11/10/2020

Can't Trust the Feeling? How Open Data Reveals Unexpected Behavior of High-level Music Descriptors

Cynthia C. S. Liem, Chris Mostert

Keywords: Evaluation, datasets, and reproducibility, Reproducibility, Domain knowledge, Machine learning/Artificial intelligence for music, Evaluation methodology, Musical features and properties, Musical affect, emotion, and mood, Musical style and genre, Philosophical and ethical discussions, Philosophical and methodological foundations

Abstract: Copyright restrictions prevent the widespread sharing of commercial music audio. Therefore, the availability of resharable pre-computed music audio features has become critical. In line with this, the AcousticBrainz platform offers a dynamically growing, open and community-contributed large-scale resource of locally computed low-level and high-level music descriptors. Beyond enabling research reuse, the availability of such an open resource allows for renewed reflection on the music descriptors we have at hand: while they were validated to perform successfully under lab conditions, they now are being run ‘in the wild’. Their response to these more ecological conditions can shed light on the degree to which they truly had construct validity. In this work, we seek to gain further understanding into this, by analyzing high-level classifier-based music descriptor output in AcousticBrainz. While no hard ground truth is available on what the true value of these descriptors should be, some oracle information can still be derived, relying on semantic redundancies between several descriptors, and multiple feature submissions being available for the same recording. We report on multiple unexpected patterns found in the data, indicating that the descriptor values should not be taken as absolute truth, and hinting at directions for more comprehensive descriptor testing that are overlooked in common machine learning evaluation and quality assurance setups.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ISMIR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers