Abstract:
Audio source separation is the process of separating a mixture into isolated sounds from individual sources. Deep learning models are the state-of-the-art in source separation, given that the mixture to be separated is similar to the mixtures the deep model was trained on. This requires the end user to know enough about each model’s training to select the correct model for a given audio mixture. In this work, we propose a confidence measure that can be broadly applied to any clustering-based separation model. The proposed confidence measure does not require ground truth to estimate the quality of a separated source. We use our confidence measure to automate selection of the appropriate deep clustering model for an audio mixture. Results show that our confidence measure can reliably select the highest-performing model for an audio mixture without knowledge of the domain the audio mixture came from, enabling automatic selection of deep models.