Model selection for deep audio source separation via clustering analysis

02/11/2020

Model selection for deep audio source separation via clustering analysis

Alisa Liu, Prem Seetharaman, Bryan Pardo

Keywords:

Abstract Paper Similar Papers

Abstract: Audio source separation is the process of separating a mixture into isolated sounds from individual sources. Deep learning models are the state-of-the-art in source separation, given that the mixture to be separated is similar to the mixtures the deep model was trained on. This requires the end user to know enough about each model’s training to select the correct model for a given audio mixture. In this work, we propose a confidence measure that can be broadly applied to any clustering-based separation model. The proposed confidence measure does not require ground truth to estimate the quality of a separated source. We use our confidence measure to automate selection of the appropriate deep clustering model for an audio mixture. Results show that our confidence measure can reliably select the highest-performing model for an audio mixture without knowledge of the domain the audio mixture came from, enabling automatic selection of deep models.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at DCASE 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

A Spectral Energy Distance for Parallel Speech Synthesis

Alexey Gritsenko, Tim Salimans, Rianne van den Berg and
Jasper Snoek, Nal Kalchbrenner

Keywords Paper

0

0

0

0

3:11

06/12/2020

Unsupervised Sound Separation Using Mixture Invariant Training

Scott Wisdom, Efthymios Tzinis, Hakan Erdogan and
Ron Weiss, Kevin Wilson, John R. Hershey

Keywords Paper

0

0

0

0

3:20

18/07/2021

Learning de-identified representations of prosody from raw audio

Jack Weston, Raphael Lenain, Udeepa Meepegama, Emil Fristed

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

4:37

06/12/2021

VoiceMixer: Adversarial Voice Style Mixup

Sang-Hoon Lee, Ji-Hoon Kim, Hyunseung Chung, Seong-Whan Lee

Keywords Paper

representation learning

0

0

0

0

10:18

03/05/2021

Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning

Siyang Yuan, Pengyu Cheng, Ruiyi Zhang and
Weituo Hao, Zhe Gan, Lawrence Carin

Keywords Paper

Disentanglement, Mutual Information, Zero-shot Learning, Style Transfer

0

0

0

0

5:03

02/02/2021

TaLNet: Voice Reconstruction from Tongue and Lip Articulation with Transfer Learning from Text-to-Speech Synthesis

Jing-Xuan Zhang, Korin Richmond, Zhen-Hua Ling, Lirong Dai

Keywords Paper

0

0

0

0

19:58

06/12/2020

Universally Quantized Neural Compression

Eirikur Agustsson, Lucas Theis

Keywords Paper

0

0

0

0

3:03

06/12/2020

Listening to Sounds of Silence for Speech Denoising

Henry Xu, Rundi Wu, Yuko Ishiwaka and
Carl Vondrick, Changxi Zheng

Keywords Paper

0

0

0

0

3:22

06/12/2021

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

Hassan Akbari, Liangzhe Yuan, Rui Qian and
Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong

Keywords Paper

machine learning, self-supervised learning, transformers, vision, contrastive learning

0

0

0

0

15:59

30/11/2020

Do We Need Sound for Sound Source Localization?

Takashi Oya, Shohei Iwase, Ryota Natsume and
Takahiro Itazuri, Shugo Yamaguchi, Shigeo Morishima

Keywords Paper

0

0

0

0

8:43

19/08/2021

FedSpeech: Federated Text-to-Speech with Continual Learning

Ziyue Jiang, Yi Ren, Ming Lei, Zhou Zhao

Keywords Paper

Natural Language Processing, Speech, Federated Learning, Privacy Preserving Data Mining

0

0

0

0

6:06

19/08/2021

Multi-Scale Selective Feedback Network with Dual Loss for Real Image Denoising

Xiaowan Hu, Yuanhao Cai, Zhihong Liu and
Haoqian Wang, Yulun Zhang

Keywords Paper

Computer Vision, Computational Photography, Photometry, Shape from X, Deep Learning

0

0

0

0

9:52

02/02/2021

Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation

Yan-Bo Lin, Yu-Chiang Frank Wang

Keywords Paper

0

0

0

0

15:06

03/05/2021

Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds

Efthymios Tzinis, Scott Wisdom, Aren Jansen and
Shawn Hershey, Tal Remez, Dan Ellis, John Hershey

Keywords Paper

self-supervised learning, universal sound separation, in-the-wild data, Audio-visual sound separation, unsupervised learning

0

0

0

0

5:06

06/12/2021

Compressive Visual Representations

Kuang-Huei Lee, Anurag Arnab, Sergio Guadarrama and
John Canny, Ian Fischer

Keywords Paper

theory, machine learning, robustness, self-supervised learning, contrastive learning

0

0

0

0

6:30

03/05/2021

End-to-end Adversarial Text-to-Speech

Jeff Donahue, Sander Dieleman, Mikolaj Binkowski and
Erich Elsen, Karen Simonyan

Keywords Paper

end-to-end, speech synthesis, feed-forward, text-to-speech, adversarial, generative model, GAN

0

0

0

0

15:23

26/04/2020

DDSP: Differentiable Digital Signal Processing

Jesse Engel, Lamtharn (Hanoi) Hantrakul, Chenjie Gu, Adam Roberts

Keywords Paper

dsp, audio, music, nsynth, wavenet, wavernn, vocoder, synthesizer, sound, signal, processing, tensorflow, autoencoder, disentanglement

0

0

0

0

5:11

05/01/2021

Boosting Monocular Depth With Panoptic Segmentation Maps

Faraz Saeedan, Stefan Roth

Keywords Paper

0

0

0

0

4:59

06/12/2020

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

Yaodong Yu, Ryan Chan, Chong You and
Chaobing Song, Yi Ma

Keywords Paper

0

0

0

0

3:20

06/12/2021

Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems

Menoua Keshishian, Samuel Norman-Haignere, Nima Mesgarani

Keywords Paper

deep learning, machine learning

0

0

0

0

10:28

26/04/2020

Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech

David Harwath, Wei-Ning Hsu, James Glass

Keywords Paper

visually-grounded speech, self-supervised learning, discrete representation learning, vision and language, vision and speech, hierarchical representation learning

0

0

0

0

13:42

02/02/2021

Listen, Understand and Translate: Triple Supervision Decouples End-to-end Speech-to-text Translation

Qianqian Dong, Rong Ye, Mingxuan Wang and
Hao Zhou, Shuang Xu, Bo Xu, Lei Li

Keywords Paper

0

0

0

0

14:09

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

05/01/2021

AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features From Multi-Modal Embeddings

Pratik Mazumder, Pravendra Singh, Kranti Kumar Parida, Vinay P. Namboodiri

Keywords Paper

0

0

0

0

4:46

14/06/2020

Training Noise-Robust Deep Neural Networks via Meta-Learning

Zhen Wang, Guosheng Hu, Qinghua Hu

Keywords Paper

label noise, noise-robust learning, loss correction approach, noise transition matrix, meta-learning

0

0

0

0

1:01

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

16/11/2020

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

response selection, retrieval-based systems, learning-to-rank problem, learning-to-rank

0

0

0

0

12:03

14/06/2020

Deep Semantic Clustering by Partition Confidence Maximisation

Jiabo Huang, Shaogang Gong, Xiatian Zhu

Keywords Paper

deep clustering, cluster separability, separability measurement, semantic plausibility

0

0

0

0

1:00

06/12/2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Cheng-I Jeff Lai, Yang Zhang, Alexander Liu and
Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, Jim Glass

Keywords Paper

self-supervised learning, representation learning

0

0

0

0

13:57

02/02/2021

Learning to Purify Noisy Labels via Meta Soft Label Corrector

Yichen Wu, Jun Shu, Qi Xie and
Qian Zhao, Deyu Meng

Keywords Paper

0

0

0

0

13:01

14/06/2020

WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching

Menglong Yang, Fangrui Wu, Wei Li

Keywords Paper

stereo matching, wavelet coefficients, inverse wavelet transform, supervised learning, deep representation, multi-scale features, multi-resolution cost volume, wavelet regression, disparity reconstruction, disparity refinement

0

0

0

0

1:01

03/05/2021

LEAF: A Learnable Frontend for Audio Classification

Neil Zeghidour, Olivier Teboul, Félix de Chaumont Quitry, Marco Tagliasacchi

Keywords Paper

sound classification, time-frequency representations, mel-filterbanks, learnable, frontend, audio understanding

0

0

0

0

5:18

06/12/2020

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

0

0

0

0

3:29

26/04/2020

Distance-Based Learning from Errors for Confidence Calibration

Chen Xing, Sercan Arik, Zizhao Zhang, Tomas Pfister

Keywords Paper

Confidence Calibration, Uncertainty Estimation, Prototypical Learning

0

0

0

0

5:09

07/09/2020

NTGAN: Learning Blind Image Denoising without Clean Reference

Rui Zhao, Daniel P.K. Lun, Kin-Man Lam

Keywords Paper

unsupervised image denoising, blind image denoising, pseudo supervision, noise transference

0

0

0

0

6:14

03/05/2021

Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning

Kanil Patel, William H Beluch, Bin Yang and
Michael Pfeiffer, Dan Zhang

Keywords Paper

deep neural networks, histogram binning, post-hoc calibration, uncertainty calibration, mutual information

0

0

0

0

5:13

06/12/2020

Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation

Jianyuan Wang, Yiran Zhong, Yuchao Dai and
Kaihao Zhang, Pan Ji, Hongdong Li

Keywords Paper

Algorithms -> Large Scale Learning; Algorithms -> Online Learning; Algorithms -> Regression; Algorithms -> Stochastic Methods; , Optimization -> Convex Optimization

0

0

0

0

3:10

14/06/2020

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

Patrick Esser, Robin Rombach, Björn Ommer

Keywords Paper

interpretability, inn, disentangling, generative models, invertible neural networks, autoencoders, normalizing flows, vae, explainable, xai

0

0

0

0

1:01

02/02/2021

Multi-SpectroGAN: High-Diversity and High-Fidelity Spectrogram Generation with Adversarial Style Combination for Speech Synthesis

Sang-Hoon Lee, Hyun-Wook Yoon, Hyeong-Rae Noh and
Ji-Hoon Kim, Seong-Whan Lee

Keywords Paper

0

0

0

0

14:19

06/12/2020

Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching

Di Hu, Rui Qian, Minyue Jiang and
Xiao Tan, Shilei Wen, Errui Ding, Weiyao Lin, Dejing Dou

Keywords Paper

0

0

0

0

3:07