Self-supervised classification for detecting anomalous sounds

02/11/2020

Self-supervised classification for detecting anomalous sounds

Ritwik Giri, Srikanth V. Tenneti, Fangzhou Cheng, Karim Helwani, Umut Isik, Arvindh Krishnaswamy

Keywords:

Abstract Paper Similar Papers

Abstract: Representation learning, using self-supervised classification has recently been shown to give state-of-the-art accuracies for anomaly detection on computer vision datasets. Geometric transformations on images such as rotations, translations and flipping have been used in these recent works to create auxiliary classification tasks for feature learning. This paper introduces a new self-supervised classification framework for anomaly detection in audio signals. Classification tasks are set up based on differences in the metadata associated with the audio files. Synthetic augmentations such as linearly combining and warping audio-spectrograms are also used to increase the complexity of the classification task, to learn finer features. The proposed approach is validated using the publicly available DCASE 2020 challenge task 2: <i>Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring dataset</i>. We demonstrate the effectiveness of our approach by comparing against the baseline autoencoder model, showing an improvement of over 12.5% in the average AUC metrics.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at DCASE 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/11/2020

ID-conditioned auto-encoder for unsupervised anomaly detection

Sławomir Kapka

Keywords Paper

0

0

0

0

13:51

02/11/2020

Group masked autoencoder based density estimator for audio anomaly detection

Ritwik Giri, Fangzhou Cheng, Karim Helwani and
Srikanth V. Tenneti, Umut Isik, Arvindh Krishnaswamy

Keywords Paper

0

0

0

0

15:43

02/11/2020

DCASE 2020 Task2: Anomalous sound detection using relevant spectral feature and focusing techniques in the unsupervised learning scenario

Jihwan Park, Sooyeon Yoo

Keywords Paper

0

0

0

0

11:06

03/05/2021

Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders

Mangal Prakash, Alexander Krull, Florian Jug

Keywords Paper

Variational Autoencoders, Noise model, Unsupervised denoising, Diversity denoising

0

0

0

0

4:56

14/06/2020

WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching

Menglong Yang, Fangrui Wu, Wei Li

Keywords Paper

stereo matching, wavelet coefficients, inverse wavelet transform, supervised learning, deep representation, multi-scale features, multi-resolution cost volume, wavelet regression, disparity reconstruction, disparity refinement

0

0

0

0

1:01

02/02/2021

Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization

Shir Gur, Ameen Ali, Lior Wolf

Keywords Paper

0

0

0

0

14:14

26/04/2020

High Fidelity Speech Synthesis with Adversarial Networks

Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman and
Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan

Keywords Paper

texttospeech, speechsynthesis, audiosynthesis, gans, generativeadversarialnetworks, implicitgenerativemodels

0

0

0

0

15:07

22/11/2021

Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video

Rishabh Garg, Ruohan Gao, Kristen Grauman

Keywords Paper

Binaural Audio, Audio visual learning

0

0

0

0

9:48

30/11/2020

Do We Need Sound for Sound Source Localization?

Takashi Oya, Shohei Iwase, Ryota Natsume and
Takahiro Itazuri, Shugo Yamaguchi, Shigeo Morishima

Keywords Paper

0

0

0

0

8:43

18/07/2021

SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform

Yuhang He, Niki Trigoni, Andrew Markham

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

4:34

03/05/2021

Learning to Set Waypoints for Audio-Visual Navigation

Changan Chen, Sagnik Majumder, Ziad Al-Halah and
Ruohan Gao, Santhosh Kumar Ramakrishnan, Kristen Grauman

Keywords Paper

visual navigation, audio visual learning, embodied vision

0

0

0

0

5:04

02/11/2020

Sound event localization and detection based on CRNN using rectangular filters and channel rotation data augmentation

Francesca Ronchini, Daniel Arteaga, Andrés Pérez-López

Keywords Paper

0

0

0

0

12:51

02/11/2020

Conformer-based sound event detection with semi-supervised learning and data augmentation

Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi and
Shinji Watanabe, Tomoki Toda, Kazuya Takeda

Keywords Paper

0

0

0

0

14:29

06/12/2021

Neural Population Geometry Reveals the Role of Stochasticity in Robust Perception

Joel Dapello, Jenelle Feather, Hang Le and
Tiago Marques, David Cox, Josh McDermott, James J DiCarlo, Sueyeon Chung

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security, neuroscience

0

0

0

0

14:19

02/02/2021

Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations

Alex Wong, Mukund Mundhra, Stefano Soatto

Keywords Paper

0

0

0

0

17:02

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

augmented training, in-distribution calibration, text classification, expectation error

0

0

0

0

11:47

06/12/2021

Combating Noise: Semi-supervised Learning by Region Uncertainty Quantification

Zhenyu Wang, Ya-Li Li, Ye Guo, Shengjin Wang

Keywords Paper

machine learning, vision, semi-supervised learning

0

0

0

0

7:12

06/12/2020

CoMIR: Contrastive Multimodal Image Representation for Registration

Nicolas Pielawski, Elisabeth Wetzer, Johan Öfverstedt and
Jiahao Lu, Carolina Wählby, Joakim Lindblad, Natasa Sladoje

Keywords Paper

0

0

0

0

2:55

14/06/2020

Telling Left From Right: Learning Spatial Correspondence of Sight and Sound

Karren Yang, Bryan Russell, Justin Salamon

Keywords Paper

audio-visual learning in video, self-supervision, video dataset, spatial audio, localization, spatialization, upmixing, source separation

0

0

0

0

4:41

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

03/05/2021

Neural Synthesis of Binaural Speech From Mono Audio

Alexander Richard, Dejan Markovic, Israel Gebru and
Steven Krenn, Gladstone A Butler, Fernando Torre, Yaser Sheikh

Keywords Paper

speech generation, speech processing, binaural speech, neural sound synthesis, sound spatialization, binaural audio

0

0

0

0

15:00

02/02/2021

Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning

Yu Liu, Lianghua Huang, Pan Pan and
Bin Wang, Yinghui Xu, Rong Jin

Keywords Paper

0

0

0

0

15:15

06/12/2021

Improving Deep Learning Interpretability by Saliency Guided Training

Aya Abdelsalam Ismail, Hector Corrada Bravo, Soheil Feizi

Keywords Paper

deep learning, transformers, vision, language, interpretability

0

0

0

0

10:45

30/11/2020

Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data

Adrian Lopez-Rodriguez, Benjamin Busam, Krystian Mikolajczyk

Keywords Paper

0

0

0

0

10:00

26/04/2020

I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively

Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Keywords Paper

model comparison

0

0

0

0

4:53

02/02/2021

Joint Demosaicking and Denoising in the Wild: The Case of Training Under Ground Truth Uncertainty

Jierun Chen, Song Wen, S.-H. Gary Chan

Keywords Paper

0

0

0

0

18:17

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

22/11/2021

PS-Transformer: Learning Sparse Photometric Stereo Network using Self-Attention Mechanism

Satoshi Ikehata

Keywords Paper

photometric stereo, transformer

0

0

0

0

2:56

26/04/2020

From Variational to Deterministic Autoencoders

Partha Ghosh, Mehdi S. M. Sajjadi, Antonio Vergari and
Michael Black, Bernhard Scholkopf

Keywords Paper

Unsupervised learning, Generative Models, Variational Autoencoders, Regularization

0

0

0

0

4:59

06/12/2021

Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Chaoqun Wang, Shaobo Min, Xuejin Chen and
Xiaoyan Sun, Houqiang Li

Keywords Paper

0

0

0

0

10:51

02/02/2021

Model Uncertainty Guides Visual Object Tracking

Lijun Zhou, Antoine Ledent, Qintao Hu and
Ting Liu, Jianlin Zhang, Marius Kloft

Keywords Paper

0

0

0

0

18:06

06/12/2021

TriBERT: Human-centric Audio-visual Representation Learning

Tanzila Rahman, Mengyu Yang, Leonid Sigal

Keywords Paper

transformers, representation learning

0

0

0

0

13:54

02/02/2021

Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning

Elad Amrani, Rami Ben-Ari, Daniel Rotman, Alex Bronstein

Keywords Paper

0

0

0

0

14:04

05/01/2021

R-MNet: A Perceptual Adversarial Network for Image Inpainting

Jireh Jam, Connah Kendrick, Vincent Drouard and
Kevin Walker, Gee-Sern Hsu, Moi Hoon Yap

Keywords Paper

0

0

0

0

5:02

05/01/2021

Mask Selection and Propagation for Unsupervised Video Object Segmentation

Shubhika Garg, Vidit Goel

Keywords Paper

0

0

0

0

4:38

14/06/2020

A U-Net Based Discriminator for Generative Adversarial Networks

Edgar Schönfeld, Bernt Schiele, Anna Khoreva

Keywords Paper

gan, image synthesis, u-net, discriminator, consistency regularization, equivariance, generative adversarial networks, ffhq, biggan

0

0

0

0

1:01

02/11/2020

Anomalous sound detection as a simple binary classification problem with careful selection of proxy outlier examples

Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer

Keywords Paper

0

0

0

0

15:23

06/12/2021

Learning Signal-Agnostic Manifolds of Neural Fields

Yilun Du, Katie Collins, Josh Tenenbaum, Vincent Sitzmann

Keywords Paper

deep learning, generative model

0

0

0

0

11:08

06/12/2020

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

Xiang Li, Wenhai Wang, Lijun Wu and
Shuo Chen, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang

Keywords Paper

0

0

0

0

2:42

02/11/2020

Effects of word-frequency based pre- and post- processings for audio captioning

Daiki Takeuchi, Yuma Koizumi, Yasunori Ohishi and
Noboru Harada, Kunio Kashino

Keywords Paper

0

0

0

0

13:56