02/11/2020

Ensemble of sequence matching networks for dynamic sound event localization, detection, and tracking

Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

Keywords:

Abstract: Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses magnitude or phase differences between microphones to estimate source directions. Therefore, it is often difficult to jointly train two subtasks simultaneously. Our previous sequence matching approach solved sound event detection and direction-of-arrival separately and trained a convolutional recurrent neural network to associate the sound classes with the directions-of-arrival using onsets and offsets of the sound events. This approach achieved better performance than other state-of-the-art networks such as the SELDnet, and the two-stage networks for static sources. In order to estimate directions-of-arrival of moving sound sources with higher required spatial resolutions than those of static sources, we propose to separate the directional estimates into azimuth and elevation estimates before passing them to the sequence matching network. Experimental results on the new DCASE dataset for sound event localization, detection, and tracking of multiple moving sound sources show that the sequence matching network with separated azimuth and elevation inputs outperforms the sequence matching network with joint azimuth and elevation input. We combined several sequence matching networks with the new proposed directional inputs into an ensemble to boost the system performance. Our proposed ensemble achieves localization error of 9.3 degrees, localization recall of 90%, and ranked 2<sup><i>nd</i></sup> in the team category of the DCASE2020 sound event localization and detection challenge.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at DCASE 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers