MultiQT: Multimodal learning for real-time question tracking in speech

Abstract: We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labeling in speech. Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via automatic speech recognition. Our results show significant gains of jointly learning from the two modalities when compared to text or audio only, under adverse noise and limited volume of training data. The results generalize to medical symptoms detection where we observe a similar pattern of improvements with multimodal learning.

19/08/2021

Yi Ren, Jinglin Liu, Xu Tan and
Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu

MultiQT: Multimodal learning for real-time question tracking in speech

Jakob D. Havtorn, Jan Latko, Joakim Edin, Lars Maaløe, Lasse Borgholt, Lorenzo Belgrano, Nicolai Jacobsen, Regitze Sdun, Željko Agić

Comments

Similar Papers

A Streaming End-to-End Framework For Spoken Language Understanding

Nihal Potdar, Anderson Raymundo Avila, Chao Xing and Dong Wang, Yiran Cao, Xiao Chen

Keywords Abstract Paper

Natural Language Processing, Dialogue, Speech

Attentively Embracing Noise for Robust Latent Representation in BERT

Gwenaelle Cunha Sergio, Dennis Singh Moirangthem, Minho Lee

Keywords Abstract Paper

Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations

Karan Singla, Zhuohao Chen, David Atkins, Shrikanth Narayanan

Keywords Abstract Paper

predicting codes, Spoken tasks, voice detection, speaker diarization

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Chengyi Wang, Yu Wu, Yao Qian and Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang

Keywords Abstract Paper

Applications, Speech Recognition

Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions

Hannah Craighead, Andrew Caines, Paula Buttery, Helen Yannakoudakis

Keywords Abstract Paper

automated transcriptions, automatically speech, multi-task learning, inductive transfer

Towards an Automated SOAP Note: Classifying Utterances from Medical Conversations

Benjamin Schloss, Sandeep Konam

Keywords Abstract Paper

SimulSpeech: End-to-End Simultaneous Speech to Text Translation

Yi Ren, Jinglin Liu, Xu Tan and Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu

Keywords Abstract Paper

simultaneous translation, simultaneous recognition, ASR, NMT

Semantic parsing of disfluent speech

Priyanka Sen, Isabel Groves

Keywords Abstract Paper

Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Dongchan Min, Dong Bok Lee, Eunho Yang, Sung Ju Hwang

Keywords Abstract Paper

Applications, Audio and Speech Processing

Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation

Yasheng Sun, Hang Zhou, Ziwei Liu, Hideki Koike

Keywords Abstract Paper

Computer Vision, 2D and 3D Computer Vision, Speech

Unsupervised Paraphasia Classification in Aphasic Speech

Sharan Pai, Nikhil Sachdeva, Prince Sachdeva, Rajiv Ratn Shah

Keywords Abstract Paper

Unsupervised Classification, speech disorder, naming detection, treatment

MRD-Net: Multi-Modal Residual Knowledge Distillation for Spoken Question Answering

Chenyu You, Nuo Chen, Yuexian Zou

Keywords Abstract Paper

Natural Language Processing, Question Answering, Sentiment Analysis and Text Mining, Speech

Robust Neural Machine Translation with ASR Errors

Haiyang Xue, Yang Feng, Shuhao Gu, Wei Chen

Keywords Abstract Paper

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

Alexander Podolskiy, Dmitry Lipin, Andrey Bout and Ekaterina Artemova, Irina Piontkovskaya

Keywords Abstract Paper

Audio-Oriented Multimodal Machine Comprehension via Dynamic Inter- and Intra-modality Attention

Zhiqi Huang, Fenglin Liu, Xian Wu and Shen Ge, Helin Wang, Wei Fan, Yuexian Zou

Keywords Abstract Paper

Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention

Mingda Li, Xinyue Liu, Weitong Ruan and Luca Soldaini, Wael Hamza, Chengwei Su

Keywords Abstract Paper

Modelling context emotions using multi-task learning for emotion controlled dialog generation

Deeksha Varshney, Asif Ekbal, Pushpak Bhattacharyya

Keywords Abstract Paper

Streaming models for joint speech recognition and translation

Orion Weller, Matthias Sperber, Christian Gollan, Joris Kluivers

Keywords Abstract Paper

End-to-End Speech Translation with Adversarial Training

Xuancai Li, Chen Kehai, Tiejun Zhao, Muyun Yang

Keywords Abstract Paper

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Abstract Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

Contextualized Emotion Recognition in Conversation as Sequence Tagging

Yan Wang, Jiayu Zhang, Jun Ma and Shaojun Wang, Jing Xiao

Keywords Abstract Paper

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

Rafael Valle, Kevin J Shih, Ryan Prenger, Bryan Catanzaro

Keywords Abstract Paper

Nihal Potdar, Anderson Raymundo Avila, Chao Xing and
Dong Wang, Yiran Cao, Xiao Chen

Keywords Paper

Keywords Paper

Keywords Paper

Chengyi Wang, Yu Wu, Yao Qian and
Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang

Keywords Paper

Keywords Paper

Keywords Paper

Yi Ren, Jinglin Liu, Xu Tan and
Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Alexander Podolskiy, Dmitry Lipin, Andrey Bout and
Ekaterina Artemova, Irina Piontkovskaya

Keywords Paper

Zhiqi Huang, Fenglin Liu, Xian Wu and
Shen Ge, Helin Wang, Wei Fan, Yuexian Zou

Keywords Paper

Mingda Li, Xinyue Liu, Weitong Ruan and
Luca Soldaini, Wael Hamza, Chengwei Su

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Yan Wang, Jiayu Zhang, Jun Ma and
Shaojun Wang, Jing Xiao

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Taesun Whang, Dongyub Lee, Dongsuk Oh and
Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Paper

Qianqian Dong, Mingxuan Wang, Hao Zhou and
Shuang Xu, Bo Xu, Lei Li

Keywords Paper

Chenfeng Miao, Liang Shuang, Zhengchen Liu and
Chen Minchuan, Jun Ma, Shaojun Wang, Jing Xiao

Keywords Paper

Keywords Paper

Keywords Paper

Chen Zhang, Xu Tan, Yi Ren and
Tao Qin, Kejun Zhang, Tie-Yan Liu

Keywords Paper

Michelle Yuan, Mozhi Zhang, Benjamin Van Durme and
Leah Findlater, Jordan Boyd-Graber

Keywords Paper

Keywords Paper

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

Qianqian Dong, Rong Ye, Mingxuan Wang and
Hao Zhou, Shuang Xu, Bo Xu, Lei Li

Keywords Paper

Monica Sunkara, Srikanth Ronanki, Kalpit Dixit and
Sravan Bodapati, Katrin Kirchhoff

Keywords Paper