Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

06/12/2020

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, Lei Xie

Keywords:

Abstract Paper Similar Papers

Abstract: Neural sequence-to-sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on one-to-many sequence transduction problems, such as extracting multiple sequential sources from a mixture sequence. We extend the standard sequence-to-sequence model to a conditional multi-sequence model, which explicitly models the relevance between multiple output sequences with the probabilistic chain rule. Based on this extension, our model can conditionally infer output sequences one-by-one by making use of both input and previously-estimated contextual output sequences. This model additionally has a simple and efficient stop criterion for the end of the transduction, making it able to infer the variable number of output sequences. We take speech data as a primary test field to evaluate our methods since the observed speech data is often composed of multiple sources due to the nature of the superposition principle of sound waves. Experiments on several different tasks including speech separation and multi-speaker speech recognition show that our conditional multi-sequence models lead to consistent improvements over the conventional non-conditional models.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture

Chenfeng Miao, Liang Shuang, Zhengchen Liu and
Chen Minchuan, Jun Ma, Shaojun Wang, Jing Xiao

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

5:13

03/05/2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling

Jiahui Yu, Wei Han, Anmol Gulati and
Chung-Cheng Chiu, Bo Li, Tara Sainath, Yonghui Wu, Ruoming Pang

Keywords Paper

Dual-mode ASR, Low-latency ASR, Streaming ASR, Speech Recognition

0

0

0

0

5:11

06/12/2021

Speech-T: Transducer for Text to Speech and Beyond

Jiawei Chen, Xu Tan, Yichong Leng and
Jin Xu, Guihua Wen, Tao Qin, Tie-Yan Liu

Keywords Paper

transformers

0

0

0

0

8:38

06/12/2021

Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems

Menoua Keshishian, Samuel Norman-Haignere, Nima Mesgarani

Keywords Paper

deep learning, machine learning

0

0

0

0

10:28

01/07/2020

Start-Before-End and End-to-End: Neural Speech Translation by AppTek and RWTH Aachen University

Parnia Bahar, Patrick Wilken, Tamer Alkhouli and
Andreas Guta, Pavel Golik, Evgeny Matusov, Christian Herold

Keywords Paper

0

0

0

0

15:41

16/11/2020

Direct Segmentation Models for Streaming Speech Translation

Javier Iranzo-Sánchez, Adrià Giménez Pastor, Joan Albert Silvestre-Cerdà and
Pau Baquero-Arnal, Jorge Civera Saiz, Alfons Juan

Keywords Paper

st, streaming st, pipeline, automatic system

0

0

0

0

11:53

18/07/2021

SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform

Yuhang He, Niki Trigoni, Andrew Markham

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

4:34

04/07/2020

SimulSpeech: End-to-End Simultaneous Speech to Text Translation

Yi Ren, Jinglin Liu, Xu Tan and
Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu

Keywords Paper

simultaneous translation, simultaneous recognition, ASR, NMT

0

0

0

0

5:51

19/04/2021

Streaming models for joint speech recognition and translation

Orion Weller, Matthias Sperber, Christian Gollan, Joris Kluivers

Keywords Paper

0

0

0

0

5:11

04/07/2020

Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions

Hannah Craighead, Andrew Caines, Paula Buttery, Helen Yannakoudakis

Keywords Paper

automated transcriptions, automatically speech, multi-task learning, inductive transfer

0

0

0

0

11:37

19/08/2021

A Streaming End-to-End Framework For Spoken Language Understanding

Nihal Potdar, Anderson Raymundo Avila, Chao Xing and
Dong Wang, Yiran Cao, Xiao Chen

Keywords Paper

Natural Language Processing, Dialogue, Speech

0

0

0

0

14:09

03/05/2021

Representation Learning for Sequence Data with Deep Autoencoding Predictive Components

Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong

Keywords Paper

Unsupervised Learning, Mutual Information, Masked Reconstruction, Sequence Data

0

0

0

0

5:08

02/11/2020

Temporal sub-sampling of audio feature sequences for automated audio captioning

Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen

Keywords Paper

0

0

0

0

14:09

06/12/2021

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

Hongyu Gong, Yun Tang, Juan Pino, Xian Li

Keywords Paper

0

0

0

0

10:04

08/12/2020

Speaker-change Aware CRF for Dialogue Act Classification

Guokan Shang, Antoine Tixier, Michalis Vazirgiannis, Jean-Pierre Lorré

Keywords Paper

0

0

0

0

14:52

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

06/12/2021

Improving Deep Learning Interpretability by Saliency Guided Training

Aya Abdelsalam Ismail, Hector Corrada Bravo, Soheil Feizi

Keywords Paper

deep learning, transformers, vision, language, interpretability

0

0

0

0

10:45

19/08/2021

Progressive Open-Domain Response Generation with Multiple Controllable Attributes

Haiqin Yang, Xiaoyuan Yao, Yiqun Duan and
Jianping Shen, Jie Zhong, Kun Zhang

Keywords Paper

Machine Learning, Learning Generative Models, Dialogue

0

0

0

0

14:43

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

19/04/2021

Modelling context emotions using multi-task learning for emotion controlled dialog generation

Deeksha Varshney, Asif Ekbal, Pushpak Bhattacharyya

Keywords Paper

0

0

0

0

9:50

06/12/2021

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Xiaolin Hu, Kai Li, Weiyi Zhang and
Yi Luo, Jean-Marie Lemercier, Timo Gerkmann

Keywords Paper

deep learning

0

0

0

0

13:24

03/05/2021

DiffWave: A Versatile Diffusion Model for Audio Synthesis

Zhifeng Kong, Wei Ping, Jiaji Huang and
Kexin Zhao, Bryan Catanzaro

Keywords Paper

diffusion probabilistic models, generative models, speech synthesis, audio synthesis

0

0

0

0

15:12

16/11/2020

COD3S: Diverse Generation with Discrete Semantic Signatures

Nathaniel Weir, João Sedoc, Benjamin Van Durme

Keywords Paper

causal generation, cods, neural models, seqseqs

0

0

0

0

7:09

02/02/2021

Interactive Speech and Noise Modeling for Speech Enhancement

Chengyu Zheng, Xiulian Peng, Yuan Zhang and
Sriram Srinivasan, Yan Lu

Keywords Paper

0

0

0

0

14:47

18/07/2021

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

Vadim Popov, Ivan Vovk, Vladimir Gogoryan and
Tasnima Sadekova, Mikhail Kudinov

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

5:12

03/05/2021

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

Rafael Valle, Kevin J Shih, Ryan Prenger, Bryan Catanzaro

Keywords Paper

normalizing flows, deep learning, Text to speech synthesis

0

0

0

0

5:11

13/04/2021

Feedback coding for active learning

Gregory Canal, Matthieu Bloch, Christopher Rozell

Keywords Paper

0

0

0

0

2:55

04/07/2020

Good-Enough Compositional Data Augmentation

Jacob Andreas

Keywords Paper

Good-Enough Augmentation, diagnostic tasks, semantic task, data protocol

0

0

0

0

11:31

01/07/2020

KIT’s IWSLT 2020 SLT Translation System

Ngoc-Quan Pham, Felix Schneider, Tuan-Nam Nguyen and
Thanh-Le Ha, Thai Son Nguyen, Maximilian Awiszus, Sebastian Stüker, Alexander Waibel

Keywords Paper

0

0

0

0

14:58

01/07/2020

ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020

Maha Elbayad, Ha Nguyen, Fethi Bougares and
Natalia Tomashenko, Antoine Caubrière, Benjamin Lecouteux, Yannick Estève, Laurent Besacier

Keywords Paper

0

0

0

0

14:54

14/06/2020

Discriminative Multi-Modality Speech Recognition

Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang

Keywords Paper

multi-modal, audio-visual, speech recognition, lip reading, deep learning, eleatt-gru, deep learning

0

0

0

0

1:01

19/04/2021

WER-BERT: Automatic WER estimation with BERT in a balanced ordinal classification paradigm

Akshay Krishna Sheshadri, Anvesh Rao Vijjini, Sukhdeep Kharbanda

Keywords Paper

0

0

0

0

11:45

02/11/2020

On multitask loss function for audio event detection and localization

Huy Phan, Lam Pham, Philipp Koch and
Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Keywords Paper

0

0

0

0

15:16

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

02/02/2021

Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks

Maurício Gruppi, Pin-Yu Chen, Sibel Adali

Keywords Paper

0

0

0

0

19:35

06/12/2020

A Simple Language Model for Task-Oriented Dialogue

Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu and
Semih Yavuz, Richard Socher

Keywords Paper

0

0

0

0

3:21

02/02/2021

Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance

Guanhua Chen, Yun Chen, Victor O.K. Li

Keywords Paper

0

0

0

0

15:33

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

16/11/2020

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Keywords Paper

document-level translation, document-level systems, context-aware architecture, transformer

0

0

0

0

6:36

26/04/2020

Compositional languages emerge in a neural iterated learning model

Yi Ren, Shangmin Guo, Matthieu Labeau and
Shay B. Cohen, Simon Kirby

Keywords Paper

Compositionality, Multi-agent, Emergent language, Iterated learning

0

0

0

0

5:07