12/07/2020

Deep Graph Random Process for Relational-Thinking-Based Speech Recognition

Huang Hengguan, Fuzhao Xue, Hao Wang, Ye Wang

Keywords: Applications - Language, Speech and Dialog

Abstract: Both relational thinking and relational reasoning lie at the core of human intelligence. While relational reasoning has inspired many perspectives in artificial intelligence, relational thinking is relatively unexplored in solving machine learning problems. It is characterized by initially relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge, consequently becoming a recognizable concept or object through coupling of these percepts. Such mental processes are difficult to model in real-world problems such as in conversational automatic speech recognition (ASR), as the percepts (e.g. unconscious mental impressions formed while hearing sounds) are supposed to be innumerable and not directly observable. And yet the dialogue history of the conversation might still reflect such underlying processes, allowing an indirect way of modeling. We present a framework that models a percept as weak relations between a current utterance and its history. We assume the probability of the existence of such a relation to be close to zero due to the unconsciousness of the percept. Given an utterance and its history, our method can generate an infinite number of probabilistic graphs representing percepts and further analytically combine them into a new graph representing strong relations among utterances. This new graph can be further transformed to be task-specific and provide an informative representation for acoustic modeling. Our approach is able to successfully infer relations among utterances without using any relational data during training. Experimental evaluations on ASR tasks including CHiME-2, SWB-30k and CHiME-5 demonstrate the effectiveness and benefits of our method.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers