Exploring Aligned Lyrics-informed Singing Voice Separation

11/10/2020

Exploring Aligned Lyrics-informed Singing Voice Separation

Chang-Bin Jeon, Hyeong-Seok Choi, Kyogu Lee

Keywords: MIR tasks, Sound source separation, Domain knowledge, Machine learning/Artificial intelligence for music, MIR fundamentals and methodology, Lyrics and other textual data, web mining, and natural language processing

Abstract Paper Similar Papers

Abstract: In this paper, we propose a method of utilizing aligned lyrics as additional information to improve the performance of singing voice separation. We have combined the highway network-based lyrics encoder into Open-unmix separation network and show that the model trained with the aligned lyrics indeed results in a better performance than the model that was not informed. The question now remains whether the increase of performance is actually due to the phonetic contents that lie in the informed aligned lyrics or not. To this end, we investigated the source of performance increase in multifaceted ways by observing the change of performance when incorrect lyrics were given to the model. Experiment results show that the model can use not only just vocal activity information but also the phonetic contents from the aligned lyrics.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ISMIR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Modeling the Compatibility of Stem Tracks to Generate Music Mashups

Jiawen Huang, Ju-Chiang Wang, Jordan B. L. Smith and
Xuchen Song, Yuxuan Wang

Keywords Paper

0

0

0

0

19:31

11/10/2020

"Butter Lyrics Over Hominy Grit": Comparing Audio and Psychology-based Text Features in MIR Tasks

Jaehun Kim, Andrew M. Demetriou, Sandy Manolios and
M. Stella Tavella, Cynthia C. S. Liem

Keywords Paper

MIR fundamentals and methodology, Lyrics and other textual data, web mining, and natural language , Applications, Music recommendation and playlist generation, Domain knowledge, Machine learning/Artificial intelligence for music, Evaluation, datasets, and reproducibility, MIR tasks, Automatic classification

0

0

0

0

3:55

11/10/2020

Joyful for You and Tender for Us: the Influence of Individual Characteristics and Language on Emotion Labeling and Classification

Juan S. Gómez-Cañón, Estefania Cano, Perfecto Herrera, Emilia Gomez

Keywords Paper

Musical features and properties, Musical affect, emotion, and mood, Domain knowledge, Cognitive MIR, Evaluation, datasets, and reproducibility, Annotation protocols, Evaluation methodology, Human-centered MIR, User-centered evaluation

0

0

0

0

3:38

08/12/2020

Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention

Mingda Li, Xinyue Liu, Weitong Ruan and
Luca Soldaini, Wael Hamza, Chengwei Su

Keywords Paper

0

0

0

0

14:43

02/02/2021

Interactive Speech and Noise Modeling for Speech Enhancement

Chengyu Zheng, Xiulian Peng, Yuan Zhang and
Sriram Srinivasan, Yan Lu

Keywords Paper

0

0

0

0

14:47

11/10/2020

Explaining Perceived Emotion Predictions in Music: an Attentive Approach

Sanga Chaki, Pranjal Doshi, Sourangshu Bhattacharya, Prof. Priyadarshi Patnaik

Keywords Paper

Musical features and properties, Musical affect, emotion, and mood, Applications, Music recommendation and playlist generation, Music retrieval systems, Domain knowledge, Machine learning/Artificial intelligence for music, MIR tasks, Automatic classification, Pattern matching and detection

0

0

0

0

3:15

11/10/2020

Content Based Singing Voice Source Separation via Strong Conditioning Using Aligned Phonemes

Gabriel Meseguer Brocal, Geoffroy Peeters

Keywords Paper

MIR tasks, Sound source separation, Evaluation, datasets, and reproducibility, Novel datasets and use cases, MIR fundamentals and methodology, Lyrics and other textual data, web mining, and natural language processing, Multimodality

0

0

0

0

4:08

19/04/2021

WER-BERT: Automatic WER estimation with BERT in a balanced ordinal classification paradigm

Akshay Krishna Sheshadri, Anvesh Rao Vijjini, Sukhdeep Kharbanda

Keywords Paper

0

0

0

0

11:45

11/10/2020

Combining Musical Features for Cover Detection

Guillaume Doras, Furkan Yesiler, Joan Serra and
Emilia Gomez, Geoffroy Peeters

Keywords Paper

Applications, Music retrieval systems, Domain knowledge, Machine learning/Artificial intelligence for music, MIR tasks, Automatic classification, Similarity metrics, Musical features and properties, Harmony, chords, and tonality, Melody and motives

0

0

0

0

4:09

18/07/2021

Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Jaehyeon Kim, Jungil Kong, Juhee Son

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

7:21

02/11/2020

Conformer-based sound event detection with semi-supervised learning and data augmentation

Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi and
Shinji Watanabe, Tomoki Toda, Kazuya Takeda

Keywords Paper

0

0

0

0

14:29

14/06/2020

Discriminative Multi-Modality Speech Recognition

Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang

Keywords Paper

multi-modal, audio-visual, speech recognition, lip reading, deep learning, eleatt-gru, deep learning

0

0

0

0

1:01

02/11/2020

DCASE 2020 Task2: Anomalous sound detection using relevant spectral feature and focusing techniques in the unsupervised learning scenario

Jihwan Park, Sooyeon Yoo

Keywords Paper

0

0

0

0

11:06

26/04/2020

Multilingual Alignment of Contextual Word Representations

Steven Cao, Nikita Kitaev, Dan Klein

Keywords Paper

multilingual, natural language processing, embedding alignment, BERT, word embeddings, transfer

0

0

0

0

4:55

11/10/2020

Generating Music with a Self-correcting Non-chronological Autoregressive Model

Wayne Chi, Prachi Kumar, Suri Yaddanapudi and
Suresh Rahul, Umut Isik

Keywords Paper

Domain knowledge, Machine learning/Artificial intelligence for music, Applications, Music composition, performance, and production, Representations of music, MIR tasks, Music synthesis and transformation

0

0

0

0

4:33

19/04/2021

Modelling context emotions using multi-task learning for emotion controlled dialog generation

Deeksha Varshney, Asif Ekbal, Pushpak Bhattacharyya

Keywords Paper

0

0

0

0

9:50

01/07/2020

Filtering conversations through dialogue acts labels for improving corpus-based convergence studies

Simone Fuscone, Benoit Favre, Laurent Prévot

Keywords Paper

0

0

0

0

7:58

04/07/2020

Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation

Xabier Soto, Dimitar Shterionov, Alberto Poncelas, Andy Way

Keywords Paper

Neural Translation, Machine MT, new systems, MT systems

0

0

0

0

14:24

08/12/2020

Bayesian Methods for Semi-supervised Text Annotation

Kristian Miok, Gregor Pirs, Marko Robnik-Sikonja

Keywords Paper

0

0

0

0

11:18

16/11/2020

Sparse Text Generation

Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Keywords Paper

story completion, dialogue generation, text generators, language models

0

0

0

0

11:27

11/10/2020

Data Cleansing with Contrastive Learning for Vocal Note Event Annotations

Gabriel Meseguer Brocal, Rachel Bittner, Simon Durand, Brian Brost

Keywords Paper

Domain knowledge, Machine learning/Artificial intelligence for music, Evaluation, datasets, and reproducibility, Novel datasets and use cases, MIR tasks, Music transcription and annotation

0

0

0

0

3:51

11/10/2020

A Chorus-section Detection Method for Lyrics Text

Kento Watanabe, Masataka Goto

Keywords Paper

MIR fundamentals and methodology, Lyrics and other textual data, web mining, and natural language , Musical features and properties, Structure, segmentation, and form

0

0

0

0

4:03

11/10/2020

Music Structure Analysis Based on an LSTM-HSMM Hybrid Model

Go Shibata, Ryo Nishikimi, Kazuyoshi Yoshii

Keywords Paper

Musical features and properties, Structure, segmentation, and form

0

0

0

0

4:06

02/02/2021

Merging Statistical Feature via Adaptive Gate for Improved Text Classification

Xianming Li, Zongxi Li, Haoran Xie, Qing Li

Keywords Paper

0

0

0

0

14:56

06/12/2021

Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport

Hsin-Yi Lin, Huan-Hsin Tseng, Xugang Lu, Yu Tsao

Keywords Paper

theory, machine learning, adversarial robustness and security, domain adaptation, optimal transport

0

0

0

0

14:40

08/12/2020

Speaker-change Aware CRF for Dialogue Act Classification

Guokan Shang, Antoine Tixier, Michalis Vazirgiannis, Jean-Pierre Lorré

Keywords Paper

0

0

0

0

14:52

01/07/2020

Robust Neural Machine Translation with ASR Errors

Haiyang Xue, Yang Feng, Shuhao Gu, Wei Chen

Keywords Paper

0

0

0

0

8:15

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01

11/10/2020

Mood Classification Using Listening Data

Filip Korzeniowski, Oriol Nieto, Matthew McCallum and
Minz Won, Sergio Oramas, Erik Schmidt

Keywords Paper

Musical features and properties, Musical affect, emotion, and mood, Applications, Music retrieval systems, MIR tasks, Automatic classification

0

0

0

0

4:01

19/04/2021

Streaming models for joint speech recognition and translation

Orion Weller, Matthias Sperber, Christian Gollan, Joris Kluivers

Keywords Paper

0

0

0

0

5:11

02/11/2020

Ensemble of sequence matching networks for dynamic sound event localization, detection, and tracking

Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

Keywords Paper

0

0

0

0

11:06

18/07/2021

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

Vadim Popov, Ivan Vovk, Vladimir Gogoryan and
Tasnima Sadekova, Mikhail Kudinov

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

5:12

05/01/2021

Facial Emotion Recognition With Noisy Multi-Task Annotations

Siwei Zhang, Zhiwu Huang, Danda Pani Paudel, Luc Van Gool

Keywords Paper

0

0

0

0

4:48

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

augmented training, in-distribution calibration, text classification, expectation error

0

0

0

0

11:47

06/12/2021

Estimating High Order Gradients of the Data Distribution by Denoising

Chenlin Meng, Yang Song, Wenzhe Li, Stefano Ermon

Keywords Paper

generative model

0

0

0

0

7:31

06/12/2021

Diversity Enhanced Active Learning with Strictly Proper Scoring Rules

Wei Tan, Lan Du, Wray Buntine

Keywords Paper

machine learning, active learning

0

0

0

0

13:21

02/11/2020

Self-supervised classification for detecting anomalous sounds

Ritwik Giri, Srikanth V. Tenneti, Fangzhou Cheng and
Karim Helwani, Umut Isik, Arvindh Krishnaswamy

Keywords Paper

0

0

0

0

13:28

04/07/2020

MultiQT: Multimodal learning for real-time question tracking in speech

Jakob D. Havtorn, Jan Latko, Joakim Edin and
Lars Maaløe, Lasse Borgholt, Lorenzo Belgrano, Nicolai Jacobsen, Regitze Sdun, Željko Agić

Keywords Paper

real-time speech, labeling speech, emergency services, real-time labeling

0

0

0

0

11:07

19/04/2021

SpanEmo: Casting multi-label emotion classification as span-prediction

Hassan Alhuzali, Sophia Ananiadou

Keywords Paper

0

0

0

0

10:03

11/10/2020

A Simple Method for User-driven Music Thumbnailing

Arianne N. van Nieuwenhuijsen, John Ashley Burgoyne, Frans Wiering, Mick Sneekes

Keywords Paper

MIR tasks, Music summarization, Applications, Music retrieval systems, Human-centered MIR, User behavior analysis and mining, user modeling

0

0

0

0

3:39