Audeo: Audio Generation for a Silent Performance Video

Abstract: We present a novel system that gets as an input, video frames of a musician playing the piano, and generates the music for that video. The generation of music from visual cues is a challenging problem and it is not clear whether it is an attainable goal at all. Our main aim in this work is to explore the plausibility of such a transformation and to identify cues and components able to carry the association of sounds with visual events. To achieve the transformation we built a full pipeline named 'Audeo' containing three components. We first translate the video frames of the keyboard and the musician hand movements into raw mechanical musical symbolic representation Piano-Roll (Roll) for each video frame which represents the keys pressed at each time step. We then adapt the Roll to be amenable for audio synthesis by including temporal correlations. This step turns out to be critical for meaningful audio generation. In the last step, we implement Midi synthesizers to generate realistic music. Audeo converts video to audio smoothly and clearly with only a few setup constraints. We evaluate Audeo on piano performance videos collected from Youtube and obtain that their generated music is of reasonable audio quality and can be successfully recognized with high precision by popular music identification software.

Audeo: Audio Generation for a Silent Performance Video

Kun Su, Xiulong Liu, Eli Shlizerman

Comments

Similar Papers

Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation

Lincheng Li, Suzhen Wang, Zhimeng Zhang and Yu Ding, Yixing Zheng, Xin Yu, Changjie Fan

Keywords Abstract Paper

Improved Handling of Repeats and Jumps in Audio--sheet Image Synchronization

Mengyi Shan, Timothy Tsai

Keywords Abstract Paper

MIR tasks, Alignment, synchronization, and score following, MIR fundamentals and methodology, Multimodality, Music signal processing

How Does it Sound?

Kun Su, Xiulong Liu, Eli Shlizerman

Keywords Abstract Paper

transformers

Drumgan: Synthesis of Drum Sounds with Timbral Feature Conditioning Using Generative Adversarial Networks

Javier Nistal, Stefan Lattner, Gaël Richard

Keywords Abstract Paper

MIR tasks, Music synthesis and transformation, Domain knowledge, Machine learning/Artificial intelligence for music, Human-centered MIR, Human-computer interaction and interfaces

Music Creation by Example

Emma Frid, Celso Gomes, Zeyu Jin

Keywords Abstract Paper

music generation, artificial intelligence, mixed-initiative interaction, algorithmic composition

Dance Beat Tracking from Visual Information Alone

Fabrizio Pedersoli, Masataka Goto

Keywords Abstract Paper

Musical features and properties, Rhythm, beat, tempo, MIR fundamentals and methodology, Multimodality

POP909: a Pop-song Dataset for Music Arrangement Generation

Ziyu Wang, Ke Chen, Junyan Jiang and Yiyi Zhang, Maoran Xu, Shuqi Dai, Gus Xia

Keywords Abstract Paper

Multi-instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams

Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi and Kazuyoshi Yoshii, Shigeo Morishima

Keywords Abstract Paper

MIR tasks, Music transcription and annotation

Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining

Timothy Tsai, Kevin Ji

Keywords Abstract Paper

Musical features and properties, Musical style and genre, Domain knowledge, Machine learning/Artificial intelligence for music, Representations of music, MIR fundamentals and methodology, Symbolic music processing, MIR tasks, Automatic classification

Polyphonic Piano Transcription Using Autoregressive Multi-state Note Model

Taegyun Kwon, Dasaem Jeong, Juhan Nam

Keywords Abstract Paper

MIR tasks, Music transcription and annotation, Applications, Music composition, performance, and production, Music retrieval systems, MIR fundamentals and methodology, Music signal processing

Example-driven virtual cinematography by learning camera behaviors

Hongda Jiang, Bin Wang, Xi Wang and Marc Christie, Baoquan Chen

Keywords Abstract Paper

camera behaviors, machine learning, virtual cinematography

LARNet: Latent Action Representation for Human Action Synthesis

Naman Biyani, Aayush Jung Bahadur Rana, Shruti Vyas, Yogesh Rawat

Keywords Abstract Paper

action synthesis, video synthesis, joint generative model, human action generation, end-to-end learning, conditional video generation

Skipping the Frame-Level: Event-Based Piano Transcription With Neural Semi-CRFs

Yujia Yan, Frank Cwitkowitz, Zhiyao Duan

Keywords Abstract Paper

PIANOTREE VAE: Structured Representation Learning for Polyphonic Music

Ziyu Wang, Yiyi Zhang, Yixiao Zhang and Junyan Jiang, Ruihan Yang, Gus Xia, Junbo Zhao

Keywords Abstract Paper

Applications, Music composition, performance, and production, Domain knowledge, Machine learning/Artificial intelligence for music, Representations of music, MIR fundamentals and methodology, Symbolic music processing

Score Following with Hidden Tempo Using a Switching State-space Model

Yucong Jiang

Keywords Abstract Paper

MIR tasks, Alignment, synchronization, and score following, Domain knowledge, Machine learning/Artificial intelligence for music, MIR fundamentals and methodology, Music signal processing

CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning

Rohit Girdhar, Deva Ramanan

Keywords Abstract Paper

Video Understanding, Temporal Reasoning

Conlon: a Pseudo-song Generator Based on a New Pianoroll, Wasserstein Autoencoders, and Optimal Interpolations

Luca Angioloni, Valentijn Borghuis, Lorenzo Brusci, Paolo Frasconi

Keywords Abstract Paper

Domain knowledge, Machine learning/Artificial intelligence for music, Applications, Music composition, performance, and production

A Scalable Reasoning and Learning Approach for Neural-Symbolic Stream Fusion

Danh Le-Phuoc, Thomas Eiter, Anh Le-Tuan

Keywords Abstract Paper

Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses

Miao Liao, Sibo Zhang, Peng Wang and Hao Zhu, Xinxin Zuo, Ruigang Yang

Keywords Abstract Paper

Synthesizer: Rethinking Self-Attention for Transformer Models

Yi Tay, Dara Bahri, Don Metzler and Da-Cheng Juan, Zhe Zhao, Che Zheng

Keywords Abstract Paper

Deep Learning, Architectures

DOPS: Learning to Detect 3D Objects and Predict Their 3D Shapes

Lincheng Li, Suzhen Wang, Zhimeng Zhang and
Yu Ding, Yixing Zheng, Xin Yu, Changjie Fan

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ziyu Wang, Ke Chen, Junyan Jiang and
Yiyi Zhang, Maoran Xu, Shuqi Dai, Gus Xia

Keywords Paper

Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi and
Kazuyoshi Yoshii, Shigeo Morishima

Keywords Paper

Keywords Paper

Keywords Paper

Hongda Jiang, Bin Wang, Xi Wang and
Marc Christie, Baoquan Chen

Keywords Paper

Keywords Paper

Keywords Paper

Ziyu Wang, Yiyi Zhang, Yixiao Zhang and
Junyan Jiang, Ruihan Yang, Gus Xia, Junbo Zhao

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Miao Liao, Sibo Zhang, Peng Wang and
Hao Zhu, Xinxin Zuo, Ruigang Yang

Keywords Paper

Yi Tay, Dara Bahri, Don Metzler and
Da-Cheng Juan, Zhe Zhao, Che Zheng

Keywords Paper

Mahyar Najibi, Guangda Lai, Abhijit Kundu and
Zhichao Lu, Vivek Rathod, Thomas Funkhouser, Caroline Pantofaru, David Ross, Larry S. Davis, Alireza Fathi

Keywords Paper

Keywords Paper

Keywords Paper

Changan Chen, Sagnik Majumder, Ziad Al-Halah and
Ruohan Gao, Santhosh Kumar Ramakrishnan, Kristen Grauman

Keywords Paper

Keywords Paper

Cornelius Schröder, David Klindt, Sarah Strauss and
Katrin Franke, Matthias Bethge, Thomas Euler, Philipp Berens

Keywords Paper

Di Hu, Rui Qian, Minyue Jiang and
Xiao Tan, Shilei Wen, Errui Ding, Weiyao Lin, Dejing Dou

Keywords Paper

Seongho Choi, Kyoung-Woon On, Yu-Jung Heo and
Ahjeong Seo, Youwon Jang, Minsu Lee, Byoung-Tak Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yu Tian, Jian Ren, Menglei Chai and
Kyle Olszewski, Xi Peng, Dimitris Metaxas, Sergey Tulyakov

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper