11/10/2020

Learning Interpretable Representation for Controllable Polyphonic Music Generation

Ziyu Wang, Dingsu Wang, Yixiao Zhang, Gus Xia

Keywords: Applications, Music composition, performance, and production, Domain knowledge, Machine learning/Artificial intelligence for music, Representations of music, MIR fundamentals and methodology, Symbolic music processing

Abstract: While deep generative models have become the leading methods for algorithmic composition, it remains a challenging problem to control the generation process because the latent variables of most deep-learning models lack good interpretability. Inspired by the content-style disentanglement idea, we design a novel architecture, under the VAE framework, that effectively learns two interpretable latent factors of polyphonic music: chord and texture. The current model focuses on learning 8-beat long piano composition segments. We show that such chord-texture disentanglement provides a controllable generation pathway leading to a wide spectrum of applications, including compositional style transfer, texture variation, and accompaniment arrangement. Both objective and subjective evaluations show that our method achieves a successful disentanglement and high quality controlled music generation.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ISMIR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers