PhD Student
Computer Science & Engineering
University of California San Diego


Arranger – Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music

Collaborator: Chris Donahue
Advisors: Prof. Taylor Berg-Kirkpatrick and Prof. Julian McAuley

Arranger is a project on automatic instrumentation. In a nutshell, we aim to dynamically assign a proper instrument for each note in solo music. Such an automatic instrumentation model could empower a musician to play multiple instruments on a keyboard at the same time. It could also assist a composer in suggesting proper instrumentation for a solo piece. Our proposed models outperform various baseline models and are able to produce alternative convincing instrumentations for existing arrangements. Check out the demo!


The music is visualized in the piano-roll representation, where the x- and y-axes represent time and pitch, respectively. Colors indicate the instruments.

By downmixing a symbolic multitrack into a single-track mixture, we acquire paired data of solo music and its instrumentation. We then use these paired data to train a part separation model that aims to infer the part label (e.g., one out of the five instruments in this example) for each single note in a mixture. Automatic instrumentation can subsequently be accomplished by treating input from a keyboard player as a downmixed mixture (bottom) and separating out the relevant parts (top).

Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick and Julian McAuley
Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), in press, 2021
homepage   video   paper   slides   slides (long)   arXiv   code   reviews

Flow-based Deep Generative Models

Collaborator: Jiarui Xu

We investigate the flow-based deep generative models. We first compare different generative models, especially generative adversarial networks (GANs), variational autoencoders (VAEs) and flow-based generative models. We then survey different normalizing flow models, including non-linear independent components estimation (NICE), real-valued non-volume preserving (RealNVP) transformations, generative flow with invertible 1×1 convolutions (Glow), masked autoregressive flow (MAF) and inverse autoregressive flow (IAF). Finally, we conduct experiments on generating MNIST handwritten digits using NICE and RealNVP to examine the effectiveness of flow-based models. For more details, please refer to the report and the slides (pdf).

MusPy – A Toolkit for Symbolic Music Generation

Collaborator: Ke Chen
Advisors: Prof. Julian McAuley and Prof. Taylor Berg-Kirkpatrick

MusPy is an open source Python library for symbolic music generation. It provides essential tools for developing a music generation system, including dataset management, data I/O, data preprocessing and model evaluation.



To learn more about MusPy, please visit the demo page.

Hao-Wen Dong, Ke Chen, Julian McAuley, and Taylor Berg-Kirkpatrick
Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR), 2020
homepage   video   paper   slides   poster   arXiv   code   documentation   reviews

Music Chord Progression Analysis

Collaborator: Chun-Jhen Lai


We analyze the statistical properties of chords and chord progressions in over ten thousand songs available on the HookTheory platform. In particular, we are interested in analyzing the prior probabilities for different chords, the transition probabilities between chords and chords and the most frequent chord progressions. We are also interested in how these statistics differ from genre to genre with an eye to reveal some interesting trends on the usage of chords and chord progressions in different genres. For more details, please refer to the report.

DANTest – Towards a Deeper Understanding of Adversarial Losses

Advisor: Dr. Yi-Hsuan Yang


In this project, we aim to gain a deeper understanding of adversarial losses by decoupling the effects of their component functions and regularization terms. In essence, we aim for the following two research questions:

For the first question, we derive some necessary and sufficient conditions of the component functions such that the adversarial loss is a divergence-like measure between the data and the model distributions. For the second question, we propose a new, simple framework called DANTest for comparing different adversarial losses. With DANTest, we are able to decouple the effects of component functions and the regularization approaches.

To learn more about this project, please visit the demo page.

Hao-Wen Dong and Yi-Hsuan Yang
arXiv preprint arXiv:1901.08753, 2019
homepage   paper   arXiv   code

BinaryGAN – Modeling high-dimensional binary-valued data with GANs

Advisor: Dr. Yi-Hsuan Yang


BinaryGAN is a novel generative adversarial network (GAN) that uses binary neurons at the output layer of the generator. We employ the sigmoid-adjusted straight-through estimators to estimate the gradients for the binary neurons and train the whole network by end-to-end backpropogation. The proposed model is able to directly generate binary-valued predictions at test time.

To learn more about BinaryGAN, please visit the demo page.

Hao-Wen Dong and Yi-Hsuan Yang
arXiv preprint arXiv:1810.04714, 2018
homepage   paper   slides   arXiv   code

Pypianoroll – Open source Python library for handling multitrack piano rolls

Collaborators: Wen-Yi Hsiao
Advisor: Dr. Yi-Hsuan Yang

Pypianoroll is an open source Python library for working with piano rolls. It provides essential tools for handling multitrack piano rolls, including efficient I/O as well as manipulation, visualization and evaluation tools.



Hao-Wen Dong, Wen-Yi Hsiao, and Yi-Hsuan Yang
Late-Breaking Demos of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018
homepage   paper   poster   code   documentation

BinaryMuseGAN – Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation

Advisor: Dr. Yi-Hsuan Yang

BinaryMuseGAN is a follow-up project of the MuseGAN project. In this project, we first investigate how the real-valued piano-rolls generated by the generator may lead to difficulties in training the discriminator for CNN-based models. To overcome the binarization issue, we propose to append to the generator an additional refiner network, which try to refine the real-valued predictions generated by the pretrained generator to binary-valued ones. The proposed model is able to directly generate binary-valued piano-rolls at test time.


We trained the network with training data collected from Lakh Pianoroll Dataset. We used the model to generate four-bar musical phrases consisting of eight tracks: Drums, Piano, Guitar, Bass, Ensemble, Reed, Synth Lead and Synth Pad. Audio samples are available here.

To learn more about BinaryMuseGAN, please visit the demo page.

Hao-Wen Dong and Yi-Hsuan Yang
Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018
homepage   video   paper   slides   slides (long)   poster   arXiv   code   reviews

MuseGAN – Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

Collaborators: Wen-Yi Hsiao and Li-Chia Yang
Advisor: Dr. Yi-Hsuan Yang

MuseGAN is a project on music generation. In a nutshell, we aim to generate polyphonic music of multiple tracks (instruments). The proposed models are able to generate music either from scratch, or by accompanying a track given a priori by the user.

We train the model with training data collected from Lakh Pianoroll Dataset to generate pop song phrases consisting of bass, drums, guitar, piano and strings tracks.


Listen to some of the best samples. (more results)

To learn more about MuseGAN, please visit the demo page.

Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang, and Yi-Hsuan Yang (*equal contribution)
Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), 2018
homepage   paper   slides   arXiv   code

Meow Meow – A Smart Pet Interaction System

Embedded Systems Labs, 2017 Spring class at NTUEE
Collaborator: Yu-Hsuan Teng

Meow Meow, an intelligent pet interaction system, allows you to remotely monitor the temperature and humidity condition, control the light and fan and on top of that, interact with your pets. We use two Tessel 2 boards to build up the environment monitoring system and the interactive feeding system (one board for each system).


To learn more about Meow Meow, please visit the demo page.

A Game Theoretic Model for User Preference-Aware Resource Pricing in Wireless Mobile Networks

Undergraduate research at NTUEE
Advisor: Prof. Hung-Yu Wei


We study the user preference-aware resource pricing in wireless mobile networks. We model this problem as a two-stage Stackelberg game between multiple users and an internet service provider (ISP). We consider different user types and the users can have their own preferences on resource types. With such a game theoretical modeling, we are able to derive the optimal prices for different resource types for the ISP to maximize its profit.


For more information, please see these slides.

Reverse Ordering in Dynamical 2D Hopper Flow

High school science fair project in Physics
Collaborator: Chen-Chieh Ping
Advisors: Dr. Kiwing To, Dr. Yu-Cheng Chien and Mrs. Yan-Ping Zhang

We study the exit ordering of grains in gravity driven flow through two-dimensional hoppers of different hopper angle and outlet size with adjustable reclining angle. We observe a reverse ordering phenomenon such that grains entering the hopper at earlier times may not come out earlier. We record the entry order and exit order of the grains and calculate the degree of reverse ordering which is found to increase with increasing hopper angle, decreasing reclining angle, and increasing hopper outlet size.

In order to find the mechanism of reverse ordering, we construct maps which register the entry order and exit order according to the position of the grain in the hopper before the flow. By comparing the exit order map and the entry order map we locate the regions where grains undergo reverse ordering. From the trajectories of the grains in the reverse ordering regions, we find that they take part in avalanches at the surface on their way to the exit. Hence, it is the dynamical process of surface avalanche that reverse the exit order of the grain when they flow out of the hopper.

These results may be useful for special hopper design in agricultural and pharmaceutical industries to reduce or to enhance reverse ordering of materials for specific purposes.

Point, Line and Plane – Extrema of Area Enclosed by a Given Curve and a Variable Line Passing through a Fixed Point

High school science fair project in Mathematics
Advisors: Mrs. Yan-Ping Zhang and Mr. Chih-Chang Ou


We study the properties of the segment and area enclosed by a given curve and a variable line passing through a fixed point. We first examine the existence of an equally dividing line in different settings. We then show that for a convex curve and a point inside it, the enclosed area between the curve and a variable line passing through the point results in extreme values if and only if the line is an equally dividing line. We also extend the analysis to nonconvex curves and we found that this property only holds when the curve can be written in an explicit function in polar form using the fixed point as the origin.

Analysis of Gunmen’s Strategies

High school science fair project in Mathematics
Collaborator: Chen-Chieh Ping
Advisors: Mr. Chih-Chang Ou and Mrs. Yan-Ping Zhang


We study a game among three gunmen with the following rules: 1) they play in turn without turns limitation, 2) in each turn, the player can either pass the turn or shoot one of the other two players, 3) the game ends when there is only one player alive and 4) the only remaining player is the winner. In this work, we consider the case when the three gunmen have their own precision and propose the decision box model for analyzing and visualizing their optimal strategies. We also extended our analysis to the imperfect information scenario.

Squeeze! Don’t Move! – Tight Configuration of Disks and their Circum-rectangle

High school science fair project in Mathematics
Collaborator: Chen-Chieh Ping
Advisors: Prof. Sen-Peng Eu, Mrs. Yan-Ping Zhang and Mr. Chih Chang Ou

We study the tight configurations for n disks of the same size in their circum-rectangles. We find the biggest and smallest such rectangles when n ≤ 6 and the smallest rectangle for arbitrary n of certain configurations. We also propose several approaches for generating interesting tight configurations of any number of disks based on simple ones.