Hao-Wen Dong, Yi-Hsuan Yang

Music and AI Lab,
Research Center for IT Innovation,
Academia Sinica


Lakh Pianoroll Dataset

We use the cleansed version of Lakh Pianoroll Dataset (LPD). LPD contains 174,154 unique multitrack pianorolls derived from the MIDI files in the Lakh MIDI Dataset (LMD), while the cleansed version contains 21,425 pianorolls that are in 4/4 time and have been matched to distinct entries in Million Song Dataset (MSD).

Training Data

Hence, the size of the target output tensor is 4 (bar) × 96 (time step) × 84 (pitch) × 8 (track).

The following are six sample pianoroll seen in our training data, where each block represents a bar. The tracks are (from top to bottom): Drums, Piano, Guitar, Bass, Ensemble, Reed, Synth Lead and Synth Pad