Hao-Wen Dong, Yi-Hsuan Yang

Music and AI Lab,
Research Center for IT Innovation,
Academia Sinica

Model

The proposed model consists of a generator G and a discriminator D. The generator takes as input a random vector z drawn from a prior distribution pz and generate a fake sample G(z). The discriminator takes as input either a real sample drawn from the data distribution or a fake sample generated by the generator and outputs a scalar indicating the genuineness of that sample. The discriminator is trained to tell the fake data from the real ones. The generator is trained to fool the discriminator.

In order to handle binary data, we propose to use binary neurons, either deterministic or stochastic ones, at the output layer (i.e., the final layer) of the generator. We employ the sigmoid-adjusted straight-through estimators to estimate the gradients for binary neurons and train the whole network by end-to-end backpropagation.

The following is the system diagram for the proposed model implemented by multilayer perceptrons (MLPs).

system