Skip to content

WiraDKP/RNN_MIDI_Composer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RNN_MIDI_Composer

Training a LSTM on Indonesian Folk Songs in MIDI format to compose a new MIDI music.
Have a listen:

Have a look:

Dependencies

  • numpy
  • pandas
  • pytorch==0.4.1
  • plac
  • tqdm

How to prepare your data

Convert your midi file into .csv using Midicsv[1], and put them in a folder, by default, in the dataset folder. It is recommended to remove channels that contain repetitive music (usually background sound such as drums and snare) to avoid the RNN produce uninteresting repetitive sound. I do perform this data cleaning in the Indonesian Folk Song Dataset.

How to train model

The training is executed through a command-line interface (CLI). Check for the cli help documentation.

python train.py -h

You may also use the default value by simply run

python train.py

You can visualize the model performance using the Music Composer.ipynb notebook while training.
Note: The program will keep running unless you interrupt it with ctrl + c.

Parameters in Training configuration

  • n_hidden / -nh
    Number of hidden unit.
  • n_layers / -nl
    Number of hidden layer.
  • bs / -bs
    batch size
  • seq_len / -sl
    Length of input sequence.
  • lr / -lr
    Learning Rate
  • d_out / -do
    Dropout rate
  • save_every / -se
    Number of steps for a model to be saved
  • print_every / -pe
    Number of steps that the training information (loss, etc.) will be printed
  • name / -o
    Folder Name for the model. It will create a new folder with this name if the folder is not found.
  • midi_source_folder / -i
    Folder Name for the data. It must have the .csv files in Midicsv[1] format.

How to compose music

Use the Music Composer.ipynb notebook. Load the model, then set your desired configuration.

I have prepared some generated music in the sample folder. Use Midicsv[1] to convert it back to midi file, then you can open it with common midi player or you can try MidiEditor[2]

Parameters in Composing configuration

  • fname
    The name used for the generated music (.csv). You need to convert it back to .mid using Midicsv[1]
  • prime
    Prime for the RNN to compose the characters
  • top_k
    Take top k most probable prediction to randomly choose from. top_k = 1 means that we always use the most probable character. Higher top_k will produce more creative music (relative to the dataset). I would recommend around 3-5. If top_k value is too large, the prediction may not follow the desired format to be converted back to .mid format.
  • compose_len
    Length of character to compose. One music note will need 8-14 characters.
  • channel
    The midi channels and track number. For example, [0, 1, 2] means three channels, with each Track 0, 1 and 2.

Troubleshooting

  • If Retry music composing... keeps on popping
    It is caused by our model does not follow the format. For example, we would want C5-512-1024, but the model generated C5--512-1024. In analogy with char-RNN for paragraph generation, it is like a typo.
    You can try to use less channel, decrease top_k, decrease compose_len, train longer, or get more data. Less top_k helps because it will follow the proper format of the data instead of randomly generate characters. The same with longer training, and more data so that it can properly learn the format. Lower compose_len, instead, just to avoid this problem before it happens. Less channel is a must, the more you try to generate, the more chances that the model broke the format.
  • If the model replicates the music from dataset
    It is overfitting. You can try to decrease the model complexity (less n_hidden, n_layers, seq_len), choose a model with lower epoch (higher loss model), or increase the d_out.
  • If the generated music sounds gibberish
    Your data may be too complex. Try a more homogenous data.

Sample Result

Have a listen:

Here is the Loss history

Note: You does not have to push the Loss to minimum to generate a good music.

References

This project will not succeed without these references. Thank you indeed!