In this repository, all code can be found that produced the results for the Bachelor's thesis. In the thesis, we discussed the generation of new Jazz Chords given this training dataset (iRealPro Corpus of Jazz Standards). The Corpus contains around 1000 songs in the genre of Jazz and a chord vocab of 1007 distinct chords. The general pipeline is as follows:
To train and run the network, open Models.ipynb
and run the cells. It relies on already generated data, so one does not need to tun the notebook PreProcessing.ipynb
beforehand. All modules should be installed - they are collectively imported at the top of the notebook. Hyperparameters can be adjusted - or even the architecture of the RNNs. Chords are not generated by default; they will be loaded from outputs/sequences/
, where they are stored in a JSON format. Function parameters must be changed accordingly if one wants to generate new sequences.
In the Notebook PreProcessing.ipynb
, data is being loaded and processed. First, we take care of the **kern
structure and arrange the chords as given in the sequence information in the header. Then, the chords are added to a 2D list. This list is then processed, and the chords are simplified. The original chord vocab size shrinks from 1007
→ 115
. The chords are then saved into data/processed/chords.json
.
Statistics.ipynb
will give some insights into the data and produce plots saved in img/
.
The Notebook Models.ipynb
contains all the training and generating logic. Two Recurrent Neural Networks (one LSTM and a baseline RNN) are set up with the same hyper-parameters. They will be compared later. Here is the general structure:
- Load
chords.json
: Load the chords and tokenize them to datatypeint
. This is all done by functions one can find infunctions/utils.py
. The sequences are also padded. - Add tokens to the data:
<BOS>
: Beginning of Sequence token (marks the start of the chord sequence)
<EOS>
: End of Sequence token (marks the end of the chord sequence)
pad
: Padding token (sequences are padded to the same length) - Set up RNN: An RNN with two linear layers is set up. Sequences will not be packed like in the LSTM.
- Set up LSTM: The same architecture as the RNN, but with sequence packing.
- Train them both over 50 epochs.
- Generate new sequences using multinomial sampling for picking the next element.
- Compare the results: Distribution similarity, padding content, ...
- Save generated chords to midi files: they can be found in the
midi
folder for listening. I recommend looking into theoutputs/midi/arranged
since the chords have been used to make an entire arrangement out of it - or for only chords (without arrangement):outputs/midi/piano
.