Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
dnn_gpu_setup_test		dnn_gpu_setup_test
README.md		README.md
demo.ipynb		demo.ipynb
ffmpeg_mp3_to_wav.bat		ffmpeg_mp3_to_wav.bat
main.py		main.py
models.py		models.py
music_config.py		music_config.py
so_far_best.md		so_far_best.md
utils.py		utils.py

Repository files navigation

deep_piano

Piano music generation using deep learning.

Given a few piano notes as input (around 10 seconds), the program will generate a full piece of piano music (around 3 minutes).

Windows 10 Home, i7 Core, 16GB ram (most subsequent steps should still hold true if with Mac/Linux, but some will need to be modified, such as the conda env setup batch and the ffmpeg)
NVIDIA GeForce GTX 1080 with Max-Q, 8GB
conda env setup: dnn_gpu_setup_test\conda_dnn_gpu_setup.bat (Ananconda3)

Music format ready:
- convert mp3 to wav using ffmpeg: ffmpeg_mp3_to-wav.bat
- ffmpeg executable file should be downloaded for win64 and placed in .\FFmpeg\bin\ffmpeg.exe
- Note the conversion to wav is only required due to the piano music available happened to be mp3 format.
- samples used being performance by Richard Clayderman (https://en.wikipedia.org/wiki/Richard_Clayderman)
Music to numeric data:
- Python package librosa is employed, which will convert wav file into 1-D numpy array, given sample_rate (sr) as hyper-parameter
- music length (in seconds) x sample_rate = numpy array length (1-D)
DNN based on Data
- train towards the scenario of using the last N to generate the next 1 sample
- LSTM + Dense
- Loss = mse
DNN to generate music
- short starting piece of piano melody (random segment from hold-out piece)
Data to Music
- librosa