Music Recommendation System with Deep Learning and Cosine Similarity

This is a Content-Based recommmendation system focus on properities of items. Similarity of items is determined by measuring the similarity in their properties.

Overview

Traditinally, Collaborative filtering is a common method for Recommendation systems. The idea of collaborative filtering is to determine the users’ preferences from historical usage data, focus on the relationship between users and items. But perhaps the biggest problem is that new and unpopular items cannot be recommended: if there is no usage data to analyze, the collaborative filtering approach breaks down. This is the so-called cold-start problem.
Content-based recommendations make it possible for us to recommend new released or unpopular songs to listeners. The basic idea is that I use the CNN network to train as a classifier with labels that are 8 different song genres on the Free Music Archive dataset. The trained network is then modified by discarding the softmax layer i.e. creating a new model which works as an encoder. This encoder takes as input slices of a spectrogram one at a time and outputs a 40 dimensional latent representation of that respective slice. This generates multiple latent vectors for one spectrogram depending on how many slices were generated. These multiple vectors are then averaged to get one latent representation for each spectrogram.
The Cosine similarity metric is used to generate a similarity score between one anchor song and the rest of the songs in the playlist set. The two songs with the highest similarity score with respect to the anchor song are then outputted as the recommendations.
The network architecture look like this:

Training model

Process data

The dataset I use to train the network is fma_small file from the Free Music Archive consist of 8,000 tracks of 30s, 8 balanced genres ( Hip-Hop, International, Electronic, Folk, Experimental, Rock, Pop, and Instrumental)(GTZAN-like).
The mel-frequency spectrogram of tracks in the dataset look like this:
I will then slice into small images of 128x128 pixels in grayscale levels.

Training

I'm using Residual Block into Convolutional neural network to increase the model's performance.
Tensorboard
Performance on test set

Test Accuracy: 89.9450 %

Usage

1. Prerequisites
Required Python >=3.5, install Anaconda (optional and recommended)

pip install -r requirements.txt

2. Download Playlist songs
Download playlist 40 songs from here, extract file and put into folder torch/templates.
3. Run app
Go to torch folder, run:

python app_server.py

and go to https://localhost:5000

Usage with Docker and Docker-compose

1. Requirement: Install Docker and Docker-compose, nvidia-docker2 for GPU support.
2. Build app

docker-compose up --build

Go to https://localhost

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
torch		torch
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Recommendation System with Deep Learning and Cosine Similarity

Overview

Training model

Process data

Training

Usage

Usage with Docker and Docker-compose

References

About

Releases

Packages

Contributors 2

Languages

License

namngduc/MiRemd

Folders and files

Latest commit

History

Repository files navigation

Music Recommendation System with Deep Learning and Cosine Similarity

Overview

Training model

Process data

Training

Usage

Usage with Docker and Docker-compose

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages