This is a Content-Based recommmendation system focus on properities of items. Similarity of items is determined by measuring the similarity in their properties.
Traditinally, Collaborative filtering is a common method for Recommendation systems. The idea of collaborative filtering is to determine the users’ preferences from historical usage data, focus on the relationship between users and items. But perhaps the biggest problem is that new and unpopular items cannot be recommended: if there is no usage data to analyze, the collaborative filtering approach breaks down. This is the so-called cold-start problem.
Content-based recommendations make it possible for us to recommend new released or unpopular songs to listeners.
The basic idea is that I use the CNN network to train as a classifier with labels that are 8 different song genres on the Free Music Archive dataset. The trained network is then modified by discarding the softmax layer i.e. creating a new model which works as an encoder. This encoder takes as input slices of a spectrogram one at a time and outputs a 40 dimensional latent representation of that respective slice. This generates multiple latent vectors for one spectrogram depending on how many slices were generated. These multiple vectors are then averaged to get one latent representation for each spectrogram.
The Cosine similarity metric is used to generate a similarity score between one anchor song and the rest of the songs in the playlist set. The two songs with the highest similarity score with respect to the anchor song are then outputted as the recommendations.
The network architecture look like this:
The dataset I use to train the network is fma_small
file from the Free Music Archive consist of 8,000 tracks of 30s, 8 balanced genres ( Hip-Hop, International, Electronic, Folk, Experimental, Rock, Pop, and Instrumental)(GTZAN-like).
The mel-frequency spectrogram of tracks in the dataset look like this:
I will then slice into small images of 128x128 pixels in grayscale levels.
I'm using Residual Block into Convolutional neural network to increase the model's performance.
Tensorboard
Performance on test set
Test Accuracy: 89.9450 %
1. Prerequisites
Required Python >=3.5, install Anaconda (optional and recommended)
pip install -r requirements.txt
2. Download Playlist songs
Download playlist 40 songs from here, extract file and put into folder torch/templates.
3. Run app
Go to torch folder, run:
python app_server.py
and go to https://localhost:5000
1. Requirement:
Install Docker and Docker-compose, nvidia-docker2
for GPU support.
2. Build app
docker-compose up --build
Go to https://localhost