Skip to content

This repo contains a Jupyter notebook showing how to run a prediction of new data using a multimodal deep learning model to predict movie genres.

Notifications You must be signed in to change notification settings

arusl/tmdb-multimodal-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Multimodal Movie Genre Classification (Text+Image): Prediction on New Data

This notebook shows how to use an already pre-trained model (using Keras) to make predictions on new data.
The model used in this notebook is trained following the architectures and codes specified in this Github repo, explained in this Medium article.

Required files

To run this notebook in Google Colab, you need to download the followings and adjust the paths inside the notebook:

  1. trained model
  2. dataset to prepare our tokenizer --> filtered from the original data from TMDB
  3. movie posters for prediction

Input & Output

Input: Movie Overview (text) & Movie Poster (image)

Output: Predicted movie genre

Example:

Model

The model used in this notebook is trained following the architectures and codes specified in this Github repo.

Some explanations regarding how to train and evaluate the models can be found in this Medium article and the above Github repo.

Basically, the model receives inputs from two modalities, image (movie poster) and text (movie overview). Images are trained using CNN and texts are trained using LSTM. Below is the illustration from the above Medium article.

Illustration from Source

Notes

Detailed explanations can be found inside the notebook.

About

This repo contains a Jupyter notebook showing how to run a prediction of new data using a multimodal deep learning model to predict movie genres.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published