Skip to content

SimonPop/AudiobookSeeker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📖 Audiobook Search Engine

This project is about creating a neural search engine that allows to discover new audio-books based on title the user already likes.

🔎 Search Engine

Neural search engine is created using the Jina framework.

🗂️ Data

The data used for this project are scrapped from Audible.

Data is scrapped using a Breadth First Search strategy: from a starting point (random audiobook), all recommended audibooks are enqueued and explored turn in turn.

📃 Embedding

Embeddings are created using a link prediction model based on already existing Audible recommendations.

A recommendation graph of books is extracted from scrapping. A model is then trained to predict whenever two nodes of that graph are linked. During the training, embeddings are tuned and can finally be used during neural search.

The link prediction model using dot products in order to predict the existence of a link, embeddings are guided to be closer whenever two books should be recommended and farther when not.

A naive approach of averaging is used in order to allow multiple books to be selected as inputs.

The link prediction model has been created using PyG.

💻 UI

A small demo UI is available in the project. It allows the user to select among available books, the ones he likes most and get the top n recommendations from the engines.

The demo UI was created using Streamlit.

About

Audiobook neural search engine

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published