HnM Fashion Recommender System

Description

This project is related to the kaggle competition. The goal is to develop product recommendations based on data from previous transactions as well as from customer and product meta data. The available meta data spans from simple data, such as garment type and customer age, to text data from product descriptions and image data from garment images.

The final task is to recommend up to 12 products for customers to purchase during the 7-day period immediately after the training data ends. Performance is evaluated according to the Mean Average Precision @ 12 (MAP@12).

Dataset

The dataset can be downloaded here.

articles.csv : detailed metadata for each article_id available for purchase
customers.csv : metadata for each customer_id in dataset
transactions_train.csv : the training data, consisting purchase log of customers
images folder : a folder of images corresponding to each article_id

However, articles.csv contains detailed metadata, such as shape, material, and color information for each article. This project assumes that the information gathered from article.csv is enough, so images from the images folder will not be used to train the model.

Notebook

This project performed in the two notebooks.

1st notebook: EDA(Exploratory Data Analysis)

This Exploratory Data Analysis Notebook will look the data, analyze the content, check for missing data, understand the data distribution, see what are the relations between data in various files, and do some various visualizations and statistical analysis.

2nd notebook: Candidate generation and Model

This notebook prepares the data, reducing the amount of data train needed from 4.5GB + 512MB + 117MB to 788MB + 17MB + 11MB. Achieve 6x memory reduction!.

Generates candidates as negative examples for training the model and for evaluation. For each customer, candidate products generated are 12 best seller last week, the most recent product purchased by the customer, and the best seller product for the week in which the customer purchased.

Feed data training and candidates to LGBMRanker and use the ranker to output predictions. Get 0.2045 score and 1798/2952 place ~ 40% better than other competitors.

Result

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
1_EDA.ipynb		1_EDA.ipynb
2_Modeling.ipynb		2_Modeling.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HnM Fashion Recommender System

Description

Dataset

Notebook

Result

About

Releases

Packages

Languages

License

chukbert/HnM-Recommender-System

Folders and files

Latest commit

History

Repository files navigation

HnM Fashion Recommender System

Description

Dataset

Notebook

Result

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages