Repository for the Fall 2022 Peak.AI x NYU Data Science Club Datathon, Recommender Systems Challenge (Winner)

Team Son, Sons & Company

Teammates:

Sunny Son (LinkedIn, GitHub)
Morgan Xu (LinkedIn, GitHub)
Shane Sun (LinkedIn)
Sunny Yang (LinkedIn)

Problem Statement

The goal in this Datathon is to act as developers for the host company of this datathon, Peak.AI, and develop a recommender system model for our customer, the Brazilian e-commerce company Olist, to advertise products to users based off of previous purchase history

Data Processing

We then joined all necessary tables to determine all relevant orders based on the primary key of customer_id, at the neds related to product_id with the entity relationship diagrams shown below:

Data Preparation

We then follow the below steps to finalize the data for modeling:

Model

We used a k-Nearest Neighbor model, to determine a mapping between "features" (e.g. cost, location, etc.) and "label" (e.g. Lamp), and minimize a specific metric distance for the labels of previously purchased items to the label of the target item. This procedure is shown below:

Generating Recommendations

With this, our model is ready to generate recommendations. We accept an input parameter "k" for the number of closest products to generate, and do so using a modified euclidiean distance metric based on product category:

"Product Category" Metric Distance

In order to determine a "distance" metric for our model to follow, we decided to scale the k-NN distance by the "cosine similarity" of a product's category to ensure similar placement of similar products, using the Global Vector (GloVe) 50d embedding:

Notebook

For the full notebook, please refer to the file peak-ai-x-nyu-dsc-datathon-team-7-submission.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
assets		assets
datasets		datasets
glove		glove
.DS_Store		.DS_Store
README.md		README.md
peak-ai-x-nyu-dsc-datathon-team-7-submission.ipynb		peak-ai-x-nyu-dsc-datathon-team-7-submission.ipynb
peak-presentation.pdf		peak-presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repository for the Fall 2022 Peak.AI x NYU Data Science Club Datathon, Recommender Systems Challenge (Winner)

Team Son, Sons & Company

Problem Statement

Data Processing

Data Preparation

Model

Generating Recommendations

"Product Category" Metric Distance

Notebook

About

Releases

Packages

Languages

sunnydigital/datathon-f22

Folders and files

Latest commit

History

Repository files navigation

Repository for the Fall 2022 Peak.AI x NYU Data Science Club Datathon, Recommender Systems Challenge (Winner)

Team Son, Sons & Company

Problem Statement

Data Processing

Data Preparation

Model

Generating Recommendations

"Product Category" Metric Distance

Notebook

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages