Skip to content

🏢 EDA (incl. KNN) & CatBoost & Optuna for predicting the most accurate scores for the business objects success.

Notifications You must be signed in to change notification settings

gblssroman/vk_internship

Repository files navigation

🏢 VK Internship March '24

EDA (incl. KNN) & CatBoost & Optuna for giving the most accurate scores for the object success prediction task.

Repo for VK Internship Data Science task.

.zip archive included.

Instruction:

  1. Install requirements.txt via pip install -r requirements.txt (no junk :))
  2. Launch generate_submission.py
  3. Get final submission.csv (it has already been generated in the folder output for fast reference).

Other files description:

  • classifier.cbm - Trained CatBoost model (regressor, not classifier)
  • do_eda.py - Script used in generate_submission.py for given datasets preparation
  • datasets - Folder containing datasets
  • cols_to-drop.pkl - Columns to-be-dropped causing multicollinearity (dict).

Инструкция:

  1. Установите requirements.txt через pip install -r requirements.txt
  2. Запустите generate_submission.py
  3. Получите окончательный submission.csv (он уже сгенерирован в папке output для быстрого референса).

Описание других файлов:

  • classifier.cbm - Обученный регрессор CatBoost
  • do_eda.py - скрипт, используемый в generate_submission.py для подготовки датасетов и фичей
  • datasets - Папка, содержащая наборы данных
  • cols_to-drop.pkl - Столбцы, подлежащие удалению, вызывающие мультиколлинеарность и по факту не дающие полезной информации.

About

🏢 EDA (incl. KNN) & CatBoost & Optuna for predicting the most accurate scores for the business objects success.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published