Predict the average spend of customers for the next 3 months in a regression problem.
The task involves predicting customer spending for the next 3 months based on advertisement details. The dataset includes training and testing data, with features outlined in the data dictionary.
-
Import Libraries:
- Import necessary libraries for data analysis and model building.
-
Load Data:
- Load training and testing data along with the data dictionary.
-
Data Exploration and Preprocessing:
- Treat NULL values and handle outliers.
- Explore gender distribution and perform basic data analysis.
-
Feature Engineering:
- Log-transform right-skewed columns.
- Check and address collinearity issues.
-
Data Preprocessing:
- Label encode categorical variables and standardize numerical features.
-
Train-Test Split:
- Split the data into training and validation sets.
-
Model Building:
- Utilize Linear Regression, Random Forest, and CatBoost models for prediction.
-
Model Evaluation:
- Assess model performance using RMSLE scores.
-
Making Predictions:
- Use trained models to predict customer spending on the test data.