Skip to content

Latest commit

 

History

History
33 lines (20 loc) · 1.11 KB

File metadata and controls

33 lines (20 loc) · 1.11 KB

Gradient Boosting on the Lending Club dataset

In this Mini Project we will explore the use of pre-processing methods and Gradient Boosting on the popular Lending Club dataset. We are provided with two files: loan train.csv and loan test.csv.

We have be to pre-process the data appropriately, and then apply gradient boosting to classify whether a customer should be given a loan or not.

The target attribute is in the column loan status, which has values “Fully Paid” for which we can assign +1 to, and “Charged off” for which we can assign -1 to. The other records with loan status values “Current” (in both train and test) are not relevant to this problem.

We will be mainly working on the following:

(a) Pre-process the data as needed to apply the classifier to the training data.

(b) Apply gradient boosting using the function sklearn.ensemble.GradientBoostingClassifier
for training the model.

Lending Club Dataset

Training Data : loan_train.csv

Testing Data : loan_test.csv

Analysis

Proper Analysis can be found in the Report

Authors