Skip to content

pjbk/Breast-Cancer-Analysis-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Breast Cancer Analysis & Prediction

Dataset Description

The Breast Cancer datasets is available UCI machine learning repository maintained by the University of California, Irvine. The dataset contains 569 samples of malignant and benign tumor cells.

The first two columns in the dataset store the unique ID numbers of the samples and the corresponding diagnosis (M=malignant, B=benign), respectively. The columns 3-32 contain 30 real-value features that have been computed from digitized images of the cell nuclei, which can be used to build a model to predict whether a tumor is benign or malignant.

1= Malignant (Cancerous) - Present (M) 0= Benign (Not Cancerous) -Absent (B)

Proposed Approach: EDA, Data Preprocessing, Feature decomposition, PCA, Random Forest Classifier, XGboost, Model Evaluation Metrics, CM, ROC, AUC, Model Comparision

For more exciting notebooks visit my Kaggle workspace! [ https://www.kaggle.com/pankajbhowmik ]