-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
84 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,37 +1,92 @@ | ||
## Welcome to GitHub Pages | ||
|
||
You can use the [editor on GitHub](https://github.com/MoganaD/Machine-Learning-on-Breast-Cancer-Survival-Prediction/edit/master/index.md) to maintain and preview the content for your website in Markdown files. | ||
# Model Evaluation using 6 different Machine Learning Algorithms in R, for Breast Cancer dataset | ||
|
||
Whenever you commit to this repository, GitHub Pages will run [Jekyll](https://jekyllrb.com/) to rebuild the pages in your site, from the content in your Markdown files. | ||
## 1. Random Forest | ||
### Load the package | ||
>library (randomForest) | ||
#### Read file_ | ||
>all_data <- read.csv (file='D:/model_evaluation/all_data.csv') | ||
### Run randomForest | ||
>all_data.rf <- randomForest (V24 ~. , data=all_data, ntree=500, mtry=5, keep.forest=FALSE, importance=TRUE) | ||
### Print result of randomForest (OOB error and confusion matrix) | ||
>print (all_data.rf) | ||
### Markdown | ||
## 2. Decision Tree | ||
### Load the packages | ||
>library (caret) | ||
>library (rpart) | ||
### Split data into testing and training | ||
>intrain <- createDataPartition (y = all_data $V24, p= 0.7, list = FALSE) | ||
>training <- all_data [intrain,] | ||
>testing <- all_data [-intrain,] | ||
### Train the model | ||
>trctrl <- trainControl (method = "repeatedcv", number = 10, repeats = 3) | ||
>dtree_fit <- train (V24 ~. , data = training, method = "rpart", parms = list (split = "information"), trControl=trctrl, tuneLength = 10) | ||
### Prediction using testing data | ||
>predict (dtree_fit, newdata = testing) | ||
>test_pred <- predict (dtree_fit, newdata = testing) | ||
### print result of randomForest (Accuracy and confusion matrix) | ||
>confusionMatrix (test_pred, testing$V24) | ||
Markdown is a lightweight and easy-to-use syntax for styling your writing. It includes conventions for | ||
## 3. Support Vector Machine (SVM) | ||
### Load the packages | ||
>library(caret) | ||
### Assign the categorical variable as a factor in both training and testing sets (use training and testing sets from Decision Tree) | ||
>training[["V24"]] = factor(training[["V24"]]) | ||
>testing[["V24"]] = factor(testing[["V24"]]) | ||
### Train the svm model | ||
>trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 3) | ||
>set.seed(3233) | ||
>svm_Linear <- train(V24 ~., data = training, method = "svmLinear",trControl=trctrl, preProcess = c("center", "scale"), | ||
tuneLength = 10) | ||
>svm_Linear | ||
### Test set prediction | ||
>test_pred <- predict(svm_Linear, newdata = testing) | ||
>test_pred | ||
### Compute accuracy using confusion matrix | ||
>confusionMatrix(test_pred, testing$V24) | ||
```markdown | ||
Syntax highlighted code block | ||
## 4. Logistic Regression | ||
### Training the dataset with lm | ||
>lmMod <- lm(V24 ~ ., data=training) #build the model | ||
distPred <- predict(lmMod, testing) #predict survival | ||
>summary (lmMod) | ||
### Test set prediction | ||
>actuals_preds <- data.frame(cbind(actuals=testing$V24, | ||
predicteds=distPred)) #make actuals_predicteds dataframe | ||
>correlation_accuracy <- cor(actuals_preds) | ||
>head(actuals_preds) | ||
# Header 1 | ||
## Header 2 | ||
### Header 3 | ||
## 5. Neural Networks | ||
### Load packages | ||
>library(caret) | ||
>library(neuralnet) | ||
### Fit neural network | ||
>set.seed(2) | ||
> NN = neuralnet(V24 ~ V1+V2+V3+V4+V5+V6+V7+V8+V9+V10+V11+V12+V13+V14+V15+V16+V17+V18+V19+V20+V21+V22+V23, training, hidden = 3 , linear.output = T) | ||
### Prediction using neural network | ||
>predict_testNN = compute(NN, testing[,c(1:23)]) | ||
>predict_testNN = (predict_testNN$net.result * (max(all_data$V24) - >min(all_data$V24))) + min(all_data$V24) | ||
>plot(testing$V24, predict_testNN, col='blue', pch=16, ylab = "predicted rating NN", xlab = "real rating") | ||
>abline(0,1) | ||
### Calculate Root Mean Square Error (RMSE) | ||
>RMSE.NN = (sum((testing$V24 - predict_testNN)^2) / nrow(testing)) ^ 0.5 | ||
- Bulleted | ||
- List | ||
|
||
1. Numbered | ||
2. List | ||
|
||
**Bold** and _Italic_ and `Code` text | ||
|
||
[Link](url) and ![Image](src) | ||
``` | ||
|
||
For more details see [GitHub Flavored Markdown](https://guides.github.com/features/mastering-markdown/). | ||
|
||
### Jekyll Themes | ||
|
||
Your Pages site will use the layout and styles from the Jekyll theme you have selected in your [repository settings](https://github.com/MoganaD/Machine-Learning-on-Breast-Cancer-Survival-Prediction/settings). The name of this theme is saved in the Jekyll `_config.yml` configuration file. | ||
|
||
### Support or Contact | ||
|
||
Having trouble with Pages? Check out our [documentation](https://help.github.com/categories/github-pages-basics/) or [contact support](https://github.com/contact) and we’ll help you sort it out. | ||
## 6. Extreme Gradient Boosting | ||
### Load packages | ||
>library(xgboost) | ||
>library(caret) | ||
### Train with xgboost | ||
>folds = createFolds(training$V24, k=10) | ||
cv = lapply (folds, function(x){ | ||
training_fold= training[-x,] | ||
testing_fold= testing[x,] | ||
classifier = xgboost(data=as.matrix(training[-24]), label=training$V24, nrounds=10 ) | ||
y_pred = predict(classifier, newdata=as.matrix(testing_fold[24])) | ||
y_pred =(y_pred >= 0.5) | ||
cm = table(testing_fold[,24], y_pred) | ||
accuracy = (cm[1,1] + cm[2,2]) / (cm[1,1] + cm[2,2] + cm[1,2] + cm[2,1]) | ||
return (accuracy) | ||
}) | ||
accuracy = mean(as.numeric(cv)) |