Update index.md

MoganaD · Sep 3, 2018 · 7e49d9e · 7e49d9e
1 parent 26cef1b
commit 7e49d9e
Showing 1 changed file with 84 additions and 29 deletions.
diff --git a/index.md b/index.md
@@ -1,37 +1,92 @@
 ## Welcome to GitHub Pages
 
-You can use the [editor on GitHub](https://github.com/MoganaD/Machine-Learning-on-Breast-Cancer-Survival-Prediction/edit/master/index.md) to maintain and preview the content for your website in Markdown files.
+# Model Evaluation using 6 different Machine Learning Algorithms in R, for Breast Cancer dataset
 
-Whenever you commit to this repository, GitHub Pages will run [Jekyll](https://jekyllrb.com/) to rebuild the pages in your site, from the content in your Markdown files.
+## 1. Random Forest
+### Load the package
+>library (randomForest)
+#### Read file_
+>all_data <- read.csv (file='D:/model_evaluation/all_data.csv')
+### Run randomForest
+>all_data.rf <- randomForest (V24 ~. , data=all_data, ntree=500, mtry=5, keep.forest=FALSE, importance=TRUE)
+### Print result of randomForest (OOB error and confusion matrix)
+>print (all_data.rf)
 
-### Markdown
+## 2. Decision Tree
+### Load the packages
+>library (caret)
+>library (rpart)
+### Split data into testing and training
+>intrain <- createDataPartition (y = all_data $V24, p= 0.7, list = FALSE)
+>training <- all_data [intrain,]
+>testing <- all_data [-intrain,]
+### Train the model
+>trctrl <- trainControl (method = "repeatedcv", number = 10, repeats = 3)
+>dtree_fit <- train (V24 ~. , data = training, method = "rpart", parms = list (split = "information"), trControl=trctrl, tuneLength = 10)
+### Prediction using testing data
+>predict (dtree_fit, newdata = testing)
+>test_pred <- predict (dtree_fit, newdata = testing)
+### print result of randomForest (Accuracy and confusion matrix)
+>confusionMatrix (test_pred, testing$V24)
 
-Markdown is a lightweight and easy-to-use syntax for styling your writing. It includes conventions for
+## 3. Support Vector Machine (SVM)
+### Load the packages
+>library(caret)
+### Assign the categorical variable as a factor in both training and testing sets (use training and testing sets from Decision Tree)
+>training[["V24"]] = factor(training[["V24"]])
+>testing[["V24"]] = factor(testing[["V24"]])
+### Train the svm model
+>trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 3)
+>set.seed(3233)
+>svm_Linear <- train(V24 ~., data = training, method = "svmLinear",trControl=trctrl, preProcess = c("center", "scale"),
+tuneLength = 10)
+>svm_Linear
+### Test set prediction
+>test_pred <- predict(svm_Linear, newdata = testing)
+>test_pred
+### Compute accuracy using confusion matrix
+>confusionMatrix(test_pred, testing$V24)
 
-```markdown
-Syntax highlighted code block
+## 4. Logistic Regression
+### Training the dataset with lm
+>lmMod <- lm(V24 ~ ., data=training) #build the model
+distPred <- predict(lmMod, testing) #predict survival
+>summary (lmMod)
+### Test set prediction 
+>actuals_preds <- data.frame(cbind(actuals=testing$V24,
+predicteds=distPred)) #make actuals_predicteds dataframe
+>correlation_accuracy <- cor(actuals_preds)
+>head(actuals_preds)
 
-# Header 1
-## Header 2
-### Header 3
+## 5. Neural Networks
+### Load packages
+>library(caret)
+>library(neuralnet)
+### Fit neural network
+>set.seed(2)
+> NN = neuralnet(V24 ~ V1+V2+V3+V4+V5+V6+V7+V8+V9+V10+V11+V12+V13+V14+V15+V16+V17+V18+V19+V20+V21+V22+V23, training, hidden = 3 , linear.output = T)
+### Prediction using neural network
+>predict_testNN = compute(NN, testing[,c(1:23)])
+>predict_testNN = (predict_testNN$net.result * (max(all_data$V24) - >min(all_data$V24))) + min(all_data$V24)
+>plot(testing$V24, predict_testNN, col='blue', pch=16, ylab = "predicted rating NN", xlab = "real rating")
+>abline(0,1)
+### Calculate Root Mean Square Error (RMSE)
+>RMSE.NN = (sum((testing$V24 - predict_testNN)^2) / nrow(testing)) ^ 0.5
 
-- Bulleted
-- List
-
-1. Numbered
-2. List
-
-**Bold** and _Italic_ and `Code` text
-
-[Link](url) and ![Image](src)
-```
-
-For more details see [GitHub Flavored Markdown](https://guides.github.com/features/mastering-markdown/).
-
-### Jekyll Themes
-
-Your Pages site will use the layout and styles from the Jekyll theme you have selected in your [repository settings](https://github.com/MoganaD/Machine-Learning-on-Breast-Cancer-Survival-Prediction/settings). The name of this theme is saved in the Jekyll `_config.yml` configuration file.
-
-### Support or Contact
-
-Having trouble with Pages? Check out our [documentation](https://help.github.com/categories/github-pages-basics/) or [contact support](https://github.com/contact) and we’ll help you sort it out.
+## 6. Extreme Gradient Boosting
+### Load packages
+>library(xgboost)
+>library(caret)
+### Train with xgboost
+>folds = createFolds(training$V24, k=10)
+cv = lapply (folds, function(x){
+training_fold= training[-x,]
+testing_fold= testing[x,]
+classifier = xgboost(data=as.matrix(training[-24]), label=training$V24, nrounds=10 )
+y_pred = predict(classifier, newdata=as.matrix(testing_fold[24]))
+y_pred =(y_pred >= 0.5)
+cm = table(testing_fold[,24], y_pred)
+accuracy = (cm[1,1] + cm[2,2]) / (cm[1,1] + cm[2,2] + cm[1,2] + cm[2,1])
+return (accuracy)
+})
+accuracy = mean(as.numeric(cv))