Skip to content

In scope of NLP, this project tackles SMS spam detection problem utilizing various ML algorithms such as Logistic Regression, Multinomial Naïve Bayes, K-Nearest Neighbour, Decision Tree, Random Forest, AdaBoost, Gradient Boosting.

Notifications You must be signed in to change notification settings

gozdeorhan/Spam-or-Ham

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spam-or-Ham

In scope of NLP, this project tackles SMS spam detection problem utilizing various ML algorithms such as Logistic Regression, Multinomial Naïve Bayes, K-Nearest Neighbour, Decision Tree, Random Forest, AdaBoost, Gradient Boosting.

The dataset contains 4,825 legitimate (ham) messages and 747 spam messages in English with a total number of 5,572 short messages. This indicates that the major percentage of the data with the percentage of 86.6 are labelled as ham where only 13.4% of them labelled as spam.

The dataset was found in a .csv format where each row corresponds to a single message and is composed of two columns: label (v1) and the raw text (v2).

Link to dataset: https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection

About

In scope of NLP, this project tackles SMS spam detection problem utilizing various ML algorithms such as Logistic Regression, Multinomial Naïve Bayes, K-Nearest Neighbour, Decision Tree, Random Forest, AdaBoost, Gradient Boosting.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published