Spam-or-Ham

In scope of NLP, this project tackles SMS spam detection problem utilizing various ML algorithms such as Logistic Regression, Multinomial Naïve Bayes, K-Nearest Neighbour, Decision Tree, Random Forest, AdaBoost, Gradient Boosting.

The dataset contains 4,825 legitimate (ham) messages and 747 spam messages in English with a total number of 5,572 short messages. This indicates that the major percentage of the data with the percentage of 86.6 are labelled as ham where only 13.4% of them labelled as spam.

The dataset was found in a .csv format where each row corresponds to a single message and is composed of two columns: label (v1) and the raw text (v2).

Link to dataset: https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Spam-or-Ham-NLP.ipynb		Spam-or-Ham-NLP.ipynb
example.png		example.png
pie.png		pie.png
spam.csv		spam.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam-or-Ham

About

Releases

Packages

Languages

gozdeorhan/Spam-or-Ham

Folders and files

Latest commit

History

Repository files navigation

Spam-or-Ham

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages