Skip to content

enjuichang/covid_sentiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Covid Sentiment Analysis with SpaCy

First time creating a NLP classification project? Want to have hands-on experience with SpaCy? Want to work with tweet datasets?

In this article, we will go through the main concepts of NLP project, including the data selection, exploratory data analysis, NLP preprocessing, NLP models (statistical/neural language models), metrics selection, and implementation on another dataset. The dataset of interest is the COVID-19 tweet dataset on Kaggle, while all NLP-related tasks are performed using SpaCy.

The data source is the Coronavirus Tweet Data dataset from Kaggle.

Structure of the data

│
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── raw            <- The original, immutable data dump.
│   └── processed      <- The final, canonical data sets for modeling.
│
├── config             <- Config files for training in spaCy.
│
├── models             <- Trained and serialized models, model predictions, or model summaries.
│
├── images             <- Images for the notebooks.
│
├── notebooks          <- Serialized Jupyter notebooks created in the project.
│   ├── All            <- Notebook that includes all codes.
│   ├── EDA            <- Exploratory data analysis process.
│   ├── Traditional    <- The training of traditional statistical models.
│   └── Neural         <- The training of neural network models.
│
├── templates          <- HTML code for model deployment.
│
├── app.py             <- Code for deploying of the model.
│
├── Procfile           <- Procfile for Heroku.
│
└─── requirements.txt  <- The requirements file for reproducing the analysis environment.

About

NLP sentiment classification for COVID-19 tweets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published