SWB-GVCEH

Summary: Statisticians Without Borders sentiment analysis project for GVCEH. This project collects data from Twitter daily using the Twitter API and performs sentiment analysis & relevancy classification on tweets.

Automation: GitHub actions triggers the scraper.py script to be run every night at 8pm PCT (3 UTC). The lookback period for the scraper is 48 hours. New tweets are deduped against historical tweets, which are stored in the SWB-GVCEH/data/processed/twitter/github_actions folder. If the latest file exceeds 5MB, a new csv file is created in the folder listed above and the tweets are saved there. If the file is less than 5MB the new tweets are added to the latest csv.

Modeling: 2 models (a sentiment model and relevance model) are used in this project. The sentiment model needed no additional training while the relevancy model was further trained using historical data from this project.

Dashboard: The dashboard is built and hosted using streamlit (https://statisticians-without-borders-gvceh-swb--dashboardmockup-9l8zxx.streamlit.app/). To run the app locally, one should navigate to the parent dir and execute the following command: streamlit run ./dashboard/mockup.py

Name		Name	Last commit message	Last commit date
Latest commit History 899 Commits
.github/workflows		.github/workflows
.idea		.idea
appendices		appendices
dashboard		dashboard
data		data
models		models
scraper		scraper
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SWB-GVCEH

About

Releases

Packages

Contributors 6

Languages

License

Statisticians-Without-Borders-GVCEH/SWB-GVCEH

Folders and files

Latest commit

History

Repository files navigation

SWB-GVCEH

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages