The goal of this project is to analyze the vaccine myths on Reddit.
I have Downloaded this dataset from kaggle website. Here is the link: https://www.kaggle.com/gpreda/reddit-vaccine-myths
- Imported all the required libraries and dataset for this project.
- Exploratory Data Analysis and Visualizing different aspects of the dataset.
- Finding number of observations and outliers in the dataset.
- Plotting different attributes of the dataset.
- Text Preprocessing
- Sentiment Analysis
- Topic Modeling
- numpy.
- pandas.
- matplotlib.
- seaborn.
- sklearn
- spacy
- textblob
- 2019 has the maximun number of comments.
- April month of 2019 has the maximun nnumber of commnets.
- Around half of the sentiments were positive, with another half split between neutral and negative.
- Highest length of post title is 120.
- Created by @Nirvik07, HRSoc 2022