Skip to content

Latest commit

 

History

History
48 lines (35 loc) · 1.73 KB

README14.md

File metadata and controls

48 lines (35 loc) · 1.73 KB

Reddit Vaccine Myths Analysis

covid-19-vaccine

Goal

The goal of this project is to analyze the vaccine myths on Reddit.

Dataset

I have Downloaded this dataset from kaggle website. Here is the link: https://www.kaggle.com/gpreda/reddit-vaccine-myths

What Have I Done?

  • Imported all the required libraries and dataset for this project.
  • Exploratory Data Analysis and Visualizing different aspects of the dataset.
  • Finding number of observations and outliers in the dataset.
  • Plotting different attributes of the dataset.
  • Text Preprocessing
  • Sentiment Analysis
  • Topic Modeling

Library used:

  1. numpy.
  2. pandas.
  3. matplotlib.
  4. seaborn.
  5. sklearn
  6. spacy
  7. textblob

Visualization and EDA of different attributes:

download download download download download

Conclusion:

  • 2019 has the maximun number of comments.
  • April month of 2019 has the maximun nnumber of commnets.
  • Around half of the sentiments were positive, with another half split between neutral and negative.
  • Highest length of post title is 120.

Authors