What kind of emotions do his tweets convey?
Explore the docs »
Report Bug
·
Request Feature
The aim of this study is to analyze Trump’s content message strategies and evaluate his aggregated sentiments toward a given topic using sentiment analysis. Sentiment analysis algorithms are used to categorize opinions in a given text by classifying words into categories. Because tweets are very noisy, the dataset at hand is a high sparseness and high dimensional dataset. This could reduce the efficiency of the K-means algorithm. This problem will be overcome by selecting relevant features using term frequency-inverse document frequency (tf–idf) technique, and reducing the high dimensional dataset using principal component analysis (PCA), while retaining the most relevant elements.
To get a local copy up and running, download the sentiment_analysis.R
and the text input file, donald_tweets.csv
. Then run the code in an IDE software, such as RStudio. Set the working directory to the location of the CSV file.
The code guides you through the following:
- Importing the CSV file
- Visualizing the formatting of the variables (datatypes, number of rows/columns, measures of central tendancy, statistical descriptions, etc.)
- Create a corpus to store dictionary of texts
- Text pre-processing such as installing packages, cleanup, transformation, and normalization (remove unique identifiers and irrelevant variables, cleanse errors such as special characters and stopwords, etc.)
- Exploratory analysis such as sentiment analysis based on the NRC Word Emotion Association Lexicon from tidytext package or the bag of words model
- Transformation into Document Term Matrix and Term Document Matrix, to allow for frequent terms to be found easily, and creation of a word cloud, box plot, and histogram
- Perform the K-means clustering algorithm and evaluate through the elbow method and optimal K
- Visualize through a cluster plot and cross tabulation for comparison
- Evaluate sentiment score on all tweets
- Generate top 5 words from each cluster, which suggests general topics the tweets are about
The top sentiments expressed by him were positivity and trustworthiness, which allowed him to build a strong and loyal base. He was able to capitalize through them by using words in his tweets to create calls to action. He followed the same word usage and patterning as the clusters generated from the analysis overlapped and had similar words captured in them.
Karishma Mathur - [email protected]
Project Link: https://github.com/Mathurkarishma/trump-tweets
- Group Members: Grant Lum, Vanessa Fotso, Brandon Clark
- Dr. Firdu Bati at University of Maryland, Global Campus - Fall 2019