Social scientists increasingly use large quantities of text-based data to address problems in industry and academy. This course provides students with an overview of popular techniques for collecting, processing, and analyzing text data from a social science perspective. We will first learn how to collect text data from a variety of sources, including application programming interfaces (APIs) and web-scraping. The second portion of the class provides an overview of popular methods to analyze text data, including sentiment analysis, topic models, supervised classification, and word embeddings. The course is applied in nature. While many of the techniques we discuss have their origins in computer science or statistics, this is not a CS or statistics course. Ultimately, the goal is to introduce students to modern techniques for computational text analysis and help them apply these methods to their own research.
Run the code below in R
to download this repo onto your machine.
# Install tidyverse if you have not already done so.
# install.packages("tidyverse")
library("usethis")
use_course("https://github.com/rochelleterman/TAD-F22/archive/main.zip")
These materials are still in development and will be changing.
Rochelle Terman, Assistant Professor in Political Science, University of Chicago [email protected]
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.