Skip to content

Code and data accompanying the Programming Historian tutorial on text reuse with Passim by Romanello & Hengchen.

Notifications You must be signed in to change notification settings

impresso/PH-passim-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

Binder

This repository contains the sample data for the Programming Historian's lesson Detecting Text Reuse with Passim, written by Matteo Romanello and Simon Hengchen (currently in preparation).

Data come from two different sources (see respective READMEs for license statements and further details):

  1. books from EEBO (Early English Books Online) → more info
  2. newspaper articles from impressomore info

The Jupyter notebook explore-passim-output.ipynb contains an example of how to load passim's JSON output into a pandas DataFrame to compute some statistics.

To run the notebook as well as the script eebo/code/main.py make sure that you install the required dependencies into a new virtual environment (created by using conda, pyenv, venv, etc.):

pip install -r requirements.txt