Skip to content

patrickmineault/zipf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zipf's law project

This is an example project for the Good Research Code Handbook. It's a reinterpretation of the Zipf's law project from Research Software Engineering in Python. It reuses and modifies some of the code from the original project, which was licensed under a CC-BY license. For this reason, this repo is under a CC-BY 4.0 license.

Installation

Make a copy of this repo (e.g. with git clone), cd into the root folder of the repo, and run:

pip install -e .

Organization

The project is organized into folders:

  • zipf contains the main module code that runs the analysis
  • scripts contains scripts to glue the module code
  • tests contains tsts of the module code
  • data contains the data for the analysis
  • results will contain the output of the analysis

Running the analysis

cd into the scripts folder and run run_analysis.py via:

python run_analysis.py --in_folder ../data --out_folder ../results

You can then load up visualize_results.ipynb in jupyter to visualize the results.

Running tests

cd into the tests folder and run pytest.

Adding data sources

I've pre-populated the data folder with these books from Project Gutenberg:

You can add more documents to the folder as you wish.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published