Cookiecutter Flexible ML

Anders Poirel

A flexible cookiecutter template for reproducible data science. cookiecutter is a command-line utility that creates projects from cookiecutters (project templates). See cookiecutter.readthedocs.io.

Usage

Requirements:

cookiecutter >= 1.1

Run

cookiecutter gh:Jswig/cookiecutter-flexible-ml

[Enter] will select the default option in prompt.

Generated project structure

Some of these might not be created depending on options picked

.
├── LICENSE
├── environment.yml /  <- File specifying an environment
|   requirements.txt   
├── README.md
├── run.py/Makefile    <- Script for running the full
|    /Snakefile          analysis.
│                        
├── data
|   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling. 
│   └── raw            <- The original, immutable data dump.
├── notebooks          <- Jupyter notebooks.
├── output             <- For outputs of running the analysis.
│   ├── models         <- Saved models and predictions.      
│   └── visualizations <- Graphics created during analysis.       
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
└── src                <- Source code for this project.
    ├── __init__.py    <- Makes this a Python module
    ├── data                 <- Scripts to download or generate data.
    |   └── make_dataset.py  
    ├── features             <- Scripts to turn raw data into features for modeling.
    |   └── build_features.py  
    ├── models               <- Scripts used to generate models and inference results.
    └── visualization        <- Scripts to generate graphics.
        └── visualize.py

Motivation

A project template that promotes good practices for reproducible data science (immutablity of raw data, seperation of exploratory code and "final" analysis code), while giving options for more or less complex projects

A python script is used by default for workflow automation instead of a Makefile (though it is left as an option) to avoid issues for Windows users. snakemake is preferred for "real" projects.
The template does not include boilerplate for generating documentation or python packages, which is not necessary for the intended use case of personal /classroom projects.

This project is meant to adapt (and borrows liberally from) Driven Data's cookicutter-data-science structure and philosophy to slightly different needs.

License

This project is distributed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
hooks		hooks
{{cookiecutter.repo_name}}		{{cookiecutter.repo_name}}
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cookiecutter.json		cookiecutter.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cookiecutter Flexible ML

Usage

Generated project structure

Motivation

License

About

Releases

Packages

Languages

License

Jswig/cookiecutter-flexible-ml

Folders and files

Latest commit

History

Repository files navigation

Cookiecutter Flexible ML

Usage

Generated project structure

Motivation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages