Skip to content

A python package for optimizing processing pipelines using statistical design of experiments (DoE).

License

Notifications You must be signed in to change notification settings

clicumu/doepipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

doepipeline

Optimize your data processing pipelines with doepipeline. The optimization strategy implemented in doepipeline is based on methods from statistical Design of Experiments (DoE). Use it to optimize quantitative and/or qualitative factors of simple (single tool) or complex (multiple tool) pipelines.

doepipeline overview

Features

  • Community developed: Users are welcome to contribute to add additional functionality.
  • Installation: Easy installation through conda or PyPI.
  • Generic: The optimization is useful for all kinds of CLI applications.

Quick start links

Take a look at the wiki documentation to getting started using doepipeline. Briefly, the following steps are needed to start using doepipeline.

  1. Install doepipeline
  2. Create YAML configuration file
  3. Run optimization

Four example cases (including data and configuration files) are provided to as help getting started:

  1. de-novo genome assembly
  2. scaffolding of a fragmented genome assembly
  3. k-mer taxonomic classification of ONT MinION reads
  4. genetic variant calling

Cite

doepipeline: a systematic approach for optimizing multi-level and multi-step data processing workflows Svensson D, Sjögren R, Sundell D, Sjödin A, Trygg J BioRxiv doi: https://doi.org/10.1101/504050

About this software

doepipeline is implemented as a Python package. It is open source software made available under the MIT license.

If you experience any difficulties with this software, or you have suggestions, or want to contribute directly, you have the following options:

  • submit a bug report or feature request to the issue tracker
  • contribute directly to the source code through the github repository. 'Pull requests' are especially welcome.