Skip to content

PISPINO (PIpits SPIN-Off tools): Bioinformatics toolkits for processing NGS data

Notifications You must be signed in to change notification settings

hsgweon/pispino

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

License: GPL v3

PISPINO (PIpits SPIN-Off tools)

A bioinformatics toolkit for processing NGS data

Tools which were originally part of PIPITS (a fungal ITS pipeline), but moved here for more generic use as well as to be managed separately. For Linux and Mac OS only.

Installation

Prerequisite: set up conda channels (only for the first time)

add the Bioconda channel as well as the other channels bioconda depends on. It is important to add them in this order (this needs to be done once)

$ conda config --add channels defaults
$ conda config --add channels conda-forge
$ conda config --add channels bioconda

Create a Conda environment with PISPINO

It is recommended that you use Conda environment so that you install tools and dependencies in this "sandbox" environment without messing with your system. Don't worry, it's easy - just type the following command. N.B. pispino only supports Python3, but none of this should matter if you use the Conda environment. It should just work.

create a Conda environment (here named "myNGSenv" but you can choose any name)

$ conda create -n myNGSenv python=3.6 pispino 

Tool 1: SEQPREP

###For preparing (quality filter, reindex, join, merge etc.) raw data from Illumina sequencing platform for further processing by PIPITS, QIIME etc.

Prerequisite

All you need is a directory with your raw FASTQ sequences (can be compressed with .gz or .bzip2 or uncompressed).

Usage

Illumina reads are generally provided as demultiplexed FASTQ files where BASESPACE (Illumina software) splits the reads into separate files, one for each barcode.

first of all, get into your environment you just created

$ source activate myNGSenv

create a list file to specify (1) sample names, (2) file names of forward reads and (3) file names of reverse reads. This can be done with pispino_createreadpairslist which will generate a tab-delimited text file for all read-pairs from the directory containing your fresh raw sequences from sequencer. "rawdata" is the directory with your FASTQ sequences

$ pispino_createreadpairslist -i rawdata -o readpairslist.txt

inspect "readpairslist.txt" to see everything looks right, and once happy, process the data with the following: (see more options by "pispino_seqprep -h")

$ pispino_seqprep -i rawdata -o pispino_seqprep_output -l readpairslist.txt

to leave the environment

$ source deactivate

About

PISPINO (PIpits SPIN-Off tools): Bioinformatics toolkits for processing NGS data

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages