GitHub - bilille/miRkwood

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 2,748 Commits
R		R
cgi-bin		cgi-bin
docker		docker
galaxy/miRkwood		galaxy/miRkwood
html		html
provisioning		provisioning
resources		resources
tests-browser		tests-browser
.dockerignore		.dockerignore
LICENCE		LICENCE
README.txt		README.txt
Vagrantfile		Vagrantfile
install-dependencies.sh		install-dependencies.sh
miRkwood_installation.md		miRkwood_installation.md

Repository files navigation

SYNOPSIS

miRkwood is an application that allows for the fast and easy identification of microRNAs. It is specifically designed for plant microRNAs.

INSTALL

See file miRkwood_installation.md.

USAGE

miRkwood comes in two distinct pipelines, according to the input data type.

-mirkwood.pl (abinitio pipeline): scans a genomic sequence and finds all potential microRNA precursors.
Input: a FASTA file.

-mirkwood-bed.pl (smallRNAseq pipeline): analyses small RNA deep sequencing data and find all potential microRNAs.
Input : a BED file.

OPTIONS

-mirkwood.pl: perl -I/{miRkwood_path}/cgi-bin/lib/ mirkwood.pl [options]
Mandatory options:
--input
Path to the fasta file.

--output
Output directory. If non existing it will be created. The directory
must be empty.

Additional options:
--both-strands
Scan both strands.

--species-mask
Mask coding regions against the given organism.

--shuffles
Compute thermodynamic stability (shuffled sequences).

--filter-mfei
Select only sequences with MFEI < -0.6.

--filter-rrna
Filter out ribosomal RNAs (using RNAmmer).

--filter-trna
Filter out tRNAs (using tRNAscan-SE).

--align
Flag conserved mature miRNAs (alignment with miRBase + miRdup).

--varna
Allow the structure generation using Varna.

--help
Print a brief help message and exits.

--man
Prints the manual page and exits.

-mirkwood-bed.pl: perl -I/{miRkwood_path}/cgi-bin/lib/ mirkwood-bed.pl [options]
Mandatory options:
--input
Path to the BED file (created with our script mirkwood-bam2bed.pl).

--genome
Path to the genome (fasta format).

--output
Output directory. If non existing it will be created. The directory
must be empty.

Additional options:
--shuffles
Compute thermodynamic stability (shuffled sequences).

--align
Flag conserved mature miRNAs (alignment with miRBase + miRdup).

--no-filter-mfei
Don't filter out sequences with MFEI >= -0.6. Default : only keep
sequences with MFEI < -0.6.

--mirbase
If you have a gff file containing known miRNAs for this assembly,
use this option to give the path to this file.

--gff
List of annotation files (gff or gff3 format). Reads matching with
an element of these files will be filtered out. For instance you can
filter out CDS by providing a suitable GFF file.

--no-filter-bad-hairpins
By default the candidates with a quality score of 0 and no
conservation are discarded from results and are stored in a BED
file. Use this option to keep all results.

--min-read-positions-nb
Minimum number of positions for each read to be kept. Default : 0.

--max-read-positions-nb
Maximum number of positions for each read to be kept. Default : 5
(reads that map at more than 5 positions are filtered out).

--varna
Allow the structure generation using Varna.

--help
Print a brief help message and exits.

--man
Prints the manual page and exits.

OUTPUT

For both pipelines:

alignments : folder containing all alignments files
(only if option --align is on).

images: folder containing images created by VARNA
(only if option --varna is on).

results: folder containing all results files, in several
formats (csv, fa, gff, html and txt).

sequences: folder containing sequences for each candidate
in fasta and dotbracket format, alternatives sequences
if they exist and optimal structure if it is different
from the stemloop structure.

YML: folder containing all candidates data in YAML format.

basic_candidates.yml: contains a summary of all candidates
with basic informations (this file is needed to create
the results files).

log.log: log file (hey, what did you expect?)

run_options.cfg: config file with the chosen options.

ab initio pipeline only:

masks: folder containing results of BlastX, rnammer and tRNAscan-SE.

input_sequences.fas: your sequences.

smallRNAseq pipeline only:

read_clouds: folder containing all text files for the candidates
read clouds.

bed_sizes.txt: tabulated file with the number of reads in each BED file.

summary.txt: contains a summary of your options and of results.

Depending on the options you chose for your job you may find
some of the following files:

your_bed_your_GFF.tar.gz: a compressed BED containing all reads matching
to features from your GFF file, for each GFF file that you
provided.

your_bed_multimapped.tar.gz: a compressed BED containing all reads from your
input BED file mapping at less than --min-read-positions-nb positions
or more than --max-read-positions-nb positions.

your_bed_miRNAs.tar.gz: a compressed BED containing all reads from your
input BED file corresponding to miRNAs present in miRBase.

your_bed_orphan_clusters.tar.gz: a compressed BED containing all reads from your
input BED file that fall into a peak but that don't correspond to
a valid miRNA candidate.

your_bed_orphan_hairpins.tar.gz: a compressed BED containing all candidates
with a quality score of 0 and no conservation. By default
these candidates are excluded from final results, but you can
change this behaviour with flag option --no-filter-bad-hairpins.

your_bed_filtered.bed: a BED containing all reads from your
input BED file that have not been filtered out in one of the
previous categories.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

bilille/miRkwood

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages