-
Notifications
You must be signed in to change notification settings - Fork 14
Home
The Trans-ABySS package has 2 main applications:
- transabyss - RNAseq assembler at a single k-mer size
- transabyss-merge - merge multiple assemblies from (1)
PART 1. Assembly with transabyss
- Usage
- Running transabyss
- Input reads
- Output assembly path
- k-mer size
- MPI and multi-threading
- Strand-specific assembly
PART 2. Merging assemblies with transabyss-merge
- Usage
- Running transabyss-merge
- Minimum and maximum k-mer sizes
- Output merged assembly path
- Output sequence prefixes
Please use our Google Group [email protected] for discussions and support. Existing topics can be viewed at https://groups.google.com/d/forum/trans-abyss.
You may also create issues on our GitHub repository at https://github.com/bcgsc/transabyss/issues.
transabyss --help
transabyss --se reads.fq
See below for assembling single-end reads and/or pair-end reads.
Supported formats are compressed (gzip/bzip) FASTQ/FASTA/SAM/QSEQ or BAM files.
Paired-end reads in FASTQ/A can be interleaved or in separate files.
Paired-end reads in FASTQ/A must have the suffixes /1
or /2
in the read name.
Use options --se
and --pe
to specify the path(s) of single-end reads and paired-end reads, respectively. Example usages:
combination | reads for sequence content | reads for paired-end linkage | assembly |
---|---|---|---|
--se r.fq |
r.fq | none | single-end |
--pe r.fq |
r.fq | r.fq | paired-end |
--se SE.fq --pe PE.fq |
SE.fq | PE.fq | paired-end |
--se SE.fq PE.fq --pe PE.fq |
SE.fq, PE.fq | PE.fq | paired-end |
default: ./transabyss_2.0.0_assembly/transabyss-final.fa
Use options --name
and --outdir
to change the output directory and assembly name.
default: 32
Use option --kmer
to adjust the k-mer size.
k=32 has a good trade-off for assembling both rare and common transcripts.
Using larger k-mers improve the assembly quality of common transcripts and transcripts with repetitive regions, but the assembly of rare transcripts may suffer.
default: no mpi processes; singe-threaded
Use option --threads
to specify the number of threads.
Use option --mpi
to specify the number of MPI processes.
Only the first stage of assembly (de Bruijn graph) could benefit from MPI; all remaining stages may be multi-threaded.
Use option --SS
to indicate that input reads are strand-specific.
Strand-specific reads are expected to have /1
reads in reverse direction and /2
reads in forward direction, ie.
<-- R/1
5' =======================> 3' transcript
F/2 -->
Should you choose to assemble the same dataset with different settings (different k-mer sizes), you can merge the assemblies together into one FASTA file for downstream analyses (with transsabyss-analyze
). When a sequence is contained in a longer sequence, the longer sequence is kept.
transabyss-merge --help
Example:
transabyss-merge a.fa b.fa --mink 32 --maxk 64
--mink
and --maxk
are used to specific the smallest and largest k-mer sizes in the input assemblies.
default: ./transabyss-merged.fa
Use option --out
to specify the output path.
Use option --prefixes
to specify the prefixes of output sequences.
Example:
transabyss-merge a.fa b.fa c.fa --mink 32 --maxk 64 --prefix k32. k48. k64.
file | prefix |
---|---|
a.fa | k32. |
b.fa | k48. |
c.fa | k64. |
One prefix for sequences from each input assembly FASTA file. This feature helps you keep track of the origin of each seqeunce in the merged assembly.