NAME
coverm contig - Calculate read coverage per-contig (version 0.7.0)
SYNOPSIS
coverm contig <MAPPING_INPUT> ..
DESCRIPTION
coverm contig calculates the coverage of a set of reads on a set of contigs.
This process can be undertaken in several ways, for instance by specifying BAM files or raw reads as input, using different mapping programs, thresholding read alignments, using different methods of calculating coverage and printing the calculated coverage in various formats.
The source code for CoverM is available at https://github.com/wwood/CoverM
READ MAPPING PARAMETERS
- -1 PATH ..
Forward FASTA/Q file(s) for mapping. These may be gzipped or not.
- -2 PATH ..
Reverse FASTA/Q file(s) for mapping. These may be gzipped or not.
- -c, --coupled PATH ..
One or more pairs of forward and reverse possibly gzipped FASTA/Q files for mapping in order <sample1_R1.fq.gz> <sample1_R2.fq.gz> <sample2_R1.fq.gz> <sample2_R2.fq.gz> ..
- --interleaved PATH ..
Interleaved FASTA/Q files(s) for mapping. These may be gzipped or not.
- --single PATH ..
Unpaired FASTA/Q files(s) for mapping. These may be gzipped or not.
- -b, --bam-files PATH
Path to BAM file(s). These must be reference sorted (e.g. with samtools sort) unless
--sharded
is specified, in which case they must be read name sorted (e.g. withsamtools sort -n
). When specified, no read mapping algorithm is undertaken.
REFERENCE
- -r, --reference PATH
FASTA file of contigs e.g. concatenated genomes or metagenome assembly, or minimap2 index (with
--minimap2-reference-is-index
), strobealign index (with--strobealign-use-index
), or BWA index stem (with-p bwa-mem/bwa-mem2
). If multiple references FASTA files are provided and--sharded
is specified, then reads will be mapped to references separately as sharded BAMs. [required unless-b/--bam-files
is specified]
MAPPING ALGORITHM OPTIONS
-p, --mapper NAME
name | description |
---|---|
minimap2-sr |
minimap2 with '-x sr ' option |
bwa-mem |
bwa mem using default parameters |
bwa-mem2 |
bwa-mem2 using default parameters |
minimap2-ont |
minimap2 with '-x map-ont ' option |
minimap2-pb |
minimap2 with '-x map-pb ' option |
minimap2-hifi |
minimap2 with '-x map-hifi ' option |
minimap2-no-preset |
minimap2 with no '-x ' option |
- --minimap2-params PARAMS
Extra parameters to provide to minimap2, both indexing command (if used) and for mapping. Note that usage of this parameter has security implications if untrusted input is specified. '
-a
' is always specified to minimap2. [default: none]
- --minimap2-reference-is-index
Treat reference as a minimap2 database, not as a FASTA file. [default: not set]
- --bwa-params PARAMS
Extra parameters to provide to BWA or BWA-MEM2. Note that usage of this parameter has security implications if untrusted input is specified. [default: none]
- --strobealign-params PARAMS
Extra parameters to provide to strobealign. Note that usage of this parameter has security implications if untrusted input is specified. [default: none]
- --strobealign-use-index
Use a pregenerated index (one that has been created with 'strobealign --create-index'). The --reference option should be specified as the original FASTA file i.e. 'ref.fna' not 'ref.fna.r100.sti' [default: not set]
ALIGNMENT THRESHOLDING
- --min-read-aligned-length INT
Exclude reads with smaller numbers of aligned bases. [default:
0
]
- --min-read-percent-identity FLOAT
Exclude reads by overall percent identity e.g. 95 for 95%. [default:
0
]
- --min-read-aligned-percent FLOAT
Exclude reads by percent aligned bases e.g. 95 means 95% of the read's bases must be aligned. [default:
0
]
- --min-read-aligned-length-pair INT
Exclude pairs with smaller numbers of aligned bases. Implies --proper-pairs-only. [default:
0
]
- --min-read-percent-identity-pair FLOAT
Exclude pairs by overall percent identity e.g. 95 for 95%. Implies --proper-pairs-only. [default:
0
]
- --min-read-aligned-percent-pair FLOAT
Exclude reads by percent aligned bases e.g. 95 means 95% of the read's bases must be aligned. Implies --proper-pairs-only. [default:
0
]
- --proper-pairs-only
Require reads to be mapped as proper pairs. [default: not set]
- --exclude-supplementary
Exclude supplementary alignments. [default: not set]
- --include-secondary
Include secondary alignments. [default: not set]
COVERAGE CALCULATION OPTIONS
-m, --methods METHOD
method | description |
---|---|
mean |
(default) Average number of aligned reads overlapping each position on the contig |
trimmed_mean |
Average number of aligned reads overlapping each position after removing the most deeply and shallow-ly covered positions. See --trim-min /--trim-max to adjust. |
coverage_histogram |
Histogram of coverage depths |
covered_bases |
Number of bases covered by 1 or more reads |
variance |
Variance of coverage depths |
length |
Length of each contig in base pairs |
count |
Number of reads aligned to each contig. Note that supplementary alignments are not counted. |
metabat |
("MetaBAT adjusted coverage") Coverage as defined in Kang et al 2015 https://doi.org/10.7717/peerj.1165 |
reads_per_base |
Number of reads aligned divided by the length of the contig |
rpkm |
Reads mapped per kilobase of contig, per million mapped reads |
tpm |
Transcripts Per Million as described in Li et al 2010 https://doi.org/10.1093/bioinformatics/btp692 |
- --min-covered-fraction FRACTION
Contigs with less covered bases than this are reported as having zero coverage. [default:
0
]
- --contig-end-exclusion INT
Exclude bases at the ends of reference sequences from calculation [default:
75
]
- --trim-min FRACTION
Remove this smallest fraction of positions when calculating trimmed_mean [default:
5
]
- --trim-max FRACTION
Maximum fraction for trimmed_mean calculations [default:
95
]
OUTPUT
- -o, --output-file FILE
Output coverage values to this file, or '-' for STDOUT. [default: output to STDOUT]
- --output-format FORMAT
Shape of output: 'sparse' for long format, 'dense' for species-by-site. [default: dense]
- --no-zeros
Omit printing of genomes that have zero coverage. [default: not set]
- --bam-file-cache-directory DIRECTORY
Output BAM files generated during alignment to this directory. The directory may or may not exist. Note that BAM files in this directory contain all mappings, including those that later are excluded by alignment thresholding (e.g. --min-read-percent-identity) or genome-wise thresholding (e.g. --min-covered-fraction). [default: not used]
- --discard-unmapped
Exclude unmapped reads from cached BAM files. [default: not set]
GENERAL OPTIONS
- -t, --threads INT
Number of threads for mapping, sorting and reading. [default: 1]
- -h, --help
Output a short usage message. [default: not set]
- --full-help
Output a full help message and display in 'man'. [default: not set]
- --full-help-roff
Output a full help message in raw ROFF format for conversion to other formats. [default: not set]
- -v, --verbose
Print extra debugging information. [default: not set]
- -q, --quiet
Unless there is an error, do not print log messages. [default: not set]
FREQUENTLY ASKED QUESTIONS (FAQ)
Can the temporary directory used be changed? CoverM makes use of the system temporary directory (often /tmp
) to store intermediate files. This can cause problems if the amount of storage available there is small or used by many programs. To fix, set the TMPDIR
environment variable e.g. to set it to use the current directory: TMPDIR=. coverm genome <etc>
For thresholding arguments e.g. \-\-dereplication\-ani and \-\-min\-read\-percent\-identity, should a percentage (e.g 97%) or fraction (e.g. 0.97) be specified? Either is fine, CoverM determines which is being used by virtue of being less than or greater than 1.
EXIT STATUS
- 0
Successful program execution.
- 1
Unsuccessful program execution.
- 101
The program panicked.
EXAMPLES
- Calculate mean coverage from reads and assembly
$ coverm contig --coupled read1.fastq.gz read2.fastq.gz --reference assembly.fna
- Calculate MetaBAT adjusted coverage from a sorted BAM file, saving the unfiltered BAM files in the saved_bam_files folder
$ coverm contig --method metabat --bam-files my.bam --bam-file-cache-directory saved_bam_files