Skip to content

Latest commit

 

History

History
66 lines (55 loc) · 3.35 KB

README.md

File metadata and controls

66 lines (55 loc) · 3.35 KB

vcfcompare

Compare VCFs using illumina hap.py and nextflow

Install nextflow if not exists

See Nextflow Documentation

Install docker if not exists

See Docker Documentation

Get hap.py docker image from Docker Hub

   docker pull pkrusche/hap.py:v0.3.9
   singularity build hap.py_v0.3.9.img docker:https://pkrusche/hap.py:v0.3.9 (if prefer singularity)

Get code

   git clone https://github.com/oxfordfun/vcfcompare

Run Test

   nextflow run compare.nf -profile docker

Test default parameter (set your own in nextflow command line)

--input tests/input (folder with vcf files to be compared against the truevcf)
--ref tests/tests/ref/NC_000962.fasta (reference used for vcf generation)
--refindex tests/ref/NC_000962.fasta.fai (indexed reference by samtools)
--refvcf tests/ref/snps.vcf (true vcf to compare against)

Test dataset (one true vcf and three test vcf files)

tests/ref/snps.vcf (a vcf generated by [snippy](https://github.com/tseemann/snippy) as true vcf)
tests/input/snps-test-0.vcf (same as snps.vcf)
tests/input/snps-test-1.vcf (one mutation made)
tests/input/snps-test-2.vcf (two mutations made)

Test summary

snps.vcf vs snps-test-0.vcf

Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
INDEL ALL 120 120 0 120 0 0 0 1.0 1.0 0.0 1.0 0.0 0.0
INDEL PASS 120 120 0 120 0 0 0 1.0 1.0 0.0 1.0 0.0 0.0
SNP ALL 1356 1356 0 1356 0 0 0 1.0 1.0 0.0 1.0 1.66404715128 1.66404715128 0.0 0.0
SNP PASS 1356 1356 0 1356 0 0 0 1.0 1.0 0.0 1.0 1.66404715128 1.66404715128 0.0 0.0

snps.vcf vs snps-test-1.vcf

Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
INDEL ALL 120 120 0 120 0 0 0 1.0 1.0 0.0 1.0 0.0 0.0
INDEL PASS 120 120 0 120 0 0 0 1.0 1.0 0.0 1.0 0.0 0.0
SNP ALL 1356 1355 1 1356 1 0 0 0.999263 0.999263 0.0 0.999263 1.66404715128 1.66404715128 0.0 0.0
SNP PASS 1356 1355 1 1356 1 0 0 0.999263 0.999263 0.0 0.999263 1.66404715128 1.66404715128 0.0 0.0

snps.vcf vs snps-test-2.vcf

Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
INDEL ALL 120 120 0 120 0 0 0 1.0 1.0 0.0 1.0 0.0 0.0
INDEL PASS 120 120 0 120 0 0 0 1.0 1.0 0.0 1.0 0.0 0.0
SNP ALL 1356 1354 2 1356 2 0 0 0.998525 0.998525 0.0 0.998525 1.66404715128 1.65882352941 0.0 0.0
SNP PASS 1356 1354 2 1356 2 0 0 0.998525 0.998525 0.0 0.998525 1.66404715128 1.65882352941 0.0 0.0