Skip to content

Tool for analysing reports for Antimicrobial resistance and virulence genes, plasmid types and multilocus sequence typing data

License

Notifications You must be signed in to change notification settings

hkaspersen/VAMPIR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VAMPIR - Virulence, Amr, Mlst and Plasmid analysis In R

This script takes reports from the software ARIBA (https://github.com/sanger-pathogens/ariba) and creates summary reports, basic statistics reports and visualizations based on user-defined settings.

Author: Håkon Kaspersen, Norwegian Veterinary Institute

Dependencies

Non-CRAN packages:

CRAN packages: dplyr, tidyr, tibble, optparse, purrr, reprex, stringr, knitr

Bioconductor packages: ggtree, ape

If not on SAGA, you need to change the paths to the executable scripts in the VAMPIR.R file.

Usage

Make sure that your .Rprofile file have the correct library path before beginning. The .Rprofile file is located in your home folder, and you get there by typing cd. Type in nano .Rprofile, and paste in the following:

path <- "/cluster/projects/nn9305k/lib/R/"

.libPaths(c(path, .libPaths()))

Usage:

module load R/4.0.0-foss-2020a

Rscript VAMPIR.R [options] -o output_folder

For help:

Rscript VAMPIR.R -h

Help screen:

Usage: VAMPIR.R [options] -o output_folder


Options:
        -h, --help
                Show this help message and exit

        -u MUT, --mut=MUT
                Directory of megaRes reports.

        -a ACQ, --acq=ACQ
                Directory of resFinder reports.

        -i INTRINSIC, --intrinsic=INTRINSIC
                List of intrinsic genes of interest, used with -u.
                Type 'all' for including all reported genes.
                Can partially match gene names, f. ex. 'gyr' will match all gyr genes identified.
                Example: -i gyr,par,mar
                
        -q, --gyrfix
                Add to filter the reported mutations in gyrA, gyrB, parC, and parE to those in the QRDR only.

        -c ACQUIRED, --acquired=ACQUIRED
                List of acquired genes of interest, used with -a.
                Type 'all' for including all reported genes.
                Can partially match gene names, f. ex. 'qnr' will match all qnr genes identified.
                Example: -c blaTEM,oqxAB,qnr

        -v VIR, --vir=VIR
                Directory of ARIBA virulence reports.

        -r VIRGENES, --virgenes=VIRGENES
                Virulence genes of interest, use with -v.
                Type 'all' for including all reported genes.

        -d DATABASE, --database=DATABASE
                Virulence database used: virfinder or vfdb

        -m MLST, --mlst=MLST
                Directory of ARIBA MLST reports.
        
        -p PLASMID, --plasmid=PLASMID
                Directory of ARIBA plasmid reports.

        -o OUTPUT, --output=OUTPUT
                Output directory.
                One folder for each analysis will be created
                at given location.
        
        -f FILEENDING, --fileending
                The suffix of the reports generated by ARIBA, f.ex. "_amr_report.tsv".

        --version
                Print version info.

Tracks

  • Intrinsic AMR gene analysis (-u, genes: -i, filter: -q)

    • This track analyses reports from the MEGAres database, and gives reports based on which genes are specified by the user in -i. Filters mutations reported in gyrA, gyrB, parC, and parE to only those inside the quinolone resistance determining region (QRDR) with -q.
  • Acquired AMR gene analysis (-a, genes: -c)

    • This track analyses reports from the ResFinder database, and gives reports based on which genes are specified by the user in -c.
  • Virulence gene analysis (-v, database: -d, genes: -r)

    • This track analyses virulence reports from ARIBA and gives a summary report and a detailed report on which virulence genes were found.
  • Multilocus sequence typing analysis (-m)

    • This track takes summary reports on MLST from ARIBA and gives a summary report on sequence types and alleles, as well as a neighbor joining tree based on allele distances.
  • Plasmid typing analysis (-p)

    • This track plasmidFinder reports from ARIBA (-p) and gives summary reports on which plasmid types were identified.

Output files

  • *_report.txt: A tab separated text file containing columns with genes, and 1/0 for present/absent for each isolate (row).

  • *_flags.txt: A tab separated text file containing the quality control values (flags) for each gene/mutation found in the respective report. The column "flag_result" determines in the respective gene/mutation passed quality control (1) or not (0). Note that all reported genes and mutations are presented here.

  • *_stats.txt: A tab separated text file containing the summary statistics of the respective analysis type. It presents the percentage of isolates where the given gene is present, as well as a 95 % confidence interval.

  • intrinsic_mut_report.txt: A tab separated text file containing the detailed mutations found in specified intrinsic genes.

  • mlst_tree.png: A figure presenting the phylogenetic tree based on allele distances.

  • mlst_tree.newick: A newick-format phylogenetic tree based on allele distances.

About

Tool for analysing reports for Antimicrobial resistance and virulence genes, plasmid types and multilocus sequence typing data

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages