sigDriver is a driver discovery tool base on mutational signature exposures. Currently, single base substitution signatures are supported.
sigDriver
works in R (>= 3.4.0) and depends on the following packages: data.table, optparse, dplyr, GenomicRanges, reshape2, SKAT, matrixStats, rtracklayer, trackViewer, qqman, VariantAnnotation, DDoutlier, R.utils
sigDriver
package can run on a standard computer. The memory requirment is dependent on the number of the input somatic variants and number of requested threads.
The runtimes below are generated using a computer with 16 GB RAM, 4 [email protected] GHz.
This package is supported for macOS and Linux. The package has been tested on the following systems:
- Linux: openSUSE 42.3
Installation in R using devtools:
- This should take less than 1 minute if all prerequisites were meet
library(devtools)
install_github("wkljohn/sigDriver",subdir="R_library")
Installing helper Rscripts and the example data:
git clone https://github.com/wkljohn/sigDriver.git
Run sigDriver
- The approximate run time for this demo is 9 minutes
Rscript run_sigdriver.R -s SBS9 -t 4 -v ./example/example_SBS9.snv.simple.gz -e ./example/example_signatures.tsv -m ./example/example_SBS9_metadata.tsv -o ./output/
Run results annotation
- The approximate run time for this demo is 1 hour 43 minutes
Rscript run_sigdriver_annotate.R -g gencode.v19.gtf -t 4 -l ./output/SBS9_results.tsv -d ./output/SBS9_var_meta.rds -s SBS9 -v ./example/example_SBS9.snv.simple.gz -e ./example/example_signatures.tsv -m ./example/example_SBS9_metadata.tsv -o ./output/
- Running
run_sigdriver.R
require the following inputs:
Parameter | Description | Note |
---|---|---|
-v | Input variants in simple format | Tab delimited without column names {Entity,ID,Cohort,Genome_Build,Variant_type(SNV/INDEL),Chr,Start,End,Ref,Alt} |
-m | Sample metadata | Tab delimited with column names {ID,entity,gender} |
-e | Signature exposures | 'signatures' x 'ID' matrix column names=ID, row names=Signature |
-o | Output folder | |
-s | Signature to test | signature(s) listed in -e |
-t | Threads |
- Running
run_sigdriver_annotate.R
require inputs in-addition to runningrun_sigdriver.R
:
Parameter | Description | Note |
---|---|---|
-g | Reference GTF | |
-l | Results from sigdriver | {Signature}_results.tsv in output folder |
-d | RDS object from sigdriver | {Signature}_var_meta.rds in output folder |
- Running
run_sigdriver.R
produce the following outputs:
File name | Description |
---|---|
{signature}_results.tsv |
the final output of the script on associated hotspots and the details |
{signature}_results_FULL.tsv |
the hotspots output before filtering and p-values correction |
{signature}_qq.png |
the quantile-quantile plot for the distribution of p-values |
{signature}_var_meta.rds |
The corrected variants table for downstream analysis(note: this R object is incompatible across different versions of GenomicRanges) |
- Running
run_sigdriver_annotate.R
produce the following outputs:
File name | Description |
---|---|
{signature}_perturb_sites_annotated_results.tsv |
the perturbed p-values and gene annotations of each hotspot point mutation |
./plots/{signature}/[hotspot].png |
the graphical presentation of the significant hotspot |
./plots/{signature}/[hotspot].png |
the graphical presentation of the significant hotspot |
For questions and suggestions, please contact:
John Wong: [email protected]