regsnp-intron

regsnp-intron predicts the disease-causing probability of intronic single nucleotide variants (iSNVs) based on both genomic and protein structural features.

Prerequisites

ANNOVAR (>= 2016Feb01): Follow the instructions at https://annovar.openbioinformatics.org/en/latest to install, and prepare Ensembl gene annotation.

tar -xf annovar.latest.tar.gz
cd annovar
perl annotate_variation.pl -downdb -buildver hg19 -webfrom annovar ensGene humandb/

BEDTools (>= 2.25.0): Follow the instructions at https://bedtools.readthedocs.io/en/latest install, and make sure the programs are in your PATH.

Python (>= 2.7.11): Installing libraries such as Numpy and Scipy can be a little difficult for inexperienced users. We highly recommend installing Anaconda. Anaconda conveniently installs Python and other commonly used packages for scientific computing and data science. (Python 3 is not currently supported.)

The following Python libraries are also required. They will be automatically installed if you use pip (see Installation).

Numpy (>= 1.10.4),
Scipy (>= 0.17.0),
Pandas (>= 0.17.1),
Scikit-learn (>= 0.17),
pysam (>= 0.8.4),
bx-python (0.7.3),
pybedtools (>= 0.7.6)

Installation

The easiest way is to install with pip (recommended). This will also install all the required Python libraries:

pip install regsnp-intron

To manually install, you need to install all the required Python libraries first. Then, use the following command:

git clone https://github.com/linhai86/regsnp_intron.git
cd regsnp_intron
python setup.py install

Configuration

Download the database and annotation files for human genome (hg19):

wget to_be_added
unzip db.zip

Modify the paths in settings/settings.json file. Type regsnp_intron --help to find the location of default settings.json file. You can also provide customized settings.json file with -s (see Usage):

{
    "annovar_path": "/path/to/annovar",
    "annovar_db_path": "/path/to/annovar/humandb",
    "db_dir": "/path/to/db"
}

Usage

usage: regsnp_intron [-h] [-s SFNAME] [-f] ifname out_dir

Given a list of intronic SNVs, predict the disease-causing probability based
on genomic and protein structural features.

positional arguments:
  ifname                input SNV file. Contains four columns: chrom, pos,
                        ref, alt.
  out_dir               directory contains output files

optional arguments:
  -h, --help            show this help message and exit
  -s SFNAME, --sfname SFNAME
                        JSON file containing settings. Default setting file
                        locate at: regsnp_intron/settings/settings.json
  -f, --force           overwrite existing directory

Output

The following files will be generated under the output directory:

snp.prediction.txt: tab-delimited text file containing prediction results and all the features for iSNVs.
snp.features.txt: tab-delimited text file containing all the features for iSNVs (can be deleted).
tmp: temporary folder containing all the intermediate results (can be deleted).

snp.prediction.txt contains the following columns:

chrom: Chromosome
pos: Position.
ref: Reference allele.
alt: Alternative allele.
disease: Categorical prediction.
prob: Disease-causing probability [0, 1]. Higher score indicates higher probability of being pathologic.
splicing_site: Indicates on/off splicing site. Splicing sites are defined as +7bp from donor site and -13bp from acceptor site.
features: The rest of columns contain all the genomic and protein structural features around each iSNV.
. 
.
.

Citation

Please cite:

To be added

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
bin		bin
regsnp_intron		regsnp_intron
test		test
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

regsnp-intron

Prerequisites

Installation

Configuration

Usage

Output

Citation

About

Releases 1

Packages

Languages

License

yunliu/regsnp_intron

Folders and files

Latest commit

History

Repository files navigation

regsnp-intron

Prerequisites

Installation

Configuration

Usage

Output

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages