Skip to content

Machine learning on transposable elements in the human genome

Notifications You must be signed in to change notification settings

DanielY1783/te_ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status

Machine Learning Analysis on Human Genome Transposable Elements

Overview

This project analyzes the overlap of transposable elements and enhancers within the human genome using various machine learning algorithms such as random forest, svm, and tsne, as well as dimensionality reduction with pca.

Documentation Details

Many README files will refer to pile paths on the Vanderbilt accre cluster, which will start with /dors/capra_lab/users/yand1/te_ml/. For the purpose of finding files within this Github repository, consider /dors/capra_lab/users/yand1/te_ml/ as equivalent to the root directory of this Github repository.

Source Files

Source files are in the bin folder, which contains directories corresponding to the creation date of different files. Detailed documentation is inside the bin folder.

Data

The data files are too large to store on Github and are on the Vanderbilt ACCRE cluster at

/dors/capra_lab/users/yand1/te_ml/data

Note that data files are zipped and need to be unzipped using gunzip

Results

The results folder was added on 2018-07-13 for easier synchronizing between local machine and Vanderbilt accre cluster. More detailed documentation is inside the results folder.

Note

results/2018_06_21_chromehmm_te folder is ignored due to large size of file within the folder, but is on the Vanderbilt accre cluster at /dors/capra_lab/users/yand1/te_ml/results/2018_06_21_chromehmm_te

About

Machine learning on transposable elements in the human genome

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published