Skip to content

JanZrimec/ExpressionGAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

ExpressionGAN

Controlling gene expression with deep generative design of regulatory DNA

Link to paper: 10.1038/s41467-022-32818-8

Figure. Predictor-guided generator optimization enables gene-specific navigation of the regulatory sequence-expression landscape. T-distributed stochastic neighbor embedding (t-SNE) mapping of the input latent subspaces that produce novel sequence variants spanning ~6 orders of magnitude of gene expression (colored and black dots), uncovered using the predictor-guided generator optimization. Black dots represent selections of 10 sequence variants per each of the 4 expression groups covering a 4 order-of-magnitude range of predicted expression levels from TPM ~10 to ~10,000.


Note

Arrowsheads were incorrectly rendered and are missing in schematic figures 1a,d, 3a & 6e. The correct panels are available in the docs folder.

Figure 1a. Schematic depiction of sequence data and model training strategies.

Figure 1d. Overview of the generative adversarial network (GAN) approach.

Figure 3a. Schematic depiction of the procedure to optimize the generator.

Figure 6e. Schematic depiction of the mutagenesis strategy.


Scripts for training and optimization of ExpressionGAN as well as to reproduce the analysis are provided in the folder 'scripts'.

The data including generated sequence data are available at DOI, extract the archive to a folder named 'data'.

Software dependencies are specified in the environment files in the 'docs' folder, with env_training.yml used for GAN training and optimization and env_analysis.yml used for the data analysis.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published