TMvis 🧬

Welcome to TMvis - a pipeline for transmembrane protein annotation and 3D visualization.

TMvis combines AlphaFold 2 [1] structures from the AlphaFold DB [2] with predicted transmembrane protein (TMP) annotations into interactive 3D visualizations of protein structures embedded into membranes. The TMPs are predicted by TMbed [3], a method based on the protein language model ProtT5 [4], which provides per-residue alpha-helical and beta-barrel transmembrane segment predictions. The respective AlphaFold 2 TMP structures are then enhanced by adding the predicted TMbed topology to the 3D visualization. Further, TMvis allows to add membrane embeddings predicted by ANVIL [5], or PPM3 [6].

Quick access

As an example, we provide a subset of 496 predicted TMPs. TMbed predicted 4.967 TMP for the human proteome (20,375 proteins, UniProt [7] version April 2022; excluding TITIN_HUMAN due to length). We extracted AlphaFold 2 structures with an average per-residue confidence score (pLDDT) of more than 90%, which lead to the subset. Please download the set here.

Once you have the dataset ready, you can use the Jupyter notebook TMvis.ipynb in the TMvis folder for 3D-visualization of every protein structure in the dataset with predictions of ANVIL, PPM3, and TMbed. Additionally, you can visualize the per-residue confidence scores (pLDDT) of AlphaFold.

Running TMvis 🧬

Step 0: Clone repository

git clone https://github.com/Rostlab/TMvis

Step 1: Install dependencies

Python and Conda

conda env create -n TMvis --file TMvis.yml    
conda activate TMvis

Nextflow
Docker

Step 3: Generate dataset

See data/ folder for details. Make sure that after this step, you have a folder data/current/ containing one folder with your AlphaFold 2 structures and a text file with TMbed predictions.

Step 4: Preprocessing and subset extraction

Run python3 ./TMvis/main.py

main.py will generate a results/db folder which contains AF structure of TMbed predicted membrane alpha/beta proteins. Additionally, db/**/pLDDT90F1 is a subset from db with alpha and beta proteins selected by following criteria:

Max. 2.700 base pairs long (length of one AlphaFold 2 PDB file)
pLDDT mean score per protein is higher than 90 (highly accurate structures). If needed you can change the threshold.

Step 5: Predict membrane with PPM3

Unpack the docker container. See docker/ folder for details.
Run nextflow run ./PPM3/run_PPM3.nf -c custom.config

Step 6 (optional): Predict membrane with ANVIL

Note: to run ANVIL, you need an access key.

Unpack the docker container. See docker/ folder for details on how to do that and where to get the access key.
Run nextflow run ./ANVIL/run_anvil.nf -c custom.config

Step 7: Visualize TMPs in 3D

Run jupyter notebook ./TMvis/TMvis.ipynb

References

AlphaFold - AlphaFold Jumper, John, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, et al. 2021. “Highly Accurate Protein Structure Prediction with AlphaFold.” Nature 596 (7873): 583–89.
Alphafold DB - Varadi, Mihaly, Stephen Anyango, Mandar Deshpande, Sreenath Nair, Cindy Natassia, Galabina Yordanova, David Yuan, et al. 2022. “AlphaFold Protein Structure Database: Massively Expanding the Structural Coverage of Protein-Sequence Space with High-Accuracy Models.” Nucleic Acids Research 50 (D1): D439–44.
TMbed - TMbed Bernhofer, Michael, and Burkhard Rost. 2022. “TMbed – Transmembrane Proteins Predicted through Language Model Embeddings.” bioRxiv.
ProtT5 - A. Elnaggar et al., "ProtTrans: Towards Cracking the Language of Lifes Code Through Self-Supervised Deep Learning and High Performance Computing," in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2021.3095381.
ANVIL - ANVIL Postic, Guillaume, Yassine Ghouzam, Vincent Guiraud, and Jean-Christophe Gelly. 2016. “Membrane Positioning for High- and Low-Resolution Protein Structures through a Binary Classification Approach.” Protein Engineering, Design & Selection: PEDS 29 (3): 87–91.
PPM3 - PPM3 Lomize, Mikhail A., Irina D. Pogozheva, Hyeon Joo, Henry I. Mosberg, and Andrei L. Lomize. 2012. “OPM Database and PPM Web Server: Resources for Positioning of Proteins in Membranes.” Nucleic Acids Research 40 (Database issue): D370–76.
UniProt - UniProt Consortium (2021). UniProt: the universal protein knowledgebase in 2021. Nucleic acids research, 49(D1), D480–D489.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TMvis 🧬

Quick access

Running TMvis 🧬

Step 0: Clone repository

Step 1: Install dependencies

Step 3: Generate dataset

Step 4: Preprocessing and subset extraction

Step 5: Predict membrane with PPM3

Step 6 (optional): Predict membrane with ANVIL

Step 7: Visualize TMPs in 3D

References

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
ANVIL		ANVIL
PPM3		PPM3
TMvis		TMvis
data		data
docker		docker
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
TMvis.yml		TMvis.yml
custom.config		custom.config

License

Rostlab/TMvis

Folders and files

Latest commit

History

Repository files navigation

TMvis 🧬

Quick access

Running TMvis 🧬

Step 0: Clone repository

Step 1: Install dependencies

Step 3: Generate dataset

Step 4: Preprocessing and subset extraction

Step 5: Predict membrane with PPM3

Step 6 (optional): Predict membrane with ANVIL

Step 7: Visualize TMPs in 3D

References

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages