To model TCR-pMHC complex structures, as well as unbound TCR structures, with high fidelity.
While you have the option to download and install TCRmodel2 locally, we highly recommend utilizing our webserver for generating predictions. The webserver offers a user-friendly interface and eliminates the need for local installations. You can access the webserver at the following URL:
https://tcrmodel.ibbr.umd.edu/
If you used our tool, please cite:
Yin R, Ribeiro-Filho HV, Lin V, Gowthaman R, Cheung M, Pierce BG. (2023) TCRmodel2: high-resolution modeling of T cell receptor recognition using deep learning. Nucleic Acids Res. In press. https://doi.org/10.1093/nar/gkad356
- Quick start
- Generate TCR-pMHC complex predictions
- Generate unbound TCR predictions
- Thanks
- References
- Copyright and license
The TCRmodel2 code is adapted from AlphaFold v.2.3.0.
NVIDIA cuda driver >= 11.2
To install the dependencies, we recommend the following steps:
- Install alphaFold requirements in a conda environment. Here's a useful resource if you prefer to install AlphaFold without Docker: https://github.com/kalininalab/alphafold_non_docker
- Install additional packages: ANARCI and MDAnalysis to the conda environment created from previous step. These two packages are not required for generating structural predictions. ANARCI is used to trim TCR to variable domains only, and for renumbering PDB outputs. MDAnalysis is used for output renumbering and output alignment.
conda install -c bioconda anarci
conda config --add channels conda-forge
conda install mdanalysis
While the majority of database files can be found in data/dabases/ folder, due to file size limit, one would need to:
- unzip pdb sequence database file:
cd data/databases
tar -xvzf pdb_seqres.txt.tar.gz
- download pdb_mmcif and params database (around 120 GB total after unzip) used by alphafold to a database folder of your choice, the path of which will be pass as a ori_db variable to the run_tcrmodel2.py and run_tcrmodel2_ub_tcr.py script. Please refer to the download instructions in download_pdb_mmcif.sh and download_alphafold_params.sh in alphafold repository.
Workflow for creating TCR-pMHC complex structure predictions:
- Receive TCR alpha, beta, peptide, MHC sequences
- Build pMHC template alignment file
- Generate MSA features using a reduced database for all chains, considered seperatedly
- Generate all other features by concatenating peptide MHC into one chain
- Predict structures
- Output 5 structures and a text file containing 1) templates used 2) prediction scores
To make a class I TCR-pMHC prediction:
python run_tcrmodel2.py \
--job_id=test_clsI_6kzw \
--output_dir=experiments/ \
--tcra_seq=AQEVTQIPAALSVPEGENLVLNCSFTDSAIYNLQWFRQDPGKGLTSLLLIQSSQREQTSGRLNASLDKSSGRSTLYIAASQPGDSATYLCAVTNQAGTALIFGKGTTLSVSS \
--tcrb_seq=NAGVTQTPKFQVLKTGQSMTLQCSQDMNHEYMSWYRQDPGMGLRLIHYSVGAGITDQGEVPNGYNVSRSTTEDFPLRLLSAAPSQTSVYFCASSYSIRGSRGEQFFGPGTRLTVL \
--pep_seq=RLPAKAPLL \
--mhca_seq=SHSLKYFHTSVSRPGRGEPRFISVGYVDDTQFVRFDNDAASPRMVPRAPWMEQEGSEYWDRETRSARDTAQIFRVNLRTLRGYYNQSEAGSHTLQWMHGCELGPDGRFLRGYEQFAYDGKDYLTLNEDLRSWTAVDTAAQISEQKSNDASEAEHQRAYLEDTCVEWLHKYLEKGKETLLH \
--ori_db=/path/to/alphafold_database #set it as the path to the folder containing pdb_mmcif and params
To make a class II TCR-pMHC prediction:
python run_tcrmodel2.py \
--job_id=test_clsII_7t2c \
--output_dir=experiments \
--tcra_seq=LAKTTQPISMDSYEGQEVNITCSHNNIATNDYITWYQQFPSQGPRFIIQGYKTKVTNEVASLFIPADRKSSTLSLPRVSLSDTAVYYCLVGDTGFQKLVFGTGTRLLVSP \
--tcrb_seq=GAVVSQHPSWVICKSGTSVKIECRSLDFQATTMFWYRQFPKQSLMLMATSNEGSKATYEQGVEKDKFLINHASLTLSTLTVTSAHPEDSSFYICSARDPGGGGSSYEQYFGPGTRLTVT \
--pep_seq=LAWEWWRTV \
--mhca_seq=IKADHVSTYAAFVQTHRPTGEFMFEFDEDEMFYVDLDKKETVWHLEEFGQAFSFEAQGGLANIAILNNNLNTLIQRSNHTQAT \
--mhcb_seq=PENYLFQGRQECYAFNGTQRFLERYIYNREEFARFDSDVGEFRAVTELGRPAAEYWNSQKDILEEKRAVPDRMCRHNYELGGPMTLQR \
--ori_db=/path/to/alphafold_database #set it as the path to the folder containing pdb_mmcif and params
You may use additional flags in run_tcrmodel2.py to control additional behaviors of the script. To see a list of flags:
python run_tcrmodel2.py --help
Workflow for creating TCR-pMHC complex structure predictions:
- Receive TCR alpha, beta sequences
- Generate MSA features using reduced database, and modified TCR template search protocol.
- Predict structures
- Output 5 structures and a text file containing 1) templates used 2) prediction scores
To make a class II TCR-pMHC prediction:
python run_tcrmodel2_ub_tcr.py \
--job_id=test_tcr_7t2b \
--output_dir=experiments \
--tcra_seq=SQQGEEDPQALSIQEGENATMNCSYKTSINNLQWYRQNSGRGLVHLILIRSNEREKHSGRLRVTLDTSKKSSSLLITASRAADTASYFCATDKKGGATNKLIFGTGTLLAVQP \
--tcrb_seq=NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLIHYSVGEGTTAKGEVPDGYNVSRLKKQNFLLGLESAAPSQTSVYFCASSQGGGEQYFGPGTRLTVT \
--ori_db=/path/to/alphafold_database #set it as the path to the folder containing pdb_mmcif and params
You may use additional flags in run_tcrmodel2_ub_tcr.py to control additional behaviors of the script. To see a list of flags:
python run_tcrmodel2_ub_tcr.py --help
We would like to thank alphafold, alphafold_finetune, ColabFold teams for developing and distributing the code. The content inside alphafold/ folder is modified from alphafold/ of alphafold repository. The featurization of custom template is modified from predict_utils.py of alphafold_finetune. Chain break introduction, as well as making mock template feature steps are modified from batch.py of ColabFold.
Yin R, Ribeiro-Filho HV, Lin V, Gowthaman R, Cheung M, Pierce BG. (2023) TCRmodel2: high-resolution modeling of T cell receptor recognition using deep learning. Nucleic Acids Res. In press. https://doi.org/10.1093/nar/gkad356
Apache License 2.0