GTE: A Graph Learning Framework for Prediction of T-cell Receptors and Epitopes Binding Specificity

Welcome to GTE, a powerful Graph Learning Framework designed for the prediction of T-cell Receptors and Epitopes binding specificity.

Folder Structure

The project's folder structure is as follows:

models folder:

The 'models' folder contains saved models generated by GTE. It includes models for four different datasets, divided into RandomTCR and StrictTCR partitions, each with results for individual folds. In total, you will find 40 models.

The naming convention for model files is as follows: XXXXX_0123_4, where XXXXX represents the dataset name, 0123 represents the fold used for training, and 4 indicates that the model is used for testing.

You can download our 40 models for inference here.
processed_data folder:

This folder contains the raw data for each dataset and the pre-processed 5-fold data. These data are used for training and testing the models.
results folder:

In this folder, we store the model's predictions on the datasets. These results can help us analyze model performance and generate further visualizations and reports.

Quick Start

Create a Conda Environment:

Start by creating a Conda environment with Python 3.11. If you haven't already installed Conda, you can get it from Anaconda.
```
conda create -n GTE python=3.11
```
Activate the environment:
```
conda activate GTE
```
Install Dependencies:

Use pip to install the required packages listed in the requirements.txt file.
```
pip install -r requirements.txt
```
How to Run:

To quickly run the program, use the following command:
```
python inference.py --split RandomTCR --dataset pMTnet 
```
Available options:
- --split:
  - Default: "RandomTCR"
  - Choices: ["RandomTCR", "StrictTCR"]
- --dataset:
  - Default: "pMTnet"
  - Choices: ["McPAS", "pMTnet", "VDJdb", "TEINet"]
- --device:
  - Default: "cpu"
  - Choices: ["cpu", "gpu"]
- --gpu_id:
  - Default: 0
  - Description: When using a GPU, this specifies which GPU to use by its ID. The default is the first GPU (ID 0).
  - Example:
```
  python inference.py --split RandomTCR --dataset pMTnet --device gpu --gpu_id 0
```

Example Output:

You chose the dataset: pMTnet
The split method is: RandomTCR
Fold: 0, AUC: 0.9113, AUPR: 0.6501
Fold: 1, AUC: 0.9098, AUPR: 0.6438
Fold: 2, AUC: 0.9079, AUPR: 0.6438
Fold: 3, AUC: 0.9077, AUPR: 0.6404
Fold: 4, AUC: 0.9111, AUPR: 0.6512

Additional Information:

For more details and customization options, please refer to ours paper. Have fun exploring the GTE framework!

How to Train

The downloaded test model contains embeddings generated by TCRpeg. If you need embeddings from ESM-2, please refer to ESM-2's GitHub.

Next, simply run the following command:

python train.py --gpu 0 --configs_path configs/pMTnet.yml --droup_out 0.1 --split StrictTCR

Please ensure that the paths in configs/XXXXX.yml are correct, including the paths for training and testing files, and the embeddings.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
configs		configs
processed_data		processed_data
results		results
README.md		README.md
arg_parser.py		arg_parser.py
data_processing.py		data_processing.py
inference.py		inference.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GTE: A Graph Learning Framework for Prediction of T-cell Receptors and Epitopes Binding Specificity

Folder Structure

Quick Start

How to Train

About

Releases

Packages

Languages

uta-smile/GTE

Folders and files

Latest commit

History

Repository files navigation

GTE: A Graph Learning Framework for Prediction of T-cell Receptors and Epitopes Binding Specificity

Folder Structure

Quick Start

How to Train

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages