GitHub - idrugLab/hignn: Code for "HiGNN: A Hierarchical Informative Graph Neural Network for Molecular Property Prediction Equipped with Feature-Wise Attention"

Introduction

HiGNN is a well-designed hierarchical and interactive informative graph neural networks framework for predicting molecular property by utilizing a co-representation learning of molecular graphs and chemically synthesizable BRICS fragments. Meanwhile, a plug-and-play feature-wise attention block was first designed in HiGNN architecture to adaptively recalibrate atomic features after message passing phase. HiGNN has been accepted for publication in Journal of Chemical Information and Modeling. Fig.1 The overview of HiGNN

Requirements

This project is developed using python 3.7.10, and mainly requires the following libraries.

rdkit==2021.03.1
scikit_learn==1.1.1
torch==1.7.1+cu101
torch_geometric==1.7.1
torch_scatter==2.0.7

To install requirements:

pip install -r requirements.txt

Usage

File description

source: the source code for HiGNN.
- config.py
- datasets.py
- utils.py
- model.py
- loss.py
- train.py
- cross_validate.py
configs: HiGNN used yacs for experimental configuration, where you can customize the relevant hyperparameters for each experiment with yaml file.
data: the dataset for training.
- raw: where to store the original csv dataset.
- processed: the dataset objects generated by PyG.
test: where the training logs, checkpoints and tensorboards are saved.
example: jupyter notebook codes for fragments counting, t-SNE visualization, interpretation and so on.
dataset: 11 real-world drug-discovery-related datasets used in this study.

Training example

Taking the BBBP dataset as an example, experiment can be run via:

git clone https://github.com/idruglab/hignn
cd ./hignn

# For one random seed
python ./source/train.py --cfg ./configs/bbbp/bbbp.yaml --opts 'SEED' 2022 'MODEL.BRICS' True 'MODEL.F_ATT' True --tag seed_2022

# For 10 different random seeds (2021~2030)
python ./source/cross_validate.py --cfg ./configs/bbbp/bbbp.yaml --opts 'MODEL.BRICS' True 'MODEL.F_ATT' True 'HYPER' False --tag 10_seeds

# For hyperparameters optimization
python ./source/cross_validate.py --cfg ./configs/bbbp/bbbp.yaml --opts 'MODEL.BRICS' True 'MODEL.F_ATT' True --tag hignn # HiGNN
python ./source/cross_validate.py --cfg ./configs/bbbp/bbbp.yaml --opts 'MODEL.F_ATT' True --tag w/o_hi # the variant (w/o HI)
python ./source/cross_validate.py --cfg ./configs/bbbp/bbbp.yaml --opts 'MODEL.BRICS' True --tag w/o_fa # the variant (w/o FA)
python ./source/cross_validate.py --cfg ./configs/bbbp/bbbp.yaml --tag vanilla # the variant (w/o All)

And more hyperparameter details can be found in config.py.

Interpretation

The interpretability of HiGNN can refer to interpretation_bace and interpretation_bbbp.

Data

The datasets used in this study are available in dataset or MoleculeNet.
The training logs, checkpoints, and tensorboards for each dataset can be found in BaiduNetdisk.

Results

In the present study, we evaluated the proposed HiGNN model on 11 commonly used and publicly available drug discovery-related datasets from Wu et al., including classification and regression tasks. According to previous studies, 14 learning tasks were designed based on 11 benchmark datasets, including 11 classification tasks based random- and scaffold-splitting methods and three regression tasks based on random-splitting method.

Table 1 Predictive performance results of HiGNN on the drug discovery-related benchmark datasets.

Dataset	Split Type	Metric	Chemprop	GCN	GAT	Attentive FP	HRGCN+	XGBoost	HiGNN
BACE	random	ROC-AUC	0.898	0.898	0.886	0.876	0.891	0.889	0.890
	scaffold	ROC-AUC	0.857						0.882
HIV	random	ROC-AUC	0.827	0.834	0.826	0.822	0.824	0.816	0.816
	scaffold	ROC-AUC	0.794						0.802
MUV	random	PRC-AUC	0.053	0.061	0.057	0.038	0.082	0.068	0.186
Tox21	random	ROC-AUC	0.854	0.836	0.835	0.852	0.848	0.836	0.856
ToxCast	random	ROC-AUC	0.764	0.770	0.768	0.794	0.793	0.774	0.781
BBBP	random	ROC-AUC	0.917	0.903	0.898	0.887	0.926	0.926	0.932
	scaffold	ROC-AUC	0.886						0.927
ClinTox	random	ROC-AUC	0.897	0.895	0.888	0.904	0.899	0.911	0.930
SIDER	random	ROC-AUC	0.658	0.634	0.627	0.623	0.641	0.642	0.651
FreeSolv	random	RMSE	1.009	1.149	1.304	1.091	0.926	1.025	0.915
ESOL	random	RMSE	0.587	0.708	0.658	0.587	0.563	0.582	0.532
Lipo	random	RMSE	0.563	0.664	0.683	0.553	0.603	0.574	0.549

Acknowledgments

The code was partly built based on chemprop, TrimNet and Swin Transformer. Thanks a lot for their open source codes!

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
configs/bbbp		configs/bbbp
data/bbbp		data/bbbp
dataset		dataset
example		example
source		source
test		test
LICENSE		LICENSE
README.md		README.md
hignn.png		hignn.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Requirements

Usage

File description

Training example

Interpretation

Data

Results

Acknowledgments

About

Releases

Packages

Languages

License

idrugLab/hignn

Folders and files

Latest commit

History

Repository files navigation

Introduction

Requirements

Usage

File description

Training example

Interpretation

Data

Results

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages