Skip to content

Multimodal Transformer for Predicting Global Minimum Adsorption Energy

License

Notifications You must be signed in to change notification settings

schwallergroup/AdsMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AdsMT: Multimodal Transformer for Predicting Global Minimum Adsorption Energy

zenodo

AdsMT is a novel multi-modal transformer to rapidly predict the global minimum adsorption energy (GMAE) of diverse catalyst/adsorbate combinations based on surface graphs and adsorbate feature vectors without any binding information.

🚀 Environment Setup

  • System requirements: This package requires a standard Linux computer with GPU (supports CUDA >= 11) and enough RAM (> 2 GB). The codes have been tested on NVIDIA RTX 3090, A6000 and A100 GPUs. If you want to run the code on a GPU that does not support CUDA>=11, you need to modify the versions of PyTorch and CUDA in the env.yml file.
  • We'll use conda to install dependencies and set up the environment for a Nvidia GPU machine. We recommend using the Miniconda installer.
  • After installing conda, install mamba to the base environment. mamba is a faster, drop-in replacement for conda:
    conda install mamba -n base -c conda-forge
  • Then create a conda environment and install the dependencies:
    mamba env create -f env.yml
    Activate the conda environment with conda activate adsmt. It will take about 10 minutes to configure the environment for running code.

📌 Datasets

Dataset links: Zenodo and Figshare

We built three GMAE benchmark datasets named OCD-GMAE, Alloy-GMAE and FG-GMAE from OC20-Dense, Catalysis Hub, and FG-dataset datasets through strict data cleaning, and each data point represents a unique combination of catalyst surface and adsorbate.

Dataset Combination Num. Surface Num. Adsorbate Num. Range of GMAE (eV)
Alloy-GMAE 11,260 1,916 (37) 12 (5) -4.3 $\sim$ 9.1
FG-GMAE 3,308 14 (14) 202 (5) -4.0 $\sim$ 0.8
OCD-GMAE 973 967 (54) 74 (4) -8.0 $\sim$ 6.4

Note: The values in brackets represent the numbers of element types.

We can run scripts/download_datasets.sh to download all datasets:

bash scripts/download_datasets.sh

🔥 Model Training

1. Training from scratch

To train a AdsMT model with different graph encoder on a dataset by scripts/train.sh and the following command:

bash scripts/train.sh [DATASET] [GRAPH_ENCODER]

This code repo includes 7 different graph encoders: SchNet (schnet), CGCNN (cgcnn), DimeNet++ (dpp), GemNet-OC (gemnet-oc), TorchMD-NET (et), eSCN (escn), AdsGT (adsgt, this work). The log file including experiment results will be found in exp_results/[DATASET]/[GRAPH_ENCODER].log. It will take 3-24 hours for one task, depending on the dataset and graph encoder.

2. Pretraining on the OC20-LMAE dataset

We provide scripts for model pretraining on the OC20-LMAE dataset. For example, a AdsMT model with different graph encoders will be pretrained by running:

bash scripts/pretrain_base.sh [GRAPH_ENCODER]

The checkpoint file of pretrained model can be found at checkpoint_dir in the log file.

3. Finetuning on the GMAE datasets

To finetune a AdsMT model on a GMAE dataset, you need to change the ckpt_path parameter in the model's configuration file (configs/[DATASET]/finetune/[GRAPH_ENCODER].yml) to the checkpoint path of your pre-trained model, then run the following command:

bash scripts/finetune.sh [DATASET] [GRAPH_ENCODER]

4. Cross-attention scores for adsorption site identification

The scripts/attn4sites.sh is used to calculate the cross-attention scores of a trained AdsMT model on a GMAE dataset by running:

bash scripts/attn4sites.sh [CONFIG_PATH] [CHECKPOINT_PATH]

The output file will be stored at the results_dir in the log file.

We provide a notebook visualize/vis_3D.ipynb to visualize and compare cross-attention score-colored surfaces with DFT-optimized adsorption configurations under GMAE.

🌈 Acknowledgements

This work was supported as part of NCCR Catalysis (grant number 180544), a National Centre of Competence in Research funded by the Swiss National Science Foundation.

This code repo is based on several existing repositories:

📝 Citation

If you find our work useful, please consider citing it:

📫 Contact

If you have any question, welcome to contact me at:

Junwu Chen: [email protected]

About

Multimodal Transformer for Predicting Global Minimum Adsorption Energy

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published