Authors: Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, Zhiming Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang
This is Geom3D, a platfrom for geometric modeling on 3D structures:
Setup the anaconda
wget https://repo.continuum.io/archive/Anaconda3-2019.10-Linux-x86_64.sh
bash Anaconda3-2019.10-Linux-x86_64.sh -b
export PATH=$PWD/anaconda3/bin:$PATH
Start with some basic packages.
conda create -n Geom3D python=3.7
conda activate Geom3D
conda install -y -c rdkit rdkit
conda install -y numpy networkx scikit-learn
conda install -y -c conda-forge -c pytorch pytorch=1.9.1
conda install -y -c pyg -c conda-forge pyg=2.0.2
pip install ogb==1.2.1
pip install sympy
pip install ase
pip install lie_learn # for TFN and SE3-Trans
pip install packaging # for SEGNN
pip3 install e3nn # for SEGNN
pip install transformers # for smiles
pip install selfies # for selfies
pip install atom3d # for Atom3D
pip install cffi # for Atom3D
pip install biopython # for Atom3D
pip install cython # for pyximport
conda install -y -c conda-forge py-xgboost-cpu # for XGB
We cover three types of datasets:
- Small Molecules
- QM9
- MD17
- rMD17
- COLL
- Proteins
- EC
- FOLD
- Small Molecules and Proteins
- LBA
- LEP
- Materials
- MatBench
- QMOF
For dataset acquisition, please refer to the data folder.
Geom3D includes the following representation models:
- SchNet, NeurIPS'18
- TFN, NeurIPS'18 Workshop
- DimeNet, ICLR'20
- SE(3)-Trans, NeurIPS'20
- EGNN, ICML'21
- PaiNN, ICML'21
- GemNet, NeurIPS'21
- SphereNet, ICLR'22
- SEGNN, ICLR'22
- NequIP, Nature Communications'22
- Allegro, Nature Communications'23
- Equiformer, ICLR'23
- GVP-GNN, ICLR'21
- IEConv, ICLR'21
- GearNet, ICLR'23
- ProNet, ICLR'23
- CDConv, ICLR'23
We also include the following 7 1D models and 11 2D models (specifically for small molecules):
- 1D Fingerprints: MLP, RF, XGB
- 1D SMILES: CNN, BERT
- 1D Selfies: CNN, BERT
- 2D topology:
Notice that there is no pretraining considered at this stage. For geoemtric pretraining models, please check the following section.
We include the following 14 geometric pretraining methods:
- Pure 3D:
- Supervised
- Atom Type Prediction
- Distance Prediction
- Angle Prediction
- 3D InfoGraph, from GeoSSL, ICLR'23
- GeoSSL-RR, from GeoSSL, ICLR'23
- GeoSSL-InfoNCE, from GeoSSL, ICLR'23
- GeoSSL-EBM-NCE, from GeoSSL, ICLR'23
- GeoSSL-DDM, ICLR'23
- GeoSSL-DDM-1L, ICLR'23
- 3D-EMGP, AAAI'23
- Joint 2D-3D: