Skip to content

chao1224/Geom3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials

Authors: Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, Zhiming Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang

This is Geom3D, a platfrom for geometric modeling on 3D structures:

Environment

Conda

Setup the anaconda

wget https://repo.continuum.io/archive/Anaconda3-2019.10-Linux-x86_64.sh
bash Anaconda3-2019.10-Linux-x86_64.sh -b
export PATH=$PWD/anaconda3/bin:$PATH

Packages

Start with some basic packages.

conda create -n Geom3D python=3.7
conda activate Geom3D
conda install -y -c rdkit rdkit
conda install -y numpy networkx scikit-learn
conda install -y -c conda-forge -c pytorch pytorch=1.9.1
conda install -y -c pyg -c conda-forge pyg=2.0.2
pip install ogb==1.2.1

pip install sympy

pip install ase

pip install lie_learn # for TFN and SE3-Trans

pip install packaging # for SEGNN
pip3 install e3nn # for SEGNN

pip install transformers # for smiles
pip install selfies # for selfies

pip install atom3d # for Atom3D
pip install cffi # for Atom3D
pip install biopython # for Atom3D

pip install cython # for pyximport 

conda install -y -c conda-forge py-xgboost-cpu # for XGB

Datasets

We cover three types of datasets:

  • Small Molecules
    • QM9
    • MD17
    • rMD17
    • COLL
  • Proteins
    • EC
    • FOLD
  • Small Molecules and Proteins
    • LBA
    • LEP
  • Materials
    • MatBench
    • QMOF

For dataset acquisition, please refer to the data folder.

Representation Models

Geom3D includes the following representation models:

We also include the following 7 1D models and 11 2D models (specifically for small molecules):

Notice that there is no pretraining considered at this stage. For geoemtric pretraining models, please check the following section.

Geometric Pretraining

We include the following 14 geometric pretraining methods: