Skip to content

Latest commit

 

History

History

data

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Dataset

Processed data

You can download the processed data geom_drug.tar.gz (~2GB) from here and unzip them (tar -zxvf) in the data folder as:

data
├── geom_drug
│   ├── mol_summary.csv
│   ├── split_by_molid.pt
│   ├── processed.lmdb
│   └── processed_molid2idx.pt

From sdf files

If you want to process the data from the sdf files, you have to further download the sdf.tar.gz (~4GB) from here. Unzip it (~28GB after unzipping) in the data/geom_drug folder and remove the processed.lmdb and processed_molid2idx.pt:

data
├── geom_drug
│   ├── mol_summary.csv
│   ├── split_by_molid.pt
│   └── sdf
│       ├── 0.sdf
│       ├── ...

Then by running any sampling, training or evaluation script, the data will be processed automatically.