Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data/example		data/example
models		models
scripts		scripts
tools		tools
utils		utils
README.md		README.md
base.csv		base.csv
dataloader.py		dataloader.py
opt.py		opt.py

Repository files navigation

MSGO

This repository provieds data and methods in the paper:
Pseudodata-based molecular structure generator to reveal unknown chemicals (under review)

Authors: Nanyang Yu, Zheng Ma, Qi Shao, Laihui Li, Xuebing Wang, and Si Wei*

Setup

Environment

Python: 3.7
Torch: 1.7.1

Data

We provied For Training, we use 30k+ pseudo smiles-specturm pairs generated by cfmid (you can download the raw smiles lists file here). For evaluation, we use 300+ real specturm to verify our method (download here). For evaluation in real samples，we use one LC–QTOF dataset for wastewater samples to verify our model (download here, code: gmas).

Model weights

We provide the MSGO model (pfas, code: 0bfg; lipid, code: 37it) trained use pseudo smiles-specturm pairs with whole methods mentioned in paper. you also can train you own model with other methods.

Training

You can replicate our experiment, including all the techniques:

python tools/train.py --id all_trick --user_precurso 1 -- use_mask 1 --use_formual 1

More options can be viewed in opt.py

Evaluation

Download the model weights in ckpts/pfas or ckpts/lipid, run

python tools/eval.py --log_path [ckpt/pfas or ckpts/lipid]

Predict real data

We provide example data in data/example.

For pfas, run :

python tools/eval_standard.py --log_path ckpts/pfas --real_csv ./data/example/pfas.csv --out_csv ./pfas_results.csv --beam_size 500 --polar neg

For lipid, run:

python tools/eval_standard.py --log_path ckpts/lipid --real_csv ./data/example/lipid.csv --out_csv ./lipid_results.csv --beam_size 300 --polar pos

Then you can obatin a results csv file inluding top 10 predicts.

Todos

Release model weights
Release pseudo and real data
Release training process

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSGO

Setup

Environment

Data

Model weights

Training

Evaluation

Predict real data

Todos

About

Releases

Packages

Languages

aaronma2020/MSGO

Folders and files

Latest commit

History

Repository files navigation

MSGO

Setup

Environment

Data

Model weights

Training

Evaluation

Predict real data

Todos

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages