Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANI-2x training data set #39

Open
JMorado opened this issue Jun 22, 2021 · 0 comments
Open

ANI-2x training data set #39

JMorado opened this issue Jun 22, 2021 · 0 comments

Comments

@JMorado
Copy link

JMorado commented Jun 22, 2021

Hi,

How does one know what was the exact data set used to train ANI-2x?

In the original ANI-2x paper, it is said that the training data set is composed of molecules from a variety of sources, including the GDB-11 database, the CheMBL database, the s66x8 benchmark, and some randomly generated amino acids and dipeptides.
Nevertheless, from what I understood, these data sets are not included integrally because some specific sampling techniques are then employed.

Is it possible to know which were the exact molecules used for training?

Thank you.
Best,
João

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant