Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script for generating AM1-BCC charges? #94

Open
ghutchis opened this issue Jul 5, 2022 · 9 comments
Open

Script for generating AM1-BCC charges? #94

ghutchis opened this issue Jul 5, 2022 · 9 comments
Labels
enhancement New feature or request

Comments

@ghutchis
Copy link

ghutchis commented Jul 5, 2022

Is your feature request related to a problem? Please describe.

The AM1-BCC charge scheme is fairly popular in drug discovery, but as far as I can tell, only Amber and OpenEye toolkits can generate them.

Describe the solution you'd like

I'd love to have an example script that can take a molecule in SDF format and use Open Babel or RDKit to calculate AM1-BCC charges using open-source MOPAC.

@godotalgorithm godotalgorithm added the enhancement New feature or request label Jul 6, 2022
@godotalgorithm
Copy link
Collaborator

This is a reasonable request, and the possibility of AM1-BCC in MOPAC has come up before in development discussions. The AM1 atomic charge part of the AM1-BCC model is straightforward to produce with MOPAC, it is mainly the atom and bond typing that is required for the BCC correction that can get tricky. As suggested, this process needs connectivity information, which could be input from something like an SDF file or produced internally by MOPAC using its Lewis structure assignment feature. Even with this connectivity information, there is a non-trivial amount of logic that is needed to make the type assignments, and without an existing open-source implementation, someone will have to re-implement it.

Are you aware of any open-source implementations of comparable/compatible atom and bond typing given molecular connectivity information?

Whether as an external script or integrated into MOPAC, I do see this as a desirable feature. I'd be very willing to accept a feature like that, and I might be willing to take a pass at implementing the typing logic myself in the near future.

Presumably Amber and/or OpenEye can be used to verify an implementation. With so few previous implementations, I'm slightly worried about a deviation between principle and practice - the typing logic in the implementation papers might not 100% match what Amber/OpenEye do now.

@ghutchis
Copy link
Author

ghutchis commented Jul 6, 2022

Are you aware of any open-source implementations of comparable/compatible atom and bond typing given molecular connectivity information?

Yes. AmberTools is GPL and has the relevant typing and code.
http:https://ambermd.org/GetAmber.php#ambertools
conda install -c conda-forge ambertools=22

For example:
antechamber -i name.sdf -fi sdf -o lig1.mol2 -fo mol2 -c bcc # typing and BCC charges to mol2
antechamber -i lig1.mol2 -fi mol2 -o lig2.mol2 -fo mol2 -c wc -cf charges.txt # write to charges.txt

I haven't looked through the source, but antechamber prints out what it's doing with sqm, etc.

I can certainly take a test set of molecules and generate the relevant types / charges for comparison. Let me know if you have a preferred set or I'll find something we have here.

@godotalgorithm
Copy link
Collaborator

I may be slightly misinterpreting this request and/or the availability of various components of Amber. The antechamber program from AmberTools has command-line options to use mopac in place of the default sqm option for the AM1 calculation (-df 0). However, this use case isn't well documented, and it appears to require a missing script file, mopac.sh, in the same directory as the AmberTools programs. This file runs MOPAC, making sure that the input file is named mopac.in and the output file is named mopac.out. For example, I got it to work with the script:

sed 's/ANALYT//g' mopac.in > mopac_temp.mop
mopac mopac_temp.mop
cp mopac_temp.out mopac.out
rm mopac_temp.*

I also had to remove an obsolete, unnecessary keyword, ANALYT, that antechamber is trying to use. This appears to run correctly, but I didn't verify that the results are equivalent to when sqm is used. I expect that you have a more handy and thorough set of tests, and if there are further problems or related requests, feel free to reopen this issue. Otherwise, I'm not sure there is anything for me to do here, since antechamber appears to be able to meet your request with a small amount of tinkering.

@flatstik
Copy link

flatstik commented Jul 7, 2022

Just for curiosity - is there some reason why to use AM1-BCC instead of PM7 charges?

@ghutchis
Copy link
Author

ghutchis commented Jul 7, 2022

There are people who prefer AM1-BCC for various reasons. Personally, since it emulates HF, I'd probably prefer some sort of DFT-RESP model or a more advanced electrostatics model, but YMMV.

@ghutchis
Copy link
Author

ghutchis commented Jul 7, 2022

I'm hoping for an implementation outside of antechamber but if you don't want to deal with the BCC part, that's fine.

@godotalgorithm godotalgorithm reopened this Jul 7, 2022
@godotalgorithm
Copy link
Collaborator

I was just unclear about the request - whether it was to have a way to produce AM1-BCC results using MOPAC AM1 outputs (which antechamber does allow for) or to remove the AmberTools dependency completely. I can't just copy the atomtype and bondtype source from AmberTools into MOPAC unless I move the overall license "further left" from LGPL to GPL, which I am not prepared to do. However, having self-contained open-source references for these features would make them a lot easier to independently re-implement in MOPAC if necessary.

This isn't going to have a high priority since it's more about eliminating an undesirable dependency than providing previously unavailable functionality, but I'll consider it to be under active consideration.

As for reasons to use AM1-BCC charges over raw PM6/PM7 charges, I can see pros/cons to both. AM1-BCC was explicitly fit to reproduce charge data (Hartree-Fock-based RESP charges), whereas the most direct modeling pressure on PM6/PM7 to produce good charges is dipole-moment reference data. On the other hand, the reference data for AM1-BCC isn't necessarily that accurate (Hartree-Fock rather than post-HF data), the BCC corrections don't seem to be that large in practice, and AM1-BCC charges are not continuous functions of atomic coordinates since they depend on discrete connectivity data. There is a recent survey of semiempirical dipole moments where PM6/PM7 are reasonably keeping pace with other models, but again I'm not sure what this says about the quality of the underlying charges.

@ghutchis
Copy link
Author

ghutchis commented Jul 7, 2022

I don't know if I have a ton of free coding time, since I'm pouring most of it into Avogadro2 right now. But I'll see if we can implement some of the BCC pieces in straight RDKit or some other form.

@godotalgorithm
Copy link
Collaborator

For anyone trying to use antechamber with MOPAC, there is another problem with the MOPAC input files that it generates besides the presence of the obsolete ANALYT keyword. The input file has only one comment line instead of two, so the first atom in the molecule is not read in properly. A mopac.sh job script that fixes the problem is:

sed 's/ANALYT//g' mopac.in | sed 's/for mopac/for mopac\n/g' > mopac_temp.mop
mopac mopac_temp.mop
cp mopac_temp.out mopac.out
rm mopac_temp.*

I've reported this problem to AmberTools developers, and it will hopefully be fixed in future versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants