Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infrastructure for making input files for QC codes #43

Closed
tovrstra opened this issue Jan 28, 2019 · 9 comments
Closed

Infrastructure for making input files for QC codes #43

tovrstra opened this issue Jan 28, 2019 · 9 comments
Assignees
Milestone

Comments

@tovrstra
Copy link
Member

tovrstra commented Jan 28, 2019

We should have functions to easily write input files for a few quantum chemistry codes of interest (Gaussian, PSI4, ORCA). Suggested API:

from string import Template

# This is just a bare example template, probably lots of features can be added.
default_orca_template = """\
! ${lot} ${basis_name} ${runtype}

*xyz ${charge} ${spinmult}
${atom_lines}
*
"""

def write_input_orca(filename, iodata, template=None):
    """TODO docstring."""
    # some code here to translate iodata attributes to specific names and commands
    # of the QC code. At least the following should be present
    fields = vars(iodata)
    # which can be followed by some code that edits the fields dictionary.
    # For example, you may want to add an 'atom_lines' field that contains a mult-line
    # string with atomic elements, atomic coordinates, and possibly other things like
    # ghost-atom modifiers, etc.
    # Just make sure the values of the fields dictionary are not modified in-place.
    # In some cases, you want to use part of the filename as a field, e.g. to construct
    # the filename of the chk, molden or wfn file.
    # Finally, call a generic routine.
    if template is None:
        template = default_orca_template
    _write_input_generic(filename, fields, template)

def _write_input_generic(filename, fields, template):
    """Write a QC input file.
    
    Parameters
    ----------
    filename
        Filename of the input file to be written.
    fields
        Dictionary of parameters that could potentially be written into the input file.
    template
        A template string.
    """
    with open(filename, 'w') as f:
        f.write(Template(template).substitute(fields))

Required attributes for the IOData class are defined in #41.

Some remotely related code can be found here: https://github.com/theochem/horton/blob/master/horton/scripts/atomdb.py

@dgasmith
Copy link

It would be good to join forces on this, we will begin implementing this kind of tech into two of our codes Elemental and Engine. Both of this will produce and consume schema.

So far we haven't focused too much on the increasing the number of codes that we interface too, but as this ecosystem enters beta that will become a focus. Feel free to join our Slack from any link on the READMEs.

@tovrstra
Copy link
Member Author

@dgasmith That could be interesting and I have considered it, but we seem to have requirements that are not covered in QCEngine. Anyway, your comments would be very much appreciated.

We're not aiming at fully standardizing the interface to QC chemistry codes. We'd just want something pragmatic to generate input files for various use cases. One requirement, which is going against standardizing, is that a user should be able to provide its own template input file, just to make it possible to control uncommon features of a QC code. Similarly, we'd like to keep existing outputs rather than only keeping a condensed summary, because they sometimes contain useful but very code-specific results. (These two points were useful in previous studies.) Would QCEngine somehow be able to deal with these requirements?

@dgasmith
Copy link

Generating input files was, granted, not originally not in our purview. However, we have had many requests for this and three of the MolSSI Software fellows have decided to take this on so we have been working to relax this constraint. As this was a decision made about a week ago, progress has been relatively small and we are still working through the best way to accomplish behavior. This is an area where we would really like to join forces on.

We always intended to keep full output files by default as well and we do so for all compute backends that contain them (this is the default for all programs, but Psi4 which will change today). There are stdout/stderr fields for the canonical case and we are still discussing what to do for multiple output files and how to handle that.

We can bring more people to the conversation on our Slack.

@evohringer
Copy link
Contributor

evohringer commented Mar 1, 2019

I'm thinking what would be the best way to organize the interface to other QC programs compatible with existing iodata structure.

Here are some thoughts:

  • To have one class which writes all input files depending on selected QC program and provided template
  • To have parser an output log parser for each QC program as e.g. cp2k, gaussian etc
  • Wavefunction containing files have their own load and dump functions since there are more specific.

I just started thinking how to start and was not sure how to proceed: make something specific for one QC program or generic with different options of QC input files as @tovrstra suggested.

@tovrstra
Copy link
Member Author

tovrstra commented Mar 4, 2019

The parsers could just use the same API as we already have with the loadfunctions for file formats. See e.g. for Gaussian log files https://github.com/theochem/iodata/blob/master/iodata/log.py. (The module name should be renamed for clarity, but that is a separate issue.) For this part, what is being loaded (wavefunctions or other things) does not matter so much, as long as we the quantities to be loaded in iodata.py. If you feel some things are missing, just comment here: #41.

For writing input files, classes seem overkill. Just functions as shown in the first message should cover all the needs. This makes it also fairly easy to move all the shared code to a low-level but generic write_input function, which does not have to be much more than what is already shown in the example. When writing inputs for different QC codes, the main differences are found in how to write out the geometry.

@evohringer
Copy link
Contributor

Thanks for the quick reply. I will write a first version of orca log parser and put all QC input file formats in one file "write_input.py" as functions as you suggested in the beginning.

Once ready I will push the draft versions to get the feedback from all.

@FarnazH
Copy link
Member

FarnazH commented Aug 21, 2020

A note from @tovrstra about a feature not supported in PR #188 when writing Gaussian input files:
"The format of the atom_lines should be configurable, with about the same level of flexibility as the template for the whole input file. See e.g. the Gaussian Molecule Specification: http:https://gaussian.com/molspec/. For some job types, more information than element and Cartesian coordinates needs to be added (fragment, charge, ...)."

@tovrstra
Copy link
Member Author

@FarnazH I'm going to turn that comment into a separate issue, se we can deal with it in a separate PR to keep things manageable.

@tovrstra
Copy link
Member Author

tovrstra commented Apr 1, 2021

All points in this issue are addressed. Related issues are #192 (with PR #253) and #221. Feel free to continue discussions there or open new issues if needed.

@tovrstra tovrstra closed this as completed Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants