Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General API for preparing IOData object before dumping #191

Open
15 of 22 tasks
tovrstra opened this issue Aug 23, 2020 · 1 comment
Open
15 of 22 tasks

General API for preparing IOData object before dumping #191

tovrstra opened this issue Aug 23, 2020 · 1 comment
Assignees
Labels
API breaking Should be done first to stabilize API
Milestone

Comments

@tovrstra
Copy link
Member

tovrstra commented Aug 23, 2020

Motivation

At the moment, it is mostly assumed that an IOData instance contains all the right attributes in the right form before it is passed on to a dump_one call. Some file formats (WNF, WFX and potentially also FCHK) modify the IOData object to become compatible with the format. Typical modifications include:

  • WFX/WFN: converting basis to Cartesian functions (with transformation of the MO coeffs)
  • WFX/WFN: decontracting the basis (with transformation of the MO coeffs)
  • FCHK: convert (natural) orbitals to density matrix, recontraction of basis sets.
  • In general, possibly too far fetched, reverse-engineering contractions from WFN/WFX files.

This is problematic for several reasons:

  • It makes dump_one functions long and complicated.
  • Users may not be aware of the conversion taking place, which may result in loss of information.
  • It may sometimes be of interest to disable conversions, e.g. when they are optional or when the user does not want any conversions (and prefers an exception to be raised instead). The latter is typical when dealing with conversions of large data sets, where data preservation is desirable and unintended loss of information due to conversion is not wanted.
  • Some of the current conversions introduce redundant data, which results in inefficient use of storage.

See also:

Proposal

  • Add an optional prepare_dump function to the fileformat modules. If present, it takes and IOData instance as argument, and returns a potentially modified one. The given IOData instance is not modified. An option allow_changes=False should be added, to allow disabling any conversion. If this flag is set to False and the file cannot be written without conversion, an exception is raised. If this flag is set to True, a warning is emitted when a conversion is applied.

  • The dump_one and dump_many in the file formats functions call the new prepare_dump function before dumping.

  • Add a allow_changes=False option to the dump_one and dump_many functions in the file formats modules. This is passed on to the prepare_dump function. The dump functions return the potentially modified IOData instance(s).

  • Add a allow_changes=False option to the dump_one and dump_many functions in the module iodata.api. This is passed on to the dump functions of the selected file format. Also these dump functions return the potentially modified IOData instance(s).

  • Factor out some of the reusable utility functions to modify the IOData object, e.g. manipulations of basis set and corresponding changes to MO coefficients.

  • Add an option to the script iodata-convert to enable or disable modifications before dumping.

  • Add basic sanity check to dump_one and dump_many that required attributes are not None before creating a file. Such missing attributes will raise an error, and may result in overwriting the output with an empty file, which is never useful and may ocassionally lead to data loss. This type of pre-flight check could be added to prepare_dump, but it is better to write one general implementation for all file formats, so it is always checked.

TODO list

@tovrstra
Copy link
Member Author

tovrstra commented Jun 6, 2024

Another example of required conversion is discussed in #252: many formats do not support restricted orbitals with "unrestricted occupation numbers". In this case, the orbitals need to be converted to unrestricted form to be able to write a file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API breaking Should be done first to stabilize API
Projects
None yet
Development

No branches or pull requests

2 participants