-
Notifications
You must be signed in to change notification settings - Fork 1
evbopt
This program generates and optimizes the parameters of EVB coupling terms. The coupling smoothly connects two QMDFFs, therefore these need to be present (generated with qmdffgen) when an EVB coupling term is optimized. In addition, some reference data describing the reaction must be present, serving as the training set for the optimization routine. The xyz-coordinates of the structures and the respective reference energies in a linewise format must be present. Usually, a reaction path, generated from a NEB or IRC calculation, would be used for this. For the RP-EVB and the related TREQ method, a reaction path is mandatory.
List of examples:
- 1: dE-EVB (energy difference coupling term): evbopt/dE-EVB
- 2: dQ-EVB (coordinate-dependent coupling term): evbopt/dQ-EVB
- 3: DG-EVB (distributed Gaussian coupling term), folder: evbopt/DG-EVB
Folder: evbopt/dE-EVB
The simplest EVB coupling version is the energy-gap (dE)-EVB coupling.
Here, the off-diagonal coupling term depends only on the energetic distances of both
QMDFFs at the current coordinates (delta E or short dE).
A bunch of different functions depending on dE can be used, the easiest are
the constant coupling a with no dE-dependence at all and the 1g
coupling C= a exp(-b deltaE^2)
with a Gaussian shape.
The main input file is the \texttt{evbopt_de_1g.key} file, where all essentially all control parameters are listed:
pes de_evb
energies_ref ref.dat
coords_ref struc.xyz
qmdff {
ffnames min1.qmdff min2.qmdff
eshift -272.22659664700754 -272.16798367875066
shift_manual
}
de_evb{
coupling sp
}
The first keyword is always needed and specifies the used class of PES functions. Here, we
want to use the dE-EVB coupling (de_evb
). Other options (for evbopt) are
dQ-EVB (dq_evb
) and DG-EVB (dg_evb
), see below.
Both the energies_ref
and coords_ref
keywords are needed to provide the
needed reference data for the reaction of interest that shall be described with EVB.
Usually this would be a one-dimensional reaction path, but higher dimensional sampling
data could be used as well.
The keywords in the qmdff
section: ffnames
and eshift
specifiy the properties of the
single QMDFFs, both their filenames and the energy shifts, where shift_manual
prevents the automatic correction of QMDFF shifts to the first/last given energy point of the reaction path. These parameters are written
automatically if qmdffgen is executed first (see section qmdffgen).
The optimization is invoked by typing:
evbopt.x evbopt.key
A multi-start local-search Levenberg-Marquardt optimization is executed with the objective
to minimize the error sum between the EVB-QMDFF energies and the reference energies listed
in ref.dat
, an usual output is:
...
Used off-diagonal-Term:
a*exp(-b*deltaE^2)+c*deltaE*exp(-d*deltaE^2) (sp)
After Multi-Start-Local-Search step: 1
Local optimization finished, new best fitness: 8.3897511547073188E-004
After Multi-Start-Local-Search step: 15
Local optimization finished, new best fitness: 8.3877256721804922E-004
After Multi-Start-Local-Search step: 35
Local optimization finished, new best fitness: 8.3632257233213895E-004
.. .----. -- -.. --- -. .
Multi start local optimization finished!
The best fitness-value is: 8.3632257233213895E-004
Look into evbopt.log for further details.
...
The total fitness is the central indicator of how good the reference reaction path is reproduced with the EVB-QMDFF model, however, in order to gain more detailed knowledge of the quality, one can look at the additional output, namely:
-
energies_final.dat
: The EVB-QMDFF energies of all reference structures -
single_qmdff.dat
: The energies of both QMDFFs of all reference structure
Plotting the files together with the reference energies in ref.dat
results in the
figure shown in the figure:
The optimized EVB parameters (which are needed for subsequent calculations with other programs
like dynamic), are written to file
evb_pars.dat
. The resulting file will look something like this (the exact values differ in each
run due to the randomized parameter search algorithm):
** This file contains parameters for a EVB-dE-calculation **
0.11580080347625064
5.7673300124919509
1.9401589887079447E-004
-4.7760155261137269
Folder: evbopt/dQ-EVB
Within coordinate-dependent or dQ-EVB, the coupling strength depends on the distance to a chosen point within the configuation space of the system. Usually, the transition state (TS) of the reaction (or a structure nearby) should be chosen to be that point. In order to get a rotationally invariant description, the distance to the TS is measured in internal coordinates. It is not required to define a full set of 3N-6 internal coordinates for this, instead, a small subset being representative for the reaction is usually sufficient and more effective.
The main input file is again the evbopt.key
file, containing all general settings:
pes dq_evb
energies_ref energies_ref.dat
coords_ref path.xyz
qmdff {
ffnames min1.qmdff min2.qmdff
eshift -195.19018649184099 -195.19422087959535
}
dq_evb {
ts_file ts.xyz
coupling sd2
}
Its structure is mostly identical to that for the dE optimization. The dQ coupling term is activated with the pes dq_evb
keyword.
Names and energy shifts of QMDFFs are read in from the qmdff
section, as well as the structures and the energies of the reaction path serving as reference for the parameter optimization (energies_ref energies_ref.dat
and coords_ref path.xyz
).
The dq_evb
section provides specific settings for the dQ coupling term: the coupling function (sd2
) and the structure of the reference point, in this case also the TS: ts_file ts.xyz
.
Besides evbopt.key
and the other files speficied in it (min1.qmdff
, min2.qmdff
, energies_ref.dat
, path.xyz
and ts.xyz
), the set of internal coordinates to be used must be defined manually and written to the file coord_def.inp
, where one internal coordinate is defined in each line:
3 7
7 10
3 10
1 7
In this case, four bond lengths are the active set of internal coordinates, referring to the region most interesting for the reaction:
The optimization can be invoked as usual by typing
evbopt.x evbopt.key
The optimization will be finished after a fraction of a second, printing something like this:
...
Used off-diagonal-Term:
a*exp(-b*deltaQ^2)+c*deltaQ^2*exp(-d*deltaQ^2)+
e*deltaQ^2*exp(-f*deltaQ^2) (sd2)
. ...- -... --.- -- -.. ..-. ..-.
After Multi-Start-Local-Search step: 1
Local optimization finished, new best fitness: 1.4688854696209078E-003
After Multi-Start-Local-Search step: 2
Local optimization finished, new best fitness: 1.4679300705026375E-003
After Multi-Start-Local-Search step: 5
Local optimization finished, new best fitness: 1.4677972875365361E-003
After Multi-Start-Local-Search step: 7
Local optimization finished, new best fitness: 1.4675796326565325E-003
After Multi-Start-Local-Search step: 26
Local optimization finished, new best fitness: 1.4675447295631309E-003
.. .----. -- -.. --- -. .
Multi start local optimization finished!
The best fitness-value is: 1.4675447295631309E-003
Look into evbopt.log for further details.
...
Besides the file energies_final.dat
with the (optimized) EVB-QMDFF energies of the structures in path.xyz
and single_qmdff.dat
containing the energies of both QMDFFs for those structures, the optimized coupling parameters are again written to evb_pars.dat
.
The quality of the optimized coupling can again be accessed by plotting QMDFF and EVB-QMDFF energies besides the reference energies:
Folder: evbopt/DG-EVB
The distributed Gaussian (DG-EVB) coupling term enables the inclusion of QM-reference gradients and frequencies into the formulation of the transition region, offering a much better quality than dE-EVB or dQ-EVB.
The following explanation and the example assumes that an optimized 1D reaction path of the reaction of interest is given. In principle, however, arbitrary collections of reference data, e.g. from MD samplings, can be used as well.
Including gradients and frequencies to the used reference set needs somewhat more preparational work than using only energies of all structures of the reaction path. The first question is: how many and which points along the reaction path shall be chosen. In principle, we could simply choose all given structures and calculate gradients and frequencies on it. Depending on the resolution of the path, dozens to hundreds of frequency calculations might be needed, which almost always implies heavy computational demands. Further, the separate optimization of coupling basis function widths needs an additional set of energies along the path. In this example, the resulting input files are provided, such that this preparational part might be skipped. It serves, however, as a template for how to prepare an DG-EVB-QMDFF surface for other reactions of interest.... bla bla
For the manual preparation, an utility script for the selection and preparation
of gradient/frequency reference is given in the scripts main folder.
Copy the script dg_evb_ref.sh
to the
actual folder if you want to redo the reference calculations.
It requires the reaction path stored in struc.xyz
and a list of structure numbers
on which the reference data shall be calculated: struc_num.txt
.
A file containing the header of the desired QM input (for Gaussan: gauss.com
,
for orca: orca.inp
, for CP2K: run.inp
) must be provided as well.
For convenience, the folder structure was already prepared for seven reference points distributed along the path such that the transition state and the important transition region is covered well. Gaussian is here the chosen QM program, the frequency calculations can be executed for all seven folders.
After this, the needed reference input file dg_ref.dat
can be generated
by executing the program gen_dg_evb
.
The main input file for EVB optimizations is evbopt.key
. It has the following
entries:
pes dg_evb
energies_ref ref.dat
coords_ref struc.xyz
qmdff {
ffnames min1.qmdff min2.qmdff
eshift -132.2974220 -132.3079820
shift_manual
}
dg_evb{
mode 2
points 7
point_ref dg_ref.dat
read_coord
}
After specification of a DG-EVB potential energy surfaces, the information concerning the
single QMDFFs and the used reference data for optimization is listed.
All settings concerning the DG-EVB coupling term in particular are listed inside the
dg_evb
block. They have the following meaning:
-
mode
: Which type of DG-EVB function built upon which kind of ererence shall be used.1
: only energies,2
: energies and gradients,3
energies, gradients and Hessians. -
points
: Number of DG-EVB reference points on which the energies, gradients and Hessians were calculated. -
point_ref
: File in which the reference information for the DG-EVB points is stored. -
read_coord
: If given, the defined internal coordinates are read in from the filecoord_def.inp
, else, they are determined automatically.
Altogether, 4 or 5 different input files are needed to perform the optimization of a DG-EVB coupling term (named as above):
-
evbopt.key
: Main input file -
ref.dat
: File with energies of the reaction path for Gaussian exponent optimization. -
struc.xyz
: File with structures of the reaction path. -
dg_ref.dat
: Reference information for the DG-EVB points. -
coord_def.inp
: Definition of internal coordinates for DG-EVB coupling (optional).
The dg_ref.dat
file needs to have a predefined formate, which is generated by the
gen_dg_ref
program, given in the scripts folder.
Essentially, energies, geometries, gradients and Hessians of all reference points are there
listed in order, here for seven reference points along the path:
NPOINTS 7
NATOM 6
*POINT 1
ENERGY -132.073392
GEOMETRY
-0.2598966E+01 -0.1689562E+00 -0.2337807E-02 -0.2485496E+01 0.1654694E+01
0.3883103E-01 0.2523122E+01 -0.1090708E+00 -0.8583873E-02 0.1909804E+01
0.1018732E+00 0.1786691E+01 0.1266274E+01 -0.1242103E+01 -0.8934828E+00
0.2439287E+01 0.1600681E+01 -0.8532495E+00
END
GRADIENT
0.3391129E-02 0.5053883E-05 -0.3630294E-03 0.1683734E-03 -0.3711497E-05
0.1950042E-03 -0.2993883E-02 -0.4142712E-05 -0.4566316E-04 0.3697710E-04
0.4438535E-04 0.7573431E-04 -0.3227359E-03 -0.1882671E-04 0.1766393E-03
-0.2798607E-03 -0.2275832E-04 -0.3868530E-04
END
HESSIAN
0.1371910E-01 0.3332208E-01 0.2103093E-03 -0.4587440E-02 -0.3262215E-01
-0.6211021E-03 -0.5506326E-02 -0.1298989E-02 -0.9317258E-05 -0.4769846E-03
....
END
*POINT 2
...
Depending on the chosen DG-EVB coupling mode, energies (and geometries) might be sufficient as well.
If internal coordinates for the coupling are given in the coord_def.inp
file, a
predefined format must be obeyed as well:
One coordinate per line, each defined by the involved atom indices: Two for a bond,
three for an angle, and four for dihedral/out of plane. For the latter one, a fifth
number indicating if dihedral or out of plane must be added after the atom
indices, such that each dihedral/out of plane line consists of five numbers.
In the example system, the internal coordinate set defined in coord\_def.inp
consists of four bonds:
1 5
3 5
1 3
1 2
The choice of internal coupling coordinates is not exactly black-box. The usual rule is that coordinates involved in the reaction of interest are the most important, but the overall quality might significantly raise or fall if another assumingly unimportant coordinate is added or removed. In the evb_kt_driver program explained below a more systematic and quite well working approach of finding the internal coordinates for the there used TREQ PES is implemented.
The actual optimization of the DG-EVB coupling term is very simple: Just type
evbopt.x evbopt.key
and the MSLS-LM algorithm is executed to optimize the parameter. This will take some seconds to minutes.
The program output should look something like this:
...
Used DG-EVB mode: 3 - energies, gradients and hessians (E+G+H)
The dimension of the linear equation F=DB is 105
################## QMDFFs!
. ...- -... --.- -- -.. ..-. ..-.
DG-EVB-parameters are optimized with Multi Start Local Search.
We will start 100 single optimization runs.
The coefficient matrix D for DG-EVB will be calculated analytically!
Multi Start Local Search algorithm:
---------------------------------------------------
After Multi-Start-Local-Search step: 1
Local optimization not converged, new best fitness: 1.5411906000070418E-006
After Multi-Start-Local-Search step: 2
Local optimization finished, new best fitness: 1.5075939192383248E-006
After Multi-Start-Local-Search step: 10
Local optimization finished, new best fitness: 1.3147886030294805E-006
After Multi-Start-Local-Search step: 11
Local optimization finished, new best fitness: 1.2190281156368069E-006
After Multi-Start-Local-Search step: 16
Local optimization finished, new best fitness: 1.1965456308064260E-006
After Multi-Start-Local-Search step: 30
Local optimization finished, new best fitness: 1.1388462232211951E-006
After Multi-Start-Local-Search step: 78
Local optimization finished, new best fitness: 1.1059136690605482E-006
.. .----. -- -.. --- -. .
Multi start local optimization finished!
The best fitness-value is: 1.1059136690605482E-006
Look into evbopt.log for further details.
...
The total fitness is the central indicator of how good the reference reaction path is reproduced with the EVB-QMDFF model, however, in order to gain more detailed knowledge of the quality, one can look at the additional output, namely:
-
energies.qmdff
: The EVB-QMDFF energies of all reference structures -
single_qmdff.dat
: The energies of both QMDFFs of all reference structure
Plotting the files together with the reference energies in ref.dat
results in the
figure shown in the figure below.
CARACAL Wiki
Getting Started
Program Tutorials
Miscellaneous