Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

further cleaning of pkg/autodiff, pkg/ctrl, if necessary #786

Open
4 of 5 tasks
Tracked by #440
mjlosch opened this issue Oct 12, 2023 · 9 comments
Open
4 of 5 tasks
Tracked by #440

further cleaning of pkg/autodiff, pkg/ctrl, if necessary #786

mjlosch opened this issue Oct 12, 2023 · 9 comments

Comments

@mjlosch
Copy link
Member

mjlosch commented Oct 12, 2023

After closing #440, this is a new place for collecting and discussing cleanup of pkg/ctrl and pkg/autodiff. The remaining item on the list was:

  • adcommon.h establishes the transformation of ad variables from var_ad to advar. But the only place that this is used is in addummy_in_stepping.f. It seems like we could just rewrite this using the suffix form that TAF naturally generates, and save the headache of updating adcommon.h every time the common blocks change. Update: we decided not to address this issue as it is not as simple as it seemed originally, see also discussion below and issue naming convention for hand-written AD and TL src code #547
  • optim.h -> OPTIM.h? (PR rename optim.h into OPTIMCYCLE.h #787)
  • with the deprecated ECCO-legacy code gone, pkg/grdchk could be cleaned up to remove reference to unavailable control variables, also the relationship between grdchkvarindex, index, and ncvarindex(index) should be more transparent in the stdout (s/r ctrl_summary), so that is easier to find out which variable grdchkvarindex refers to (currently index, but I am not sure that that's the best choice). Now that the number control variables has been reduced, one could change ncvarindex to be the same as index and then it's clear what to use for grdchkvarindex. OR/AND: one could set the index according to the position in data.ctrl (for all control variables, not sure what to do about the obcs-contols, though). The only thing that ncvarindex seems to be used is to make sure that it is set (in ctrl_pack/unpack and in optim/optim_readdata.F). (PR grdchk incremental improvement #788 and another pkg/ctrl cleanup with implications for pkg/grdchk #796)
  • there are still a few variables in pkg/ctrl/CTRLh.h defined in common blocks that are only used as local variables in one file, e.g. filenvartype, etc. (PR another pkg/ctrl cleanup with implications for pkg/grdchk #796)
  • TURNOFF_MODEL_IO in ctrl_init_ctrlvar.F is executed each time a control variable is set-up (e.g., 5 control-vars, called 5 times). Solution: move the test for file cost_final from ctrl_init_ctrlvar.F (again, no need to check 5 times) to the calling routine ctrl_init.F and pass the information (new argument) to ctrl_init_ctrlvar.F (PR another pkg/ctrl cleanup with implications for pkg/grdchk #796)
  • do we need wunit at all?
@mjlosch mjlosch changed the title adcommon.h establishes the transformation of ad variables from var_ad to advar. But the only place that this is used is in addummy_in_stepping.f. It seems like we could just rewrite this using the suffix form that TAF naturally generates, and save the headache of updating adcommon.h every time the common blocks change. further cleaning of pkg/autodiff, pkg/ctrl, if necessary Oct 12, 2023
@jm-c
Copy link
Member

jm-c commented Oct 13, 2023

I am copying here some pieces that were mentioned in issue #440 and relevant to "adcommon,h" point:

  1. From @timothyas (on Jun 28, 2021):
    Turns out the adcommon issue is more tricky than expected. I thought that this only existed to deal with the fact that TAF defaults to adding the _ad suffix, but some routines use ad as a prefix for variables. It turns out this is not at all the case, and something like adcommon.h will always be necessary to compile the code so that e.g. the fields like theta_ad, etc are defined in theaddummy routines at compile time.

  2. From @jm-c (on Sep 11, 2023):
    @mjlosch the adcommon.h issue is more complicated that what is described here (I think even with the same suffix that TAF uses we will still need this header file).

And regarding the file names "adcommon.h" and "g_common.h", there is some discussion in issue #547

@timothyas
Copy link
Member

Thanks @jm-c for the tag and apologies for delays here. I do remember originally thinking that the adcommon.h is not necessary, and then having the realization that any hand written adjoint code relies on having adcommon.h. So, I think that item can be checked or crossed off safely.

@jm-c
Copy link
Member

jm-c commented Oct 25, 2023

I would like to add a new bullet point to the list: seems like the call to TURNOFF_MODEL_IO in ctrl_init_ctrlvar.F is going to be executed each time a control variable is set-up (e.g., 5 control-vars, called 5 times). Is it necessary ?
I think a cleaner way would be to move the test for file 'cost_final" outside ctrl_init_ctrlvar.F (again, no need to check 5 times) and pass the information (new argument) to ctrl_init_ctrlvar.F

@mjlosch
Copy link
Member Author

mjlosch commented Oct 26, 2023

I would like to add a new bullet point to the list: seems like the call to TURNOFF_MODEL_IO in ctrl_init_ctrlvar.F is going to be executed each time a control variable is set-up (e.g., 5 control-vars, called 5 times). Is it necessary ?
I think a cleaner way would be to move the test for file 'cost_final" outside ctrl_init_ctrlvar.F (again, no need to check 5 times) and pass the information (new argument) to ctrl_init_ctrlvar.F

Makes sense!

@mjlosch
Copy link
Member Author

mjlosch commented Nov 17, 2023

With #788 merged, the mapping of the control variables to indices, grid type etc, is now "safer" in the sense, that there are only two places where the mapping is hardwired:

  1. s/r ctrl_init defines the map by calling s/r ctrl_init_ctrlvar with the appropriate parameters.
  2. s/r grdchk_ctrl_fname uses this map (again hardwired) to determine from the grdchk_index the file name (fname)

I would like to try this: define an additional character arrays ncvartype and ncvarfname (not the best names, but follows the convention in CTRL.h) that are filled with whichxyz and xx_fname in s/r ctrl_init_ctrlvar. From then on, the map between the index and variable is complete, and we do not need to have a second place, where the map is required. The only other places are in optim/optim_readdata.F and optim/optim_writedata.F, where the assignment of fixed indices for the obcs-ctrl-parameters is necessary.
s/r grdchk_ctrl_fname would reduce to fname = ncvarfname(grdchk_index) and with the availability of ncvartype=whichxyz we could simplify a few other if-statements in pkg/grdchk, e.g. in grdchk_getadxx.F etc.

The small disadvantage is that we then have two places where the control variable names are stored: (xx_genarr2d_file, xx_genarr2d_file, xx_gentim2d_file, xx_obcsn_file,xx_obcss_file,xx_obcse_file,xx_obcsw_file) and ncvarfname. The latter would hold the "active" control variables in one array, while the former list of files does the same.

@jm-c
Copy link
Member

jm-c commented Nov 27, 2023

The way we specify which control var to perturb in pkg/grdchk, using the control-var index, is not very robust since this depends on the ordering of the controls in data.ctrl. To be more specific, e.g., in global_oce_latlon/input_ad/data.ctrl, if I permute the 1 and 2 in xx_genarr3d_file(1/2) =, the adjoint run is the same but not the gradient-check since now the perturbation applies to "xx_salt" instead of "xx_theta".

We could decide to change this and define which control var to perturb by setting the control var/file name in data.grdchk (as one of the ctrl var/file name listed in data.ctrl), in the example above it would be "xx_theta" instead of "201".

This would make the grdchk selection a little bit easier and independent of the ordering in data.ctrl namelist.
Also, with this change, no need to jump by 100 between genarr2d, genarr3d and gentim2d for the control-index so that maxcvars could be reduced to just "maxCtrlArr2D + maxCtrlArr3D + maxCtrlTim2D + 4" (and this is how it is related to this issue).

@mjlosch
Copy link
Member Author

mjlosch commented Nov 28, 2023

@jm-c

We could decide to change this and define which control var to perturb by setting the control var/file name in data.grdchk (as one of the ctrl var/file name listed in data.ctrl), in the example above it would be "xx_theta" instead of "201".

very good idea. The associated code changes are very few, once the variable name (xx_theta) is avaiable (as in PR #796).

@jm-c
Copy link
Member

jm-c commented Apr 2, 2024

Regarding bullet-point 1, was mentioned in issue #440 discussion, then added to to the list there, and we (including myself) commented about it back at that time;
then it got listed again here (but without the comments that discourage to change this) so that I had to copy/paste the comments again here. Now that most of the bullet-points are marked as "done" here, I am worried that this first point is going to be moved again to an other/new issue as this one is closed.

@mjlosch
Copy link
Member Author

mjlosch commented Apr 2, 2024

I can change this bullet point and add that we will not change this (and remove the "ticklable" bullet).

What about the last one (wunit.bib)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants