Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolves building issues (WRF/WRF-chem) with Intel compilers (ifx/icx) #1942

Merged
merged 10 commits into from
Feb 20, 2024

Conversation

ahheo
Copy link
Contributor

@ahheo ahheo commented Nov 28, 2023

Resolving building issues with Intel compilers (ifx/icx) for WRF/WRF-chem

TYPE: bug fix

KEYWORDS: Missing/Wrong Prototypes

SOURCE: Changgui Lin

DESCRIPTION OF CHANGES:
Problem:
Problem: Most C code written long ago. No prototypes were used. See #1823. And, this pr is kind of an extension to #1823 addressing the same issue for building WRF-chem.

Solution:
Add missing prototypes; rearrange function order.

LIST OF MODIFIED FILES: list of changed files (use git diff --name-status master to get formatted list)
modified: arch/configure.defaults
modified: chem/KPP/kpp/kpp-2.1/src/code.c
modified: chem/KPP/kpp/kpp-2.1/src/code.h
modified: chem/KPP/kpp/kpp-2.1/src/code_c.c
modified: chem/KPP/kpp/kpp-2.1/src/code_f77.c
modified: chem/KPP/kpp/kpp-2.1/src/code_f90.c
modified: chem/KPP/kpp/kpp-2.1/src/code_matlab.c
modified: chem/KPP/kpp/kpp-2.1/src/gdata.h
modified: chem/KPP/kpp/kpp-2.1/src/gen.c
modified: chem/KPP/kpp/kpp-2.1/src/gen_org.c
modified: chem/KPP/kpp/kpp-2.1/src/kpp.c
modified: chem/KPP/kpp/kpp-2.1/src/scan.h
modified: chem/KPP/kpp/kpp-2.1/src/scan.l
modified: chem/KPP/kpp/kpp-2.1/src/scan.y
modified: chem/KPP/util/wkc/Makefile
modified: chem/KPP/util/wkc/change_chem_Makefile.c
modified: chem/KPP/util/wkc/compare_kpp_to_species.c
modified: chem/KPP/util/wkc/gen_kpp.c
modified: chem/KPP/util/wkc/gen_kpp_args_to_Update_Rconst.c
modified: chem/KPP/util/wkc/gen_kpp_interf_utils.c
modified: chem/KPP/util/wkc/gen_kpp_interface.c
modified: chem/KPP/util/wkc/gen_kpp_mech_dr.c
modified: chem/KPP/util/wkc/gen_kpp_utils.c
modified: chem/KPP/util/wkc/get_kpp_chem_specs.c
modified: chem/KPP/util/wkc/get_wrf_jvals.c
modified: chem/KPP/util/wkc/get_wrf_radicals.c
modified: chem/KPP/util/wkc/kpp_data.c
modified: chem/KPP/util/wkc/protos_kpp.h
modified: chem/KPP/util/wkc/registry_kpp.c
modified: chem/KPP/util/write_decomp/integr_edit.c
modified: compile
modified: external/RSL_LITE/buf_for_proc.c
modified: external/RSL_LITE/c_code.c
modified: external/RSL_LITE/cycle.c
modified: external/RSL_LITE/gen_comms.c
modified: external/RSL_LITE/period.c
modified: external/RSL_LITE/rsl_bcast.c
modified: external/RSL_LITE/rsl_lite.h
modified: external/RSL_LITE/rsl_malloc.c
modified: external/RSL_LITE/swap.c
modified: external/RSL_LITE/task_for_point.c
modified: external/io_grib1/MEL_grib1/FTP_getfile.c
modified: external/io_grib1/MEL_grib1/gribfuncs.h
modified: external/io_grib1/MEL_grib1/gribgetbds.c
modified: external/io_grib1/MEL_grib1/gribhdr2file.c
modified: external/io_grib1/MEL_grib1/init_enc_struct.c
modified: external/io_grib1/MEL_grib1/init_gribhdr.c
modified: external/io_grib1/MEL_grib1/init_struct.c
modified: external/io_grib1/MEL_grib1/ld_enc_input.c
modified: external/io_grib1/MEL_grib1/ld_enc_lookup.c
modified: external/io_grib1/MEL_grib1/ld_grib_origctrs.c
modified: external/io_grib1/MEL_grib1/map_lvl.c
modified: external/io_grib1/MEL_grib1/map_parm.c
modified: external/io_grib1/MEL_grib1/prt_badmsg.c
modified: external/io_grib1/MEL_grib1/prt_inp_struct.c
modified: external/io_grib1/MEL_grib1/set_bytes.c
modified: external/io_grib1/MEL_grib1/upd_child_errmsg.c
modified: external/io_grib1/grib1_routines.c
modified: external/io_grib1/grib1_util/read_grib.c
modified: external/io_grib1/grib1_util/read_grib.h
modified: external/io_grib1/grib1_util/write_grib.c
modified: external/io_grib1/gribmap.c
modified: external/io_grib1/gribmap.h
modified: external/io_grib1/trim.c
new file: external/io_grib1/trim.h
modified: external/io_grib_share/get_region_center.c
modified: external/io_grib_share/open_file.c
modified: frame/pack_utils.c
modified: tools/Makefile
modified: tools/gen_config.c
modified: tools/gen_interp.c
modified: tools/gen_irr_diag.c
modified: tools/gen_streams.c
modified: tools/gen_wrf_io.c
modified: tools/misc.c
modified: tools/protos.h
modified: tools/reg_parse.c
modified: tools/sym.c
modified: tools/sym.h
modified: tools/symtab_gen.c

SCC = icx # icc
CCOMP = icx # icc
DM_FC = mpiifx #mpif90 -f90=$(SFC)
DM_CC = mpiicx #mpicc -cc=$(SCC)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahheo Should not change this option. Add another option like it is in PR#1893.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Well, what should I do with this PR for the next step? Should I convert it as a draft and make a new commit and submit this PR again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahheo Please continue to modify this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiwangncar I have made this modification following your suggestion: restoring the changed option and adding another option as PR #1893 , by another commit 7361882. Well, the commit still failed in WRF regression test, and I was not able to check the details. Sorry that this is my first time making a PR, and I am still learning the procedures to make things work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahheo The failure of the regression tests is not your fault. We currently have some trouble running the tests on the Cloud. We are working to resolve the issues.

@weiwangncar
Copy link
Collaborator

@ahheo Can we assume that your changes are limited in chem/KPP/ directory, while the rest is identical to that in PR#1823?

@ahheo
Copy link
Contributor Author

ahheo commented Nov 30, 2023

@ahheo Can we assume that your changes are limited in chem/KPP/ directory, while the rest is identical to that in PR#1823?

Well, you might, in principle, assume this. But, to make it precisely, it is worth mentioning that I also made some modifications dealing with the following compilation warnings:
warning: for loop has empty body [-Wempty-body]
warning: format string is not a string literal (potentially insecure) [-Wformat-security]
warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]
Should I mention these in the revised PR?

@weiwangncar
Copy link
Collaborator

@ahheo Can we assume that your changes are limited in chem/KPP/ directory, while the rest is identical to that in PR#1823?

Well, you might, in principle, assume this. But, to make it precisely, it is worth mentioning that I also made some modifications dealing with the following compilation warnings: warning: for loop has empty body [-Wempty-body] warning: format string is not a string literal (potentially insecure) [-Wformat-security] warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses] Should I mention these in the revised PR?

@ahheo It would be helpful if you identify the routines you modified in addition to those modified in PR#1823, and explain why you need to modify them.

@weiwangncar
Copy link
Collaborator

The regression test results:

Test Type              | Expected  | Received |  Failed
= = = = = = = = = = = = = = = = = = = = = = = =  = = = =
Number of Tests        : 23           24
Number of Builds       : 60           57
Number of Simulations  : 158           150        0
Number of Comparisons  : 95           86        0

Failed Simulations are: 
None
Which comparisons are not bit-for-bit: 
None

@weiwangncar
Copy link
Collaborator

@ahheo It would be helpful if you identify the routines you modified in addition to those modified in PR#1823, and explain why you need to modify them.

@weiwangncar
Copy link
Collaborator

@ahheo I tried to use Intel oneAPI to compile WRF-Chem + KPP, it failed. It worked when compiling KPP only. Have you tried to compile WRF-Chem and KPP? I also verified that your modification still works for compiling WRF along.

@ahheo
Copy link
Contributor Author

ahheo commented Dec 6, 2023

@ahheo I tried to use Intel oneAPI to compile WRF-Chem + KPP, it failed. It worked when compiling KPP only. Have you tried to compile WRF-Chem and KPP? I also verified that your modification still works for compiling WRF along.

@weiwangncar, yes, I compiled with WRF_CHEM=1 together with WRF_KPP=1 using ifx / icx / mpiifx / mpiicx successfully, (and also successfully using ifx / icx / mpif90 -fc ifx / mpicc -cc icx). It would be nice if you can provide the compile log.

@weiwangncar
Copy link
Collaborator

@ahheo Here are the steps I took to compile your branch:

  1. clone your branch and check out branch CL_test
  2. on our HPC, use intel-oneapi/2023.0.0 module
  3. set environtment variables:
setenv WRF_EM_CORE 1
setenv WRF_CHEM 1
setenv WRF_KPP 1
setenv FLEX_LIB_DIR /usr/lib64
setenv YACC '/usr/bin/yacc -d'
  1. configure using option 78 (icx/ifx, dmpar)
  2. compile using 'compile em_real' >& make78chemkpp.txt

Here is the log file.
make78chemkpp.txt

@ahheo
Copy link
Contributor Author

ahheo commented Dec 7, 2023

@ahheo Here are the steps I took to compile your branch:

  1. clone your branch and check out branch CL_test
  2. on our HPC, use intel-oneapi/2023.0.0 module
  3. set environtment variables:
setenv WRF_EM_CORE 1
setenv WRF_CHEM 1
setenv WRF_KPP 1
setenv FLEX_LIB_DIR /usr/lib64
setenv YACC '/usr/bin/yacc -d'
  1. configure using option 78 (icx/ifx, dmpar)
  2. compile using 'compile em_real' >& make78chemkpp.txt

Here is the log file. make78chemkpp.txt

@weiwangncar the compile problem is due to the failure in finding libfl.a. I found the following lines in your make log:

No libfl.a in /usr/lib64
check if FLEX_LIB_DIR environment variable is set correctly
(FLEX_LIB_DIR should be the complete pathname of the FLEX library libfl.a)
OR: Enter full path to flex library on your system
PROBLEM: libfl.a NOT FOUND IN
You should check if libfl.a located in /usr/lib64.

@weiwangncar
Copy link
Collaborator

@ahheo Thanks for pointing it out. True, /usr/lib64 does not have libfl.a, it has /usr/libfl.so. Let me find out more about this lib.

@weiwangncar
Copy link
Collaborator

@ahheo Replacing libfl.a by libfl.so allowed the code to compile.

@weiwangncar
Copy link
Collaborator

@ahheo You probably received an email as well. If you have not, the regression tests have failed. Mostly because it failed to compile for all.

@ahheo
Copy link
Contributor Author

ahheo commented Dec 11, 2023

@ahheo You probably received an email as well. If you have not, the regression tests have failed. Mostly because it failed to compile for all.

@weiwangncar Hope this commit 26964a0 works fine.

@weiwangncar
Copy link
Collaborator

@ahheo The last change did not pass the regression tests. I have moved this to the develop branch to give both you and us more time to review and test the change.

@ahheo
Copy link
Contributor Author

ahheo commented Dec 19, 2023

@ahheo The last change did not pass the regression tests. I have moved this to the develop branch to give both you and us more time to review and test the change.

@weiwangncar Absolutely OK. It would be nice if you can tell more information why the regression tests failed.

@weiwangncar
Copy link
Collaborator

@ahheo The problem is that no output was sent back. This can indicate serious problems. Have you succeeded in compiling the updated code on your local computer?

@ahheo
Copy link
Contributor Author

ahheo commented Dec 19, 2023

@ahheo The problem is that no output was sent back. This can indicate serious problems. Have you succeeded in compiling the updated code on your local computer?

@weiwangncar Yes, I successfully compiled the updated code on my own machine. So, something goes weird?

@weiwangncar weiwangncar changed the base branch from release-v4.5.2 to develop January 13, 2024 02:57
chem/KPP/kpp/kpp-2.1/src/gen.c Outdated Show resolved Hide resolved
chem/KPP/kpp/kpp-2.1/src/scan.l Outdated Show resolved Hide resolved
chem/KPP/kpp/kpp-2.1/src/scan.y Outdated Show resolved Hide resolved
chem/KPP/util/wkc/kpp_data.c Outdated Show resolved Hide resolved
compile Outdated Show resolved Hide resolved
tools/gen_irr_diag.c Outdated Show resolved Hide resolved
tools/gen_wrf_io.c Outdated Show resolved Hide resolved
@weiwangncar
Copy link
Collaborator

@ahheo Would you like to update the PR based on the suggestions made?

@benkirk
Copy link

benkirk commented Feb 13, 2024

Hi, I lead the user support group for Derecho, and I just wanted to say we will really value this contribution. WRF-Chem has been quite painful, so far.

Can I also ask a naïve question, in hopes of getting pointed in a useful direction?
I see from the KPP github site it is actively developed, with releases as recent as 2 months ago. Yet as this PR correctly observes the version in WRF-Chem is 20+ years old. I'm sure there is a good reason for that(?). Can anyone explain why??

Thanks again, this is great work.

@ahheo
Copy link
Contributor Author

ahheo commented Feb 16, 2024

@ahheo Would you like to update the PR based on the suggestions made?

@weiwangncar Sorry that for the late reply because I just come back from holidays to office today. The PR was updated as suggested. Please see and review the commit 911facc.

compile Outdated Show resolved Hide resolved
tools/gen_wrf_io.c Outdated Show resolved Hide resolved
This reverts commit 82447aa.

	modified:   compile
	modified:   tools/gen_wrf_io.c
…an if condition grouping (tools/gen_wrf_io.c)
@ahheo ahheo requested a review from islas February 19, 2024 08:38
@weiwangncar
Copy link
Collaborator

@islas Can you take hopefully a final check on this PR?

Copy link
Collaborator

@weiwangncar weiwangncar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For WRF-Chem since Jordan Schnell has approved this before.

@weiwangncar weiwangncar merged commit 2e15abb into wrf-model:develop Feb 20, 2024
1 of 2 checks passed
@weiwangncar
Copy link
Collaborator

@ahheo If you could let us where you work, we can add it to the PR. Thanks for contributing to our model development effort.

@ahheo
Copy link
Contributor Author

ahheo commented Feb 21, 2024

If you could let us where you work, we can add it to the PR.

I work in National Space Science Center, CAS. Feel free to add it or not as I made the changes on my own :).

@weiwangncar
Copy link
Collaborator

@ahheo OK, thanks for letting us know.

Copy link
Contributor

@DWesl DWesl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm glad you're clarifying the precedence of the bit-mask and logical-not operators, I have some questions about the choice of how to do so, and would appreciate hearing the reasons for responding to the warning by insisting you meant the behavior the compiler thought unlikely?

@@ -377,7 +377,7 @@ fprintf(fp," ngrid%%parent_grid_ratio, ngrid%%parent_grid_ratio
strcpy( tmpstr , pp->interpu_aux_fields ) ;
} else if ( down_path & FORCE_DOWN ) {
/* by default, add the boundary and boundary tendency fields to the arg list */
if ( ! p->node_kind & FOURD ) {
if ( (! p->node_kind) & FOURD ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( (! p->node_kind) & FOURD ) {
if ( ! (p->node_kind & FOURD) ) {

or

Suggested change
if ( (! p->node_kind) & FOURD ) {
if ( (~ p->node_kind) & FOURD ) {

registry.h has #define FOURD 8 so this conditional will never be met (!p->node_kind) will be 0 or 1)

@@ -85,7 +85,7 @@ gen_wrf_io2 ( FILE * fp , char * fname, char * structname , char * fourdname, no
{


if ( p->ndims > 3 && ! p->node_kind & FOURD ) continue ; /* short circuit anything with more than 3 dims, (not counting 4d arrays) */
if ( p->ndims > 3 && ( (! p->node_kind) & FOURD ) ) continue ; /* short circuit anything with more than 3 dims, (not counting 4d arrays) */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( p->ndims > 3 && ( (! p->node_kind) & FOURD ) ) continue ; /* short circuit anything with more than 3 dims, (not counting 4d arrays) */
if ( p->ndims > 3 && ! (p->node_kind & FOURD) ) continue ; /* short circuit anything with more than 3 dims, (not counting 4d arrays) */

or

Suggested change
if ( p->ndims > 3 && ( (! p->node_kind) & FOURD ) ) continue ; /* short circuit anything with more than 3 dims, (not counting 4d arrays) */
if ( p->ndims > 3 && (~ p->node_kind) & FOURD ) continue ; /* short circuit anything with more than 3 dims, (not counting 4d arrays) */

registry.h has #define FOURD 8 so this conditional will never be met (!p->node_kind) will be 0 or 1)

@@ -1058,7 +1058,7 @@ check_dimspecs()
p->assoc_nl_var_s,p->name ) ;
return(1) ;
}
if ( ! q->node_kind & RCONFIG )
if ( (! q->node_kind) & RCONFIG )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( (! q->node_kind) & RCONFIG )
if ( ! (q->node_kind & RCONFIG) )

or

Suggested change
if ( (! q->node_kind) & RCONFIG )
if ( (~ q->node_kind) & RCONFIG )

I suspect RCONFIG != 1, so this conditional is unlikely to be met as-is.

@@ -1083,7 +1083,7 @@ check_dimspecs()
p->assoc_nl_var_e,p->name ) ;
return(1) ;
}
if ( ! q->node_kind & RCONFIG )
if ( (! q->node_kind) & RCONFIG )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( (! q->node_kind) & RCONFIG )
if ( ! (q->node_kind & RCONFIG) )

or

Suggested change
if ( (! q->node_kind) & RCONFIG )
if ( (~ q->node_kind) & RCONFIG )

I suspect RCONFIG != 1, so this conditional will never be met as-is.

@ahheo
Copy link
Contributor Author

ahheo commented Feb 27, 2024

While I'm glad you're clarifying the precedence of the bit-mask and logical-not operators, I have some questions about the choice of how to do so, and would appreciate hearing the reasons for responding to the warning by insisting you meant the behavior the compiler thought unlikely?

@DWesl Thank you for raising the questions. I would point out that my corresponding changes focused only on the grouping of if-conditions but not the precedence of the bit-mask and logical-not operators. Assuming that the original code yields warnings but gives the right results, the rule applied for grouping is to have the same result after grouping. Indeed, I have made the following test (currently with only icx compiler):

a is 1 and b is 0:
! a & b is False
! (a & b) is True
(! a) & b is False

a is 0 and b is 0:
! a & b is False
! (a & b) is True
(! a) & b is False

a is 0 and b is 1:
! a & b is True
! (a & b) is True
(! a) & b is True

a is 1 and b is 1:
! a & b is False
! (a & b) is False
(! a) & b is False

But I cannot confirm that the assumption is reasonable, and I wouldn't say if the warnings are related to the ifx compiler. Finally, it would be very appreciated if you can validate the assumption and/or made changes with your reasons provided.

@DWesl
Copy link
Contributor

DWesl commented Feb 27, 2024

FOURD = 8, and RCONFIG = 4:

WRF/tools/registry.h

Lines 27 to 28 in 794843f

#define RCONFIG 4
#define FOURD 8

I think if you use either value for b, the first and third results will always be false, even if you use 8, 12, or even 255 for a, at which point we may as well delete the conditional and make any else branch evaluate every time. Phrased another way, I think the compiler warning is correct that whoever wrote the code was wrong about the relative precedence of ! and &.

@ahheo
Copy link
Contributor Author

ahheo commented Feb 28, 2024

I think if you use either value for b, the first and third results will always be false, even if you use 8, 12, or even 255 for a
I think the compiler warning is correct that whoever wrote the code was wrong about the relative precedence of ! and &.

@DWesl I agree with you. Well, I have not idea what I need to do at this point as the pull request has been merged already.

islas added a commit that referenced this pull request Apr 2, 2024
…ags (#2030)

Fix issue with generalized compiler version output using the wrong flags

TYPE: bug fix

KEYWORDS: compile, version

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
A generalized compile version check was proposed in #1987. This logic was implemented in #1942 but the originating logic contains a bug where the `-V` and `--version` flag commands' output is flipped.

Solution:
Use `-V` in the appropriate spot when `$status` is zero for the `-V` check, and respectively for the `--version` check.

LIST OF MODIFIED FILES: 
M       compile

TESTS CONDUCTED: 
1. With previous bad logic the compile log output shows a compiler error as the wrong flag is used to output version info. With the fix, the correct output now shows in the compile log.
islas added a commit to islas/WRF that referenced this pull request Jun 10, 2024
…ags (wrf-model#2030)

Fix issue with generalized compiler version output using the wrong flags

TYPE: bug fix

KEYWORDS: compile, version

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
A generalized compile version check was proposed in wrf-model#1987. This logic was implemented in wrf-model#1942 but the originating logic contains a bug where the `-V` and `--version` flag commands' output is flipped.

Solution:
Use `-V` in the appropriate spot when `$status` is zero for the `-V` check, and respectively for the `--version` check.

LIST OF MODIFIED FILES: 
M       compile

TESTS CONDUCTED: 
1. With previous bad logic the compile log output shows a compiler error as the wrong flag is used to output version info. With the fix, the correct output now shows in the compile log.
islas added a commit to islas/WRF that referenced this pull request Jun 10, 2024
…ags (wrf-model#2030)

Fix issue with generalized compiler version output using the wrong flags

TYPE: bug fix

KEYWORDS: compile, version

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
A generalized compile version check was proposed in wrf-model#1987. This logic was implemented in wrf-model#1942 but the originating logic contains a bug where the `-V` and `--version` flag commands' output is flipped.

Solution:
Use `-V` in the appropriate spot when `$status` is zero for the `-V` check, and respectively for the `--version` check.

LIST OF MODIFIED FILES: 
M       compile

TESTS CONDUCTED: 
1. With previous bad logic the compile log output shows a compiler error as the wrong flag is used to output version info. With the fix, the correct output now shows in the compile log.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants