Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unneeded redundant HDF5 linking #1935

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

islas
Copy link
Collaborator

@islas islas commented Nov 8, 2023

TYPE: bug fix

KEYWORDS: hdf5, relink, duplicate, netCDF

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
HDF5 linking is handled separately by HDF5(configure)/sw_hdf5_path(Config.pl) and is not required anymore. Even if netCDF has HDF5 compression, those libraries should be provided via nc-config --libs or nf-config equivalent. This default injection of HDF5 causes compilation to fail in environments that don't have HDF5 loaded in path, but would otherwise build fine (ex. Derecho, 32 (serial), module load gcc netcdf).

How does netCDF find HDF5 if it has HDF5 compression but no library is listed now?
nc-config when used in its shared library format should dictate which HDF5 to use - Unidata/netcdf-c#1257 (comment). This is important to follow so as to avoid library splicing or overlinking if conflicting HDF5 libraries are specified. Note that this is accomplished via RPATH

For static builds we would use --libs --static - Unidata/netcdf-c#1383. In that case one should provide their own locations for HDF5 and other dependent libraries.

This all likely should be addressed for zlib AND most likely conflicts with specifying HDF5(configure) / HDF5_PATH(configure) / sw_hdf5_path(Config.pl). In the latter case, users probably have either (1) a difficult time building and are maybe not even using the HDF5 they expect, (2) can't build as desired, or (3) have non-HDF5 enabled netCDF to allow supplying their own HDF5 usage within WRF. Solving that issue is outside the scope of this bug and would require discussion of all the methods to supply alternate libraries and their respective benefits/costs.

Solution:
Remove non-configurable use of sw_hdf5/$(HDF5)

LIST OF MODIFIED FILES: list of changed files (use git diff --name-status master to get formatted list)
M arch/Config.pl
M arch/preamble

TESTS CONDUCTED:

  1. On Derecho after module purge, compile with module load gcc netcdf and 32 (serial) should configure properly
  2. Are the Jenkins tests all passing? Waiting

RELEASE NOTE:
Remove default HDF5 library linking in lieu of netCDF-informed HDF5 linking

@islas islas requested a review from a team as a code owner November 8, 2023 20:22
@weiwangncar
Copy link
Collaborator

The regression test results:

Test Type              | Expected  | Received |  Failed
= = = = = = = = = = = = = = = = = = = = = = = =  = = = =
Number of Tests        : 23           24
Number of Builds       : 60           57
Number of Simulations  : 158           150        0
Number of Comparisons  : 95           86        0

Failed Simulations are: 
None
Which comparisons are not bit-for-bit: 
None

@weiwangncar
Copy link
Collaborator

The regression test results after resolving the conflict:

Test Type              | Expected  | Received |  Failed
= = = = = = = = = = = = = = = = = = = = = = = =  = = = =
Number of Tests        : 23           24
Number of Builds       : 60           57
Number of Simulations  : 158           150        0
Number of Comparisons  : 95           86        0

Failed Simulations are: 
None
Which comparisons are not bit-for-bit: 
None

@mgduda
Copy link
Collaborator

mgduda commented Dec 19, 2023

@islas Perhaps there have been changes to the modules on derecho, or perhaps my process was different, but I didn't find any issues in compiling with option 32 on derecho with the current release-v4.5.2 branch (without the changes in this PR). It didn't look to me like the libraries specified by the HDF5 variable were being included in the final linking steps for the WRF executables.

I seem to recall that there is an HDF5 implementation of WRF's I/O API, and it may be that the HDF5 logic in the build system was put in place for that purpose (and, speculating, perhaps it was later co-opted for NetCDF4 purposes?). If we intend to continue support for the HDF5 option (independent of NetCDF4), then we may want to consider carefully how we modify the logic for HDF5. On the other hand, perhaps we decide that we won't support or maintain the HDF5 implementation of the I/O API moving forward.

@islas
Copy link
Collaborator Author

islas commented Dec 19, 2023

@mgduda These changes were made independent of #1923. Testing with current develop/master/ or where release-v4.5.2 branches from develop should demonstrate the issues in the test/nc4_test.log logfile. As that PR was merged in first, I think it resolved the build issues with forced HDF5 linkage since NETCDF4_DEP_LIB now comes directly from nc-config --libs instead of broken down and reconstructed to specify $(HDF5) $(ZLIB) $(GPFS) $(CURL) manually, with default values being injected.

As that PR removes the in-situ modification of specifying HDF5 libraries for use with netCDF, I think it might still be worth considering this PR since this (the HDF5 linking code in question) is more or less dead code now.

It does look like the HDF5 I/O API assumes to use the NETCDF4_DEP_LIB (here). Whether one was co-opted for the other or vice versa (I think the dep_libs variable (netCDF or not) just became a catch-all for dependencies), the current logic would not have been ideal when using both netCDF and HDF5 as it could potentially load different library versions. The take away here I think should be that the current system already assumed sourcing HDF5 from netCDF only (this may have been overlooked on my part when reviewing #1923) - if there is a use case reinstate support for HDF5 independent of netCDF then the current logic [prior to PR 1923] would only have sufficed in that context and not when both are present, and thus would need to have been revisited / rewritten regardless.

My overarching question is then - should this PR be about removing the traces of hazardous in-situ linkage of HDF5 via co-opting netCDF dependencies (which no longer works), or repairing linkage to HDF5 I/O API when netCDF is not present in such a manner that it does play nicely when it is?

I think both are perfectly valid PRs and push in the right direction, though I view the former as an incremental step to the latter, especially in light of #1923.

@weiwangncar
Copy link
Collaborator

@islas @mgduda Should we leave it for 4.6 consideration?

@islas
Copy link
Collaborator Author

islas commented Dec 20, 2023

@weiwangncar I think that is fine

@mgduda
Copy link
Collaborator

mgduda commented Dec 20, 2023

@islas @mgduda Should we leave it for 4.6 consideration?

I think that seems reasonable. It may turn out that we remove even the HDF5 I/O API code, which seems to be in an unmaintained and untested state.

@mgduda mgduda changed the base branch from release-v4.5.2 to develop December 21, 2023 22:26
@mgduda mgduda requested a review from a team as a code owner December 21, 2023 22:26
@mgduda
Copy link
Collaborator

mgduda commented Dec 21, 2023

@islas I've changed the base branch for this PR to develop. Could you rebase the branch so that only the intended commits are included?

@weiwangncar
Copy link
Collaborator

@mgduda ready for approval?

@weiwangncar
Copy link
Collaborator

@mgduda @islas I missed this one. Can you again review this, M.?

@mgduda
Copy link
Collaborator

mgduda commented Feb 12, 2024

If I understand correctly, WRFDA sometimes requires HDF5 even outside of its use in the WRF I/O API; see the installation instructions. Would this PR affect compilation of WRFDA with HDF5 if the NetCDF libraries were build without HDF5 support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants