Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update upp submodule #2213

Merged
merged 26 commits into from
Apr 19, 2024
Merged

Update upp submodule #2213

merged 26 commits into from
Apr 19, 2024

Conversation

WenMeng-NOAA
Copy link
Contributor

@WenMeng-NOAA WenMeng-NOAA commented Mar 28, 2024

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

This PR aims to update revision of upp submodule which is under FV3 subcomponent. The main changes include rocky8 transition and other UPP updates to post process of UFS based global and regional applications.

Commit Message:

* UFSWM - Update inline post
  * FV3 - Update upp submodule for inline post

Priority:

  • High: Support global-workflow Rocky8 transition on Hera

Git Tracking

UFSWM:

  • None

Sub component Pull Requests:

UFSWM Blocking Dependencies:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

  • PR Adds New Tests/Baselines.
  • PR Updates/Changes Baselines.
regional_control intel
regional_restart intel
regional_decomp intel
regional_2threads intel
regional_2dwrtdecomp intel
regional_wofs intel
regional_spp_sppt_shum_skeb intel
regional_control_faster intel
rap_clm_lake_debug intel
regional_spp_sppt_shum_skeb_dyn32_phy32 intel
hafs_regional_atm intel
hafs_global_multiple_4nests_atm intel
hafs_regional_specified_moving_1nest_atm intel
  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

Merge remote-tracking branch 'upstream/develop' into upp_HR4
@FernandoAndrade-NOAA FernandoAndrade-NOAA added the Baseline Updates Current baselines will be updated. label Mar 29, 2024
@jkbk2004
Copy link
Collaborator

jkbk2004 commented Apr 1, 2024

@WenMeng-NOAA @FernandoAndrade-NOAA can we schedule to work on this pr anytime this week?

@WenMeng-NOAA
Copy link
Contributor Author

@WenMeng-NOAA @FernandoAndrade-NOAA can we schedule to work on this pr anytime this week?

@jkbk2004 That would be great! Please let me know any actions from my end.

@FernandoAndrade-NOAA
Copy link
Collaborator

It looks like the test_changes.list was overwritten during the sync, I'm recommitting that as those changes were confirmed to be expected from this update.

@FernandoAndrade-NOAA
Copy link
Collaborator

@zach1221 @BrianCurtis-NOAA FYI getting started on testing this PR

@FernandoAndrade-NOAA FernandoAndrade-NOAA added the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Apr 5, 2024
@WenMeng-NOAA
Copy link
Contributor Author

It looks like the test_changes.list was overwritten during the sync, I'm recommitting that as those changes were confirmed to be expected from this update.

@FernandoAndrade-NOAA Yes, during my syncing process this morning, I was not sure which version of test_changes.list was needed.

@FernandoAndrade-NOAA
Copy link
Collaborator

It looks like the test_changes.list was overwritten during the sync, I'm recommitting that as those changes were confirmed to be expected from this update.

@FernandoAndrade-NOAA Yes, during my syncing process this morning, I was not sure which version of test_changes.list was needed.

No worries! We should be good to go now, thanks.

@FernandoAndrade-NOAA
Copy link
Collaborator

There were failures with the creation for the rap_clm_lake_debug_intel test on Hera, Gaea, and Jet due to timeouts. There is an unusually massive err file on all machines, please note the line count @jkbk2004 FYI:
Hera: /scratch1/NCEPDEV/stmp2/Fernando.Andrade-maldonado/FV3_RT/rt_1277667/rap_clm_lake_debug_intel/err

37280183   0: slurmstepd: error: *** STEP 58007894.0 ON h10c53 CANCELLED AT 2024-04-05T17:38:55 DUE TO TIME LIMIT ***
37280184 144: fv3.exe            000000000094783A  Unknown               Unknown  Unknown
37280185 144: fv3.exe            00000000011E4A59  Unknown               Unknown  Unknown
37280186 144: fv3.exe            0000000000A98E8A  Unknown               Unknown  Unknown
37280187 144: fv3.exe            0000000000967070  Unknown               Unknown  Unknown
37280188 144: fv3.exe            0000000000C96201  Unknown               Unknown  Unknown
37280189 144: fv3.exe            000000000042E9EB  MAIN__                    406  UFS.F90
37280190 144: fv3.exe            000000000042AEE2  Unknown               Unknown  Unknown
37280191 144: libc-2.28.so       0000153E58C0AD85  __libc_start_main     Unknown  Unknown
37280192 144: fv3.exe            000000000042ADEE  Unknown               Unknown  Unknown
37280193 149: forrtl: warning (406): fort: (1): In call to RSEARCH1, an array temporary was created for argument #4

Jet: /lfs4/HFIP/h-nems/Fernando.Andrade-maldonado/RT_RUNDIRS/Fernando.Andrade-maldonado/FV3_RT/rt_715069/rap_clm_lake_debug_intel/err

Gaea: /gpfs/f5/epic/scratch/Fernando.Andrade-maldonado/RT_RUNDIRS/Fernando.Andrade-maldonado/FV3_RT/rt_148379/rap_clm_lake_debug_intel/err

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Apr 8, 2024

@WenMeng-NOAA If a quick fix is not ready, we can reschedule this pr. We will move to #2145. @FernandoAndrade-NOAA @zach1221 @BrianCurtis-NOAA FYI

@WenMeng-NOAA
Copy link
Contributor Author

There were failures with the creation for the rap_clm_lake_debug_intel test on Hera, Gaea, and Jet due to timeouts. There is an unusually massive err file on all machines, please note the line count @jkbk2004 FYI: Hera: /scratch1/NCEPDEV/stmp2/Fernando.Andrade-maldonado/FV3_RT/rt_1277667/rap_clm_lake_debug_intel/err

37280183   0: slurmstepd: error: *** STEP 58007894.0 ON h10c53 CANCELLED AT 2024-04-05T17:38:55 DUE TO TIME LIMIT ***
37280184 144: fv3.exe            000000000094783A  Unknown               Unknown  Unknown
37280185 144: fv3.exe            00000000011E4A59  Unknown               Unknown  Unknown
37280186 144: fv3.exe            0000000000A98E8A  Unknown               Unknown  Unknown
37280187 144: fv3.exe            0000000000967070  Unknown               Unknown  Unknown
37280188 144: fv3.exe            0000000000C96201  Unknown               Unknown  Unknown
37280189 144: fv3.exe            000000000042E9EB  MAIN__                    406  UFS.F90
37280190 144: fv3.exe            000000000042AEE2  Unknown               Unknown  Unknown
37280191 144: libc-2.28.so       0000153E58C0AD85  __libc_start_main     Unknown  Unknown
37280192 144: fv3.exe            000000000042ADEE  Unknown               Unknown  Unknown
37280193 149: forrtl: warning (406): fort: (1): In call to RSEARCH1, an array temporary was created for argument #4

Jet: /lfs4/HFIP/h-nems/Fernando.Andrade-maldonado/RT_RUNDIRS/Fernando.Andrade-maldonado/FV3_RT/rt_715069/rap_clm_lake_debug_intel/err

Gaea: /gpfs/f5/epic/scratch/Fernando.Andrade-maldonado/RT_RUNDIRS/Fernando.Andrade-maldonado/FV3_RT/rt_148379/rap_clm_lake_debug_intel/err

@jkbk2004 It seems to me the errors are not from inline post code. It would be difficult for me to debug this issue .

@WenMeng-NOAA
Copy link
Contributor Author

@WenMeng-NOAA If a quick fix is not ready, we can reschedule this pr. We will move to #2145. @FernandoAndrade-NOAA @zach1221 @BrianCurtis-NOAA FYI

@jkbk2004 You may move to the next PR process. Meanwhile I will investigate more.

@BrianCurtis-NOAA
Copy link
Collaborator

I see the test is a debug test, but it seems to be lacking debug information ( a lot of unknown labels where we should see lines and subroutines). @jkbk2004 can your team find out which subroutine is causing the issue?

@jkbk2004 jkbk2004 removed the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Apr 8, 2024
@WenMeng-NOAA
Copy link
Contributor Author

@jkbk2004 The fix recommended by @DusanJovic-NOAA has been implemented at UPP side. Both @FernandoAndrade-NOAA and I have conducted tests on WCOSS2 and Hera. This PR is ready for testing.

@zach1221
Copy link
Collaborator

Hi, @WenMeng-NOAA . Can you sync up your branch here, so we can begin testing?

@WenMeng-NOAA
Copy link
Contributor Author

Hi, @WenMeng-NOAA . Can you sync up your branch here, so we can begin testing?

@zach1221 Done.

@FernandoAndrade-NOAA
Copy link
Collaborator

FernandoAndrade-NOAA commented Apr 18, 2024

Per Gaea admin suggestions, it looks like a quick sample run with export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6 beforehand resolved the error, @WenMeng-NOAA could you add this line to your PR while I run baseline creation and full RTs?

@WenMeng-NOAA
Copy link
Contributor Author

Per Gaea admin suggestions, it looks like a quick sample run with export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6 beforehand resolved the error, @WenMeng-NOAA could you add this line to your PR while I run baseline creation and full RTs?

@FernandoAndrade-NOAA Could you specify which file should be updated?

@jkbk2004
Copy link
Collaborator

Per Gaea admin suggestions, it looks like a quick sample run with export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6 beforehand resolved the error, @WenMeng-NOAA could you add this line to your PR while I run baseline creation and full RTs?

@FernandoAndrade-NOAA Could you specify which file should be updated?

@WenMeng-NOAA you can put 'export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6' somewhere https://github.com/WenMeng-NOAA/ufs-weather-model/blob/upp_HR4/tests/rt.sh#L750-L751. @FernandoAndrade-NOAA please, confirm

@FernandoAndrade-NOAA
Copy link
Collaborator

FernandoAndrade-NOAA commented Apr 18, 2024

Per Gaea admin suggestions, it looks like a quick sample run with export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6 beforehand resolved the error, @WenMeng-NOAA could you add this line to your PR while I run baseline creation and full RTs?

@FernandoAndrade-NOAA Could you specify which file should be updated?

@WenMeng-NOAA you can put 'export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6' somewhere https://github.com/WenMeng-NOAA/ufs-weather-model/blob/upp_HR4/tests/rt.sh#L750-L751. @FernandoAndrade-NOAA please, confirm

I would say around line 726, just to be sure. @WenMeng-NOAA FYI

@WenMeng-NOAA
Copy link
Contributor Author

export LD_PRELOAD=/opt/cray/pe/gcc/12.2.0/snos/lib64/libstdc++.so.6

@FernandoAndrade-NOAA @jkbk2004 Added. Thanks!

@FernandoAndrade-NOAA
Copy link
Collaborator

Leaving a note that baseline creation was successful, however Gaea is unresponsive to the point that I can't create and copy the new baseline directory. Trying again tomorrow morning. Apologies for the delays with Gaea.

@zach1221
Copy link
Collaborator

Gaea and Derecho were extremely slow yesterday and our tests weren't running. Finishing up those two machines currently.

@zach1221
Copy link
Collaborator

zach1221 commented Apr 19, 2024

I think we should probably skip Gaea and proceed with merging process. We can sync up the Gaea baselines later, when we're able to do so.

Fernando is reaching out to their admins again to ensure they're aware of the issue.

@zach1221
Copy link
Collaborator

@WenMeng-NOAA fv3atm sub-pr is merged, can you please revert the change in .gitmodule url and update the submodule hash?
Hash: NOAA-EMC/fv3atm@da95cc4

@WenMeng-NOAA
Copy link
Contributor Author

@WenMeng-NOAA fv3atm sub-pr is merged, can you please revert the change in .gitmodule url and update the submodule hash? Hash: NOAA-EMC/fv3atm@da95cc4

@zach1221 Done.

@zach1221 zach1221 merged commit 5d2ca19 into ufs-community:develop Apr 19, 2024
3 checks passed
zhanglikate added a commit to zhanglikate/ufs-weather-model that referenced this pull request May 3, 2024
commit f234a3e
Author: Ufuk Turunçoğlu <[email protected]>
Date:   Tue Apr 30 11:35:25 2024 -0600

    Fix for land component model (ufs-community#2191)

    * UFSWM - fix fully coupled land component configuration
      * NOAHMP - get fixed information from surface file

commit 04bbc15
Author: jiandewang <[email protected]>
Date:   Thu Apr 25 14:52:00 2024 -0400

    update MOM6 to its main repo. 20240401 commit (ufs-community#2241)

    * UFSWM -
      * MOM6 - update MOM6 to its main repo. 20240401 commit (NCAR-candidate-20240319)

commit b6c576d
Author: Daniel Sarmiento <[email protected]>
Date:   Tue Apr 23 12:24:22 2024 -0400

    Merged global namelist (ufs-community#2173)

    * UFSWM - global_control.nml_IN has been added as the new regression test namelist template for all global regression tests. The namelist now uses pointers (i.e. @[abc]) for variables and default values have been added to the default_vars.sh script. A new section in default_vars.sh has been added (export_tiled) to account for tiled RTs that pulls the correct parameter files using the ATMRES variable.
    Regression tests have been modified to account for these changes. Tests that were not compatible with the GFSv17_p8 core have been disabled for now. They will be turned on as they are updated from GFSv16 to GFSv17.

commit 5d2ca19
Author: WenMeng-NOAA <[email protected]>
Date:   Fri Apr 19 13:59:12 2024 -0400

    Update upp submodule (ufs-community#2213)

    * UFSWM - Update inline post
      * FV3 - Update upp submodule for inline post

commit 47c0099
Author: Brian Curtis <[email protected]>
Date:   Wed Apr 17 15:59:48 2024 -0400

    Add bash linting to CI. Cleanup .sh scripts a bit. Address .sh bugs. Adds -v Verbose option. (ufs-community#2218)  Remove nowarn Intel compiler flag (ufs-community#2225)

    * UFSWM
    - Add bash linting to CI:
      - uses superlinter to check for consistent bash code writing
    - Cleans up .sh scripts to comply with superlinter
    - Cleans up .sh scripts to be more consistent, easier to read.
    - Add's -v verbose option if debugging outputs needed, otherwise simplifies rt.sh run echo's.
    - Addresses smaller bugs
      - quota/timeout search logic adjusted.
      - check for dirs existing (DISKNM, STMP, PTMP) before starting.
      - adjustments/cleanup to ecflow/rocoto sections
      - rt.sh will attempt to start ecflow, and only stop ecflow if it started from rt.sh.
      - fix for issue where run_dir will not delete properly.
    * FV3: Address compiler warnings
      * atmos_cubed_sphere: Address compiler warnings.

commit 4f32a4b
Author: Rick Grubin <[email protected]>
Date:   Mon Apr 15 07:21:08 2024 -0600

    Document ATMW / ATMAERO / HAFS WM configurations (ufs-community#2160)

    * UFSWM
      * doc/Userguide
        * source
          * conf.py
          * Configurations.rst
          * FAQ.rst
          * InputsOutputs.rst
          * Introduction.rst

commit ac4445d
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Apr 15 08:59:42 2024 -0400

    Bump idna from 3.6 to 3.7 in /doc/UsersGuide (ufs-community#2234)

    *doc/UserGuide
       *requirements.txt - updates inda version from 3.6 to 3.7

commit 281b32f
Author: Samuel Trahan (NOAA contractor) <[email protected]>
Date:   Mon Apr 15 08:38:01 2024 -0400

    bug fixes: kchunk3d ignored, hailwat uninitialized in dycore, tile_num wrong for nests (ufs-community#2201)

    * UFSWM - None.
      * FV3 - Write component will use kchunk3d. Model init sends the right tile number to CCPP.
        * atmos_cubed_sphere - Initialize the hailwat variable. Pass global_tile index to model.

commit 8a5f711
Author: Denise Worthen <[email protected]>
Date:   Thu Apr 11 13:32:26 2024 -0400

    Add PIO namelist control for CICE (ufs-community#2145)

    Update to CICE-Consortium/CICE aca8357. Adds implementation of namelist PIO options for CICE

commit 45c8b2a
Author: JONG KIM <[email protected]>
Date:   Thu Apr 4 19:49:13 2024 -0400

    Hotfix/cubed sphere hash fix: HAILCAST diagnostic code (units issue) (ufs-community#2223)

    cubed_sphere hash update: f060e85 for a bug- fix in the HAILCAST diagnostic code (units issue)

commit 26e6db6
Author: Denise Worthen <[email protected]>
Date:   Wed Apr 3 19:57:08 2024 -0400

    Enable cpl_scalars export from ATM and NoahMP for use by CMEPS (ufs-community#2175)

      * CMEPS - allow additional dimension in cpl_scalars for CSG and regional ATM domains for use in mediator history files
      * CMEPS - fix mapping mask for lnd->atm
      * FV3 - add export of cpl_scalars
      * NOAHMP - add export of cpl_scalars

commit 1411b90
Author: Dusan Jovic <[email protected]>
Date:   Mon Apr 1 18:04:44 2024 -0400

    Update module_write_netcdf to avoid hangs in RRFS runs (ufs-community#2193)

    * UFSWM - Update module_write_netcdf to avoid hangs in RRFS runs
      * FV3 - Update module_write_netcdf to avoid hangs in RRFS runs

commit 87c27b9
Author: Matthew Masarik <[email protected]>
Date:   Fri Mar 29 15:23:42 2024 -0400

    WW3 feature:  Langmuir turbulence parameterization (ufs-community#2195)

      * WW3 - Langmuir turbulence parameterization

commit c54e986
Author: Samuel Trahan (NOAA contractor) <[email protected]>
Date:   Wed Mar 27 16:11:03 2024 -0400

    regression test system bug fixes, eliminate MOM6 warnings (ufs-community#2197), add xr_cnvcld flag to FV3 (ufs-community#2185) (ufs-community#2202)

    * UFSWM - atparse.bash: correctly handle input that doesn't end with an end-of-line character. Fix some bugs in Rocoto support and clean up rt.sh.
      * FV3 - namelist flag xr_cnvcld to control if suspended grid-mean convective cloud condensate should be included in cloud fraction and optical depth calculation in radiation in the GFS suite
        * ccpp - physics-level changes to implement new namelist variable
      * MOM6 - update MOM6 code to eliminate all compiler warnings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants