Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Fix issues on AQM and NCO mode caused by new YAML interface PR #676 #722

Merged
merged 22 commits into from
Apr 25, 2023

Conversation

chan-hoo
Copy link
Collaborator

DESCRIPTION OF CHANGES:

  • Fix AQM configuration issues on both community and nco modes.
  • Fix workflow entity issues on nco mode.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

Fundamental WE2E tests on WCOSS2

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

ISSUE:

Fixes issue mentioned in #709

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

Copy link
Collaborator

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chan-hoo Again, I am sorry that I did not see your comments on the Issue earlier.

I have just a concerns about these changes below.

jobs/JREGIONAL_RUN_POST Outdated Show resolved Hide resolved
if [ "${NUM_FCST_LEN_CYCL}" -gt "1" ]; then
cyc_mod=$(( ${cyc} - ${DATE_FIRST_CYCL:8:2} ))
CYCLE_IDX=$(( ${cyc_mod} / ${INCR_CYCL_FREQ} ))
FCST_LEN_HRS=${FCST_LEN_CYCL[$CYCLE_IDX]}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this logic works just right.

If DATE_FIRST_CYCL starts at 18Z, and we're looking at the next 00Z cycle, we get:

cyc_mod=$(( 00 - 18 ))
CYCLE_IDX=$(( -18 / 6 ))

So we're left with a negative CYCLE_IDX and can't get the right information in the list. I'd suggest that we add the requirement on FCST_LEN_CYCL to define a list that starts from 00Z and increments by the FCST_LEN_CYCL if you need to define variable forecasts lengths.

Copy link
Collaborator Author

@chan-hoo chan-hoo Apr 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA, I don't agree with you. {cyc} starts from a lower value. if {cyc} includes "06" and "18", its order will be ["06", "18"] (not ["18", "06"]). I think the 'cyc_mod' will not have a negative value in any cases because these variable forecast length hours are only set per day as you modified it in your former PR:

# Check that the number of entries divides into a day

What do you think about this? In the above example of ["06", "18"], the current condition will fail because their indexes will be [ "06/12", "18/12"]. With a new condition, they will be ["0", "1"] (["(6-6)/12", "(18-6)/12"]).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still concerned about the use case given that there would be a strong coupling between the FCST_LEN_CYCL and the DATE_FIRST_CYCL and DATE_LAST_CYCL variables.

In the above example, I was suggesting that these might be the user-defined settings for two cycles, where the first is meant to be a short forecast and the second a long one:

DATE_FIRST_CYCL=2023030518
DATE_LAST_CYCL=2023030600
FCST_LEN_CYCL=["06", "18"]

This is a specific example of where I'm saying it might be better to require users to define FCST_LEN_CYCL as if it were all the possible daily cycles like this:

FCST_LEN_CYCL=["18", "06",  "06", "06"]

This means that we don't have to handle the "special" cases and aren't so heavily coupling the user-defined dates to the forecast lengths.

Copy link
Collaborator Author

@chan-hoo chan-hoo Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA, I understand your concern. However, the change in this PR is better than the current one you made because it at least works for the special case. The current version will not work for both cases (my case as well as your case). In my opinion, the best solution is to use my original version:

  for i_cdate in "${!ALL_CDATES[@]}"; do
    if [ "${ALL_CDATES[$i_cdate]}" = "${PDY}${cyc}" ]; then
      FCST_LEN_HRS="${FCST_LEN_CYCL_ALL[$i_cdate]}"
      break
    fi

I understand that you didn't want to use ALL_CDATES, but this one will not have any issues. What do you think about that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaning on ALL_CDATES is not a valid solution for when we want to run a real-time run indefinitely. Do you mind sharing the use case that you are trying to get running? The start and end date, and the forecast lengths you'd like to support?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA:

  1. Official NRT (Near Real Time) run: 4 cycles per day ("00" "06" "12" "18") and varying forecast length hours = ("06" "72" "72" "06").
  2. Official Retro run: same 4 cycles per day and lengths as 1); 3 month period
  3. Non-official test run: 2 cycles per day ("06" "18") for 3 days ("INCR_CYCL_FREQ: 12). This case causes the error in the current status. We don't support this case officially. However, I remember someone asked me if this case was available in the workflow. This is the reason why I am trying to update this part in my PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA, if you don't agree with my change, please provide us a reasonable solution. This PR takes so long for review now. The AQM users are not able to use the develop branch now and they are waiting for this PR to be merged. @MichaelLueken, could you please ask other reviewers to review this PR?

ush/config_defaults.yaml Outdated Show resolved Hide resolved
Copy link
Collaborator

@danielabdi-noaa danielabdi-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Some minor comments.

etc/lmod-setup.sh Show resolved Hide resolved
jobs/JREGIONAL_RUN_POST Outdated Show resolved Hide resolved
scripts/exregional_aqm_lbcs.sh Outdated Show resolved Hide resolved
scripts/exregional_bias_correction_o3.sh Outdated Show resolved Hide resolved
ush/setup.py Outdated Show resolved Hide resolved
ush/setup.py Outdated Show resolved Hide resolved
scripts/exregional_run_post.sh Show resolved Hide resolved
Copy link
Collaborator

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After chatting with @chan-hoo, we decided to open an Issue on the limitations of this formulation, and go ahead with this PR as-is.

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chan-hoo Thank you for working with @christinaholtNOAA and @danielabdi-noaa to address their concerns! These changes look good to me, so approving and submitting Jenkins tests now.

@MichaelLueken
Copy link
Collaborator

@chan-hoo The verification WE2E tests on Cheyenne and Hera failed (these are noted in issue #688). All other tests successfully passed. Will now move forward with merging this work.

@MichaelLueken MichaelLueken merged commit e0655f1 into ufs-community:develop Apr 25, 2023
@chan-hoo chan-hoo deleted the bugfix/aqm_config branch June 26, 2023 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Priority: HIGH run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AQM configuration files do not work after YAML change PR #676
4 participants