Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Added an option for RRFS external model files used as ICS and LBCS #1089

Open
wants to merge 28 commits into
base: develop
Choose a base branch
from

Conversation

natalie-perlin
Copy link
Collaborator

@natalie-perlin natalie-perlin commented May 30, 2024

DESCRIPTION OF CHANGES:

  • An option to use RRFS model output (control) files are added as initial and lateral boundary conditions, ICS and LBCS.
    RRFS_a data for the test was retrieved from the NODD website ((https://registry.opendata.aws/noaa-rrfs/)), pressure-level grib2 files from the control directory, RRFS forecasts interpolated into 3-km regular grid.

  • A new test has been added grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_v1beta with RRFS input files for the event on 06/05/2024 with the tornadoes reported in Maryland

  • Workflow for the plotting tasks has been updated to allow graphic output for individual ensemble members.

  • Python plotting script is updated to have geographical data visible over the plotted fields.

  • Updated devclean.sh script with safety checks, addressing devclean.sh demonstrates very unsafe directory removal behavior #1073 (devclean.sh demonstrates very unsafe directory removal behavior #1073)

UPDATE (6/13/2024):

RRFS data location:
https://noaa-rrfs-pds.s3.amazonaws.com/rrfs_a/rrfs_a.{yyyymmdd}/{hh}/control/
files are in the format rrfs.t{hh}z.prslev.f{fcst_hr:03d}.conus.grib2
where {yyyymmdd} are 4-digit year, 2-digit month, and 2-digit day of the forecast cycle, and {hh} is a 2-digit hour of the forecast cycle (forecast start), and {fcst_hr:03d} is a 3-digit forecast hour.

Browsing the bucket could be done at the site: browse the bucket:
https://noaa-rrfs-pds.s3.amazonaws.com/index.html#rrfs_a/

For this PR, RRFS input data uses are interpolated into a regular 3-km grid, these files need older sfs_data v1.
The sfc_data v2 that contains rotated u,v fields or fractional grids will be needed to use a newer UFS_UTILS version and tag. This would allow use of full RRFS input files, i.e. on a native grid with no remapping into regular grids; these files are ~6GB per file, and also require higher-version of packages (g2) that are not present in a spack-stack v1.5.1 or 1.6.0.

The following needs to added to config.yaml file to use RRFS ICS/LBCS option: (an example)

task_get_extrn_ics:
  EXTRN_MDL_NAME_ICS: RRFS
  USE_USER_STAGED_EXTRN_FILES: true
  FV3GFS_FILE_FMT_ICS: grib2
  EXTRN_MDL_SOURCE_BASEDIR_ICS: /lustre/SRW_DATA/RRFS/rrfs_a.20230501/${hh}/control
  EXTRN_MDL_FILES_ICS:
    - 'rrfs.t{hh}z.prslev.f{fcst_hr:03d}.conus.grib2'
task_get_extrn_lbcs:
  EXTRN_MDL_NAME_LBCS: RRFS
  USE_USER_STAGED_EXTRN_FILES: true
  LBC_SPEC_INTVL_HRS: 1
  FV3GFS_FILE_FMT_LBCS: grib2
  EXTRN_MDL_SOURCE_BASEDIR_LBCS: /lustre/SRW_DATA/RRFS/rrfs_a.20230501/${hh}/control
  EXTRN_MDL_FILES_LBCS:
    - 'rrfs.t{hh}z.prslev.f{fcst_hr:03d}.conus.grib2'

An example of a config.yaml file is attached. It accessed the data from a pre-stage standard location. Variables such as EXTRN_MDL_SOURCE_BASEDIR_ICS, EXTRN_MDL_FILES_ICS,
EXTRN_MDL_SOURCE_BASEDIR_LBCS, EXTRN_MDL_FILES_LBCS
need to be added for another date/forecast cycle.
If data are not found on disk, it is retrieved from the AWS.

config.yaml.txt

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

Conducted a test for RRFS_CONUScompact_25km grid, setting ICS and LBCS to "RRFS" option, running on NOAA AWS cloud. One-, two-, and 3-ensemble member experiments.
A new test configured, config.grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_v1beta , which could be launched on all the platforms, with data staged in standard location for EPIC project.
Fundamental tests (not including a newly developed one) pass successfully on AWS.

staged data for the test using RRFS ICS/LBCS for the SRW:

NOAA Cloud: /contrib/EPIC/UFS_SRW_data/develop/input_model_data/RRFS/
Derecho: /glade/work/epicufsrt/contrib/UFS_SRW_data/develop/input_model_data/RRFS/
Hera: /scratch1/NCEPDEV/nems/role.epic/UFS_SRW_data/develop/input_model_data
Gaea: /gpfs/f5/epic/world-shared/UFS_SRW_data/develop/input_model_data/RRFS/
Jet: /mnt/lfs4/HFIP/hfv3gfs/role.epic/UFS_SRW_data/develop/input_model_data/RRFS/
Orion/Hercules: /work/noaa/epic/role-epic/contrib/UFS_SRW_data/develop/input_model_data/RRFS/

A directory that uses forecast cycle date stamp for the test, ./2024060517, has 10 files:
rrfs.t17z.prslev.f000.conus.grib2
rrfs.t17z.prslev.f001.conus.grib2
rrfs.t17z.prslev.f002.conus.grib2
rrfs.t17z.prslev.f003.conus.grib2
rrfs.t17z.prslev.f004.conus.grib2
rrfs.t17z.prslev.f005.conus.grib2
rrfs.t17z.prslev.f006.conus.grib2
rrfs.t17z.prslev.f007.conus.grib2
rrfs.t17z.prslev.f008.conus.grib2
rrfs.t17z.prslev.f009.conus.grib2

  • hera.intel
  • orion.intel
  • hercules.intel
  • cheyenne.intel
  • cheyenne.gnu
  • derecho.intel
  • gaea.intel
  • gaeac5.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (AWS)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

DOCUMENTATION:

A new option for "RRFS" used as ICS and LBCS may need to be documented.

ISSUE:

In preparation for RRFS integration tasks, option to use "RRFS" model file as ICS and LBCS was added.

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

LABELS (optional):

A Code Manager needs to add the following labels to this PR:

  • Work In Progress
  • bug
  • enhancement
  • documentation
  • release
  • high priority
  • run_ci
  • run_we2e_fundamental_tests
  • run_we2e_comprehensive_tests
  • Needs Cheyenne test
  • Needs Jet test
  • Needs Hera test
  • Needs Orion test
  • help wanted

CONTRIBUTORS (optional):

@christinaholtNOAA

@MichaelLueken
Copy link
Collaborator

@natalie-perlin -

I'll move this work to On-Hold until a WE2E test has been added so that we can properly test this new functionality.

@MichaelLueken MichaelLueken added enhancement New feature or request DO_NOT_MERGE Ensure that a PR isn't merged labels May 30, 2024
@natalie-perlin
Copy link
Collaborator Author

natalie-perlin commented May 31, 2024

NB: @MichaelLueken - this PR requires an option "RRFS" to be allowed in UFS_UTILS. Current develop branch of ufs-community/UFS_UTILS does have the "RRFS" option enabled, but the version checked out by the SRW does not (a correction needs to be to allow it).

How should we proceed with this requirement?

@natalie-perlin
Copy link
Collaborator Author

An updated tag could be used for the UFS_UTILS that had this option implemented

@MichaelLueken
Copy link
Collaborator

@natalie-perlin -

I'll check if updating the version of UFS_UTILS will work in the SRW App. The commit in UFS_UTILS following what is currently in the SRW App's External.cfg file causes the weather model to fail (the weather model is expecting sheleg, while chgres_cube is generating sheleg_ice and sheleg_land, leading to the previously mentioned failure).

I'll go ahead and try updating the UFS_UTILS version to the latest version and see if it works. If it does, then we can move forward with this update. However, if it continues to fail, I will need to open an issue in the UFS_UTILS repository to let them know about the continued failures and see what can be done.

@MichaelLueken
Copy link
Collaborator

@natalie-perlin -

What version of UFS_UTILS contains the necessary fix so that we can exercise the use of RRFS ICs/LBCs in the SRW App? I can try to update to that version and see what issues appear.

@MichaelLueken
Copy link
Collaborator

It looks like UFS_UTILS PR #902 includes the necessary changes for chgres_cube to work with RRFS. I'll try a later version of the UFS_UTILS repository, then this one, to see if either will work.

@natalie-perlin
Copy link
Collaborator Author

It looks like UFS_UTILS PR #902 includes the necessary changes for chgres_cube to work with RRFS. I'll try a later version of the UFS_UTILS repository, then this one, to see if either will work.

Yes - thank you!! I was having troubles finding exact time/version when this change was implemented!.
The changes required to allow for RRFS option had to be done in two locations in UFS_UTILS repository in ./sorc/chgres_cube.fd/program_setup.F90: line 57 and lines 321-322. It looks like the PR you mentioned address that:
https://github.com/ufs-community/UFS_UTILS/pull/902/files#diff-6b6d24e7712144952ef83ca8f5e9d56e164fdcab1f7faab27812e91bfd483ba2

@MichaelLueken
Copy link
Collaborator

@natalie-perlin -

Using the version of UFS_UTILS associated with PR #902 is causing a failure in the fundamental tests:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2  COMPLETE               9.31
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2_20240  COMPLETE               8.22
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot  COMPLETE              15.17
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024060  DEAD                   5.71
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0_20240603165  COMPLETE              23.23
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024060316510  COMPLETE              20.14
----------------------------------------------------------------------------------------------------
Total                                                              DEAD                  81.78

The grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR WE2E test failed in run_fcst_mem000 with the following error message:

FATAL from PE 0: NetCDF: Start+count exceeds dimension bound: netcdf_read_data_3d: file:INPUT/sfc_data.nc- variable:tiice

I'll try backing my way through the commits in the UFS_UTILS repository to see which entry is causing issues with tiice.

@natalie-perlin
Copy link
Collaborator Author

@MichaelLueken - thank you for testing!! Let me look into these errors - look like a data problem. I might need to stage an additional directory in the EPIC space with data that I though was not needed... will get back to you!


CHGRES_CUBE=${SRW_DIR}/sorc/UFS_UTILS/sorc/chgres_cube.fd
${SED} -i 's/"RAP","HRRR"/"RAP","HRRR","RRFS"/g' ${CHGRES_CUBE}/program_setup.F90
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an appropriate fix to fortran codes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a temporary workaround to allow chgres_cube to build before a later tag is used that officially introduces RRFS capability. The current UFS_UTILS tag simply flag the "RRFS" option as not valid, and reports an error. Newer UFS_UTILS tags do not yet work with the SRW, and require more adaptation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@natalie-perlin natalie-perlin Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed an extra empty line

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The appropriate workaround to needing code not yet committed to an authoritative repo is to keep the necessary code in a branch and point to that branch in the interim. Editing external code in the build process is a bad practice that sets a horrible precedent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the UFS_UTILS hash to 1dac855 would bring in the necessary change to chgres_cube's program_setup.F90. However, all WE2E tests that use RAP and HRRR physics will fail while moving to this version of UFS_UTILS, as it is after the fractional grid update. It doesn't look like there has been any work on issue #961 in UFS_UTILS, so this is still keeping the SRW App from moving forward with updated versions of UFS_UTILS.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally understand that there is no way to get the UFS_UTILS hash updated right now.

There are two ways to handle it -- a temporary branch in a fork that EPIC controls to hold this source code change, or holding off on this PR until that update can be made.

Editing source code in the build script is unacceptable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA @MichaelLueken

Another possible approach could be to solve the issue of chgres_cube flagging the RRFS option to be done on SRW scripts level, instead of changing the error condition on the UFS_UTILS level.

My suggestion is to change the variable "external_model" from "RRFS" to "HRRR" in namelist fort.41 files before chgres_cube is called. These namelist files are stored in $EXPTDIR/[TEST_NAME]/[YYYYMMDDHH]/tmp_MAKE_ICS/ and ./tmp_MAKE_LBCS/ .
The changes are to be done in exregional_make_ics.sh and exregional_make_lbcs.sh, or in exregional_get_extrn_mdl_files.sh

That way the RRFS files interpolated to a regular grid still could be retrieved from the RRFS AWS location as needed, according to the external model type set for the experiment, but will be processed by chgres_cube similar to those for HRRR. This should be functional as long as HRRR and RAP remain as options for the SRW initial and lateral boundary condition.

Please let me know if this sounds a viable alternative to error flag removal in chgres_cube's program_setup.F90!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labeling RRFS ICs/LBCs as "HRRR" to get it through is fine with me. Modifying source code is not.

parm/data_locations.yml Show resolved Hide resolved
parm/wflow/plot.yaml Show resolved Hide resolved
@@ -3,6 +3,7 @@ grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_HRRR
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_v1beta
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the coverage tests is that by running a portion of the suite on each platform, all the comprehensive tests will have run. Should this only be done on one platform?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other tests seem to exist for all the machines; e.g,
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR
...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test should only be added to a single platform's coverage test suite. I'd recommend either Hera GNU, Derecho, Orion, or Hercules (machines with only 11 tests).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelLueken thanks for confirming!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we have a coverage.cheyenne file now that Cheyenne has been decommissioned?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adamant that this more expensive ensemble test not be included as a test on every machine. I appreciate @MichaelLueken's suggestion above about choosing one of the platforms with a lighter load.

natalie-perlin and others added 6 commits June 24, 2024 14:53
updated /WE2E/test_configs/grids_extrn_mdls_suites_community/config.grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_v1beta.yaml

Co-authored-by: Christina Holt <[email protected]>
Placing variable in alphabetical order as requested
removed debugging diagnostics
@natalie-perlin
Copy link
Collaborator Author

@gspetro-NOAA -
Please feel free to comment on the RRFS-related documentation changes for the SRW.

@natalie-perlin
Copy link
Collaborator Author

A current develop branch has been merged into the rrfs_ics_lbcs branch, and successfully tested by running the grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_v1beta case.

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natalie-perlin -

During one of the latest pushes, it appears as though changes to the documentation had been undone. Please reapply these updates to the documentation.

doc/UsersGuide/BackgroundInfo/Components.rst Outdated Show resolved Hide resolved
doc/UsersGuide/CustomizingTheWorkflow/InputOutputFiles.rst Outdated Show resolved Hide resolved
doc/UsersGuide/CustomizingTheWorkflow/InputOutputFiles.rst Outdated Show resolved Hide resolved
@natalie-perlin
Copy link
Collaborator Author

@MichaelLueken - thank you for noticing documentation changes! I'll compare with the changes done and documentation built at a local system, to verify all the necessary changes are in place

@natalie-perlin
Copy link
Collaborator Author

@MichaelLueken -
verified that all the documentation changes are in place. Please let me know how to proceed, and whether anything else is needed.

@MichaelLueken
Copy link
Collaborator

@natalie-perlin -

Thanks for reapplying the documentation modifications again! The documentation updates look good to me. It caught me off guard when I noticed that they had been changed to the way they were before you originally addressed my concerns with the documentation, but everything is once again good to go.

@natalie-perlin
Copy link
Collaborator Author

@natalie-perlin -

Thanks for reapplying the documentation modifications again! The documentation updates look good to me. It caught me off guard when I noticed that they had been changed to the way they were before you originally addressed my concerns with the documentation, but everything is once again good to go.

My apologies - some merges did not go well right away when I attempted to implement changes, address comments, and run a new test before yesterday's demo. Using different platforms to test the changes (AWS, Hera), build and change documentation (local system), and addressing comments + commits (GitHub) were likely not recorded properly. I'm glad that it's back to the expected.

@MichaelLueken
Copy link
Collaborator

MichaelLueken commented Jun 25, 2024

@natalie-perlin -

I think I have finally figured out the issue with chgres_cube v2 surface files (fractional grid) not being read in by the weather model. RAP and HRRR use RUC LSM, which requires setting tiice to 2 vertical levels. However, the number of ice levels is not being set to 2 for RAP and HRRR.

I'll need to do some work tomorrow to fully figure this out, but I should be able to update the UFS_UTILS hash to 1dac855, which will include the necessary changes to chgres_cube's program_setup.F90 and add the necessary kice entry to the model_configure file so that the correct number of vertical levels will be used. With this, you should be able to remove the modification that was made to devbuild.sh, which is one of the major issues keeping @christinaholtNOAA from approving this work.

Thank you, Michael, fingers crossed!!

@@ -61,6 +59,45 @@ usage_error () {
exit 1
}

# remove_directory function
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being difficult, but I don't like this solution either. Why do we need to give the users a choice for what directories to delete? If users want to specify some custom directory to delete, they can just delete it! Get rid of these options entirely, simply hard-code the directories and files that need to be deleted. No need to bother with input arguments and checking for wildcards: all of this complexity just leads to more potential failure modes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to get a better understanding what are the features that are desired vs. what was coded earlier. Let me list the old/existing behavior and a new behavior below; please let me know what is to be added to the existing features, and what are excessive. I did refer to the issues mentioned in #1073, such as removing directories outside the SRW tree, or those mentioned in the links: this Quora link , ValveSoftware/steam-for-linux#3671 , which included raising issues of having empty variables or wildcards.

The old behavior:

  • User requests what type of cleaning is to be done using arguments to a script, which could be to do the following
    • remove build directory only;
    • remove all build artifacts (except for built conda directory + conda_loc file);
    • remove all checked-out external repositories/submodules;
    • remove conda directory and conda_loc;
  • When type of cleaning is determined based on the arguments, the script checks if a directory or directories exists and then removes them. This is done for all but conda + conda_loc (check for directory existence not done).

New behavior, type of cleaning to be done using similar arguments:

  • Checks whether a directory exists;
  • Check whether a directory is immediately under the expected SRW tree (ask to confirm a remove request if it is not)
  • Check whether a directory name contains wildcards, such as " [ ] | ? + *" or a space, do not remove if it does and ask the user to do it from a command line

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify some of the background of my comments: I did not know about the existence of this script for some time, and when I found out about it, I immediately noticed some unsafe behavior as I've mentioned in my open issue. On thinking about it further as I typed up that issue, it struck me that this script is completely unnecessary: all it does is remove directories in known locations. But since you insisted it is still necessary to support users with no advanced knowledge, then I can be happy with a simple script that removes the default directory locations, even if I personally think it's unnecessary.

This functionality that you have added in this PR has greatly increased the complexity of the script, and offers countless more potential failure modes due to introducing complex/advanced regex that purports to remove certain potentially dangerous characters. Furthermore, the functionality you are attempting to preserve (deleting custom build, install, etc. directories) only applies to advanced users anyway, who should be able to follow very simple instructions to remove the directories if they want a clean build. These are directories that they will already have had to specify explicitly at build time (assuming that they are different than the default locations), and it is functionality that is not even documented in the users guide!

Simple is better here, especially if you are talking about a tool for supporting novice users. I propose that this script, if it must continue to exist, simply deletes the default locations of these directories (relative to the script itself), and does not attempt to allow the user to input custom locations for these directories.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkavulich - thank you for your comments. This devclean.sh script was added in the late 2022: #415

This devclean.sh was created to accompany a devbuild.sh script, which accept custom locations for the build artifacts that could be specified, and have variables INSTALL_DIR, CONDA_DIR, and BIN_DIR available. It is also included in the SRW Documentation
https://ufs-srweather-app.readthedocs.io/en/develop/UsersGuide/Reference/FAQ.html#how-can-i-clean-up-the-srw-app-code-if-something-went-wrong-during-the-build

If these variables are not set by the user, they accept default locations, such as under main SRW tree.
So if the original code looks good enough, I could only proposed to add a check for the existence of a CONDA_DIR before removing it, when conda cleaning is requested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also included in the SRW Documentation

I am talking specifically about the customization options for INSTALL_DIR, etc. This is advanced behavior that is only documented in the script itself, so I do not expect anyone to be using it outside of advanced users.

if the original code looks good enough

Do you mean the code before your changes here? I have already expressed my concerns about that in Issue #1073 that started this discussion. Otherwise I'm not sure what you mean.

I don't think you are addressing my primary concern here: There needs to be a good reason for including all this complexity and potential failure modes in a script, especially one explicitly designed for novice users, and I just do not see that reason at all. Why include all of this for a use case that I can't imagine anyone ever needing? You are implying that there are users out there who are advanced enough to have a very specific use case which requires customizing their build, install, or conda directories, but these same hypothetical users are not advanced enough to follow simple instructions that they should remove those same directories to clean up an existing build? More code is not always better, and I don't understand the argument against replacing this now-200+ line script with a single sentence in the users guide.

Copy link
Collaborator

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has grown to accumulate too much in general. Please limit PRs to one feature.

The title should definitely be changed since it is not only adding support for RRFS ICs/LBCS, but also introducing graphics for ensembles, and addressing issues with the clean script.

All the features must pass PR before you can merge any of them and that's going to be an extremely tall order given the roadblocks of these particular features.


CHGRES_CUBE=${SRW_DIR}/sorc/UFS_UTILS/sorc/chgres_cube.fd
${SED} -i 's/"RAP","HRRR"/"RAP","HRRR","RRFS"/g' ${CHGRES_CUBE}/program_setup.F90
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labeling RRFS ICs/LBCs as "HRRR" to get it through is fine with me. Modifying source code is not.

--bin-dir=BIN_DIR
binary directory name ("exec" by default); full path is \${INSTALL_DIR}/\${BIN_DIR})
--build-dir=BUILD_DIR
main build directory, absolute path (\${SRW_DIR}/build/ by default)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should not exist bc it's dangerous. If a user chooses to go against the default, then they need to clean their own directories. It's perfectly acceptable to provide support for ppl who go the supported path and leave the others to their own devices.

This applies to any of the flags that allow users to specify something other than the default build directories created by the build script.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This options is created to mimic the options in ./devbuild.sh, as the script ./devclean.sh is stated to be cleaning any issues with the build:
https://ufs-srweather-app.readthedocs.io/en/develop/UsersGuide/Reference/FAQ.html#how-can-i-clean-up-the-srw-app-code-if-something-went-wrong-during-the-build

parm/data_locations.yml Show resolved Hide resolved
native: '{{ platform.SCHED_NATIVE_CMD }}'
nnodes: 1
nnodes: 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plotting script MPI capable? If not, this is wasting resources that will never be used.

#
# Path to the RRFS/HRRRX geogrid file.
#
geogrid_file_input_grid="${FIXgsm}/geo_em.d01.nc_HRRRX"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the geogrid file used to produce RRFS or HRRR or both?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used for both.

@@ -3,6 +3,7 @@ grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_HRRR
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_v1beta
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adamant that this more expensive ensemble test not be included as a test on every machine. I appreciate @MichaelLueken's suggestion above about choosing one of the platforms with a lighter load.

@natalie-perlin
Copy link
Collaborator Author

This PR has grown to accumulate too much in general. Please limit PRs to one feature.

The title should definitely be changed since it is not only adding support for RRFS ICs/LBCS, but also introducing graphics for ensembles, and addressing issues with the clean script.

All the features must pass PR before you can merge any of them and that's going to be an extremely tall order given the roadblocks of these particular features.

@christinaholtNOAA @MichaelLueken @mkavulich
Yes, planning to take out additional features and improvements out of RRFS-focused PR, to be combined into a separate PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants