Restore error checking in regression test system. #2335

SamuelTrahanNOAA · 2024-06-21T19:46:33Z

Commit Queue Requirements:

Fill out all sections of this template.
N/A No subcomponents. ~~All sub component pull requests have been reviewed by their code managers.~~
See description. ~~Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules~~
See description. ~~Commit 'test_changes.list' from previous step~~

Description:

The regression test system ignores all errors in all jobs. A job where fv3.exe crashed is the same as a job where it ran, and produced different results. Also, compilation job errors are ignored. This leads to several problems:

Test jobs are executed even if the compile job fails.
Jobs with prerequisites (such as restart tests) run even if their prerequisite fails.
A job that failed to copy input data or had syntax errors won't be caught until the entire workflow completes.
Temporary system issues require rerunning the entire workflow instead of only affected jobs.

In this new version of the regression test system:

Errors are caught, and result in the metascheduler considering the job as failed.
Dependencies are honored; if a job fails, anything that depends on it won't run.
Jobs that run to completion, but have changed results are considered to have succeeded. This behavior is unchanged.

Temporary Changes to Make Some Tests Fail

To test this feature, I've modified a few jobs so they break or have changed results. This should be reverted before merging to develop:

rrfs_v1beta_failing - This new test will always fail at runtime.
compile_atm_faster_dyn32_intel - Removed the -DFASTER=ON. This means its tests will succeed, but results will change. The test_changes.txt will contain control_wam intel and control_wam_debug intel.
compile_hafsw_intel - Will always fail at runtime due to the new --invalid-argument argument. Tests that require this compilation will never run.

IMPORTANT: These changes are marked with # FIXME and should be reverted before merging to develop.

Commit Message:

* UFSWM - restore error checking to regression test system

Priority:

Normal

Git Tracking

UFSWM:

Closes modify report of test failures to clearly indicate when a test failed to compare because it did not run #2330

Sub component Pull Requests:

N/A

UFSWM Blocking Dependencies:

N/A

Changes

Regression Test Changes (Please commit test_changes.list):

No Baseline Changes.

Some deliberate errors must be removed before this PR will run to completion without changing baselines. They are marked with # FIXME.

Input data Changes:

None.

Library Changes/Upgrades:

No Updates

Testing Log:

…st the features

…ld not be committed

SamuelTrahanNOAA · 2024-06-21T19:52:46Z

Pinging @DeniseWorthen who authored the relevant issue. Also, @DusanJovic-NOAA who authored the original regression test system.

SamuelTrahanNOAA · 2024-06-21T19:55:03Z

I've been testing this on top of #2326. The regression test has proven itself completely unusable for development due to the lack of error checking. Updating UPP and modulefiles required many changes that caused subsets of the tests to fail. A regression test system that is unable to differentiate between a test with changed results, and a test that could not run at all, is not useful for development.

jkbk2004 · 2024-06-27T12:38:22Z

@SamuelTrahanNOAA Can you follow up to clean the super-linter complaint ?

SamuelTrahanNOAA · 2024-06-27T15:46:45Z

After updating this branch, I'm getting out-of-memory errors from some jobs on Hera when using Rocoto.

SamuelTrahanNOAA · 2024-06-27T18:49:15Z

After updating this branch, I'm getting out-of-memory errors from some jobs on Hera when using Rocoto.

The jobs succeeded on the second attempt. This may have been a temporary system issue.

SamuelTrahanNOAA added 3 commits June 21, 2024 15:45

restore error checking to workflow and tweak some jobs to fail, to te…

7c6316c

…st the features

job-failing functionality (for test purposes) moved to right script

c7cbada

ignore some regression test system flag and temporary files that shou…

325c261

…ld not be committed

use pipefail to detect if job card fails

7bb328c

Merge remote-tracking branch 'upstream/develop' into error-checking

fd212fc

SamuelTrahanNOAA added 3 commits June 27, 2024 15:48

make linter happy

bae897a

try again to make linter happy

05835b8

set pipefail again for linter

eef6e73

SamuelTrahanNOAA added 2 commits June 27, 2024 20:02

run_compile.sh: do not duplicate redirect_out_err

b8ba251

correct a comment

e536905

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore error checking in regression test system. #2335

Restore error checking in regression test system. #2335

SamuelTrahanNOAA commented Jun 21, 2024 •

edited

Loading

SamuelTrahanNOAA commented Jun 21, 2024

SamuelTrahanNOAA commented Jun 21, 2024 •

edited

Loading

jkbk2004 commented Jun 27, 2024

SamuelTrahanNOAA commented Jun 27, 2024

SamuelTrahanNOAA commented Jun 27, 2024

Restore error checking in regression test system. #2335

Are you sure you want to change the base?

Restore error checking in regression test system. #2335

Conversation

SamuelTrahanNOAA commented Jun 21, 2024 • edited Loading

Commit Queue Requirements:

Description:

Temporary Changes to Make Some Tests Fail

Commit Message:

Priority:

Git Tracking

UFSWM:

Sub component Pull Requests:

UFSWM Blocking Dependencies:

Changes

Regression Test Changes (Please commit test_changes.list):

Input data Changes:

Library Changes/Upgrades:

Testing Log:

SamuelTrahanNOAA commented Jun 21, 2024

SamuelTrahanNOAA commented Jun 21, 2024 • edited Loading

jkbk2004 commented Jun 27, 2024

SamuelTrahanNOAA commented Jun 27, 2024

SamuelTrahanNOAA commented Jun 27, 2024

SamuelTrahanNOAA commented Jun 21, 2024 •

edited

Loading

SamuelTrahanNOAA commented Jun 21, 2024 •

edited

Loading