-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors on v2.3.0rc5 (E3SM Unified 1.9.0rc10) #474
Comments
I'm afraid I don't have anything useful to suggest beyond what I mentioned on Slack. So I hope further debugging on your part or experience from other zppy users can help narrow down the problem. |
I will see if I can discover anything with @chengzhuzhang @golaz @tomvothecoder @mahf708 On the off-chance any of you have some thoughts on the above errors, please let me know! Thanks |
@forsyth2. One thing I can think of is to redo zppy test with rc9, but disable generating cmip timeseries and ilamb tasks in the configuration file. And do the same test for rc10. This can give the behavior difference between 2 rcs. |
If we can get a simpler reproducer, we could potentially isolate it to a single package. I checked a few of the changes. For example, |
@mahf708 Yeah, it's also not clear to me how we could do that. At least on Chrysalis, @chengzhuzhang seems to have narrowed it down to an E3SM Diags problem. As for Compy, #475 (comment) seems to be a possible path forward. |
Closing this issue, resolved by Unified |
Chrysalis
Contents of
tests/integration/generated/test_complete_run_chrysalis.cfg
:I ran:
This generates files, that I have since moved:
/lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/v2.LR.historical_0201/post
->/lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/v2.LR.historical_0201/post_20230808
/lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/zppy_test_complete_run_www/v2.LR.historical_0201
->/lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/zppy_test_complete_run_www/v2.LR.historical_0201_20230808
Selection of output from
e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1850-1851.o370766
:@chengzhuzhang says the above error is expected.
[chr-0080:487898:0:488057] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x8ec69ef4)
Compy
Contents of
tests/integration/generated/test_complete_run_compy.cfg
:I ran:
This generates files, that I have since moved:
/compyfs/fors729/zppy_test_complete_run_output/v2.LR.historical_0201/post
->/compyfs/fors729/zppy_test_complete_run_output/v2.LR.historical_0201/post_20230808
/compyfs/www/fors729/zppy_test_complete_run_www/v2.LR.historical_0201
->/compyfs/www/fors729/zppy_test_complete_run_www/v2.LR.historical_0201_20230808
Compy thus fails earlier than on Chrysalis -- in the
climo
tasks rather than thee3sm_diags
tasks.Selection of output from
climo_atm_monthly_180x360_aave_1850-1851.o551026
:Perlmutter
Contents of
tests/integration/generated/test_complete_run_pm-cpu.cfg
:I ran:
This generates files, that I have since moved:
/global/cfs/cdirs/e3sm/forsyth/zppy_test_complete_run_output/v2.LR.historical_0201/post
->/global/cfs/cdirs/e3sm/forsyth/zppy_test_complete_run_output/v2.LR.historical_0201/post_20230808
/global/cfs/cdirs/e3sm/www/forsyth/zppy_test_complete_run_www/v2.LR.historical_0201/
->/global/cfs/cdirs/e3sm/www/forsyth/zppy_test_complete_run_www/v2.LR.historical_0201_20230808
Yet I ran
python -u -m unittest tests/integration/test_complete_run.py
(before moving the files to different directories) and ran into the following error:(Chrysalis actually had an
UnidentifiedImageError
, which is what alerted me to the problem over there in the first place).Selection of output from
e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1850-1851.o1349980
:This is the same error that ultimately ended up stopping the Chrysalis run of E3SM Diags -- just without the segmentation faults.
The text was updated successfully, but these errors were encountered: