-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debug rc5 issues #475
Debug rc5 issues #475
Conversation
Commit 1: Chrysalis -- latest
Output:
|
Commit 2: Chrysalis -- latest
Output:
|
do you have a log of versions changed between rc9 and rc10 in the packages? I can probably produce it if not... |
1c1
< # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc9_login:
---
> # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc10_login:
16c16
< async-lru 2.0.3 pyhd8ed1ab_0 conda-forge
---
> async-lru 2.0.4 pyhd8ed1ab_0 conda-forge
59c59
< cdms2 3.1.5 py310heeafeea_20 conda-forge
---
> cdms2 3.1.5 py310heeafeea_21 conda-forge
67d66
< cfitsio 4.2.0 hd9d235c_0 conda-forge
78c77
< comm 0.1.3 pyhd8ed1ab_0 conda-forge
---
> comm 0.1.4 pyhd8ed1ab_0 conda-forge
90c89
< debugpy 1.6.7 py310heca2aa9_0 conda-forge
---
> debugpy 1.6.8 py310hc6cd4ac_0 conda-forge
96c95
< e3sm-unified 1.9.0rc9 mpi_mpich_py310_hdc99501_0 e3sm/label/e3sm_dev
---
> e3sm-unified 1.9.0rc10 mpi_mpich_py310_hdc99501_0 e3sm/label/e3sm_dev
98c97
< e3sm_to_cmip 1.10.0rc1 pyhe9a6732_0 conda-forge/label/e3sm_to_cmip_dev
---
> e3sm_to_cmip 1.10.0rc2 pyhe9a6732_0 conda-forge/label/e3sm_to_cmip_dev
119c118
< fonttools 4.41.1 py310h2372a71_0 conda-forge
---
> fonttools 4.42.0 py310h2372a71_0 conda-forge
156c155
< imagecodecs 2023.7.10 py310h4c4fb95_0 conda-forge
---
> imagecodecs 2023.7.10 py310hc929067_2 conda-forge
167c166
< ipywidgets 8.0.7 pyhd8ed1ab_0 conda-forge
---
> ipywidgets 8.1.0 pyhd8ed1ab_0 conda-forge
170c169
< jedi 0.18.2 pyhd8ed1ab_0 conda-forge
---
> jedi 0.19.0 pyhd8ed1ab_0 conda-forge
179c178
< jsonschema 4.18.4 pyhd8ed1ab_0 conda-forge
---
> jsonschema 4.18.6 pyhd8ed1ab_0 conda-forge
181c180
< jsonschema-with-format-nongpl 4.18.4 pyhd8ed1ab_0 conda-forge
---
> jsonschema-with-format-nongpl 4.18.6 pyhd8ed1ab_0 conda-forge
187c186
< jupyter_events 0.6.3 pyhd8ed1ab_1 conda-forge
---
> jupyter_events 0.7.0 pyhd8ed1ab_1 conda-forge
190c189
< jupyterlab 4.0.3 pyhd8ed1ab_0 conda-forge
---
> jupyterlab 4.0.4 pyhd8ed1ab_0 conda-forge
207c206
< libarrow 12.0.1 h657c46f_6_cpu conda-forge
---
> libarrow 12.0.1 h657c46f_7_cpu conda-forge
214c213
< libcap 2.67 he9d0100_0 conda-forge
---
> libcap 2.69 h0f662aa_0 conda-forge
219,220c218,219
< libclang 15.0.7 default_h7634d5b_2 conda-forge
< libclang13 15.0.7 default_h9986a30_2 conda-forge
---
> libclang 15.0.7 default_h7634d5b_3 conda-forge
> libclang13 15.0.7 default_h9986a30_3 conda-forge
248c247
< libllvm14 14.0.6 hcd5def8_3 conda-forge
---
> libllvm14 14.0.6 hcd5def8_4 conda-forge
262c261
< librsvg 2.56.1 h98fae49_0 conda-forge
---
> librsvg 2.56.3 h98fae49_0 conda-forge
268c267
< libsystemd0 253 h8c4010b_1 conda-forge
---
> libsystemd0 254 h3516f8a_0 conda-forge
272c271
< libudev1 253 h0b41bf4_1 conda-forge
---
> libudev1 254 h3f72095_0 conda-forge
293c292
< mache 1.17.0rc1 pyh4bc9f2b_0 conda-forge/label/mache_dev
---
> mache 1.17.0rc2 pyh4bc9f2b_0 conda-forge/label/mache_dev
302c301
< mpas-analysis 1.9.0rc3 pyh320ef33_0 conda-forge/label/mpas_analysis_dev
---
> mpas-analysis 1.9.0rc4 pyh320ef33_0 conda-forge/label/mpas_analysis_dev
309c308
< mpich 4.1.1 h846660c_100 conda-forge
---
> mpich 4.1.2 h846660c_100 conda-forge
322c321
< nbformat 5.9.1 pyhd8ed1ab_0 conda-forge
---
> nbformat 5.9.2 pyhd8ed1ab_0 conda-forge
343c342
< openssl 3.1.1 hd590300_1 conda-forge
---
> openssl 3.1.2 hd590300_0 conda-forge
368c367
< platformdirs 3.9.1 pyhd8ed1ab_0 conda-forge
---
> platformdirs 3.10.0 pyhd8ed1ab_0 conda-forge
385c384
< pyarrow 12.0.1 py310h0576679_6_cpu conda-forge
---
> pyarrow 12.0.1 py310h0576679_7_cpu conda-forge
391c390
< pyparsing 3.1.0 pyhd8ed1ab_0 conda-forge
---
> pyparsing 3.1.1 pyhd8ed1ab_0 conda-forge
405c404
< python-utils 3.7.0 pyhd8ed1ab_0 conda-forge
---
> python-utils 3.7.0 pyhd8ed1ab_1 conda-forge
418c417
< referencing 0.30.0 pyhd8ed1ab_0 conda-forge
---
> referencing 0.30.1 pyhd8ed1ab_0 conda-forge
433c432
< sip 6.7.10 py310hc6cd4ac_0 conda-forge
---
> sip 6.7.11 py310hc6cd4ac_0 conda-forge
535c534
< zppy 2.3.0rc3 pyh51c0ceb_0 conda-forge/label/zppy_dev
---
> zppy 2.3.0rc5 pyh51c0ceb_0 conda-forge/label/zppy_dev
|
Okay, do you have any guesses where exactly the segfault is originating? I cannot quite figure it out from the logs. Also, is dask being used to trigger additional jobs or just in-node jobs? (Not sure if you know the latter) |
The nodes associated with your seg faults are: chr-0249, chr-0244, chr-0250 |
Has anyone ever seen
|
@mahf708 thanks for helping trouble shooting. The listed packages are for login_node. I think we should look at the compute node version of packages? |
They're almost the same. I will post the other list here. |
1c1
< # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc9_chrysalis:
---
> # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc10_chrysalis:
16c16
< async-lru 2.0.3 pyhd8ed1ab_0 conda-forge
---
> async-lru 2.0.4 pyhd8ed1ab_0 conda-forge
59c59
< cdms2 3.1.5 py310heeafeea_20 conda-forge
---
> cdms2 3.1.5 py310heeafeea_21 conda-forge
67d66
< cfitsio 4.2.0 hd9d235c_0 conda-forge
78c77
< comm 0.1.3 pyhd8ed1ab_0 conda-forge
---
> comm 0.1.4 pyhd8ed1ab_0 conda-forge
90c89
< debugpy 1.6.7 py310heca2aa9_0 conda-forge
---
> debugpy 1.6.8 py310hc6cd4ac_0 conda-forge
96c95
< e3sm-unified 1.9.0rc9 hpc_py310_hd6e50ed_0 e3sm/label/e3sm_dev
---
> e3sm-unified 1.9.0rc10 hpc_py310_hd6e50ed_0 e3sm/label/e3sm_dev
98c97
< e3sm_to_cmip 1.10.0rc1 pyhe9a6732_0 conda-forge/label/e3sm_to_cmip_dev
---
> e3sm_to_cmip 1.10.0rc2 pyhe9a6732_0 conda-forge/label/e3sm_to_cmip_dev
119c118
< fonttools 4.41.1 py310h2372a71_0 conda-forge
---
> fonttools 4.42.0 py310h2372a71_0 conda-forge
156c155
< imagecodecs 2023.7.10 py310h4c4fb95_0 conda-forge
---
> imagecodecs 2023.7.10 py310hc929067_2 conda-forge
167c166
< ipywidgets 8.0.7 pyhd8ed1ab_0 conda-forge
---
> ipywidgets 8.1.0 pyhd8ed1ab_0 conda-forge
170c169
< jedi 0.18.2 pyhd8ed1ab_0 conda-forge
---
> jedi 0.19.0 pyhd8ed1ab_0 conda-forge
179c178
< jsonschema 4.18.4 pyhd8ed1ab_0 conda-forge
---
> jsonschema 4.18.6 pyhd8ed1ab_0 conda-forge
181c180
< jsonschema-with-format-nongpl 4.18.4 pyhd8ed1ab_0 conda-forge
---
> jsonschema-with-format-nongpl 4.18.6 pyhd8ed1ab_0 conda-forge
187c186
< jupyter_events 0.6.3 pyhd8ed1ab_1 conda-forge
---
> jupyter_events 0.7.0 pyhd8ed1ab_1 conda-forge
190c189
< jupyterlab 4.0.3 pyhd8ed1ab_0 conda-forge
---
> jupyterlab 4.0.4 pyhd8ed1ab_0 conda-forge
207c206
< libarrow 12.0.1 h657c46f_6_cpu conda-forge
---
> libarrow 12.0.1 h657c46f_7_cpu conda-forge
214c213
< libcap 2.67 he9d0100_0 conda-forge
---
> libcap 2.69 h0f662aa_0 conda-forge
218,219c217,218
< libclang 15.0.7 default_h7634d5b_2 conda-forge
< libclang13 15.0.7 default_h9986a30_2 conda-forge
---
> libclang 15.0.7 default_h7634d5b_3 conda-forge
> libclang13 15.0.7 default_h9986a30_3 conda-forge
246c245
< libllvm14 14.0.6 hcd5def8_3 conda-forge
---
> libllvm14 14.0.6 hcd5def8_4 conda-forge
259c258
< librsvg 2.56.1 h98fae49_0 conda-forge
---
> librsvg 2.56.3 h98fae49_0 conda-forge
265c264
< libsystemd0 253 h8c4010b_1 conda-forge
---
> libsystemd0 254 h3516f8a_0 conda-forge
289c288
< mache 1.17.0rc1 pyh4bc9f2b_0 conda-forge/label/mache_dev
---
> mache 1.17.0rc2 pyh4bc9f2b_0 conda-forge/label/mache_dev
297c296
< mpas-analysis 1.9.0rc3 pyh320ef33_0 conda-forge/label/mpas_analysis_dev
---
> mpas-analysis 1.9.0rc4 pyh320ef33_0 conda-forge/label/mpas_analysis_dev
315c314
< nbformat 5.9.1 pyhd8ed1ab_0 conda-forge
---
> nbformat 5.9.2 pyhd8ed1ab_0 conda-forge
334c333
< openssl 3.1.1 hd590300_1 conda-forge
---
> openssl 3.1.2 hd590300_0 conda-forge
357c356
< platformdirs 3.9.1 pyhd8ed1ab_0 conda-forge
---
> platformdirs 3.10.0 pyhd8ed1ab_0 conda-forge
374c373
< pyarrow 12.0.1 py310h0576679_6_cpu conda-forge
---
> pyarrow 12.0.1 py310h0576679_7_cpu conda-forge
380c379
< pyparsing 3.1.0 pyhd8ed1ab_0 conda-forge
---
> pyparsing 3.1.1 pyhd8ed1ab_0 conda-forge
394c393
< python-utils 3.7.0 pyhd8ed1ab_0 conda-forge
---
> python-utils 3.7.0 pyhd8ed1ab_1 conda-forge
407c406
< referencing 0.30.0 pyhd8ed1ab_0 conda-forge
---
> referencing 0.30.1 pyhd8ed1ab_0 conda-forge
422c421
< sip 6.7.10 py310hc6cd4ac_0 conda-forge
---
> sip 6.7.11 py310hc6cd4ac_0 conda-forge
521c520
< zppy 2.3.0rc3 pyh51c0ceb_0 conda-forge/label/zppy_dev
---
> zppy 2.3.0rc5 pyh51c0ceb_0 conda-forge/label/zppy_dev |
diff of diffs...
2c2
< < # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc9_chrysalis:
---
> < # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc9_login:
4c4
< > # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc10_chrysalis:
---
> > # packages in environment at /lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.0rc10_login:
24c24
< < e3sm-unified 1.9.0rc9 hpc_py310_hd6e50ed_0 e3sm/label/e3sm_dev
---
> < e3sm-unified 1.9.0rc9 mpi_mpich_py310_hdc99501_0 e3sm/label/e3sm_dev
26c26
< > e3sm-unified 1.9.0rc10 hpc_py310_hd6e50ed_0 e3sm/label/e3sm_dev
---
> > e3sm-unified 1.9.0rc10 mpi_mpich_py310_hdc99501_0 e3sm/label/e3sm_dev
71c71
< 218,219c217,218
---
> 219,220c218,219
77c77
< 246c245
---
> 248c247
81c81
< 259c258
---
> 262c261
85c85
< 265c264
---
> 268c267
89c89,93
< 289c288
---
> 272c271
> < libudev1 253 h0b41bf4_1 conda-forge
> ---
> > libudev1 254 h3f72095_0 conda-forge
> 293c292
93c97
< 297c296
---
> 302c301
97c101,105
< 315c314
---
> 309c308
> < mpich 4.1.1 h846660c_100 conda-forge
> ---
> > mpich 4.1.2 h846660c_100 conda-forge
> 322c321
101c109
< 334c333
---
> 343c342
105c113
< 357c356
---
> 368c367
109c117
< 374c373
---
> 385c384
113c121
< 380c379
---
> 391c390
117c125
< 394c393
---
> 405c404
121c129
< 407c406
---
> 418c417
125c133
< 422c421
---
> 433c432
129c137
< 521c520
---
> 535c534 |
It looks like you've managed to look at the output files. In any case, running
I unfortunately have no idea. I don't work with |
In a standalone test for
And followed by Traceback
|
Commit 3: Compy -- latest
Output:
|
This comment was marked as outdated.
This comment was marked as outdated.
Commit 4: Compy -- latest
E.g., from
Output:
|
@chengzhuzhang Thank you for confirming stand-alone E3SM Diags results on Perlmutter. |
Regarding Compy, I feel really stuck.
I will see today if I can get ESMF_RegridWeightGen to build successfully with some combination of Gnu and MPI modules on Compy. That seems like the only plausible way forward at the moment. |
Thanks Xylar! |
I'm still working on building an rc11 on Compy with Gnu and MVAPICH2. Spack build is taking hours because it's Compy... |
@forsyth2, the outcome is that I build Spack packages for about 8 hours today (first with Intel by mistake and then with Gnu and MVAPICH2), and TempestRemap just failed to build:
So that seems to be a dead end. Please focus on the machines other than Compy for now while I try to come up with a plan C. Or is it plan H? plan Z? |
What needs multi-node MPI functionality besides ncclimo? @xylar, I am sorry 😢 this is a total maintenance nightmare. We gotta think of the pros and cons of having these spack packages... |
Skimming the shapely repo, seems like there was a geos bug that was fixed for 3.12 |
I narrowed my stacktrace in the comment above to the Fatal Python error: Segmentation fault
Current thread 0x00007fc2c5be3740 (most recent call first):
File "/home/vo13/mambaforge/envs/e3sm_diags_dev/lib/python3.10/site-packages/shapely/predicates.py", line 69 in has_z
File "/home/vo13/mambaforge/envs/e3sm_diags_dev/lib/python3.10/site-packages/shapely/decorators.py", line 77 in wrapped
File "/home/vo13/mambaforge/envs/e3sm_diags_dev/lib/python3.10/site-packages/shapely/geometry/base.py", line 607 in has_z
File "/home/vo13/mambaforge/envs/e3sm_diags_dev/lib/python3.10/site-packages/shapely/geometry/base.py", line 206 in coord The difference between shapely=1.8.5 and shapely>=2.0.0 is that >=2.0.0 implements
|
@tomvothecoder, thanks for narrowing things down so much! What I don't understand is that nothing related to shapely (or geos) changed between rc9 and rc10. You can see #475 (comment) for Chrysalis and here is Perlmutter: 1c1
< # packages in environment at /global/common/software/e3sm/anaconda_envs/base/envs/e3sm_unified_1.9.0rc9_pm-cpu:
---
> # packages in environment at /global/common/software/e3sm/anaconda_envs/base/envs/e3sm_unified_1.9.0rc10_pm-cpu:
16c16
< async-lru 2.0.3 pyhd8ed1ab_0 conda-forge
---
> async-lru 2.0.4 pyhd8ed1ab_0 conda-forge
59c59
< cdms2 3.1.5 py310heeafeea_20 conda-forge
---
> cdms2 3.1.5 py310heeafeea_21 conda-forge
67d66
< cfitsio 4.2.0 hd9d235c_0 conda-forge
78c77
< comm 0.1.3 pyhd8ed1ab_0 conda-forge
---
> comm 0.1.4 pyhd8ed1ab_0 conda-forge
90c89
< debugpy 1.6.7 py310heca2aa9_0 conda-forge
---
> debugpy 1.6.8 py310hc6cd4ac_0 conda-forge
96c95
< e3sm-unified 1.9.0rc9 hpc_py310_hd6e50ed_0 e3sm/label/e3sm_dev
---
> e3sm-unified 1.9.0rc10 hpc_py310_hd6e50ed_0 e3sm/label/e3sm_dev
98c97
< e3sm_to_cmip 1.10.0rc1 pyhe9a6732_0 conda-forge/label/e3sm_to_cmip_dev
---
> e3sm_to_cmip 1.10.0rc2 pyhe9a6732_0 conda-forge/label/e3sm_to_cmip_dev
119c118
< fonttools 4.41.1 py310h2372a71_0 conda-forge
---
> fonttools 4.42.0 py310h2372a71_0 conda-forge
156c155
< imagecodecs 2023.7.10 py310h4c4fb95_0 conda-forge
---
> imagecodecs 2023.7.10 py310hc929067_2 conda-forge
167c166
< ipywidgets 8.0.7 pyhd8ed1ab_0 conda-forge
---
> ipywidgets 8.1.0 pyhd8ed1ab_0 conda-forge
170c169
< jedi 0.18.2 pyhd8ed1ab_0 conda-forge
---
> jedi 0.19.0 pyhd8ed1ab_0 conda-forge
179c178
< jsonschema 4.18.4 pyhd8ed1ab_0 conda-forge
---
> jsonschema 4.18.6 pyhd8ed1ab_0 conda-forge
181c180
< jsonschema-with-format-nongpl 4.18.4 pyhd8ed1ab_0 conda-forge
---
> jsonschema-with-format-nongpl 4.18.6 pyhd8ed1ab_0 conda-forge
187c186
< jupyter_events 0.6.3 pyhd8ed1ab_1 conda-forge
---
> jupyter_events 0.7.0 pyhd8ed1ab_1 conda-forge
190c189
< jupyterlab 4.0.3 pyhd8ed1ab_0 conda-forge
---
> jupyterlab 4.0.4 pyhd8ed1ab_0 conda-forge
207c206
< libarrow 12.0.1 h657c46f_6_cpu conda-forge
---
> libarrow 12.0.1 h657c46f_7_cpu conda-forge
214c213
< libcap 2.67 he9d0100_0 conda-forge
---
> libcap 2.69 h0f662aa_0 conda-forge
218,219c217,218
< libclang 15.0.7 default_h7634d5b_2 conda-forge
< libclang13 15.0.7 default_h9986a30_2 conda-forge
---
> libclang 15.0.7 default_h7634d5b_3 conda-forge
> libclang13 15.0.7 default_h9986a30_3 conda-forge
246c245
< libllvm14 14.0.6 hcd5def8_3 conda-forge
---
> libllvm14 14.0.6 hcd5def8_4 conda-forge
259c258
< librsvg 2.56.1 h98fae49_0 conda-forge
---
> librsvg 2.56.3 h98fae49_0 conda-forge
265c264
< libsystemd0 253 h8c4010b_1 conda-forge
---
> libsystemd0 254 h3516f8a_0 conda-forge
289c288
< mache 1.17.0rc1 pyh4bc9f2b_0 conda-forge/label/mache_dev
---
> mache 1.17.0rc3 pyh4bc9f2b_0 conda-forge/label/mache_dev
297c296
< mpas-analysis 1.9.0rc3 pyh320ef33_0 conda-forge/label/mpas_analysis_dev
---
> mpas-analysis 1.9.0rc4 pyh320ef33_0 conda-forge/label/mpas_analysis_dev
315c314
< nbformat 5.9.1 pyhd8ed1ab_0 conda-forge
---
> nbformat 5.9.2 pyhd8ed1ab_0 conda-forge
334c333
< openssl 3.1.1 hd590300_1 conda-forge
---
> openssl 3.1.2 hd590300_0 conda-forge
357c356
< platformdirs 3.9.1 pyhd8ed1ab_0 conda-forge
---
> platformdirs 3.10.0 pyhd8ed1ab_0 conda-forge
374c373
< pyarrow 12.0.1 py310h0576679_6_cpu conda-forge
---
> pyarrow 12.0.1 py310h0576679_7_cpu conda-forge
380c379
< pyparsing 3.1.0 pyhd8ed1ab_0 conda-forge
---
> pyparsing 3.1.1 pyhd8ed1ab_0 conda-forge
394c393
< python-utils 3.7.0 pyhd8ed1ab_0 conda-forge
---
> python-utils 3.7.0 pyhd8ed1ab_1 conda-forge
407c406
< referencing 0.30.0 pyhd8ed1ab_0 conda-forge
---
> referencing 0.30.1 pyhd8ed1ab_0 conda-forge
422c421
< sip 6.7.10 py310hc6cd4ac_0 conda-forge
---
> sip 6.7.11 py310hc6cd4ac_0 conda-forge
466c465
< wheel 0.41.0 pyhd8ed1ab_0 conda-forge
---
> wheel 0.41.1 pyhd8ed1ab_0 conda-forge
521c520
< zppy 2.3.0rc3 pyh51c0ceb_0 conda-forge/label/zppy_dev
---
> zppy 2.3.0rc5 pyh51c0ceb_0 conda-forge/label/zppy_dev One of these changes must be related. |
I tried creating an
Then, on a compute node:
|
I think that will help a lot with debugging because it suggests it's nothing specific to Spack or E3SM-Unified itself. It's some conda package or combination thereof. Update: I see that @tomvothecoder had already been testing with just e3sm_diags on its own. Sorry, I missed that. |
@chengzhuzhang, bad news, I've been able to narrow it down to the cdms2 patch I did in conda-forge/cdms2-feedstock#89. If I create an environment with:
I get
and the segfault. If I do:
I get:
and no segfault. So it seems like CDMS2 can't be used with the latest ESMF/ESMPy. This is a pretty giant setback because we've been trying to work hard here to move from ESMF v8.2.0 to v8.4.2, which would be a big jump forward for us. @mahf708 and I have also put quite a bit for work into supporting ESMF without MPI and that work only applies to v8.4.2, not preceding versions. I need to think about what the options are. Update: So far the only thing I've come up with is to install development versions of both |
@xylar Thanks for narrowing down the issue! I'm sorry this has become such an ordeal. |
@xylar Thank you for continued effort troubleshooting. In my standalone e3sm diags enviroment, I have following: This environment works okay. So my understanding is that we need to patch cdms2 to be working with mpi version of esmf? |
@chengzhuzhang, I don't think MPI has anything to do with it. I just happened to get the MPI version in one case and the nompi in another. I tested with both and build 20, and it was fine either way. So I think the issue is entirely with my patch and unrelated to |
I tested Xylar's e3sm_diags environments in his comment here and confirmed that the e3sm_diags works with the older patch version 20 In my comment here, the latest patch version 21 of In any case, as Xylar mentioned, fixing |
Oh, I understand now! It is the cdms2 built 21 caused the problem. I failed to notice the build number. Thanks @xylar and @tomvothecoder for further confirm! |
Unfortunately, I didn't have time to debug this today. I will try to figure it out tomorrow. |
Thanks for the heads-up. Let us know anything that the team can help further troubleshooting or solve this issue. |
I created stand-alone environments with different versions Test codeimport cdms2
from shapely.geometry.polygon import LinearRing
geo = LinearRing(((-180, -90), (-180, 90), (180, 90), (180, -90), (-180, -90)))
geo.has_z Environment test cases
ResultTest case 4 with cdms2 patch 21 and shapely=2.01 is the only one that breaks. These versions of cdms2 and shapely are being installed in the latest Possible reasonsAs mentioned in this comment, shapely>=2.0 now wraps the My findings lead me to believe that the The question is: what was introduced in cdms2 patch 21 (and probably esmf=8.4.2) that might be conflicting with shapely>=2.0? |
@tomvothecoder, your investigation is very helpful indeed! |
Some further clues. Conda env
(so the same as @tomvothecoder's Tom's script:#!/usr/bin/env python3
import cdms2
from shapely.geometry.polygon import LinearRing
geo = LinearRing(((-180, -90), (-180, 90), (180, 90), (180, -90), (-180, -90)))
geo.has_z Result: segfault. Replace
|
My proposed fix is E3SM-Project/e3sm_diags#715. Please have a look and also test this out more extensively. |
Thank you @xylar! I'm glad you found a quick solution to get |
I tested standalone e3sm_diags on both perlmutter and compy. Both runs are okay. @xylar I'm going to create a e3sm_diags rc3. I'm not sure if there is a viable path forward for the compy issue for building NCO? It seems like the last issue needs to be resolved... |
Using rc12, I'm not actually encountering this error anymore. (Note: I am running zppy from |
@forsyth2, I undid changes I made from rc9 to rc10 in rc12. I am building the various spack packages with Gnu and OpenMPI again rather than building with Intel and installing NCO from conda even on compute nodes. So to me it is unsurprising that your issues are fixed. The problem is that mine with ESMF_RegridWeightGen in MPAS-Analysis likely remain, so I have to live with certain configurations being broken for now. |
For reference, this is how to create the versions lists/diffs @mahf708 shows in #475 (comment), #475 (comment), #475 (comment):
|
Closing this pull request. Resolved by Unified |
Debug rc5 issues mentioned in #474.
This pull request will probably not be merged. Making it to easily share debugging process.