Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: open_dataset should handle missing bounds on ORCA grid more gracefully #284

Closed
jypeter opened this issue Jul 28, 2022 · 4 comments · Fixed by #281
Closed

[Bug]: open_dataset should handle missing bounds on ORCA grid more gracefully #284

jypeter opened this issue Jul 28, 2022 · 4 comments · Fixed by #281
Assignees
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@jypeter
Copy link

jypeter commented Jul 28, 2022

What happened?

[ Problem initially reported by @oliviermarti ]

I opened a file with data specified on the ORCA grid (one of the ORCA grids... but the problem would probably be the same for data on any non rectilinear grid), with no bounds available in the file and got a long traceback instead of a friendly warning

Generating a file that will reproduce the traceback

Get a clean IPSL areacello file from ESGF. This file can be opened without any problem with open_dataset

wget https://vesg.ipsl.upmc.fr/thredds/fileServer/cmip6/CMIP/IPSL/IPSL-CM6A-LR/piControl/r1i2p1f1/Ofx/areacello/gn/v20190319/areacello_Ofx_IPSL-CM6A-LR_piControl_r1i2p1f1_gn.nc

Remove the references to the bounds, and remove the bounds variables

ncatted -a bounds,,d,, areacello_Ofx_IPSL-CM6A-LR_piControl_r1i2p1f1_gn.nc areacello_no_bounds_attributes.nc

ncks -vareacello,nav_lon,nav_lat areacello_no_bounds_attributes.nc  areacello_no_bounds_at_all.nc

Reproducing the long traceback

Open the file with cdms2 and open_dataset

>>> nb_file = './areacello_no_bounds_at_all.nc'
>>> f_cdms2 = cdms2.open(nb_file)
>>> f_cdms2.listvariables()
['areacello', 'nav_lat', 'nav_lon']

>>> ds_nb = xc.open_dataset(nb_file)
Traceback (most recent call last):
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 178, in get_bounds
    bounds_key = coord_var.attrs["bounds"]
KeyError: 'bounds'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 145, in add_missing_bounds
    self.get_bounds(axis)
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 180, in get_bounds
    raise KeyError(
KeyError: "The coordinate variable 'nav_lon' has no 'bounds' attr. Set the 'bounds' attr to the name of the bounds data variable."

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 178, in get_bounds
    bounds_key = coord_var.attrs["bounds"]
KeyError: 'bounds'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 220, in add_bounds
    self.get_bounds(axis)
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 180, in get_bounds
    raise KeyError(
KeyError: "The coordinate variable 'nav_lon' has no 'bounds' attr. Set the 'bounds' attr to the name of the bounds data variable."

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/dataset.py", line 95, in open_dataset
    ds = _postprocess_dataset(ds, data_var, center_times, add_bounds, lon_orient)
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/dataset.py", line 482, in _postprocess_dataset
    dataset = dataset.bounds.add_missing_bounds()
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 147, in add_missing_bounds
    self._dataset = self.add_bounds(axis, width)
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 225, in add_bounds
    dataset = self._add_bounds(axis, width)
  File "/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/lib/python3.8/site-packages/xcdat/bounds.py", line 269, in _add_bounds
    raise ValueError("Cannot generate bounds for multidimensional coordinates.")
ValueError: Cannot generate bounds for multidimensional coordinates.

We can get rid of the traceback by overriding the default value of add_bounds

>>> ds_nb = xc.open_dataset(nb_file, add_bounds=False)
>>>

What did you expect to happen?

I expect a behavior that will not frighten the user away!

It was not too hard for me to identify what the problem was, and find the add_bounds=False option, but you can't really expect that from a casual user or beginning intern who has only read an example notebook.

open_dataset should be able to detect this missing data/information and print a nice warning (and suggest the add_bounds=False workaround)

Minimal Complete Verifiable Example

Available in the _What happened_ section

Relevant log output

No response

Anything else we need to know?

No response

Environment

conda list | grep xcdat
xcdat                     0.3.0              pyhd8ed1ab_0    conda-forge


>>> xr.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.45.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.17.0
pandas: 1.2.3
numpy: 1.21.4
scipy: 1.7.0
netCDF4: 1.5.6
pydap: None
h5netcdf: 0.13.1
h5py: 2.10.0
Nio: None
zarr: 2.11.0
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.9.0
iris: 3.0.1
bottleneck: None
dask: 2021.02.0
distributed: 2021.02.0
matplotlib: 3.3.4
cartopy: 0.18.0
seaborn: 0.11.1
numbagg: None
pint: None
setuptools: 49.6.0.post20210108
pip: 22.0.4
conda: None
pytest: 6.2.4
IPython: 7.21.0
sphinx: 3.5.2
@jypeter jypeter added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Jul 28, 2022
@pochedls
Copy link
Collaborator

Thank you for reporting this - I was able to reproduce this issue, which is now fixed in #278. I believe the goal is to produce a minor release in the next week with this fix. I will ping this thread when that is done.

@pochedls
Copy link
Collaborator

pochedls commented Aug 3, 2022

@jypeter - I updated to the main branch and tried your code and did not hit any issues opening the dataset – so this is resolved. I think the plan is to do a "patch release" in the next week to address this issue and a few others. @tomvothecoder - is that timeline correct?

@pochedls pochedls closed this as completed Aug 3, 2022
@tomvothecoder
Copy link
Collaborator

@pochedls You're correct, v0.3.1 (patch release) will be out in the next week. I'll make sure to keep our devs and testers posted.

@jypeter
Copy link
Author

jypeter commented Aug 4, 2022

Thanks! I will update to the new version when it is available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants