-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_zarr removes global attributes in destination dataset #8755
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
I'm not a maintainer, but ... not sure if this is a bug:
Sorry if I'm adding noise to the issue, just commenting because I'm looking now at using |
I believe this is a bug. The Modifying existing Zarr stores section of the documentation outlines ways to modify existing zarr stores and to limit what data is modified using
|
What's the desired behaviour here? Never update |
I'm not knowledgeable of the zarr API, but my desired behavior when appending is to not update the root (Dataset) attrs. @pnorton-usgs's example shows a use case where it seems they would like to append to and overwrite conflicts of root attrs, though this would require more processing and might not be desirable for use cases with large metadata. However this is separate from dealing with variable (DataArray) attrs. When appending a new variable using |
Looking briefly over the code it seems like we'd just need to change xarray/xarray/backends/zarr.py Lines 650 to 651 in 8a23e24
to include mode not in ["a", "a-"] .
Variable attrs seems to be written here, which we can leave untouched: xarray/xarray/backends/zarr.py Line 779 in 8a23e24
|
With Personally I find |
@slevang the difference is with
For an When the file only has one variable/array, such as after performing |
Yeah I think that's a reasonable distinction. Totally agree |
Agree 💯 @slevang. Seems like what is needed is a more nuanced Zarr insert / update / upsert / merge utility with a richer syntax. |
Your mention of merge gets me thinking in terms of the standard xarray ops every knows well. |
What happened?
Adding new variables to a zarr dataset with to_zarr() always removes the existing global attributes. New global attributes in the source dataset are not always added to the destination dataset depending on how to_zarr() is called.
What did you expect to happen?
I would expect that existing global attributes would always be preserved. If there are new global attributes I would expect them to be added to the existing global attributes instead of replacing all existing global attributes.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:26:40) [Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1
xarray: 2024.1.1
pandas: 2.2.0
numpy: 1.26.4
scipy: 1.12.0
netCDF4: 1.6.0
pydap: installed
h5netcdf: 1.3.0
h5py: 3.8.0
Nio: None
zarr: 2.17.0
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: 1.3.7
dask: 2024.2.0
distributed: 2024.2.0
matplotlib: 3.8.2
cartopy: 0.22.0
seaborn: None
numbagg: None
fsspec: 2023.12.2
cupy: None
pint: 0.23
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.0.3
pip: 24.0
conda: None
pytest: 8.0.0
mypy: None
IPython: 8.21.0
sphinx: None
The text was updated successfully, but these errors were encountered: