Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid units are treated inconsistently after saving and re-loading cubes #3359

Closed
schlunma opened this issue Jul 23, 2019 · 4 comments
Closed

Comments

@schlunma
Copy link
Contributor

Hey guys, I think I found a bug regarding invalid units (of coordinates).

Consider this example

import iris
from netCDF4 import Dataset

ds = Dataset('iris-sample-data/iris_sample_data/sample_data/A1B_north_america.nc', mode='a')

# Create file with invalid latitude units
ds.variables['latitude'].units = 'invalid units'
ds.close()

# Load file with iris
cube = iris.load_cube('iris-sample-data/iris_sample_data/sample_data/A1B_north_america.nc')
print(cube.coord('latitude').units)  # gives 'unknown'

# Save it and load it again
iris.save(cube, 'test.nc')
cube_new = iris.load_cube('test.nc')
print(cube_new.coord('latitude').units)  # gives '1'

That means after saving and re-loading a file with invalid coordinate units, iris changes the units. Is this behavior desired? This is not very intuitive and may prevent concatenation of newly created cubes with saved ones.

@bjlittle bjlittle self-assigned this Jul 29, 2019
@bjlittle bjlittle added this to To do in Iris v2.3.0 via automation Jul 29, 2019
@bjlittle bjlittle added this to the v2.3.0 milestone Jul 29, 2019
@bjlittle
Copy link
Member

@schlunma Thanks for sharing this 👍

Round tripping should give consistent results, so let us dig deeper to determine what exactly is going on, thanks

@lbdreyer lbdreyer moved this from To do to In progress in Iris v2.3.0 Sep 19, 2019
@stephenworsley
Copy link
Contributor

So what seems to be happening is that, when loading, invalid units are being stored in a cube as an attribute of the coordinate called 'invalid_units'. When saving, they are then assigned the attribute 'invalid_units' rather than 'units', which was the original attribute they were associated with. This coordinate now has no 'units' attribute, as opposed to an invalid one and is therefore treated differently when loading.

The reasoning behind the current behaviour is that we will always end up saving a file which is CF compliant. This currently takes precedence over consistent loading and saving of non-CF compliant files. With that said, if there was a need for non-CF compliant information to be preserved, we could consider changing the behaviour of iris in future versions, or adding the option to preserve such information.

The following options may be worth considering (ordered from major to minor changes):

  • Changing the default behaviour of iris when handling invalid units.

  • Giving an option to preserve invalid units (and perhaps other non-CF compliant information) when loading/saving.

  • Interpreting a unitless coordinate which has an 'invalid_units' attribute as having 'unknown' units rather than 1 as its units.

  • Adding a note to the documentation warning that non-CF compliant information may be changed between loading and saving.

@pp-mo
Copy link
Member

pp-mo commented Sep 26, 2019

At least some of this possibly relates to #3394 ?

@lbdreyer lbdreyer removed this from In progress in Iris v2.3.0 Oct 1, 2019
@lbdreyer lbdreyer removed this from the v2.3.0 milestone Oct 1, 2019
@stephenworsley
Copy link
Contributor

Closed by #3711.

This resolves the roundtripping from cube -> NetCDF -> cube, the units on both of these will now give "unknown".
It does not address the difference when roundtripping NetCDF -> cube -> NetCDF as this is due to changes iris is imposing to remain CF compliant. We consider this expected behaviour and not a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants