-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve floating point representation in generation of regular grids that use a reference point and spacings #3559
Comments
I believe this issue triggered odd behaviour in Cartopy for me: SciTools/cartopy#2128 |
My problem pp-fields have lbrow: 325 The |
We have seen similar problems in various regional model output datasets. This is why CORDEX specifies that projection (CRS) parameters and coordinate data have to be saved as double precision. But despite this, it occasionally happens that the historic simulation is not perfectly consistent with one (or several) of the future projection simulations. They may have been run on different machines, or using different library or OS versions, or ... When I way back first stumbled on the problem that concatenation did not work as expected it was suggested that I should filter (round) the coordinate data. OK, this is a reasonable solution when the difference in double precision coordinate values amounts to nanometres in the real world (see this old google group comment). Another solution is to disregard the 2d latitude/longitude data and recreate this information from the model grid (in my case either But recently a colleague struggled with a historic and future part of an experiment that still could not be concatenated. After quite some data debugging we found out that the problem was that for one of the projection parameters the values in the two files differed at the double precision last significant bit. This was not at all easy to spot and only underlines that generally exact comparisons should not be used with floating point numbers. If I run the same example as in the first post, but change to float64 the |
Thanks @larsbarring, yes I think I've concluded that for my case the file format was the problem. The header of my file states that there are 325 rows with spacing of 0.5555556 degrees, so if Iris takes that at face value it will come out with a latitude range of slightly more than 180 degrees. It's not obvious to me what, if anything Iris should do instead. The file is generated by the Met Office Unified Model and so decisions about that format were made decades before we thought about using python! I don't know about the OP, but it seems likely that that was also Met Office UM data. I think we should remove the "Good First Issue" label as this is problem is clearly more complicated that was originally assumed. |
Some regular grids in Iris are generated from a reference point along a given coordinate direction along with an associated spacing.
The way this is currently implemented uses numpy.arange as follows:
iris/lib/iris/coords.py
Line 2293 in b2ce2a3
Where zeroth is the value prior to the first point value, step is the numeric difference between successive point values and count is the number of point values.
As discussed at https://stackoverflow.com/questions/47243190/numpy-arange-how-to-make-precise-array-of-floats (amongst other places), this results in grids that contain fractionally incorrectly represented floats. E.g. using zeroth = 352.84, step = 0.11, count = 146, it produces the following array:
array([352.95 , 353.06 , 353.17 , 353.28 , 353.39 , 353.5 , 353.61002, 353.72 , 353.83002, 353.94 , 354.05002, 354.16 , 354.27002, 354.38 , 354.49002, 354.6 , 354.71002, 354.82 , 354.93002, 355.04 , 355.15002, 355.26 , 355.37003, 355.48 , 355.59003, 355.7 , 355.81 , 355.92 , 356.03 , 356.14 , 356.25 , 356.36002, 356.47 , 356.58002, 356.69 , 356.80002, 356.91 , 357.02002, 357.13 , 357.24002, 357.35 , 357.46002, 357.57 , 357.68002, 357.79 , 357.90002, 358.01 , 358.12003, 358.23 , 358.34003, 358.45 , 358.56 , 358.67 , 358.78 , 358.89 , 359. , 359.11002, 359.22 , 359.33002, 359.44 , 359.55002, 359.66 , 359.77002, 359.88 , 359.99002, 360.1 , 360.21002, 360.32 , 360.43002, 360.54 , 360.65002, 360.76 ], dtype=float32)
One consequence of this is that two grids that should be identical cannot be manipulated together (e.g. subtract one from the other) if one is fully specified from a source file, and the other is generated in this way from a reference point and spacings.
A potential solution as mentioned in the above link is to use an approach involving numpy.linspace instead such as:
start = zeroth + step
end = zeroth + (count * step)
points = np.linspace(start, end, num=count, dtype=np.float32)
Using the same values of zeroth, step and count as above this produces the following array:
array([352.95, 353.06, 353.17, 353.28, 353.39, 353.5 , 353.61, 353.72, 353.83, 353.94, 354.05, 354.16, 354.27, 354.38, 354.49, 354.6 , 354.71, 354.82, 354.93, 355.04, 355.15, 355.26, 355.37, 355.48, 355.59, 355.7 , 355.81, 355.92, 356.03, 356.14, 356.25, 356.36, 356.47, 356.58, 356.69, 356.8 , 356.91, 357.02, 357.13, 357.24, 357.35, 357.46, 357.57, 357.68, 357.79, 357.9 , 358.01, 358.12, 358.23, 358.34, 358.45, 358.56, 358.67, 358.78, 358.89, 359. , 359.11, 359.22, 359.33, 359.44, 359.55, 359.66, 359.77, 359.88, 359.99, 360.1 , 360.21, 360.32, 360.43, 360.54, 360.65, 360.76])
The text was updated successfully, but these errors were encountered: