Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

writing to netcdf changes dimensions and can result in ArgumentError #461

Closed
tiemvanderdeure opened this issue Jun 7, 2023 · 2 comments
Closed

Comments

@tiemvanderdeure
Copy link
Contributor

I have a Raster with regular mapped Intervals on each axis in a tif file.

When writing this Raster to a netcdf and loading it back in, the points look different and I sometimes get an error when filtering using Contains().

This also works when just generating a Raster myself like so:

using DimensionalData.LookupArrays, Rasters

xlen = 801
ylen = 1001
x = X(Projected(LinRange{Float64}(-20.05, 59.95, xlen), crs = EPSG(4326), mappedcrs = EPSG(4326), sampling = Intervals(Start())))
y = Y(Projected(LinRange{Float64}(59.95, -40.05, ylen), crs = EPSG(4326), mappedcrs = EPSG(4326), sampling = Intervals(Start())))

my_raster = ones(x, y)

write("my_raster.nc", my_raster)

raster_from_nc = Raster("my_raster.nc")

raster_from_nc[Y = Contains(7.35)] # ArgumentError: No interval contains 7.35

my_raster[Y = Contains(7.35)] # just to check 7.35 should be there

I can't find much of a pattern in which values throw an error and which don't. 7.15, 7.25, and 7.45 don't, but 7.55 does.

Writing to .tif does not give the same problem.

@rafaqz
Copy link
Owner

rafaqz commented Jun 7, 2023

Floating point error is creeping in somehow.

We save explicit position bounds for netcdf, its not really needed for floating point but very much is for dates. Maybe we can tweak this to be more type specific.

Its embedded in the Explicit span:

s = span(lookup(raster_from_nc, X))

You can see the values of the start and end of intervals don't match:

julia> s.val[:, 525:527][1]
7.550000000000001

julia> s.val[:, 525:527][4]
7.549999999999994

Somehow the code that is generating this vector is pushing the floating point errors in opposite directions, probably by starting the second range from a position offset by 1.

The good news is that is fixable. The bad news is that no matter what we do you may get the wrong answer from Contains here due to that 00000000000001 being the result of innacuracies from floating points in the range (e.g. internally in LinRange, multiplying the step size by the index plus the offset does not hit 7.55, it hits 7.550000000000001).

What a pain.

Also: there is an attempt at fixing some of this problem using StepRange rather than LinRange for tifs, as step range has som TwicePrecision magic to imrove accuracy of range values. #429. But we hit the other kind of range floating point error, which is having the wrong length. And thats much worse than innaccurate lookups. If you want to have a go at writing an algorithm to detect this and try getting the best of both worlds...

@rafaqz
Copy link
Owner

rafaqz commented Jun 11, 2023

Should be working on Rasters 0.8.1 and DD 0.24.14

@rafaqz rafaqz closed this as completed Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants