writing to netcdf changes dimensions and can result in ArgumentError #461

tiemvanderdeure · 2023-06-07T11:09:35Z

I have a Raster with regular mapped Intervals on each axis in a tif file.

When writing this Raster to a netcdf and loading it back in, the points look different and I sometimes get an error when filtering using Contains().

This also works when just generating a Raster myself like so:

using DimensionalData.LookupArrays, Rasters

xlen = 801
ylen = 1001
x = X(Projected(LinRange{Float64}(-20.05, 59.95, xlen), crs = EPSG(4326), mappedcrs = EPSG(4326), sampling = Intervals(Start())))
y = Y(Projected(LinRange{Float64}(59.95, -40.05, ylen), crs = EPSG(4326), mappedcrs = EPSG(4326), sampling = Intervals(Start())))

my_raster = ones(x, y)

write("my_raster.nc", my_raster)

raster_from_nc = Raster("my_raster.nc")

raster_from_nc[Y = Contains(7.35)] # ArgumentError: No interval contains 7.35

my_raster[Y = Contains(7.35)] # just to check 7.35 should be there

I can't find much of a pattern in which values throw an error and which don't. 7.15, 7.25, and 7.45 don't, but 7.55 does.

Writing to .tif does not give the same problem.

The text was updated successfully, but these errors were encountered:

rafaqz · 2023-06-07T13:57:13Z

Floating point error is creeping in somehow.

We save explicit position bounds for netcdf, its not really needed for floating point but very much is for dates. Maybe we can tweak this to be more type specific.

Its embedded in the Explicit span:

s = span(lookup(raster_from_nc, X))

You can see the values of the start and end of intervals don't match:

julia> s.val[:, 525:527][1]
7.550000000000001

julia> s.val[:, 525:527][4]
7.549999999999994

Somehow the code that is generating this vector is pushing the floating point errors in opposite directions, probably by starting the second range from a position offset by 1.

The good news is that is fixable. The bad news is that no matter what we do you may get the wrong answer from Contains here due to that 00000000000001 being the result of innacuracies from floating points in the range (e.g. internally in LinRange, multiplying the step size by the index plus the offset does not hit 7.55, it hits 7.550000000000001).

What a pain.

Also: there is an attempt at fixing some of this problem using StepRange rather than LinRange for tifs, as step range has som TwicePrecision magic to imrove accuracy of range values. #429. But we hit the other kind of range floating point error, which is having the wrong length. And thats much worse than innaccurate lookups. If you want to have a go at writing an algorithm to detect this and try getting the best of both worlds...

rafaqz · 2023-06-11T22:59:50Z

Should be working on Rasters 0.8.1 and DD 0.24.14

rafaqz closed this as completed Jun 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

writing to netcdf changes dimensions and can result in ArgumentError #461

writing to netcdf changes dimensions and can result in ArgumentError #461

tiemvanderdeure commented Jun 7, 2023

rafaqz commented Jun 7, 2023 •

edited

Loading

rafaqz commented Jun 11, 2023

writing to netcdf changes dimensions and can result in ArgumentError #461

writing to netcdf changes dimensions and can result in ArgumentError #461

Comments

tiemvanderdeure commented Jun 7, 2023

rafaqz commented Jun 7, 2023 • edited Loading

rafaqz commented Jun 11, 2023

rafaqz commented Jun 7, 2023 •

edited

Loading