Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further optimize extract() for regularly spaced arrays #653

Closed
alex-s-gardner opened this issue May 1, 2024 · 4 comments
Closed

Further optimize extract() for regularly spaced arrays #653

alex-s-gardner opened this issue May 1, 2024 · 4 comments

Comments

@alex-s-gardner
Copy link
Contributor

alex-s-gardner commented May 1, 2024

Manually calculating indices or using Interpolations is 3-5x faster than using extract()

Using extract()

using Rasters
using GeoInterface
dd = 0.001
x = X(0:dd:90);
y = Y(0:dd:90);
A = Raster(zeros(UInt8, y, x));

#number of points
n = 1e7;
f = GeoInterface.PointTuple.([(X=rand()*90, Y = rand()*90) for i in 1:n]);

@time pts = extract(A, f, atol=dd/2; index=true, geometry=false);
8.664601 seconds (50 allocations: 228.884 MiB, 0.93% gc time, 0.06% compilation time)

Using calculated index

@time begin
  r = round.(Int64,(pt_x .- x[1])./step(x)).+1;
  c = round.(Int64, (pt_y .- y[1]) ./ step(y)).+1;
  
  idx = [CartesianIndex(a,b) for (a,b) in  zip(r,c)];
  
  A[idx];
end;
1.690515 seconds (29.37 k allocations: 316.762 MiB, 11.33% gc time, 2.62% compilation time)

Using Interpolations

using Interpolations
@time begin    
    interp = Constant();
    itp = Interpolations.extrapolate(scale(Interpolations.interpolate(A.data, BSpline(interp)), A.dims[1].val.data, A.dims[2].val.data), NaN);
    val = itp.(pt_x, pt_y);
end;
2.974507 seconds (21 allocations: 694.414 MiB, 2.11% gc time)
@rafaqz
Copy link
Owner

rafaqz commented May 1, 2024

On main? And why not Intervals??

Ok I'm getting similar timings. The last PR was just general performance fixes not the Regular case, seems that is still needed.

(also note round is incorrect here as it rounds towards zero, floor is what you want...)

@rafaqz
Copy link
Owner

rafaqz commented May 2, 2024

I think really this is a DimensionalData.jl selector performance issue. We shouldn't have to think about Contains being slow on Regular here or write any custom code, it should just work and be as fast as just diving by step and calling floor.

@alex-s-gardner
Copy link
Contributor Author

100% agree... I don't think Rasters should need a specialized extract function... all of that should be handled by DD

@rafaqz
Copy link
Owner

rafaqz commented May 2, 2024

Ok I made a DD issue to track this and fix at the source.

@rafaqz rafaqz closed this as completed May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants