Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floats as keys #13

Closed
grahamas opened this issue Mar 16, 2020 · 7 comments
Closed

Floats as keys #13

grahamas opened this issue Mar 16, 2020 · 7 comments

Comments

@grahamas
Copy link

Currently, the two types that cannot be keys are <: CartesianIndices and <: Real. My principal use case for this library is to indicate spatiotemporal dimensions of an array, meaning my keys are Float64. However, Float64 <: Real, so these aren't allowed. This issue is meant to explore if 1) the <: Real restriction is appropriate, and if so, 2) how to make sure it's still as easy as possible to use Float axes.

First, exploring reasons for the <: Real restriction:

  1. So that 1 and 1.0 and UInt(1) all act the same. Since they don't act the same on the underlying array (1.0 is not a valid array index), I'm not totally sold on this reason. I think the true restriction should be on types that could be valid indices for the underlying array, i.e. <: CartesianIndices and <: Integer.
  2. In general, Float64 doesn't work! I made the trivial change in is_key_type from <: Real to <: Integer. However, when you do this, you quickly notice a more fundamental problem: exact equality doesn't work for Floats; you inevitably get a spurious error due to a Float64 key not exactly matching the value in the axis range.

To address the latter, how difficult would it be to introduce the analogous to ==?

Even better, could there be an operator that behaves differently for different axis types? I haven't yet looked into how you're implementing the unary operator in the indexing. (I have a separate issue of how to broadcast unary == across a tuple for indexing, where the tuple might be heterogeneous both Float and categorical).

@Tokazama
Copy link
Owner

Tokazama commented Mar 17, 2020

I think you make a valid point on the use of Float64 as a key. You're also hitting on what I think is the biggest problem right now in the package, there probably needs to be a formal set of traits for determining key or indices based indexing. I've been playing with this already so I can put up some code here to see what people think.

isapprox will have this behavior in the latest version of Julia (see here)

@Tokazama
Copy link
Owner

Tokazama commented Mar 18, 2020

I'm working on a branch now that will hopefully solve this more completely. I'm going to try and make the problem a little bit more concrete, because I've looked into this before and it's not necessarily straightforward.

  • Issue 1: I think for the majority of cases == would work fine, even with floats. The only way I see this being an issue is when the float value used for indexing is calculated and floating point errors could be an issue. Otherwise it's probably a bug. So the first issue is basically should be the default (because it's not a question of whether I will be an option b/c it will be).

  • Issue 2: isapprox is a lot slower than ==. See this.

julia> using BenchmarkTools

julia> y = 1.0
1.0

julia> y = 1.0^C

julia> x = y = 1.0
1.0

julia> @btime ==($x, $y)
  0.035 ns (0 allocations: 0 bytes)
true

julia> @btime isapprox($x, $y)
  3.364 ns (0 allocations: 0 bytes)
true

So we should avoid it whenever we can.

  • Issue 3: How approximate do we want to be. isapprox allows you to set this but I don't have a clear idea of how to cleanly provide users with the option to adjust how "approximate". Nevermind, I just checked and you can do something like isapprox(i, atol=2).

@Tokazama
Copy link
Owner

Tokazama commented Mar 19, 2020

#14 Should accomplish all of this and more. I'll leave it open for the next 24-48 hours in case someone spots something, but it's looking pretty nice at this point.

@grahamas
Copy link
Author

grahamas commented Mar 21, 2020

Here's a simple example where floating point imprecision causes an indexing error:

using AbstractIndices # dev-ing Float-keys branch
A = AxisIndicesArray(rand(5), 0.1:0.1:0.5)
A[0.3]

(sorry I disappeared for a while)

@Tokazama
Copy link
Owner

Oh, I see what the problem is. There's some weird thing going on with StepRangeLen in StaticRanges. Okay, I'll try to get it fixed this weekend.

@Tokazama
Copy link
Owner

Actually, it was super simple to fix. You can probably expect a release by tomorrow morning.

@Tokazama
Copy link
Owner

Resolved with #14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants