RFC: Safer, extensible ﹫inbounds #8227

simonster · 2014-09-04T04:38:16Z

This implements my proposal from #7799 (comment). unsafe_getindex/unsafe_setindex! provide getindex/setindex! functionality without bounds checks, and @inbounds rewrites the AST to call the unsafe versions of these functions. @inbounds only affects the expression it's applied to, not all inlined function calls. @boundscheck still works and provides the old behavior in case there's a really good reason to use it. As a bonus, this variant of @inbounds could be made to pass through the value of the enclosed expression, ~~although at the moment I don't think the behavior is as expected for setindex! expressions~~.

This would have been a lot easier if it didn't require reimplementing the expansion of expressions containing ref into getindex/setindex! in Julia before array.jl is loaded. It may be a better idea to adjust the femtolisp code so that it can perform this expansion without expanding anything else. Being able to rewrite getindex/setindex! with a macro may also be useful for the transition to arrays as views.

JeffBezanson · 2014-09-04T05:21:35Z

I'm amazed you managed to reimplement that part of the expansion pass. Having that code duplicated is not good. It probably would have been easier to replace ref with unsafe_ref and do the rest by modifying the existing front-end code. It also might work to call expand from the macro, then replace calls.

I think this is also a sign that this macro is slightly sketchy (admittedly the existing @inbounds is pretty sketchy too). Replacing identifiers is inherently unhygienic, as in this contrived example:

let getindex(x) = 2x
    @inbounds getindex(...)
end

So I think there is still something subtly unsatisfying about this approach.

simonster · 2014-09-04T06:03:40Z

Yes, modifying the front-end is probably a better solution. Maybe we could have a custom_ref Expr that accepts the substitute getindex and setindex! in its args, so we could reuse it for a future array indexing backwards compatibility macro. I tried using expand, but splicing the resulting Exprs into the AST did not always seem to work. It also seemed like a bad idea because other macros might not expect an expanded AST.

This macro doesn't actually replace any identifiers; it just expands ref differently, and shouldn't do anything to your example. It is true that:

let getindex(x) = 2x
    @inbounds y = x[1]
end

would end up calling unsafe_getindex from outside the let block. It might be better to fall back to getindex if it's from a different scope than unsafe_getindex, if that's even possible. Or we could automatically define unsafe_getindex(x...) = getindex(x) whenever getindex is redefined, although that doesn't seem so pretty. But I suspect code that redefines getindex and uses @inbounds is quite rare in practice, and at least the custom ref expansion is easy to reason about.

simonbyrne · 2014-09-04T13:44:03Z

It would be great if this pattern could be easily applied to other functions, as this sort of problem pops up in other cases. Two that come to mind:

@checked for replacing arithmetic operations (+,*, ...) by their checked equivalents (checked_mul, checked_add, ...). Or conversely, @unchecked if we end up having checked arithmetic by default.
Specifying output destinations for array operations. I've already had a stab at this with InplaceOps.jl

simonster · 2014-09-05T04:56:59Z

Here's another crack at changing how ref is expanded, this time by replacing ref nodes with custom_ref and letting the front-end expand it. That approach is indeed much less code and seems to work. But I'm not all that confident in my Scheme programming abilities and I also don't understand the indentation convention. (Why both tabs and spaces?)

timholy · 2014-09-10T18:41:19Z

👍. I can't review the Scheme code changes, but I have no complaints about the Julia code.

Looking forward to having @inbounds work on Ranges!

ivarne · 2014-09-10T19:07:25Z

base/bitarray.jl

@@ -342,8 +342,9 @@ end
 ## Indexing: getindex ##

 function unsafe_bitgetindex(Bc::Vector{Uint64}, i::Int)
- return (Bc[@_div64(i-1)+1] & (uint64(1)<<@_mod64(i-1))) != 0
+ return @inbounds (Bc[@_div64(i-1)+1] & (uint64(1)<<@_mod64(i-1))) != 0


It's great if this does no longer cause the function to return nothing

Yup:

julia> x = [1, 2]; julia> @inbounds x[1] 1

vtjnash · 2014-09-12T01:46:38Z

what if these were to go into a special module. I may need to write a longer Julep for this, but here's my basic idea for it:

Instead of unsafe_getindex & friends, implement:

module Base
  baremodule Unsafe
    getindex(x...) = Base.getindex(x...)
    setindex!(x...) = Base.setindex!(x...)
    # unsafe versions go here
  end
  # the existing definitions remain here
end

And perhaps add a few new intrinsics to do the unsafe operations (or perhaps just enhance pointerref to handle multiple dimensions and accept an Array type)

If the user wanted, they could then access these directly, or with an importall

But in typical usage, it would just @inline set a flag. Then in Base.resolve_globals and codegen.cpp::emit_var, this would select between looking in Base directly, or in Base.Unsafe.

But really, ideally, the following could probably be made just as efficient, without too much effort:

macro inbounds(expr)
    quote
        let getindex = Base.Unsafe.getindex, setindex! = Base.Unsafe.setindex!
            $(esc(expr))
        end
    end
end

Which would be really sweet, since it doesn't require any special handling.

simonster · 2014-09-12T02:18:07Z

@vtjnash That's an interesting approach. If we set a flag we could presumably choose to substitute Base.Unsafe.getindex only for Base.getindex and not a user-defined getindex function in another scope, thus avoiding @JeffBezanson's hygiene complaint above. We might also have to tweak the frontend to get x == @inbounds x. let would be cleanest in terms of implementation but it seems to have the same hygiene problem, unless you want to make let getindex = $(esc(getindex)) === Base.getindex ? Base.Unsafe.getindex : $(esc(getindex)) efficient too.

vtjnash · 2014-09-12T03:28:53Z

A slightly-less contrived example is:

module MyModule
  global getindex
  getindex(x...) = ...
  function  f()
      @inline x[1]  = 2
  end
end

But what I perhaps like most about the let block approach is that it requires absolutely nothing special added to the compiler, just a new, general inference pass for elidding let expressions. And as such, it is easy to document that it is doing nothing special. So, if you want the previous version to work in a customized manner, it's still completely transparent and simple:

module MyModule
    macro inbounds(expr...)
        :( let getindex = my_getindex; :(esc(expr)); nothing; end )
    end
    # now @inbounds is something different
end

p.s. I might as well mention this too: I think the module approach also generalizes really nicely for other checked/unchecked operations (cough_arithmetic_cough). So you could rewrite the default arithmetic in Base to be always do checked math, and then have an @unchecked_math flag that worked the same, but listed all of the relevant math operators (+,-,*,/,div,^,.+,.-,.*,./,.^)

p.p.s I think Unchecked might be a better name than Unsafe for the module, because it is more evocative of why it is potentially dangerous, than simply a generic warning.

p.p.s. while my approach may seems similar, note that it would have significantly behavior in the following case, which may not be entirely desirable:

function f()
    @inbounds map(getindex, [[1],[2],[3]])
end

But then again, having the following operations calling different functions is perhaps a bit odd too:

function f()
    @inbounds begin
        let gi = getindex
            getindex(x) == gi(x) == x[]
        end
     end
end

p.p.p.s. I corrected my above @inbounds macro to have a result of nothing, like the existing macro

simonster · 2014-09-12T03:48:42Z

A problem with the let approach is that it creates a new scope for whatever's inside @inbounds, so you'd get:

@inbounds x = a[1]
x # undefined

Also, I don't think we want @inbounds to have a result of nothing. That seems like an undesirable consequence of the current implementation. I think we want it to pass through the result of whatever's inside it.

simonster · 2014-09-22T20:36:42Z

So given the scoping issue with let, I think we have three options:

Use this strategy currently implemented in this PR, which only touches the frontend.
Add a new type of Expr that tells the codegen/type inference to substitute calls to f with calls to g, so that @inbounds $ex is Expr(:substitutecall, {Base.getindex, Base.unsafe_getindex, Base.setindex!, Base.unsafe_setindex!}, ex) or something like that.
Add a new type of block (with or without syntax) that behaves like let but does not introduce a new scope, and make unscoped_let getindex = Base.unsafe_getindex, setindex! = Base.unsafe_setindex! fast.

vtjnash · 2014-09-22T22:34:58Z

Options two and three are the same. I'll let Jeff make the final call.

simonster · 2014-09-22T22:58:25Z

@vtjnash At least as I'm envisioning them, they're not quite the same: option 2 would only substitute functions for functions, while option 3 would work on at the level of identifiers. Option 2 would avoid the hygiene issue (we could substitute Base.unsafe_getindex for Base.getindex only, and not for getindex from some other scope) while option 3 wouldn't, but option 3 might be useful in some other context.

timholy · 2014-11-20T13:10:37Z

Bump. Now that we have most of the other critical elements of #7941, this (in my opinion) is the key outstanding issue. It's not obvious to me that one wants to turn on getindex(A, I...) = sub(A, I...) until we have bounds-checking.

timholy · 2015-02-25T18:49:47Z

Given the interest in moving towards finalizing 0.4, @JeffBezanson, input/code review is needed here. In my opinion this is the main obstacle for returning views from indexing in 0.4. Most of the work is already done: in addition to the SubArray and multidimensional indexing revamp, there's #9150 waiting in the wings. But we need to make a couple of decisions before we can move forward.

Two comments:

I'm not clear on whether the Base.Unsafe solution is truly "user-extensible," meaning that a package author could also make use of it.
The question of whether @inbounds should recurse is now slightly more interesting in light of Should reshaping a SubArray produce another SubArray? #9874 (comment). If both ReshapeArray and SubArray check bounds upon construction, then checking bounds on the "outer" composition is a guarantee that it's safe all the way down. SubArray{Array} is of course going to be very common; presumably at least one ReshapeArray (so 3 layers in total) will not be exactly rare. However, one might need a boundschecked_upon_construction trait to be safe in general.

simonster · 2015-03-17T16:46:31Z

Some more thoughts:

In RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525 (comment) @mbauman proposes getindex(::Type{BoundsCheckOn}, x, I...), which has some elegance, but see the caveats in my post below in that issue.
Yet another way to express this is to have an Expr that says that we should first look for imported bindings from module A in module B, so Base.getindex could instead get looked up as Base.Unsafe.getindex. This is another variant of option 2 above but might be a bit less ugly, and could also be useful for @fastmath. But this doesn't solve the issue of potential code duplication between getindex and Unsafe.getindex.

mbauman · 2015-03-17T16:49:52Z

Moving the discussion here, referencing #10525 (comment):

The first problem is that it makes defining getindex uglier for cases where performance doesn't matter

Couldn't we have ref create calls with BoundsCheckOn() by default? Here's what Base julia could define:

abstract BoundsCheck
immutable BoundsCheckOn <: BoundsCheck; end
immutable BoundsCheckOff <: BoundsCheck; end
getindex(::BoundsCheck, x, I...) = getindex(x, I...)

Then users can opt into getting bounds check information when defining their own methods. It makes calling getindex by name more complicated, but I think it could be ok… it's already a bit of an expert syntax over x[I...].

The second problem is that, at least as far as I can tell, there's still a lowering issue.

I was thinking of that part as solved by the code you've already written here. :)

simonster · 2015-03-17T17:01:20Z

@mbauman With the fallback, it seems that any method defined as getindex(::BoundsCheckOn, x, I...) would take precedence over all methods defined as getindex(x, I...). Does that still help with #10525? Wouldn't you still need to define all general-purpose methods with bounds checks as getindex(x, I...) rather than getindex(::BoundsCheckOn, x, I...) to avoid shadowing user definitions?

mbauman · 2015-03-17T17:25:31Z

The reason I liked having this as an aspect of multiple dispatch is not for defining different methods, but rather as a way to allow the compiler to elide bounds checks. ~~But it seems like the optimization that I thought would occur isn't happening~~. Here's how I would write #10525 with this proposal:

Bool(::BoundsCheckOn) = true
Bool(::BoundsCheckOff) = false
getindex(b::BoundsCheck, x::AbstractArray, I...) = _getindex(b, linearindexing(x), x, I...)
function _getindex(b::BoundsCheck, ::LinearFast, x, I::Union(Real, AbstractArray, Colon)...)
    Bool(b) && checkbounds(x, I...) # I initially tried isa(b, BoundsCheckOn), but of course that doesn't work
    … # compute i::Int from I...
    getindex(BoundsCheckOff(), x, i) # If the user hasn't defined this, it will fall back to the next method
end
_getindex(::BoundsCheck, ::LinearFast, x, I::Int) = getindex(x, I) # And now this will throw the appropriate no method error!

Edit: ~~Oh, I see what you mean now~~. I think the same general strategy that I'm already using will work just fine with this setup. I've updated my code sketch.

mbauman · 2015-03-17T18:57:22Z

Oooh, right you are. This will cause a strange middle-ground where getindex(::MyArray, ::Int) = 1 works but getindex(::MyArray, ::Float) = 2 doesn't. I don't like that. :\

simonster force-pushed the sjk/boundscheck branch from 4516065 to 457ce1b Compare September 5, 2014 04:46

simonster force-pushed the sjk/boundscheck branch from 457ce1b to c81021a Compare September 5, 2014 20:01

Safer, extensible ﹫inbounds

7cb11d5

simonster force-pushed the sjk/boundscheck branch from c81021a to 7cb11d5 Compare September 5, 2014 20:03

ivarne reviewed Sep 10, 2014
View reviewed changes

timholy mentioned this pull request Sep 22, 2014

Towards array nirvana #7941

Closed

15 tasks

timholy mentioned this pull request Sep 28, 2014

New array views based on stagedfunctions #8501

Merged

vtjnash mentioned this pull request Sep 29, 2014

WIP: checked integer conversions #8420

Merged

jiahao force-pushed the master branch 2 times, most recently from 2ef98c5 to 0388647 Compare October 5, 2014 00:57

ivarne mentioned this pull request Oct 11, 2014

rand of AbstractArray #8649

Closed

jiahao force-pushed the master branch from 6c7c7e3 to 1a4c02f Compare October 11, 2014 22:06

timholy mentioned this pull request Oct 22, 2014

sub2ind bounds checking #8746

Closed

jiahao force-pushed the master branch from cdde4df to 7fdc860 Compare October 28, 2014 04:20

simonbyrne mentioned this pull request Nov 10, 2014

In-place "bang" versions of functions. JuliaLinearAlgebra/AppleAccelerate.jl#1

Closed

MikeInnes force-pushed the master branch from 5c60996 to b1c3df3 Compare November 14, 2014 17:07

vtjnash mentioned this pull request Dec 19, 2014

Implement @fastmath #9406

Merged

This was referenced Jan 14, 2015

@inbounds loses expression value #9774

Closed

＠fastmath does not handle +=, *=, etc. #9778

Closed

timholy mentioned this pull request Feb 1, 2015

SubArray bounds checking (or not): bug or feature? #4044

Closed

timholy mentioned this pull request Feb 23, 2015

Check bounds upon construction of SubArrays. #10296

Merged

mbauman mentioned this pull request Mar 9, 2015

utfxx vs. UTFxx #10456

Closed

timholy mentioned this pull request Mar 17, 2015

RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

Merged

mbauman mentioned this pull request Mar 25, 2015

extensible bounds checking removal #7799

Closed

mbauman mentioned this pull request Jun 25, 2015

WIP: A traits-based user-extensible @inbounds #11867

Closed

sjkelly mentioned this pull request Dec 4, 2015

Improve codegen for getindex(::Array, ::Face) JuliaGeometry/GeometryTypes.jl#29

Merged

simonster closed this Jan 29, 2016

simonster mentioned this pull request Jan 29, 2016

Arraypocalypse Now and Then #13157

Closed

27 tasks

tkelman deleted the sjk/boundscheck branch March 22, 2016 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Safer, extensible ﹫inbounds #8227

RFC: Safer, extensible ﹫inbounds #8227

simonster commented Sep 4, 2014

JeffBezanson commented Sep 4, 2014

simonster commented Sep 4, 2014

simonbyrne commented Sep 4, 2014

simonster commented Sep 5, 2014

timholy commented Sep 10, 2014

ivarne Sep 10, 2014

simonster Sep 11, 2014

vtjnash commented Sep 12, 2014

simonster commented Sep 12, 2014

vtjnash commented Sep 12, 2014

simonster commented Sep 12, 2014

simonster commented Sep 22, 2014

vtjnash commented Sep 22, 2014

simonster commented Sep 22, 2014

timholy commented Nov 20, 2014

timholy commented Feb 25, 2015

simonster commented Mar 17, 2015

mbauman commented Mar 17, 2015

simonster commented Mar 17, 2015

mbauman commented Mar 17, 2015

mbauman commented Mar 17, 2015

RFC: Safer, extensible ﹫inbounds #8227

RFC: Safer, extensible ﹫inbounds #8227

Conversation

simonster commented Sep 4, 2014

JeffBezanson commented Sep 4, 2014

simonster commented Sep 4, 2014

simonbyrne commented Sep 4, 2014

simonster commented Sep 5, 2014

timholy commented Sep 10, 2014

ivarne Sep 10, 2014

Choose a reason for hiding this comment

simonster Sep 11, 2014

Choose a reason for hiding this comment

vtjnash commented Sep 12, 2014

simonster commented Sep 12, 2014

vtjnash commented Sep 12, 2014

simonster commented Sep 12, 2014

simonster commented Sep 22, 2014

vtjnash commented Sep 22, 2014

simonster commented Sep 22, 2014

timholy commented Nov 20, 2014

timholy commented Feb 25, 2015

simonster commented Mar 17, 2015

mbauman commented Mar 17, 2015

simonster commented Mar 17, 2015

mbauman commented Mar 17, 2015

mbauman commented Mar 17, 2015