Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove bounds checks in iterate(::Tuple) #28847

Merged
merged 1 commit into from
Aug 24, 2018

Conversation

haampie
Copy link
Contributor

@haampie haampie commented Aug 23, 2018

This fix seems to help address #28844 partially to get compile-time optimizations

Before:

julia> g() = 9000 in ntuple(identity, Val(32))
julia> @btime g() # evaluated at runtime
  11.124 ns (0 allocations: 0 bytes)
false

After this PR:

julia> g() = 9000 in ntuple(identity, Val(32))
julia> @btime g() # known at compile-time
  1.541 ns (0 allocations: 0 bytes)
false

It also adds @nospecialize like in the rest of the function definitions in tuple.jl.

Potentially there's a breaking change here, since iterate((1,2), -1) === nothing now, whereas it used to throw a BoundsError.

@KristofferC
Copy link
Sponsor Member

Potentially there's a breaking change here, since iterate((1,2), -1) === nothing now, whereas it used to throw a BoundsError.

Seems like a bugfix, c.f. Array behavior?

@haampie
Copy link
Contributor Author

haampie commented Aug 23, 2018

Well, if it's considered a bug fix we could merge this, right? :)

For reference, beyond the MAX_TUPLE_SPLAT limit there's some improvement with this implementation of iterate:

julia> g() = 9000 in ntuple(identity, Val(512))
julia> @btime g() # before
  324.355 ns (0 allocations: 0 bytes)
false
julia> @btime g() # after this PR
  235.632 ns (0 allocations: 0 bytes)
false

@KristofferC KristofferC added the performance Must go faster label Aug 23, 2018
@maleadt
Copy link
Member

maleadt commented Aug 23, 2018

#26247 (comment)

@KristofferC
Copy link
Sponsor Member

Could you elaborate how that comment is related.

@haampie
Copy link
Contributor Author

haampie commented Aug 23, 2018

@maleadt thx for the reference, I missed that. Fortunately with the new iteration protocol we can have safe & inbounds iteration in the case of ::Array (see #27386). Same applies here with tuples!

@maleadt
Copy link
Member

maleadt commented Aug 23, 2018

Oh nice, I missed that discussion. This is great for GPU support, too 🙂

@mbauman
Copy link
Sponsor Member

mbauman commented Aug 23, 2018

Yes, this seems worth doing.

@haampie
Copy link
Contributor Author

haampie commented Aug 23, 2018

@mbauman we can't use that modulus-trick from #27386 here because rem is not yet defined in tuple.jl and the definition of rem in int.jl requires iteration over tuples ^^.

@mbauman
Copy link
Sponsor Member

mbauman commented Aug 23, 2018

Oh that's a fun bootstrapping puzzle. Is it possible that the iteration required would only be on two-tuples or less? Would just a few definitions for small tuples be enough to get us over the hump? E.g., something like:

iterate(::Tuple{}, state::Nothing=nothing) = nothing
iterate(t::Tuple{Any}) = (@inbounds t[1], nothing)
iterate(t::Tuple{Any}, ::Nothing) = nothing
iterate(t::Tuple{Any,Any}) = (@inbounds t[1], true)
iterate(t::Tuple{Any,Any}, s::Bool) = s ? (@inbounds t[2], false) : nothing

@haampie
Copy link
Contributor Author

haampie commented Aug 23, 2018

Not really, iteration is over all bit integer types + bool: (Int8, Int16, Int32, Int64, Int128, UInt8, UInt16, UInt32, UInt64, UInt128, Bool).

Another solution which seems tempting is to avoid indexing:

iterate(::Tuple{}) = nothing
iterate(::Tuple, ::Tuple{}) = nothing
iterate(it::Tuple) = first(it), tail(it)
iterate(::Tuple, st::Tuple) = first(st), tail(it)

but then the state is type unstable, and it does not work well :(.

At any rate, the implementation of this PR seems good enough to me.

@mbauman
Copy link
Sponsor Member

mbauman commented Aug 23, 2018

I know Nanosoldier is down at the moment, but it'd be great to even just run the Tuple benchmarks locally.

@Keno
Copy link
Member

Keno commented Aug 23, 2018

You don't need rem. You can inline the definition of rem for your particular int type directly.

@haampie
Copy link
Contributor Author

haampie commented Aug 24, 2018

Right. To get it compiling I used

iterate(@nospecialize(t::Tuple), i::Int=1) = bitcast(UInt, i) - bitcast(UInt, 1) < bitcast(UInt, length(t)) ? (@inbounds t[i], i+1) : nothing

Turns out this performs exactly the same, so I'll stick to the current changes in this PR.

@haampie
Copy link
Contributor Author

haampie commented Aug 24, 2018

@mbauman: iteration of tuples is not benchmarked in BaseBenchmarks.jl it seems -- all loops over tuples use indexing. Also if I run the benchmarks locally they're a bit flaky.

Let me just summarize results from my benchmark of a linear scan:

# Linear scan through Array of 32 / 256 / 1024 elements (reference)
26.789 ns, 159.156 ns, 583.630 ns

# Linear scan through Tuple of 32 / 256 / 1024 elements
29.265 ns, 142.273 ns, 583.619 ns # this branch
31.882 ns, 198.612 ns, 768.126 ns # julia 1.0

@KristofferC KristofferC mentioned this pull request Aug 24, 2018
@KristofferC KristofferC merged commit 35c67e5 into JuliaLang:master Aug 24, 2018
KristofferC pushed a commit that referenced this pull request Aug 25, 2018
KristofferC pushed a commit that referenced this pull request Sep 8, 2018
KristofferC pushed a commit that referenced this pull request Sep 8, 2018
@KristofferC
Copy link
Sponsor Member

This caused a regression for #28764 (comment).

MWE

perf_setindex!(A, val, inds) = setindex!(A, val, inds...)

using BenchmarkTools
s = 2
A = rand(Float64, ntuple(one, s)...)
y = one(eltype(A))
i = length(A)
@btime perf_setindex!($(fill!(A, y)), $y, $i)

@haampie
Copy link
Contributor Author

haampie commented Sep 11, 2018

Thanks for pointing that out! I'll see if I can understand this.

@KristofferC
Copy link
Sponsor Member

Fixed by #29133.

@haampie
Copy link
Contributor Author

haampie commented Sep 11, 2018

wow, that was impressively fast 😅

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 13, 2018

Good change—but probably shouldn't have been marked for back-porting. Need to re-open this as a new PR now, and look into #29147 in conjunction

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 13, 2018

I think this should be marked for backporting since otherwise iterate threw with negative indices which is a bug, c.f.

julia> iterate([1], 0)
(nothing)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 13, 2018

iterate does not define what result should be returned if you pass an arbitrary value to state, but we also generally should not be backporting changes (without a clear need)

KristofferC pushed a commit that referenced this pull request Feb 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants