Array indexing with UInt is ~5x slower than with Int #17103

bicycle1885 · 2016-06-25T10:39:15Z

This is the benchmark script:

function mysum_int(data)
    sum = 0
    for i in Int(1):Int(endof(data))
        @inbounds sum += data[i]
    end
    return sum
end

function mysum_uint(data)
    sum = 0
    for i in UInt(1):UInt(endof(data))
        @inbounds sum += data[i]
    end
    return sum
end

let
    size = 128 * 2^20
    data = zeros(UInt8, size)
    println("mysum_int")
    mysum_int(data)
    @time mysum_int(data)
    @time mysum_int(data)
    @time mysum_int(data)

    println("mysum_uint")
    mysum_uint(data)
    @time mysum_uint(data)
    @time mysum_uint(data)
    @time mysum_uint(data)
end

And the result:

~/.j/v/BGZFStreams (master|✚1…) $ julia-dev access.jl
mysum_int
  0.022742 seconds (152 allocations: 9.028 KB)
  0.020606 seconds (4 allocations: 160 bytes)
  0.023985 seconds (4 allocations: 160 bytes)
mysum_uint
  0.118161 seconds (4 allocations: 160 bytes)
  0.116512 seconds (4 allocations: 160 bytes)
  0.118702 seconds (4 allocations: 160 bytes)

I don't know the internal, but I guess an index value is converted to Int before accessing and inexact check imposes the cost on it. data[Int(i)] results in the slowdown at the same level but data[reinterpret(Int, i)] doesn't show any slowdown.

julia> versioninfo()
Julia Version 0.5.0-dev+4934
Commit 1a1a9a6* (2016-06-25 02:49 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i5-4288U CPU @ 2.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, haswell)

The text was updated successfully, but these errors were encountered:

eschnett · 2016-06-25T13:08:43Z

Examining @code_native [1.0][Uint(1)] shows that this code has indeed two branches. I assume the reason they cannot be combined because Julia needs to throw two different errors depending on whether the unsigned index is too large for a signed integer, or is too large for the array.

The elegant solution is probably to define all getindex functions with Integer as index type, so that bounds checking sees the unsigned integer directly, avoiding the conversion.

ViralBShah · 2016-06-26T04:24:48Z

Cc @timholy

timholy · 2016-06-26T17:05:03Z

This can probably be tackled while we're in RC, or for 0.5.x.

timholy · 2016-06-27T10:15:05Z

I just took a peek at this, and doing it properly is well beyond the scope of my interests and what I have time for (finishing off #16973), so I'm marking this as "up for grabs." The getindex functions support Real, not just Integer, so we're already plenty general. However, for general AbstractArrays it seems very likely that to_index and bounds-checking might duplicate some conversion steps---and since these conversions are checked, they will be expensive.

In the specific case of Array, it's much more direct because of these definitions, but it might involve some changes to array.c and codegen.cpp since arrayref and the bounds-checking are defined/performed in C.

What's the use-case for indexing with unsigned integers, anyway? Arrays require Int sizes (they can't support arrays bigger than typemax(Int)), and unsigneds are evil in many ways.

bicycle1885 · 2016-06-27T10:37:05Z

The actual use-case would be limited; that would be why we have never noticed this performance problem until now. I noticed this problem while profiling a decompressor which uses an unsigned integer to specify a byte offset. I can avoid the performance degradation by using the reinterpret(Int, index) trick, but I thought this should be fixed in the upstream.

oscardssmith · 2016-12-07T22:48:22Z

Good and bad news on this. In v 0.5.0, Good: uint performance is only 20% slower than int. Bad: I think that is only because int had a 4x regression.

StefanKarpinski · 2016-12-07T22:56:19Z

How could we possibly have had that bad a regression on array indexing and not have noticed it?

yuyichao · 2016-12-07T22:58:47Z

I cannot reproduce the regression on Int.

oscardssmith · 2016-12-07T23:01:44Z

That's possible, I was comparing direct times and assuming that there wouldn't be a 4x difference between computers (silly me). In that case, should this be closed as the difference is now fairly negligible?

yuyichao · 2016-12-07T23:06:01Z

I can still reproduce the Int, UInt difference.

mbauman · 2016-12-08T00:18:52Z

The key is this:

data[Int(i)] results in the slowdown at the same level but data[reinterpret(Int, i)] doesn't show any slowdown.

If that's still the case, then the issue isn't in the complexity of the indexing fallbacks, but, rather, it's simply that we choose to use a checked conversion to Int. We may be able to change that for UInt since the values that will be interpreted as negative will also be those that'll throw a bounds error, but this isn't the case for smaller integers.

This is particularly problematic since the optimal order of operations between checking bounds and index conversion is dependent on the index types themselves.

bicycle1885 · 2016-12-08T00:48:56Z

This is particularly problematic since the optimal order of operations between checking bounds and index conversion is dependent on the index types themselves.

Yes, indexes to an array are always converted to Int before actual indexing. That's the problem here. I'm not sure whether it is possible or not, but it would be better to flip the order of operations so that any integer can be used as an index.

mbauman · 2017-08-30T16:44:26Z

Update:

For AbstractArrays, in general, we've now fixed the twice-conversion problems Tim mentions in #17103 (comment). We absolutely do need to convert to Int in those cases since that's what we require library creators to implement. Library creators, of course, are always free to define optimized indexing for any index type they like.

For Array, however, the rest of that comment still stands. It'd be a decent amount of work for this limited situation.

KristofferC · 2018-10-25T17:35:09Z

I think this is fixed as far as we can with #29784.

kshyatt added the performance Must go faster label Jun 25, 2016

timholy added the status:help wanted Indicates that a maintainer wants help on an issue or pull request label Jun 27, 2016

mbauman added the domain:arrays [a, r, r, a, y, s] label Aug 30, 2017

KristofferC mentioned this issue Oct 23, 2018

faster indexing with UInt for Arrays #29784

Merged

KristofferC closed this as completed Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array indexing with UInt is ~5x slower than with Int #17103

Array indexing with UInt is ~5x slower than with Int #17103

bicycle1885 commented Jun 25, 2016

eschnett commented Jun 25, 2016

ViralBShah commented Jun 26, 2016

timholy commented Jun 26, 2016

timholy commented Jun 27, 2016 •

edited

Loading

bicycle1885 commented Jun 27, 2016

oscardssmith commented Dec 7, 2016

StefanKarpinski commented Dec 7, 2016

yuyichao commented Dec 7, 2016

oscardssmith commented Dec 7, 2016

yuyichao commented Dec 7, 2016

mbauman commented Dec 8, 2016

bicycle1885 commented Dec 8, 2016

mbauman commented Aug 30, 2017

KristofferC commented Oct 25, 2018

Array indexing with UInt is ~5x slower than with Int #17103

Array indexing with UInt is ~5x slower than with Int #17103

Comments

bicycle1885 commented Jun 25, 2016

eschnett commented Jun 25, 2016

ViralBShah commented Jun 26, 2016

timholy commented Jun 26, 2016

timholy commented Jun 27, 2016 • edited Loading

bicycle1885 commented Jun 27, 2016

oscardssmith commented Dec 7, 2016

StefanKarpinski commented Dec 7, 2016

yuyichao commented Dec 7, 2016

oscardssmith commented Dec 7, 2016

yuyichao commented Dec 7, 2016

mbauman commented Dec 8, 2016

bicycle1885 commented Dec 8, 2016

mbauman commented Aug 30, 2017

KristofferC commented Oct 25, 2018

timholy commented Jun 27, 2016 •

edited

Loading