Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type-instability in sum (and other reductions) #6069

Closed
stevengj opened this issue Mar 6, 2014 · 20 comments
Closed

type-instability in sum (and other reductions) #6069

stevengj opened this issue Mar 6, 2014 · 20 comments
Labels
performance Must go faster

Comments

@stevengj
Copy link
Member

stevengj commented Mar 6, 2014

The sum function is extremely slow for several types because of a type instability whenever typeof(zero(T)+zero(T)) != T. For example, the following implementation is almost 100x faster than sum for an Array{Int8}:

function sum2{T}(a::AbstractArray{T})
    s = zero(T)+zero(T)
    for x in a
        s += x
    end
    return s
end

Also, the sum function is type-unstable for the same reason, e.g.:

julia> typeof(sum(Int8[1]))
Int8

julia> typeof(sum(Int8[1,2]))
Int64

Recommendation: use typeof(zero(T)+zero(T)) for the accumulation in sum and friends, and return zero(T)+zero(T) for sums of empty arrays rather than zero(T). (Or is there some other way to infer the type of arithmetic results that is faster/cleaner than zero(T)+zero(T)?)

The same problem shows up in other reductions, e.g. typeof(reduce(+, Int8[1,2])) != typeof(reduce(+, Int8[1])).

cc: @jiahao

@StefanKarpinski
Copy link
Sponsor Member

That seems like the correct solution.

@carlobaldassi
Copy link
Member

What about types for which zero(T) is not defined? E.g. sum(Vector{Int}[[1,2,3],[4,5,6]]). Note that this has come up in the past.

@StefanKarpinski
Copy link
Sponsor Member

If the container isn't empty, we should use the first element to start this. Returning zero(T) + zero(T) is only necessary when there are no elements. That means that there will be a no method error for zero when someone tries to reduce over an empty container with no zero, but that's the best we can do.

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2014

What about if the container has only one element and zero(T) is not defined? Return a[1] + zero(a[1])?

@StefanKarpinski
Copy link
Sponsor Member

Hmm. Yes, that's a bit of a problem. We can use applicable but that feels pretty nasty.

@StefanKarpinski
Copy link
Sponsor Member

Maybe we should only use the zero(T) + zero(T) thing for numeric arrays and require that zero be defined for those. For loosely typed arrays, sum isn't going to be type stable anyway.

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2014

Is there any summable type T for which zero(x::T) is not defined?

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2014

It's a little disturbing that it so hard to make a simple function like sum type-stable...

@StefanKarpinski
Copy link
Sponsor Member

Well, the flip side in a statically typed system is that it's hard to even have a generic sum function – because it would have such a complicated type signature. Of course, this could all be made simpler by making the sum of two values of the same type always be of that type, but I'm not convinced that's a good idea yet – and it's a rather different question.

@jiahao
Copy link
Member

jiahao commented Mar 6, 2014

I found a counterexample:

julia> A=Matrix[randn(1,1), randn(2,2)]
2-element Array{Array{T,2},1}:
 [1.76797]                              
 [-0.0870554 0.307046
 1.15972 -0.36789]

julia> reduce(.*, A)
2x2 Array{Float64,2}:
 -0.153911   0.542849
  2.05035   -0.650418

julia> zero(A[1]).*zero(A[2])
2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> T = eltype(A)
Array{T,2}

julia> zero(T)
ERROR: no method convert(Type{Array{T,2}}, Int64)
 in zero at abstractarray.jl:225

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2014

@jiahao, what about for sum? Type-stability in reduce is much harder, I agree.

@jiahao
Copy link
Member

jiahao commented Mar 6, 2014

I found an example which is type stable but for which zero(T) is undefined, so forcing the type to be zero(T)+zero(T) would fail:

julia> using Color; A=[LCHuv(rand(3)...) for i=1:10];

julia> sum(A)
LCHuv(3.9230728595082516,5.225114269676179,5.640624576230978)

julia> zero(eltype(A))
ERROR: no method convert(Type{LCHuv}, Int64)
 in zero at operators.jl:146

@jiahao
Copy link
Member

jiahao commented Mar 6, 2014

Arguably, this is just a case of a missing zero method, though.

@jiahao
Copy link
Member

jiahao commented Mar 6, 2014

Here is a more insidious one:

julia> A=[Bidiagonal(rand(5), rand(4), randbool()) for i=1:10] #Generate random lower and upper bidiagonals
10-element Array{Bidiagonal{Float64},1}:
 5x5 Bidiagonal{Float64}:
 diag: 0.859519  0.195208  0.300049  0.783655  0.584966
  sub: 0.825225  0.562332  0.149394  0.177572  
 5x5 Bidiagonal{Float64}:
 diag: 0.0359303  0.721596  0.17091  0.924345  0.96887
 super: 0.288528  0.138715  0.960179  0.75766    
...

julia> eltype(A)
Bidiagonal{Float64} (constructor with 1 method)

julia> sum(A)
5x5 Tridiagonal{Float64}:
 5.05453  2.20272  0.0      0.0      0.0    
 3.36481  5.43659  2.00832  0.0      0.0    
 0.0      2.56735  5.3427   3.2014   0.0    
 0.0      0.0      2.60014  5.09088  3.53915
 0.0      0.0      0.0      2.21059  5.00687

julia> zero(eltype(A)) #Lower or upper bidiagonal?
ERROR: no method convert(Type{Bidiagonal{Float64}}, Int64)
 in zero at operators.jl:146

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2014

You only need zero(::T), not zero(T), for summing non empty collections.

zero(Bidiagonal(rand(5), rand(4), randbool())) gives no method similar(Bidiagonal{Float64}, Type{Float64}, (Int64,Int64)), but that seems like a bug...we should really have similar for Bidiagonal.

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2014

Perhaps the goal should be to have sum (etc.) work on as many types as possible (without throwing unexpected no method similar exceptions), but be potentially type-unstable. However, we should make it type-stable and fast for as many types as feasible.

One option might be this:

  • define a reducetype(T::Type) method that yields the desired output type for reductions, falling back to the identity reducetype(T::Type) = T.
  • for various important types, e.g. Number subtypes and arrays thereof, we can define reducetype appropriately, e.g. reducetype{T<:Number}(::Type{T}) = typeof(zero(T)+zero(T)).
  • use reducetype to determine the type of zero to return for sums of empty arrays, and to determine the type to return for sums of 1-element arrays.
  • for arrays with 2 or more elements, initialize the sum with a[1]+a[2] (and similarly for other reductions), which should make everything type-stable even if there is no zero.

@stevengj
Copy link
Member Author

I've pushed a possible fix (stevengj@7f553ff) to a branch.

stevengj added a commit to stevengj/julia that referenced this issue Mar 10, 2014
@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 10, 2014

cross-reference discussion in #5588

@lindahua
Copy link
Contributor

+1 for reducetype. I guess sumtype might be more appropriate though, as for other kinds of reduction (other than sum), the result types can be different.

@stevengj
Copy link
Member Author

@lindahua, my latest version turns out not to need reducetype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

6 participants