Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement accumulate and friends #702

Merged
merged 11 commits into from
Feb 18, 2020
Merged

Implement accumulate and friends #702

merged 11 commits into from
Feb 18, 2020

Conversation

tkf
Copy link
Member

@tkf tkf commented Dec 21, 2019

This PR implements accumulate, cumsum and cumprod for StaticArray.

It const-props with vectors:

julia> function demov()
           A = SA[1, 2]
           cumsum(A)
       end
demov (generic function with 1 method)

julia> @code_typed optimize=true demov()
CodeInfo(
1return $(QuoteNode([1, 3]))
) => SArray{Tuple{2},Int64,1,2}

Functions are not completely inlined with matrices. Benchmarks show no allocations:

julia> function demom()
           A = SA[1 2; 3 4]
           cumsum(A, dims=Val(1))
       end
demom (generic function with 1 method)

julia> @code_typed optimize=true demom()
CodeInfo(
1%1 = Base.add_sum::typeof(Base.add_sum)
│   %2 = invoke Base.var"#kw##accumulate"()($(QuoteNode((dims = Val{1}(),)))::NamedTuple{(:dims,),Tuple{Val{1}}}, StaticArrays.accumulate::typeof(accumulate), %1::typeof(Base.add_sum), $(QuoteNode([1 2; 3 4]))::SArray{Tuple{2,2},Int64,2,4})::SArray{Tuple{2,2},Int64,2,4}
└──      return %2
) => SArray{Tuple{2,2},Int64,2,4}

julia> @btime demom()
  7.390 ns (0 allocations: 0 bytes)
2×2 SArray{Tuple{2,2},Int64,2,4} with indices SOneTo(2)×SOneTo(2):
 1  2
 4  6

(edited by @c42f to add: fixes #730)

@coveralls
Copy link

coveralls commented Dec 21, 2019

Coverage Status

Coverage increased (+0.2%) to 81.975% when pulling 8db9ad6 on tkf:accumulate into 26d0e05 on JuliaArrays:master.

@c42f
Copy link
Member

c42f commented Jan 7, 2020

This looks pretty good to me at a quick read, though I should take a closer look. It's impressive that you can get away with this high level of abstraction.

@tkf
Copy link
Member Author

tkf commented Jan 7, 2020

I wrote too many foldls last year 😆

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I took so long to look at this.

I think it's basically good, though I worry just a bit that it's very different from the rest of the code in this package and is asking rather a lot from the compiler ;-) But hey, we should arguably be moving away from using so many generated functions, so perhaps we should just go with it.

src/mapreduce.jl Show resolved Hide resolved
src/mapreduce.jl Outdated
data = _map(a, CartesianIndices(a)) do _, CI
D = _valof(dims)
I = Tuple(CI)
J = Base.setindex(I, 1, D)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already import Base.setindex elsewhere in deque.jl so it should be available. Should we just move that import to the main StaticArrays.jl file and use setindex unqualified here?

src/mapreduce.jl Outdated

struct _InitialValue end

_maybeval(dims::Integer) = Val(Int(dims))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just me, but I can't help reading this as _maybe_eval :-/ Can we change to _maybe_val?

results = _reduce(
a,
dims,
(init = (similar_type(a, Union{}, Size(0))(), init),),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I noticed here is that for length-0 input, the result has eltype of Union{}. I can see why, but in other cases we do try to to have a "sensible" output eltype for convenience, eg SA{Int}[] .+ SA{Int}[] preserves the Int eltype via some hopefully-not-dubious trickery (#664). Is it possible to do something similar here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I noticed here is that for length-0 input, the result has eltype of Union{}.

I don't think it's possible to solve this in general as the output value depends on the function op. I think another (somewhat) sensible output is to use the eltype of input array. But this is not always the valid answer; e.g., accumulate(//, [1, 2]) (note that Base's accumulate(//, Int[]) magically returns Rational{Int64}[] as it uses promote_op).

As all static arrays (including MArray) are shape-immutable, I don't think it is so harmful to return Union{} element type.

My preference is:

  1. Union{}
  2. promote_op(op, eltype(input), eltype(input))
  3. eltype(input)

Copy link
Member

@c42f c42f Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's possible to solve this in general as the output value depends on the function op.

This is what #664 hacks around for map by using return_type 😬 (but it's no worse than Base which does the same thing for broadcast and collect on empty containers). I've never been quite sure that's the right choice, but if it can't be made to work or is otherwise terrible I'm ok with Union{} :-) Prior to #664 we used Union{} for map and people mostly seemed ok with it.

Copy link
Member Author

@tkf tkf Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I misread the last half of your first comment.

But I think one difficulty of accumulate compared to map is that output "eltype" is kind of ill-defined (or maybe it's better to say length-dependent). It's conceivable to have "type sequence" like this

op(::A, ::A) :: B
op(::B, ::A) :: C
op(::C, ::A) :: D
op(::D, ::A) :: D  # finally "stabilized"

But, even though B, C, D are all different types, promote_type(B, C, D) may return type E that is a concrete type. It is also easy to have "oscillatory" output:

osc_op(::Nothing, ::Any) = missing
osc_op(::Missing, ::Any) = nothing

So, maybe we should use something like this?

function f(op, acc, x)
    T = typeof(acc)
    while true
        acc = op(acc, x)
        T = promote_type(T, typeof(acc))
        non_existing_value && return T
        # `non_existing_value` modeling truncation arbitrary-sized input.
    end
end

return_type(f, Tuple{typeof(op), typeof(init), eltype(a)})

Impressively, julia can figure out some non-trivial examples like Base._return_type(f, Tuple{typeof(+), Int, Union{Int,Missing}}) and Base._return_type(f, Tuple{typeof(osc_op), Missing, Int}).

I'm not sure if we need to go this far, though. Base is not doing this for accumulate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's confusing to know what the empty case should do here and it's worse than map. With your fix we have

julia> accumulate((a,b)->(a+b)/2, SA[1,2])
2-element SArray{Tuple{2},Float64,1,2} with indices SOneTo(2):
 1.0
 1.5

julia> accumulate((a,b)->(a+b)/2, SA[1])
1-element SArray{Tuple{1},Int64,1,1} with indices SOneTo(1):
 1

julia> accumulate((a,b)->(a+b)/2, SA{Int}[])
0-element SArray{Tuple{0},Int64,1,0} with indices SOneTo(0)

It seems like we must make a somewhat arbitrary choice for the length-0 case with init. You've got

julia> accumulate((a,b)->(a+b)/2, SA{Int}[1], init=0)
1-element SArray{Tuple{1},Float64,1,1} with indices SOneTo(1):
 0.5

julia> accumulate((a,b)->(a+b)/2, SA{Int}[], init=0)
0-element SArray{Tuple{0},Float64,1,0} with indices SOneTo(0)

Overall do you feel like this is an improvement on using Union? I do like that cumsum now preserves numeric eltypes and I feel like that could be helpful for users for the same reason that "fixing" map was.

We do have oddities like

julia> cumsum(Int8[1])
1-element Array{Int64,1}:
 1

julia> cumsum(SA{Int8}[1])
1-element SArray{Tuple{1},Int8,1,1} with indices SOneTo(1):
 1

julia> cumsum(SA{Int8}[1,2])
2-element SArray{Tuple{2},Int64,1,2} with indices SOneTo(2):
 1
 3

I think that inconsistency between here and Base might be harmless though, and it's certainly not clear how to "fix" it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall do you feel like this is an improvement on using Union?

That's a tough question... 😅 Personally, I think we should be using Union{} eltype everywhere when it's not possible to figure out eltype without relying on the compiler. However, we don't have a good toolset and coding convention for dealing with Union{} eltype ("mutate-or-widen" style, even at the surface API level).

But I think the consistency of the API within a package and with respect to Base matters. As map is already using return_type, overall, I think using return_type seems to be a decent solution here.

src/mapreduce.jl Show resolved Hide resolved
src/mapreduce.jl Show resolved Hide resolved
src/mapreduce.jl Show resolved Hide resolved
@c42f
Copy link
Member

c42f commented Feb 17, 2020

Looks great. I'd still like to see whether we can get everything inlined for the dims case, but as discussed that can be done separately.

Is this good to go from your end?

@tkf
Copy link
Member Author

tkf commented Feb 17, 2020

Yeah, I did everything I wanted to add. By the way, I added the inference-based eltype handling but I forgot to comment that. Did you check it?

Re: dims case, I think it'd be better to optimize things in a different round. I think I'll try simplifying _reduce signature at some point. Though maybe using @generated is the option with least hustle...

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I added the inference-based eltype handling but I forgot to comment that

Yes I noticed that 👍 I had a play around with it and it seems reasonable enough. I think the zero size choice may be somewhat arbitrary for cases other than init=_InitialValue? I guess there might be some weird sense in taking typeof(init)... considering the left-extended sequence init, init+a[1], init+a[1]+a[2], .... However that seems a bit theoretical rather than practical, given the value of init isn't actually used in the empty case.

I'd be happy to go with what's here.

src/mapreduce.jl Outdated
@inline Base.cumsum(a::StaticArray; kw...) =
accumulate(Base.add_sum, a; init = Base.reduce_empty(Base.add_sum, eltype(a)), kw...)
@inline Base.cumprod(a::StaticArray; kw...) =
accumulate(Base.mul_prod, a; init = Base.reduce_empty(Base.mul_prod, eltype(a)), kw...)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: cumsum(SA{Int8}[1]) #702 (comment)

It's easy to make it Base-compatible. So maybe it's worth doing this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I guess it's not so bad for + given that already has a specialized add_sum. So sure, if it's easy let's make this compatible.

@tkf
Copy link
Member Author

tkf commented Feb 17, 2020

I guess there might be some weird sense in taking typeof(init)...

I considered using typeof(init) but it doesn't work if you want to "adjoin" singleton initial value like we are doing. For example:

struct _CustomInitialValue end

function mapaccumulate(f, op, xs)
    accumulate(xs; init = _CustomInitialValue()) do acc, x
        if acc isa _CustomInitialValue
            f(x)
        else
            op(acc, f(x))
        end
    end
end

@c42f
Copy link
Member

c42f commented Feb 18, 2020

With the use of reduce_first it looks like you've greatly improved consistency with Base for things like cumsum(SA{Int8}[1]). I think that was the last quibble, so do feel free to merge when you're ready.

@tkf
Copy link
Member Author

tkf commented Feb 18, 2020

Yeah, I think it's better this way as we are using return_type anyway. (Due to reduce_first, we can't use eltype anymore even when init is not specified. So, we need to use return_type always.)

I tried to merge this but I don't see the merge button. Also, it looks like I can't push to StaticArrays.jl. Maybe I don't have the right permission?

@c42f
Copy link
Member

c42f commented Feb 18, 2020

Sorry, just a confusion with organization level permissions. Should be fixed now.

@tkf tkf merged commit 7a78efd into JuliaArrays:master Feb 18, 2020
@tkf
Copy link
Member Author

tkf commented Feb 18, 2020

Thanks! I just merged it. I appreciate that you reviewed this PR very thoroughly!

@c42f
Copy link
Member

c42f commented Feb 19, 2020

Very nice — thanks for all the thought you put into addressing my quibbles and for some some nice improvement in functionality!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

non-allocating cumsum for SVector
3 participants