Let `muladd` accept arrays #37065

mcabbott · 2020-08-16T17:17:31Z

This extends Base.muladd to work on arrays, using mul!, which makes some combinations more efficient:

julia> A = rand(20,5); x = rand(5, 1000); b = rand(20);

julia> @btime $A * $x .+ $b;
  29.787 μs (4 allocations: 312.66 KiB)

julia> @btime muladd($A, $x, $b);
  22.322 μs (2 allocations: 156.33 KiB)

Motivated by things like FluxML/Flux.jl#1272, where it would be convenient to fuse these two, rather than fusing addition & function application tanh.((W*x) .+ b) with broadcasting.

~~Needs tests, and thinking about muladd(::Adjoint, ...) & muladd(..., ::Number) cases,~~ if anyone thinks this is a good idea.

mcabbott · 2020-10-02T16:04:07Z

Now has tests, and muladd(A, x, b) adds any b for which C = A*x; C .= C .+ b would work. Behaves the obvious way when A is an Adjoint vector, with C (and hence the result) either a scalar or another Adjoint vector.

stdlib/LinearAlgebra/src/matmul.jl

mateuszbaran · 2020-10-30T15:42:33Z

I think this change broke the following:
On 1.5.2:

julia> using LinearAlgebra

julia> muladd([2 2; 1 2], [2 1; 1 2], I)
2×2 Array{Int64,2}:
 7  6
 4  6

while currently on master:

julia> using LinearAlgebra

julia> muladd([2 2; 1 2], [2 1; 1 2], I)
ERROR: MethodError: no method matching length(::UniformScaling{Bool})
Closest candidates are:
  length(::Union{Base.KeySet, Base.ValueIterator}) at abstractdict.jl:54
  length(::Union{Adjoint{T, S}, Transpose{T, S}} where S where T) at /home/mateusz/repos/julia/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/adjtrans.jl:194
  length(::IdDict) at iddict.jl:136
  ...
Stacktrace:
 [1] _similar_for(c::UnitRange{Int64}, #unused#::Type{Bool}, itr::UniformScaling{Bool}, #unused#::Base.HasLength)
   @ Base ./array.jl:575
 [2] _collect(cont::UnitRange{Int64}, itr::UniformScaling{Bool}, #unused#::Base.HasEltype, isz::Base.HasLength)
   @ Base ./array.jl:608
 [3] collect(itr::UniformScaling{Bool})
   @ Base ./array.jl:602
 [4] broadcastable(x::UniformScaling{Bool})
   @ Base.Broadcast ./broadcast.jl:682
 [5] broadcasted
   @ ./broadcast.jl:1303 [inlined]
 [6] muladd(A::Matrix{Int64}, B::Matrix{Int64}, z::UniformScaling{Bool})
   @ LinearAlgebra ~/repos/julia/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:214
 [7] top-level scope
   @ REPL[12]:1

julia> versioninfo()
Julia Version 1.6.0-DEV.1384
Commit 4711fc3cdf (2020-10-30 14:50 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.0 (ORCJIT, haswell)

dkarrasch · 2020-10-30T15:57:14Z

Thanks for reporting this! I guess we should have restricted the third argument here to Union{Number,AbstractVecOrMat} then? That would actually make some of the dimension checks unnecessary. Or did you have other types in mind, @mcabbott?

mcabbott · 2020-10-30T16:53:59Z

Oops, I was working on the assumption that the sole existing use for muladd was with numbers. I left z untyped just for simplicity but it could be constrained as you suggest.

The error here is from ~~[6 6; 4 5] .+ I~~ rand(2,2) .= I, I'm somewhat surprised that doesn't work. If it did work, then this method would be more efficient than the fallback. Might it be worth fixing?

However, if there are other existing uses of muladd with matrices, then this whole thing may need to tread more carefully. For instance, on master this hits this PR's mul!(similar(... method:

julia> muladd(Diagonal([1,2]), [3,4], [5,6])
2-element SparseArrays.SparseVector{Int64, Int64} with 2 stored entries:
  [1]  =  8
  [2]  =  14

julia> muladd(Diagonal([1,2]), [3,4], [5,6]) # on Julia 1.5
2-element Array{Int64,1}:
  8
 14

mateuszbaran · 2020-10-30T17:10:38Z

Wouldn't replacing .= with copyto! fix this problem? It might be easier than fixing broadcasting with I.

dkarrasch · 2020-10-30T17:14:09Z

Yea, broadcasting in setindex! is #34130, and there's also #23197. Unfortunately, I don't think copyto! would be a fix, because if z is a Number, but the output a vector/matrix, then only the first element would get filled. Thus, we would need many more methods.

mateuszbaran · 2020-10-30T17:20:05Z

Yes, good point. I don't have any good ideas then 🙂 .

mcabbott · 2020-10-30T17:24:16Z

Thanks for the links. I guess this use of I is only possible in one method, and could be explicitly handled, or rejected. Things like muladd(I, [2 2; 1 2], [3 4; 5 6]) still hit the fallback, and can't be made more efficient anyway. (Edit: muladd([1,2]', [3,4], I) is an error too right now.)

But I'm a little worried about all the possibilities with UpperTriangular and friends.

dkarrasch · 2020-10-30T20:51:32Z

I was thinking whether there is some way to deduce the type of A*B and have C similar to that to do mul!(C, A, B, ...). I tried a little, but I don't think I found a good solution. At times, even that return type happens to be suboptimal. I don't remember the issue, but there was a difference whether the structured matrix was the left or the right factor, because we usually (always?) take the right factor and make C similar to it.

mcabbott · 2020-10-31T06:33:15Z

I think quite a lot of the structured types returned by A*B are destroyed by .+z. There are a few exceptions:

di = Diagonal(1:3)
ut = UpperTriangular(rand(1:9, 3,3))

di * di .+ di isa Diagonal
ut * ut .+ ut isa UpperTriangular
ut * ut .+ di isa UpperTriangular

And it looks like only similar(:: Diagonal, axes...) gives a sparse array:

similar(di, 3,3) isa SparseMatrixCSC
similar(parent(di), 3,3) isa Matrix
similar(ut, 3,3) isa Matrix

So maybe something like this is almost enough?

const UpperUnion = Union{UpperTriangular, UnitUpperTriangular, Diagonal}

function Base.muladd(A::UpperUnion, B::UpperUnion, z::UpperUnion)
    T = promote_type(eltype(A), eltype(B), eltype(z))
    C = similar(parent(B), T, axes(A,1), axes(B,2))
    C .= z
    mul!(C, A, B, true, true)
    UpperTriangular(C)
end
Base.muladd(A::Diagonal, B::Diagonal, z::Diagonal) =
    Diagonal(A.diag .* B.diag .+ z.diag)
Base.muladd(A::Union{Diagonal, UniformScaling}, B::Union{Diagonal, UniformScaling}, z::Union{Diagonal, UniformScaling}) =
    Diagonal(_diag_or_value(A) .* _diag_or_value(B) .+ _diag_or_value(z))
Base.muladd(A::UniformScaling, B::UniformScaling, z::UniformScaling) =
    UniformScaling(A.λ * B.λ + z.λ)

_diag_or_value(A::Diagonal) = A.diag
_diag_or_value(A::UniformScaling) = A.λ

mateuszbaran · 2020-10-31T10:36:03Z

I think there may be a lot more patterns where A*B is sparse but z is dense. Bidiagonal, Tridiagonal, SymTridiagonal, SparseMatrixCSC, etc. Determining the correct result type of Base.muladd(A::AbstractMatrix{TA}, B::AbstractMatrix{TB}, z) where {TA, TB} would be very complex -- the signature is very broad. Maybe for now you could just implement this muladd only for either A or B being a StridedVecOrMat?

mcabbott · 2020-10-31T11:21:17Z

Sparse matrices appear to be handled correctly, in that similar(sm) gives the same type as the naiive A*y .+ z version:

m = rand(3,3); sm = sparse(m)
v = rand(3); sv = sparse(v);

sm * sm .+ 1 isa SparseMatrixCSC
sm * sm .+ v isa SparseMatrixCSC
sm * sm .+ m isa SparseMatrixCSC

sm * sm + 1 # error
sm * sm + v # error
sm * sm + m isa Matrix

The snippet above should handle dense triangular cases. Acting with similar on Bidiagonal or SymTridiagonal gives a sparse matrix, matching what * does.

Still hoping a little bit that it won't be necessary to go all the way down to StridedArray, but we may end up there.

mateuszbaran · 2020-10-31T11:29:21Z

I don't quite understand. Right now in Base.muladd(A::AbstractMatrix{TA}, B::AbstractMatrix{TB}, z) where {TA, TB} you allocate the result based on the type of A. But if either B or z is dense, the result should also be dense. How are you going to achieve that?

mcabbott · 2020-10-31T12:35:36Z

Sure, matching the existing choices for every possible trio of types probably can't be done with some simple rule. Besides the sparse .+ dense examples, I don't know why these are the way they are:

SymTridiagonal(m+m') * m isa Matrix
m * SymTridiagonal(m+m') isa SparseMatrixCSC

I hope nobody's relying on that. But more importantly, someone may be relying on this, which ought not to go via mul!:

using StaticArrays
muladd(SA[1 2; 3 4], SA[5,6], SA[7,8])

Which argues for making it much more opt-in.

dkarrasch · 2020-10-31T12:50:12Z

I don't know why these are the way they are:

They are the way they are because, as I said, we call similar on the right factor, not on the left, see #36551.

mcabbott · 2020-10-31T13:03:50Z

Thanks I hadn't seen that discussion. There's "why" as in what method, and "why" as in what's desired... and it sounds like for my failed attempt at an exotic case nobody would want, someone wants the opposite.

mcabbott · 2020-10-31T13:09:50Z

Not quite done but this is a branch:

master...mcabbott:mulladd2

This supports triangular & diagonal matrices. It also supports UniformScaling by itself, mixed with Diagonal, ~~and perhaps weirdly as the 3rd argument with two matrices.~~

mcabbott · 2020-10-31T19:13:37Z

Now at #38250.

This adjusts #37065 to be much more cautious about what arrays it acts on: it calls mul! on StridedArrays, treats a few special types like Diagonal, UpperTriangular, and UniformScaling, and sends anything else to muladd(A,y,z) = A*y .+ z. However this broadcasting restricts the shape of z, mostly such that A*y .= z would work. That ensures you should get the same error from the mul!(::StridedMatrix, ...) method, as from the fallback broadcasting one. Both allow z of lower dimension than the existing muladd(x,y,z) = x*y+z. But x*y+z also allows z to have trailing dimensions, as long as they are of size 1. I made the broadcasting method allow these too, which I think should make this non-breaking. (I presume this is rarely used, and thus not worth sending to the fast method.) Structured matrices such as UpperTriangular should all go to x*y+z. Some combinations could be made more efficient but it gets complicated. Only the case of 3 diagonals is handled.

mcabbott mentioned this pull request Oct 2, 2020

replace Base.tanh with faster tanh FluxML/Flux.jl#1272

Closed

4 tasks

Michael Abbott added 4 commits October 5, 2020 22:56

muladd for arrays

805eae9

docstring

7d0d91a

treat adjoints etc, and add tests

749e953

remember to save before committing

fc0ca88

mcabbott force-pushed the muladd branch from efa0fa2 to fc0ca88 Compare October 5, 2020 20:56

dkarrasch reviewed Oct 6, 2020

View reviewed changes

stdlib/LinearAlgebra/src/matmul.jl Outdated Show resolved Hide resolved

dkarrasch reviewed Oct 6, 2020

View reviewed changes

stdlib/LinearAlgebra/src/matmul.jl Outdated Show resolved Hide resolved

Michael Abbott added 2 commits October 6, 2020 12:39

always use eltype, and add one test

191f61d

use ndims(number)==0, and add array-of-array tests

69a3546

dkarrasch reviewed Oct 9, 2020

View reviewed changes

stdlib/LinearAlgebra/src/matmul.jl Outdated Show resolved Hide resolved

dkarrasch reviewed Oct 9, 2020

View reviewed changes

stdlib/LinearAlgebra/src/matmul.jl Show resolved Hide resolved

better wording

6456bf5

dkarrasch added the domain:linear algebra Linear algebra label Oct 29, 2020

dkarrasch merged commit 72971c4 into JuliaLang:master Oct 29, 2020

mcabbott deleted the muladd branch October 29, 2020 12:20

mcabbott mentioned this pull request Oct 29, 2020

Let muladd accept arrays JuliaLang/Compat.jl#729

Merged

mateuszbaran mentioned this pull request Oct 30, 2020

For 0.4.4 release JuliaManifolds/Manifolds.jl#244

Merged

mcabbott mentioned this pull request Oct 31, 2020

Let muladd accept a more restricted set of arrays #38250

Merged

mateuszbaran mentioned this pull request Nov 4, 2020

exp() test failures on Julia nightly JuliaArrays/StaticArrays.jl#849

Closed

odow mentioned this pull request Nov 13, 2020

Add 1.6 to CI jump-dev/MutableArithmetics.jl#60

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let `muladd` accept arrays #37065

Let `muladd` accept arrays #37065

mcabbott commented Aug 16, 2020 •

edited

Loading

mcabbott commented Oct 2, 2020 •

edited

Loading

mateuszbaran commented Oct 30, 2020

dkarrasch commented Oct 30, 2020

mcabbott commented Oct 30, 2020 •

edited

Loading

mateuszbaran commented Oct 30, 2020

dkarrasch commented Oct 30, 2020

mateuszbaran commented Oct 30, 2020

mcabbott commented Oct 30, 2020 •

edited

Loading

dkarrasch commented Oct 30, 2020

mcabbott commented Oct 31, 2020 •

edited

Loading

mateuszbaran commented Oct 31, 2020

mcabbott commented Oct 31, 2020

mateuszbaran commented Oct 31, 2020

mcabbott commented Oct 31, 2020

dkarrasch commented Oct 31, 2020

mcabbott commented Oct 31, 2020 •

edited

Loading

mcabbott commented Oct 31, 2020 •

edited

Loading

mcabbott commented Oct 31, 2020

Let muladd accept arrays #37065

Let muladd accept arrays #37065

Conversation

mcabbott commented Aug 16, 2020 • edited Loading

mcabbott commented Oct 2, 2020 • edited Loading

mateuszbaran commented Oct 30, 2020

dkarrasch commented Oct 30, 2020

mcabbott commented Oct 30, 2020 • edited Loading

mateuszbaran commented Oct 30, 2020

dkarrasch commented Oct 30, 2020

mateuszbaran commented Oct 30, 2020

mcabbott commented Oct 30, 2020 • edited Loading

dkarrasch commented Oct 30, 2020

mcabbott commented Oct 31, 2020 • edited Loading

mateuszbaran commented Oct 31, 2020

mcabbott commented Oct 31, 2020

mateuszbaran commented Oct 31, 2020

mcabbott commented Oct 31, 2020

dkarrasch commented Oct 31, 2020

mcabbott commented Oct 31, 2020 • edited Loading

mcabbott commented Oct 31, 2020 • edited Loading

mcabbott commented Oct 31, 2020

Let `muladd` accept arrays #37065

Let `muladd` accept arrays #37065

mcabbott commented Aug 16, 2020 •

edited

Loading

mcabbott commented Oct 2, 2020 •

edited

Loading

mcabbott commented Oct 30, 2020 •

edited

Loading

mcabbott commented Oct 30, 2020 •

edited

Loading

mcabbott commented Oct 31, 2020 •

edited

Loading

mcabbott commented Oct 31, 2020 •

edited

Loading

mcabbott commented Oct 31, 2020 •

edited

Loading