Structured matrix multiplication #814

mateuszbaran · 2020-08-11T21:47:58Z

In this PR I'm expanding the support for multiplication of structured, statically sized matrices. This should improve usability of StaticArrays. It fixes (partially) #790 . Together with allocation changes in 1.5 this makes some generic code in Manifolds.jl non-allocating.

mateuszbaran · 2020-08-11T22:02:08Z

It should be possible to switch matrix multiplication methods from triangular.jl to this approach. What do you think?

mschauer · 2020-08-12T14:11:13Z

Ref: I still have https://github.com/mschauer/StaticLinearMaps.jl/blob/master/src/StaticLinearMaps.jl lying around, just to demonstrate that sometimes you want to think of static matrices as static linear maps and leave the storage/details to the implementation

mateuszbaran · 2020-08-13T12:20:40Z

Ref: I still have https://github.com/mschauer/StaticLinearMaps.jl/blob/master/src/StaticLinearMaps.jl lying around, just to demonstrate that sometimes you want to think of static matrices as static linear maps and leave the storage/details to the implementation

That's interesting but this PR is not that ambitious 🙂 .

mateuszbaran · 2020-08-13T13:43:21Z

A few notes:

Triangular matrix multiplication on master is always fully unrolled.
The approach I'm using here for code generation only works with full unrolling as far as I can tell, so it will only be used for small matrices.
For larger matrices I'm going to retain the old behavior, that is chunked or looped multiplication for non-wrapped matrices and falling back to LinearAlgebra for larger ones. Do we wan't to convert back to a static matrix after going through LinearAlgebra?
In-place multiplication needs a bit of work.

… into mbaran/matmul-symmetric

mateuszbaran · 2020-08-19T10:42:40Z

I've made a few updates and here are some benchmarks: https://gist.github.com/mateuszbaran/ff63993fb4f66a9089dcf324139de6ac

This excludes many cases where master just throws an error. There are a few small regressions that I will investigate but overall it is a huge improvement.

mateuszbaran · 2020-08-19T11:06:26Z

These performance regressions are most likely random, I've check the generated code for one of the worst regressions and it's the same in this PR and in master.

mateuszbaran · 2020-08-25T13:14:42Z

Here are some new benchmarks after incorporating changes from #818 (thanks @chriselrod !): https://gist.github.com/mateuszbaran/ff63993fb4f66a9089dcf324139de6ac . I've also tried looking for better heuristics for method selection but there are no obviously better choices as far as I can tell: https://gist.github.com/mateuszbaran/e62ba317690b25270b2c8bc1ef307d6b

c42f · 2020-09-16T02:50:12Z

Do we wan't to convert back to a static matrix after going through LinearAlgebra?

Yes, I think the rule should be that we preserve the static array type where possible. I think it makes composite array operations more predictable overall, even though there's a conversion cost.

c42f

This seems like an epic contribution! And it deletes almost as much code as it adds which always great to see. I'm very glad to see the triangular multiplication special cases gone.

To check, does this incorporate all of #818?

I think at this point you and @chriselrod are the experts on this, so I think you should merge it when you're both ready.

Unless, that is, you want to direct my attention to any particular part of the implementation? I had a quick look through at a high level.

c42f · 2020-09-16T02:59:10Z

src/matrix_multiply_add.jl

 TSize(s::Number) = TSize(Size(s))
-istranpose(::TSize{<:Any,T}) where T = T
+istranspose(::TSize{<:Any,T}) where T = (T === :transpose)


mateuszbaran · 2020-09-16T08:26:41Z

This seems like an epic contribution! And it deletes almost as much code as it adds which always great to see. I'm very glad to see the triangular multiplication special cases gone.

Thanks! I would like to go through those triangular multiplication special cases again because the old versions are sometimes faster and I'm not yet sure why.

To check, does this incorporate all of #818?

Not all of it, just the muladd part. On the other hand the reordering thing has smaller impact: #818 (comment) , and I don't know how would reordering impact performance for other types than Float64.

I think at this point you and @chriselrod are the experts on this, so I think you should merge it when you're both ready.

Unless, that is, you want to direct my attention to any particular part of the implementation? I had a quick look through at a high level.

I would like to go through these changes once more before it's merged but I don't have any particular parts that would need your attention.

mateuszbaran · 2020-09-23T12:12:13Z

Performance regressions are very few, and I don't see a consistent pattern in them. They look more or less random and not that major (up to about 50% slower on Skylake). I will merge this tomorrow if no one objects.

mateuszbaran added 2 commits August 11, 2020 23:07

structured matrix multiplication pt 1

2eba36d

mul! doesn't always take parent; allocations on Julia 1.5

6dc5694

updating triangular matrix multiplication to the new scheme

6eaf1e1

mateuszbaran linked an issue Aug 12, 2020 that may be closed by this pull request

Uncallable method of _mul #813

Closed

more tests for multiplication

56c9ca5

adjoint and transpose wrappers for multiplication; more documentation

e725e4d

mateuszbaran added 9 commits August 13, 2020 15:45

partical unification of in-placed and out-of-place matrix multiplication

49d1043

more matrix types for multiplication

9fece75

Merge branch 'master' of https://github.com/JuliaArrays/StaticArrays.jl…

246c256

… into mbaran/matmul-symmetric

fixed code for symmetric and hermitian multiplication and small cleanup

f39afe0

some work on in-place structured multiplication

f41d74c

minor fixes

8dd4523

optimized multiplication by triangular matrices

3e96bbc

blas mul! fix and matrix multiplication benchmarks

8e11852

small matmul benchmark fix

767aa19

mateuszbaran marked this pull request as ready for review August 19, 2020 11:18

slightly relaxing allocation tests

f295ba6

This was referenced Aug 20, 2020

Certain SArray * Adjoint(SArray) produces Array{Float64,2} instead of SArray #537

Closed

changed mul_heuristic for non-float #514

Open

mateuszbaran and others added 5 commits August 22, 2020 21:19

adding Diagonal to the new matrix multiplication scheme

e6667bf

slight adjustments to matrix multiplication

0ff9b55

modified matrix multiplication heuristics

fd8cd5c

matmul muladd and improved order.

960beb6

14 -> 12 for loopmul decisions.

ef1ba23

BLAS decision should be for larger than 14x14, if anything.

86eab40

mateuszbaran mentioned this pull request Aug 24, 2020

Use muladd in matmul and improved operation order. #818

Open

mateuszbaran added 4 commits August 24, 2020 21:53

fixing matmul benchmark

4fd4e9d

by default use a reduced set of matmul benchmarks

59b4a9b

Merge branch 'muladdmul' into mbaran/matmul-symmetric

094ea90

muladd in combine_products

5e999c7

formatting fix

6229999

mateuszbaran force-pushed the mbaran/matmul-symmetric branch from 45dfd79 to 6229999 Compare August 26, 2020 08:51

small cleanup in matmul benchmarks

4d3f9ca

mateuszbaran mentioned this pull request Aug 31, 2020

Too restrictive setindex! for triangular matrices JuliaLang/julia#37175

Open

c42f approved these changes Sep 16, 2020

View reviewed changes

Merge branch 'master' into mbaran/matmul-symmetric

373032a

mateuszbaran merged commit 1b94778 into JuliaArrays:master Sep 24, 2020

mateuszbaran mentioned this pull request Sep 28, 2020

Change order of function definition in matrix multiplication code #833

Merged

mateuszbaran mentioned this pull request Oct 27, 2020

Adjust to upcoming Base Vararg change #843

Merged

mateuszbaran mentioned this pull request Apr 19, 2021

Since v1.0.0 StaticArrays does not calculate vec * vec' correctly #894

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structured matrix multiplication #814

Structured matrix multiplication #814

mateuszbaran commented Aug 11, 2020

mateuszbaran commented Aug 11, 2020

mschauer commented Aug 12, 2020

mateuszbaran commented Aug 13, 2020

mateuszbaran commented Aug 13, 2020

mateuszbaran commented Aug 19, 2020

mateuszbaran commented Aug 19, 2020

mateuszbaran commented Aug 25, 2020

c42f commented Sep 16, 2020

c42f left a comment

c42f Sep 16, 2020

mateuszbaran commented Sep 16, 2020

mateuszbaran commented Sep 23, 2020 •

edited

Loading

Structured matrix multiplication #814

Structured matrix multiplication #814

Conversation

mateuszbaran commented Aug 11, 2020

mateuszbaran commented Aug 11, 2020

mschauer commented Aug 12, 2020

mateuszbaran commented Aug 13, 2020

mateuszbaran commented Aug 13, 2020

mateuszbaran commented Aug 19, 2020

mateuszbaran commented Aug 19, 2020

mateuszbaran commented Aug 25, 2020

c42f commented Sep 16, 2020

c42f left a comment

Choose a reason for hiding this comment

c42f Sep 16, 2020

Choose a reason for hiding this comment

mateuszbaran commented Sep 16, 2020

mateuszbaran commented Sep 23, 2020 • edited Loading

mateuszbaran commented Sep 23, 2020 •

edited

Loading