Faster inv 4x4 #597

ryanelandt · 2019-03-18T17:11:58Z

Made the inverse function faster for 4x4 matrices. This should be roughly 100% faster for floats and 60% faster for duals than the current function.

using StaticArrays
using LinearAlgebra
using ForwardDiff
using BenchmarkTools

estOverhead(A::SMatrix{4,4,T,16}) where {T} = nothing

### Before
sm44 = randn(SMatrix{4,4,Float64,16})
@btime inv(sm44)          #  46.160 ns (1 allocation: 144 bytes)
@btime estOverhead(sm44)  #   7.740 ns (0 allocations: 0 bytes)

sm44 = randn(SMatrix{4,4,ForwardDiff.Dual{Nothing,Float64,6},16})
@btime inv(sm44)          # 357.433 ns (1 allocation: 1008 bytes)
@btime estOverhead(sm44)  #   8.376 ns (0 allocations: 0 bytes)

### After 
sm44 = randn(SMatrix{4,4,Float64,16})
@btime inv(sm44)          #  25.904 ns (1 allocation: 144 bytes)
@btime estOverhead(sm44)  #   7.740 ns (0 allocations: 0 bytes)

sm44 = randn(SMatrix{4,4,ForwardDiff.Dual{Nothing,Float64,6},16})
@btime inv(sm44)          # 220.067 ns (1 allocation: 1008 bytes)
@btime estOverhead(sm44)  #   8.376 ns (0 allocations: 0 bytes)

KristofferC · 2019-03-18T17:15:10Z

Try run the benchmarks with interpolating

@btime inv($sm44)

etc.

julia> @btime estOverhead($sm44)
  0.001 ns (0 allocations: 0 bytes)

for example (gets completely optimized away)

ryanelandt · 2019-03-18T17:29:42Z

As does inv...

sm44 = randn(SMatrix{4,4,Float64,16})
@btime inv($sm44)          #   0.014 ns (0 allocations: 0 bytes)

coveralls · 2019-03-18T17:49:24Z

Coverage decreased (-0.3%) to 65.722% when pulling 4bbcf7b on ryanelandt:faster_inv_4x4 into 2c8d82f on JuliaArrays:master.

KristofferC · 2019-03-18T18:30:06Z

Oh, sometimes I use

@btime inv($(Ref(sm44))[])

which usually works around it.

ryanelandt · 2019-03-18T19:07:33Z

Thanks I didn't realize this. Here is a corrected test:

using StaticArrays
using LinearAlgebra
using ForwardDiff
using BenchmarkTools

### Before
sm44 = randn(SMatrix{4,4,Float64,16})
@btime inv($(Ref(sm44))[])     #  60.706 ns (0 allocations: 0 bytes)

sm44 = randn(SMatrix{4,4,ForwardDiff.Dual{Nothing,Float64,6},16})
@btime inv($(Ref(sm44))[])     #  518.232 ns (0 allocations: 0 bytes)

### After
sm44 = randn(SMatrix{4,4,Float64,16})
@btime inv($(Ref(sm44))[])  #  28.310 ns (0 allocations: 0 bytes)

sm44 = randn(SMatrix{4,4,ForwardDiff.Dual{Nothing,Float64,6},16})
@btime inv($(Ref(sm44))[])  #  346.906 ns (0 allocations: 0 bytes)

andyferris · 2019-03-18T22:18:35Z

Cool - thanks @ryanelandt.

Out of curiosity, does anyone know anything about the numerical stability of this method? The referenced .pdf doesn't go into this. (Not that our static array methods have the highest level of stability, but we do make some attempt to trade of accuracy and performance).

ryanelandt · 2019-03-19T00:38:36Z

I did a test that compared the accuracy of this method and the old method. I do not know the right way to test two versions of a function so I copied both versions into functions called invert and invert_old and did a test this way. The new function has roughly 10% less error for random matrices and 25% less error for positive matrices. Both solutions are essentially identical, the old one is an expansion of the new one. My guess is that as the new one involves less floating point operations it induces less error.

using StaticArrays
using LinearAlgebra

function eval_err(is_make_pos::Bool)
	cum_err_old = 0.0
	cum_err_new = 0.0
	for k = 1:1e6
		sm44 = randn(SMatrix{4,4,Float64,16})
		is_make_pos && (sm44 = abs.(sm44))
		cum_err_old += norm(sm44 * invert_old(sm44) - I)
		cum_err_new += norm(sm44 * invert(sm44)     - I)
	end
	println("")
	println("cum_err_old: ", cum_err_old)
	println("cum_err_new: ", cum_err_new)
	println("new method is $(100 * (1 - cum_err_new / cum_err_old)) % more accurate than old method")
end


eval_err(false)
# cum_err_old: 5.442016380124342e-9
# cum_err_new: 4.801225758445484e-9
# new method is 11.774874916201883 % more accurate than old method

eval_err(true)
# cum_err_old: 1.4806311491135073e-8
# cum_err_new: 1.0418722581347007e-8
# new method is 29.633233857163077 % more accurate than old method

andyferris · 2019-03-19T02:43:30Z

Thanks - seems like a good start.

c42f · 2019-03-19T04:57:59Z

CI/coverage failures seem unrelated.

c42f · 2019-03-19T05:14:48Z

Using the additional stress test in test/inv.jl (https://github.com/JuliaArrays/StaticArrays.jl/blob/master/test/inv.jl#L102) shows this is an improvement for "almost rank 1" 4x4 matrices. For other approximately low rank cases it fares similarly to the code on master.

master:

this pr:

andyferris · 2019-03-19T05:15:33Z

Great work Chris. :)

andyferris · 2019-03-19T05:15:53Z

And thank you very much, @ryanelandt

c42f · 2019-03-19T05:30:11Z

Yes, thanks @ryanelandt.

It's interesting to note that the low rank numerical problems go away for the 5x5 case because we use the static LU decomposition there rather than special case code.

Similarly, if we comment out the 4x4 special case entirely, we have vastly improved numerical stability due to using LU, but there's a 7x speed hit to pay on my machine for that extra accuracy. Yikes!

julia> sm44 = randn(SMatrix{4,4,Float64,16})

# master, without 4x4 special case
julia> @btime inv($(Ref(sm44))[])
  136.877 ns (0 allocations: 0 bytes)

# ...
# master after merging this PR
julia> @btime inv($(Ref(sm44))[])
  18.941 ns (0 allocations: 0 bytes)

“Ryan added 2 commits March 18, 2019 13:01

added faster inverse for 4x4 matrices

92763c4

fixed spacing issues

4bbcf7b

andyferris approved these changes Mar 19, 2019

View reviewed changes

c42f merged commit 1376c77 into JuliaArrays:master Mar 19, 2019

c42f mentioned this pull request Jun 6, 2019

New Release 0.11.0 #616

Closed

ryanelandt mentioned this pull request Aug 23, 2020

changed mul_heuristic for non-float #514

Open

snoeyink mentioned this pull request Jun 22, 2022

Rewrite 4x4 det as product of 2x2 minors for improved performance #1046

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster inv 4x4 #597

Faster inv 4x4 #597

ryanelandt commented Mar 18, 2019 •

edited

Loading

KristofferC commented Mar 18, 2019 •

edited

Loading

ryanelandt commented Mar 18, 2019

coveralls commented Mar 18, 2019

KristofferC commented Mar 18, 2019

ryanelandt commented Mar 18, 2019

andyferris commented Mar 18, 2019

ryanelandt commented Mar 19, 2019

andyferris commented Mar 19, 2019

c42f commented Mar 19, 2019

c42f commented Mar 19, 2019

andyferris commented Mar 19, 2019

andyferris commented Mar 19, 2019

c42f commented Mar 19, 2019

Faster inv 4x4 #597

Faster inv 4x4 #597

Conversation

ryanelandt commented Mar 18, 2019 • edited Loading

KristofferC commented Mar 18, 2019 • edited Loading

ryanelandt commented Mar 18, 2019

coveralls commented Mar 18, 2019

KristofferC commented Mar 18, 2019

ryanelandt commented Mar 18, 2019

andyferris commented Mar 18, 2019

ryanelandt commented Mar 19, 2019

andyferris commented Mar 19, 2019

c42f commented Mar 19, 2019

c42f commented Mar 19, 2019

andyferris commented Mar 19, 2019

andyferris commented Mar 19, 2019

c42f commented Mar 19, 2019

ryanelandt commented Mar 18, 2019 •

edited

Loading

KristofferC commented Mar 18, 2019 •

edited

Loading