Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request - specialized arithmetic for views of sparse matrices #56

Open
mkborregaard opened this issue May 11, 2017 · 9 comments

Comments

@mkborregaard
Copy link
Contributor

I posted this originally as a question here https://discourse.julialang.org/t/slow-arithmetic-on-views-of-sparse-matrices/3644 and got some useful feedback. It seems though that this functionality could be built into julia proper.

Arithmetic on views of sparse matrices fall back on general methods and are thus slow. Here is an example:

d = sprand(Bool,10000,10000, 0.01)
e = view(d, rand(1:10000,5000), rand(1:10000,9000))

using BenchmarkTools
@benchmark sum($d, 1)
BenchmarkTools.Trial:
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     421.354 μs (0.00% GC)
  median time:      450.828 μs (0.00% GC)
  mean time:        463.575 μs (0.00% GC)
  maximum time:     1.226 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1


@benchmark sum($e, 1)
BenchmarkTools.Trial:
  memory estimate:  585.06 KiB
  allocs estimate:  51
  --------------
  minimum time:     3.286 s (0.00% GC)
  median time:      3.312 s (0.00% GC)
  mean time:        3.312 s (0.00% GC)
  maximum time:     3.339 s (0.00% GC)
  --------------
  samples:          2
  evals/sample:     1

Specialized functions for these operations should fix this.

This appears to be related to (but not identical to?) JuliaLang/julia#13438

@timholy
Copy link
Contributor

timholy commented May 11, 2017

There's a framework underway for this kind of thing in https://github.com/timholy/ArrayIteration.jl, but that can't be very efficient until we have JuliaLang/julia#12205 (it creates about a zillion array wrappers, sometimes of fairly small regions, and it would be expensive to heap-allocate them all).

@mkborregaard
Copy link
Contributor Author

That sounds amazing, thanks a lot for the response.

@perrutquist
Copy link

sum(x,1) can be thought of as a special case of vector-matrix multiplication, which seems to be slow in general for views into spares matrices.

d = sprand(Float64,10000,10000, 0.01)
i = rand(1:10000,5000)
j = rand(1:10000,9000)
e = view(d, i, j)
v = rand(Float64,1,5000)

f1(v,e) = v*e
f2(v,d,i,j) = (sparse(ones(Int,5000), i, v[:], 1, 10000)*d)[j]

# f2 computes the exact same thing as f1, only faster.
@assert isapprox(f1(v,e)[:], f2(v,d,i,j))

using BenchmarkTools
@benchmark f1($v,$e)
BenchmarkTools.Trial: 
  memory estimate:  585.39 KiB
  allocs estimate:  57
  --------------
  minimum time:     1.822 s (0.00% GC)
  median time:      1.823 s (0.00% GC)
  mean time:        1.823 s (0.00% GC)
  maximum time:     1.824 s (0.00% GC)
  --------------
  samples:          3
  evals/sample:     1

@benchmark f2($v,$d,$i,$j)
BenchmarkTools.Trial: 
  memory estimate:  751.31 KiB
  allocs estimate:  37
  --------------
  minimum time:     6.853 ms (0.00% GC)
  median time:      6.984 ms (0.00% GC)
  mean time:        7.106 ms (0.92% GC)
  maximum time:     10.845 ms (27.13% GC)
  --------------
  samples:          701
  evals/sample:     1

@mkborregaard
Copy link
Contributor Author

I've got some functions for views into sparse matrices (currently here https://github.com/EcoJulia/SpatialEcology.jl/blob/master/src/Sparse_matrixfunctions.jl ), not mostly coded by me, though, but based on answers from @perrutquist and @getzdan .

Would it be interesting for Base to have more functionality for (views into) sparse matrices? alternatively:
Would it be better to make a SparseUtils.jl package with utility functions for sparse matrices (possibly moving Base functions out)?

@andreasnoack
Copy link
Member

I think we should have better support for this is base. At least AbstractUnitRange slices of columns of sparse matrices. I've a few ad hoc implementations of these in various projects and often most of the existing code can be reused just by using a generic function instead of a direct field access.

@mkborregaard
Copy link
Contributor Author

As an addition to this, copy is extremely slow on views into sparse matrices - here's an example:

using SparseArrays
c = sprand(Bool,6000,18000, 0.01)
d = view(c, 1:6000, rand(1:18000, 1800))
copy(d);
@time copy(d)
  2.626470 seconds (12 allocations: 18.557 MiB, 0.42% gc time)

2.6 seconds to generate a 6000x1800 sized matrix, a very realistically sized array seems really excessive. I saw recent PRs (JuliaLang/julia#30531) made progress on copying Sparse Vectors, but this does not seem to help this case.
Unfortunately, practically all of my workflow relies on this functionality :-s - so I'm really motivated to help doing something about it, though its mostly out of my pay grade.

@Oblynx

This comment has been minimized.

@mkborregaard
Copy link
Contributor Author

In terms of the copy issue described above, defining simply

mycopy(A::SubArray{R,S,T}) where R where S where T <: SparseMatrixCSC = d.parent[d.indices...]

makes it possible to do

using SparseArrays
c = sprand(Bool,6000,18000, 0.01);
d = view(c, 1:6000, rand(1:18000, 1800));
c1 = copy(d);
@time copy(d)
    2.456731 seconds (12 allocations: 18.556 MiB)
c2 = mycopy(d);
@time mycopy(d)
    0.000564 seconds (12 allocations: 962.156 KiB)
c2 == c1
    true

Which is 3000 times faster (at least with @time). So it turns out it can indeed be optimised quite easily. Is this something to follow in terms of optimisation, or too naive? @andreasnoack

@mbauman
Copy link
Contributor

mbauman commented Jan 24, 2019

Not naive at all. I could even get behind doing that for all SubArrays on the assumption that other .parents may specialize on nonscalar indexing.

@KristofferC KristofferC transferred this issue from JuliaLang/julia Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants