Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In BLAS overrides, remove restriction to StridedMatrix #25247

Closed
dlfivefifty opened this issue Dec 22, 2017 · 20 comments
Closed

In BLAS overrides, remove restriction to StridedMatrix #25247

dlfivefifty opened this issue Dec 22, 2017 · 20 comments

Comments

@dlfivefifty
Copy link
Contributor

Definitions such as gemv! (https://github.com/JuliaLang/julia/blob/master/base/linalg/blas.jl#L553) restrict to StridedVecOrMat. This prevents user-defined types from calling these functions.

I think this restriction is unnecessary as a user AbstractMatrix which overrides the (as-of-yet not explicitly defined) strided matrix "interface" of unsafe_convert(::Type{Ptr{T}},A), and stride should be able to call gemv!.

To overcome this restriction, I end up having to copy and paste Base's code to make my own gemv!:

https://github.com/JuliaMatrices/BlockBandedMatrices.jl/blob/master/src/lapack.jl#L19

This is not too burdensome but seems unnecessary.

@andreasnoack
Copy link
Member

Why is your custom type not a DenseArray?

@dlfivefifty
Copy link
Contributor Author

Because it’s a SubArray (representing a block of a block matrix)

@timholy
Copy link
Sponsor Member

timholy commented Dec 22, 2017

That method you defined is going to give you wrong answers for quite a few matrices. For example, try it with view(rand(5,5), [1,2,5], :).

@dlfivefifty
Copy link
Contributor Author

dlfivefifty commented Dec 22, 2017

One could a check that stride(A,1) == 1 .

EDIT: I think I mean chkstride1(A)

@timholy
Copy link
Sponsor Member

timholy commented Dec 22, 2017

Unfortunately there's some confusion about what stride means, and it has fallbacks that it probably shouldn't have:

A = view(rand(5,5), [1,2,5], :)
b = vec(A)

julia> stride(b, 1)
1

@timholy
Copy link
Sponsor Member

timholy commented Dec 22, 2017

#17812

@timholy
Copy link
Sponsor Member

timholy commented Dec 22, 2017

@dlfivefifty, want to tackle that for 1.0? 😄

@dlfivefifty
Copy link
Contributor Author

Sure, I'm happy to put together a pull request. When's the deadline?

@StefanKarpinski
Copy link
Sponsor Member

Last week, but we're still wrapping some things up. This is a fairly non-disruptive change as I understand it though, so it could probably sneak into the alpha period.

@dlfivefifty
Copy link
Contributor Author

OK. While I'm at it I might also include some of the missing routines (sbgst) for banded matrix routines, which are currently in

https://github.com/JuliaMatrices/BandedMatrices.jl/blob/master/src/lapack.jl

@dlfivefifty
Copy link
Contributor Author

One issue here is with axpy! and axpby!, which exist for both generic arrays and BLAS arrays. I'd almost say deprecate axpy! for generic arrays in favour of y .= a.*x.

@andreasnoack
Copy link
Member

Made a comment about this in JuliaLinearAlgebra/IterativeSolvers.jl#184. For new array types, it is much more demanding to support general broadcasting so I think we should keep axpy! and friends.

@dlfivefifty
Copy link
Contributor Author

Yes I did a quick timing of axpy! with sparse vectors and it's a factor of 10 slower to use broadcasting.

@Sacha0
Copy link
Member

Sacha0 commented Dec 28, 2017

Yes I did a quick timing of axpy! with sparse vectors and it's a factor of 10 slower to use broadcasting.

Modulo benchmark details, broadcast should not be so much slower; perhaps something has regressed?

@dlfivefifty
Copy link
Contributor Author

Maybe it’s because axpy can take advantage of sparsity but broadcasting can’t.

@Sacha0
Copy link
Member

Sacha0 commented Dec 28, 2017

broadcast specializations for sparse vectors and matrices exist :).

@dlfivefifty
Copy link
Contributor Author

dlfivefifty commented Dec 28, 2017

Yes, but unless I'm missing something (e.g., being able to tell that a function is pure so can be evaluated only once for the zero entries) it can't know the sparsity pattern without evaluating the broadcast at every single entry.

FYI here is my test:

julia> x = SparseVector(1000, [1,2,3], [1.,2.,3.]);

julia> y = rand(1000);

julia> using BenchmarkTools

julia> function foo(x, y)
              y .+= 2.0 .* x 
              end
foo (generic function with 1 method)

julia> @btime foo(x,y);
  3.293 μs (0 allocations: 0 bytes)

julia> @btime BLAS.axpy!(2.0, x, y);
  26.386 ns (0 allocations: 0 bytes)

EDIT: It also has the same behaviour when y is zero:

julia> y = zeros(1000);

julia> @btime foo(x,y);
  3.343 μs (0 allocations: 0 bytes)

julia> @btime BLAS.axpy!(2.0, x, y);
  24.042 ns (0 allocations: 0 bytes)

@Sacha0
Copy link
Member

Sacha0 commented Dec 28, 2017

Yes, but unless I'm missing something (e.g., being able to tell that a function is pure so can be evaluated only once for the zero entries) it can't know the sparsity pattern without evaluating the broadcast at every single entry.

Sparse broadcast roughly assumes purity, allowing (with a bit of additional magic for scalar arguments) operation only over the union of the patterns of broadcast's array arguments :).

Solution to the benchmark mystery: The destination argument being dense, the broadcast benchmark hits the generic AbstractArray machinery (rather than the sparse arrays machinery), whereas the BLAS.axpy! benchmark hits a specialization that takes advantage of x's sparsity. Best!

@dlfivefifty
Copy link
Contributor Author

Huh, that’s interesting. If I’m allowed to assume purity that would be useful for broadcast for BandedMatrix if a general fill value is added. What is the justification for being able to assume purity?

@Sacha0
Copy link
Member

Sacha0 commented Dec 28, 2017

What is the justification for being able to assume purity?

Essentially that assuming the necessary weak form of purity is extremely useful when working with higher order functions over sparse and structured objects, whereas use cases that break that assumption seem marginal.

If I’m allowed to assume purity that would be useful for broadcast for BandedMatrix if a general fill value is added.

Absolutely :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants