Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for improving performance of sparse ishermitian, issym #11353

Closed
KristofferC opened this issue May 19, 2015 · 13 comments
Closed

Suggestions for improving performance of sparse ishermitian, issym #11353

KristofferC opened this issue May 19, 2015 · 13 comments
Labels
domain:arrays:sparse Sparse arrays domain:linear algebra Linear algebra performance Must go faster

Comments

@KristofferC
Copy link
Sponsor Member

The current implementations for ishermitian and issym for sparse matrices are the full fledged A - A' == 0, see here. However, in many cases it is unnecessary to look at all elements before deciding that the matrix is not hermitian.

Due to the way sparse matrices are stored, checking one element at a time might be longer than the current method. On my computer, checking element by element is about 20% slower than the current method for hermitian matrices.

However, instead of looking at all the elements in an element by element fashion, we can look at only a fraction of the elements and if we find that the matrix is not hermitian we return false. If we do not yet know, we fall back to the full fledged method. This will have an insignificant time in the case that the matrix is hermitian but we can discard obviously non hermitian matrices instantly.

I wrote an example implementation of how this could look here:
https://gist.github.com/KristofferC/27dd9e1577f779c0d177

In the gist is also an example of a is_prob_hermitian which checks that |<Ax, y> - <x, Ay>| < eps for random x and y. I thought this could be called with a flag but maybe these type of functions are unwanted in Base.

I could polish what I have in the gist and make a PR if there is a positive response.

@stevengj
Copy link
Member

The A - A' method seems very unsatisfying to me because it involves a lot of allocation and A might be extremely large. It seems better to try to speed up the element-by-element check: 20% is not an overwhelming margin, so I'm guessing that careful code optimization might be able to eliminate that penalty.

@KristofferC
Copy link
Sponsor Member Author

I agree. I guess one improvement you can do is that for a given column you start the search at the last found element instead of starting from the beginning every time. I will see what difference that makes.

@dhoegh
Copy link
Contributor

dhoegh commented May 19, 2015

There has been a discussion around testing approximate ishermitian in #10369 and #10298

@andreasnoack
Copy link
Member

CHOLMOD has a function that checks the structure of a sparse matrix

function ishermitian(A::Sparse{Complex{Float64}})

and we could use it more extensively. It requires a conversion to CHOLMOD.Sparse but in many cases it is probably faster than the present method you linked to. A better solution would be to have something like that in pure Julia.

As @dhoegh's links show, there might also be reasons beyond speed to prefer an approximative version.

@KristofferC
Copy link
Sponsor Member Author

This is the best I can do now which seems to be around 10% slower in the hermitian case for the test I did.

function ishermitian_new(A::SparseMatrixCSC)
    m, n = size(A)
    if m != n; return false; end

    colptr = A.colptr
    rowval = A.rowval
    nzval = A.nzval
    start_indices = copy(A.colptr)
    nrnzval = length(nzval)
    @inbounds for col = 1 : A.n
        for j = colptr[col] : colptr[col+1]-1
            row = rowval[j]
            nz = nzval[j]

            r1 = start_indices[row]
            r2 = Int(A.colptr[row+1]-1)
            r1 = searchsortedfirst(A.rowval, col, r1, r2, Forward)
            start_indices[row] = r1
            if nz != ctranspose(A.nzval[r1])
                return false
            end
        end
    end
    true
end

test:

new_herm = Float64[]
old_herm = Float64[]

for i = 1:10
    A = sprand(10^5,10^5, 0.0005);
    Aherm = A + A';
    push!(new_herm, @elapsed ishermitian_new(Aherm))
    push!(old_herm, @elapsed ishermitian(Aherm))
end

julia> mean(new_herm)
3.112574440315789

julia> mean(old_herm)
2.809209260263158

@stevengj
Copy link
Member

I wonder if it would be faster to just do a linear search (rather than the binary search in searchsortedfirst) if r2-r1 is sufficently small?

@stevengj
Copy link
Member

Also, shouldn't the test be if r1 <= r2 && nz != ctranspose(...), in case the sparsity pattern is not symmetrical?

@KristofferC
Copy link
Sponsor Member Author

I realized I did not exploit that we only need to check the lower diagonal against the upper.

With this: https://gist.github.com/KristofferC/36c760057b6e911e6258

I get around 2.5 times faster than Base version for Hermitian matrices.

Cholmod is likely a lot faster though, so maybe it is worth converting and ccalling.

@stevengj To be honest, I am not sure. Could you give a small example matrix where you think that you would need that.

Regarding linear / binary search, it would be interesting to benchmark.

@KristofferC
Copy link
Sponsor Member Author

@stevengj I realized what you meant now. I can't do the optimization I did above.

julia> A = speye(5)
julia> A[1,3] = 1
julia> ishermitian_new(A)
true

:(

@simonster simonster added performance Must go faster domain:linear algebra Linear algebra labels May 19, 2015
@toivoh
Copy link
Contributor

toivoh commented May 19, 2015

If you can get n ints of workspace to check whether an nxn sparse matrix is symmetric, you don't need to do any searching at all (if each column is stored on order, but it is, right?). If you go over the matrix column by column then, for the transpose access, you can keep an index for the next entry to be accessed in each column. Next time you access that column for the transpose, you check that the entry matches, and increment the index. This works as long as each column is sorted and you go through the columns in order for the non-transpose.

@toivoh
Copy link
Contributor

toivoh commented May 19, 2015

But maybe those are too many assumptions? I would guess that the algorithm in CHOLMOD does something similar though.

@KristofferC
Copy link
Sponsor Member Author

Cholmod source code is here but I guess we shouldn't peak too much due to licencing:
https://www.cise.ufl.edu/research/sparse/cholmod/CHOLMOD/MatrixOps/cholmod_symmetry.c

You are right though that no searching is needed, I will try to implement what you described and test it more carefully this time :)

@KristofferC
Copy link
Sponsor Member Author

Closing this and referring to: #11371

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:arrays:sparse Sparse arrays domain:linear algebra Linear algebra performance Must go faster
Projects
None yet
Development

No branches or pull requests

7 participants