Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable find on SparseVectors #15747

Closed
wants to merge 0 commits into from
Closed

Conversation

pranavtbhat
Copy link
Contributor

There seem's to be a discrepancy in the Sparse Matrix interface. Splicing an entire row/column of a sparse matrix yields a sparse vector, and the "find" method isn't defined on sparse vectors. So basically I can't do something like this:

a = sprand(10, 10, 0.3)
find(a[:,1])

But this does work with "full" arrays. Defining find to return the nzind field fixes this problem.

@tkelman
Copy link
Contributor

tkelman commented Apr 2, 2016

The equivalent function for sparse matrices checks that the nzval is nonzero first.

if S.nzval[k] != 0

@pranavtbhat
Copy link
Contributor Author

Oh. So I'll have to iterate through nzval to make sure that the values are non-zero?

@nalimilan
Copy link
Member

You can use filter for that.

@pranavtbhat
Copy link
Contributor Author

filter turns out to be quite slow. I just modified the SparseMatrix findn to handle SparseVectors. This is significantly faster:

function find{Tv,Ti}(x::SparseVector{Tv,Ti})
    numnz = nnz(x)
    I = Array(Ti, numnz)

    count = 1
    @inbounds for i = 1 : numnz
        if x.nzval[i] != 0
            I[count] = x.nzind[i]
            count += 1
        end
    end

    count -= 1
    if numnz != count
        deleteat!(I, (count+1):numnz)
    end

    return I
end

@nalimilan
Copy link
Member

You're right that since we expect to find few zero values, filter will be less efficient than your custom version. Though I think you can write this much more simply by creating an empty vector I, calling sizehint!(I, numnz), and then using push! to add the indices of non-zero elements. That way you don't need count and the deleteat! step.

@pranavtbhat
Copy link
Contributor Author

Using the deleteat step was almost twice as fast as using the sizehint step. I'm not sure why. Also, since the findn method for SparseMatrices uses count and deleteat, I guess we can just stick to it for the sake of uniformity. I'll rebase and push soon.

@pranavtbhat
Copy link
Contributor Author

Should I add a separate test for this as well? (One that constructs a sparse vector using a nzval array containing zeros)

@nalimilan
Copy link
Member

That's fine then (even if it's a bit surprising to me), thanks for checking. Keeping the code consistent with sparse matrices is also a good idea to help maintaining it.

This code path should definitely be tested.

@ararslan
Copy link
Member

ararslan commented May 5, 2016

Is this addressed by the now-merged #16110?

@pranavtbhat
Copy link
Contributor Author

Yup I think this is addressed.

@pranavtbhat pranavtbhat closed this May 5, 2016
@tkelman
Copy link
Contributor

tkelman commented May 5, 2016

It would still be good to have a specialized implementation, since enumerate is likely quite slow on a sparse input

@tkelman
Copy link
Contributor

tkelman commented May 5, 2016

#16110 makes no difference here because these were already AbstractArrays.

@tkelman tkelman reopened this May 5, 2016
@ararslan
Copy link
Member

ararslan commented May 5, 2016

Ah right, sorry :|

@ViralBShah
Copy link
Member

Let's have findnz also. This implementation is significantly faster than the generic one.

@ViralBShah
Copy link
Member

@pranavtbhat and I were trying to bring this up to date, but some git snafu closed this. A new PR is coming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants