Skip to content

Commit

Permalink
fix for np.count_nonzero not present in Numpy < 1.6
Browse files Browse the repository at this point in the history
  • Loading branch information
oddskool committed Nov 5, 2013
1 parent c6bae4c commit c7c6999
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 4 deletions.
3 changes: 2 additions & 1 deletion benchmarks/bench_sparsify.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,14 +45,15 @@

from scipy.sparse.csr import csr_matrix
import numpy as np
from sklearn.utils.fixes import count_nonzero
from sklearn.linear_model.stochastic_gradient import SGDRegressor
from sklearn.metrics import r2_score

np.random.seed(42)


def sparsity_ratio(X):
return np.count_nonzero(X) / float(n_samples * n_features)
return count_nonzero(X) / float(n_samples * n_features)

n_samples, n_features = 5000, 300
X = np.random.randn(n_samples, n_features)
Expand Down
6 changes: 3 additions & 3 deletions doc/modules/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -95,9 +95,9 @@ an optimized BLAS implementation.

Here is a sample code to test the sparsity of your input:

>>> import numpy as np
>>> from sklearn.utils.fixes import count_nonzero
>>> def sparsity_ratio(X):
>>> return 1.0 - np.count_nonzero(X) / float(X.shape[0] * X.shape[1])
>>> return 1.0 - count_nonzero(X) / float(X.shape[0] * X.shape[1])
>>> print("input sparsity ratio:", sparsity_ratio(X))

As a rule of thumb you can consider that if the sparsity ratio is greater
Expand Down Expand Up @@ -219,7 +219,7 @@ compromise between model compactness and prediction power. One can also
further tune the ``l1_ratio`` parameter (in combination with the
regularization strength ``alpha``) to control this tradeoff.

A typical `benchmark <https://github.com/scikit-learn/scikit-learn/tree/master/benchmarks/bench_sparsify.py>`_
A typical `benchmark <https://github.com/scikit-learn/scikit-learn/tree/masternchmarks/bench_sparsify.py>`_
on synthetic data yields a >30% decrease in latency when both the model and
input are sparsed (with 0.000024 and 0.027400 non-zero coefficients ratio
respectively). Your mileage may vary depending on the sparsity and size of
Expand Down

0 comments on commit c7c6999

Please sign in to comment.