Division of a sparse matrix by row vector causes 0's to become NaN #23857

jsams · 2017-09-24T01:13:37Z

I think the title about says it all. This example creates a sparse CSC matrix with 7 stored entries, but then the division creates 12 stored entries, with the 5 extra being NaN. None of the column means are 0 or near 0, so this should not be happening.

julia> srand(2);
julia> X = sprand(3, 4, 0.6);
julia> Xmeandiv = X  ./ mean(X, 1)
3×4 SparseMatrixCSC{Float64,Int64} with 12 stored entries:
  [1, 1]  =  1.37231
  [2, 1]  =  1.20439
  [3, 1]  =  0.423301
  [1, 2]  =  1.91425
  [2, 2]  =  NaN
  [3, 2]  =  1.08575
  [1, 3]  =  NaN
  [2, 3]  =  3.0
  [3, 3]  =  NaN
  [1, 4]  =  3.0
  [2, 4]  =  NaN
  [3, 4]  =  NaN

division by a column vector doesn't create NaNs, but it does cause explicit 0s to be stored:

julia> X ./ mean(X, 2)
3×4 SparseMatrixCSC{Float64,Int64} with 12 stored entries:
  [1, 1]  =  1.90584
  [2, 1]  =  1.91323
  [3, 1]  =  1.50071
  [1, 2]  =  1.72613
  [2, 2]  =  0.0
  [3, 2]  =  2.49929
  [1, 3]  =  0.0
  [2, 3]  =  2.08677
  [3, 3]  =  0.0
  [1, 4]  =  0.36803
  [2, 4]  =  0.0
  [3, 4]  =  0.0

scalar division works as expected

julia> X ./ 3.3
3×4 SparseMatrixCSC{Float64,Int64} with 7 stored entries:
  [1, 1]  =  0.237995
  [2, 1]  =  0.208872
  [3, 1]  =  0.0734115
  [1, 2]  =  0.215552
  [3, 2]  =  0.12226
  [2, 3]  =  0.227818
  [1, 4]  =  0.0459582

(even if this was correct behavior, does it make sense to allow the result of an operation on a sparse matrix to return a dense matrix if the storage requirements will be lower or close to the same?)

julia> versioninfo()
Julia Version 0.6.0
Commit 9036443 (2017-06-19 13:05 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-7300U CPU @ 2.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

The text was updated successfully, but these errors were encountered:

KlausC · 2017-11-25T15:09:43Z

minimal more concise cases to demonstrate the same effect:

julia> sparse([1; 0]) ./ [1]
2-element SparseVector{Float64,Int64} with 2 stored entries:
  [1]  =  1.0
  [2]  =  NaN

julia> sparse([1; 0]) ./ [1; 1]
2-element SparseVector{Float64,Int64} with 2 stored entries:
  [1]  =  1.0
  [2]  =  0.0

In the first case, the result of 0/0 is stored to all structural zero array locations, which is wrong.
In the second case, a stored zero is produced, where it could be structural, which is not optimal.

julia> X = sparse([1;1]);
julia> X[2,1] = 0;

julia> X
2-element SparseVector{Int64,Int64} with 2 stored entries:
  [1]  =  1
  [2]  =  0

julia> X ./ [1]
2-element SparseVector{Float64,Int64} with 2 stored entries:
  [1]  =  1.0
  [2]  =  0.0

This shows, that the wrong behaviour does not happen, it the zero is not structural.

KlausC · 2017-11-25T15:22:01Z

Another bug in the same realm, maybe even more severe, as this example shows:

julia> sparse([1;1]) .÷ [1]
ERROR: DivideError: integer division error
Stacktrace:
 [1] _diffshape_broadcast(::typeof(div), ::SparseVector{Int64,Int64}, ::SparseVector{Int64,Int64}) at ./sparse/higherorderfns.jl:126
 [2] broadcast(::typeof(div), ::SparseVector{Int64,Int64}, ::SparseVector{Int64,Int64}) at ./sparse/higherorderfns.jl:119
 [3] broadcast(::Function, ::SparseVector{Int64,Int64}, ::Array{Int64,1}) at ./broadcast.jl:455
 [4] top-level scope

The expression 0 ÷ 0 is evaluated, even if there is no need to divide by zero.
Unfortunately the code in sparse/higherorderfns.jl is too complex for a fast fix.

KlausC · 2017-11-25T15:30:34Z

Yet another surprise / bug:

julia> broadcast(+, sparse([1;0]), sparse([1]))
2-element SparseVector{Int64,Int64} with 2 stored entries:
  [1]  =  2
  [2]  =  1

julia> broadcast(+, sparse([1;0]), [1])
ERROR: DimensionMismatch("")
Stacktrace:
 [1] _binarymap(::typeof(+), ::SparseVector{Int64,Int64}, ::Array{Int64,1}, ::Int64) at ./sparse/sparsevector.jl:1320
 [2] broadcast(::typeof(+), ::SparseVector{Int64,Int64}, ::Array{Int64,1}) at ./sparse/sparsevector.jl:1383
 [3] top-level scope

Wonder why it works for / but not for + .

Sacha0 · 2017-11-25T16:26:44Z

Thanks for the additional examples! Sparse broadcast over combinations including dense objects indeed needs a good bit of attention after 1.0 :). Best!

KlausC · 2017-11-27T15:13:46Z

I just created a PR for the original issue: #24806

mbauman · 2018-04-23T23:05:16Z

Fixed by #24806.

jsams · 2018-04-25T22:10:57Z

There were two parts to this bug: The NaNs and the creation of, i believe the term is, 'non-structural zeros' (i.e. zeros stored in the array rather than being implied by the structure of the object). Based on a quick test of the lastest nightly, only the NaNs are fixed. But the NaNs, at least in some instances, have been replaced by these non-structural zeros.

As a user, I would not expect broadcast operations to add structural zeros when unnecessary. It might be reasonable to expect them to remove generated zeros, but I understand that can get iffy with floating point comparisons to zero. That said, it looks like non-structural zeros are being pruned in at least some circumstances. So that's nice, but the system isn't perfect. Here's an example set of tests:

using SparseArrays
using Test

@test size(sparse([1.0; 0.0]).nzval, 1) == 1
@test_broken size((sparse([1.0; 0.0]) ./ [1.0]).nzval, 1) == 1 # is 2, should be 1
@test size((sparse([1.0; 0.0]) .* [1.0]).nzval, 1) == 1 # expected
@test size((sparse([1.0; 0.0]) .+ [0.0]).nzval, 1) == 1 # expected
@test size((sparse([2.0; 1.0]) .- [1.0]).nzval, 1) == 1 # nice, pruning!

mbauman · 2018-04-25T22:17:18Z

I believe that is fairly well covered by #26561.

jsams · 2018-04-25T22:29:39Z

indeed it is! Sorry, I missed that issue.

andreasnoack added bug Indicates an unexpected problem or unintended behavior sparse Sparse arrays labels Sep 24, 2017

andreasnoack assigned Sacha0 Sep 24, 2017

KlausC mentioned this issue Nov 27, 2017

Wrong evaluation in sparse/higherorderfns.jl fix#23857 #24806

Merged

mbauman closed this as completed Apr 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division of a sparse matrix by row vector causes 0's to become NaN #23857

Division of a sparse matrix by row vector causes 0's to become NaN #23857

jsams commented Sep 24, 2017

KlausC commented Nov 25, 2017

KlausC commented Nov 25, 2017

KlausC commented Nov 25, 2017 •

edited

Loading

Sacha0 commented Nov 25, 2017

KlausC commented Nov 27, 2017

mbauman commented Apr 23, 2018

jsams commented Apr 25, 2018

mbauman commented Apr 25, 2018

jsams commented Apr 25, 2018

Division of a sparse matrix by row vector causes 0's to become NaN #23857

Division of a sparse matrix by row vector causes 0's to become NaN #23857

Comments

jsams commented Sep 24, 2017

KlausC commented Nov 25, 2017

KlausC commented Nov 25, 2017

KlausC commented Nov 25, 2017 • edited Loading

Sacha0 commented Nov 25, 2017

KlausC commented Nov 27, 2017

mbauman commented Apr 23, 2018

jsams commented Apr 25, 2018

mbauman commented Apr 25, 2018

jsams commented Apr 25, 2018

KlausC commented Nov 25, 2017 •

edited

Loading