1/x not the same as inv(x) for matrix in julia 0.2.0 #5807

freefrancisco · 2014-02-13T19:14:54Z

I was trying to figure out how to invert a matrix in a 0.2.0 installation of julia, and I found out that 1/x gives me a different inverse than inv(x).
Here is my output

julia> x = rand(3, 3)
3x3 Array{Float64,2}:
 0.698453  0.500252  0.822943
 0.483441  0.078515  0.889188
 0.275469  0.341758  0.195685

julia> 1/x
3x3 Array{Float64,2}:
 1.43174   1.99899  1.21515
 2.06851  12.7364   1.12462
 3.63017   2.92605  5.11026

julia> inv(x)
Warning: Possible conflict in library symbol dtrtri_
Warning: Possible conflict in library symbol dgetri_
Warning: Possible conflict in library symbol dgetrf_
3x3 Array{Float64,2}:
  35.4312  -22.5164  -46.69
 -18.4623   11.0545   27.4109
 -17.6333   12.3904   22.9644

julia> versioninfo()
Julia Version 0.2.0
Commit 05c6461 (2013-11-16 23:44 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin12.5.0)
  WORD_SIZE: 64
  BLAS: libgfortblas
  LAPACK: liblapack
  LIBM: libopenlibm

The text was updated successfully, but these errors were encountered:

andreasnoack · 2014-02-13T19:21:43Z

Because 1 is a number 1/x gives the reciprocal of each element of the matrix whereas inv is the matrix inverse.

It is better to ask a question like this to the julia-users list.

pao · 2014-02-13T19:37:30Z

You can find the julia-users list here.

JeffBezanson · 2014-02-13T20:37:54Z

I wonder if that behavior is still a good idea. After all a^-1 is also matrix inverse.

jiahao · 2014-02-13T21:02:09Z

1/A computing the elementwise reciprocal of A is inconsistent with A/B computing B\A. Why not make 1./A compute the elementwise reciprocal instead?

freefrancisco · 2014-02-13T21:03:41Z

Also there is at least one julia tutorial that expects 1/A == A^-1
http:https://www.statalgo.com/2012/04/15/statistics-with-julia-linear-algebra-with-lapack/

jiahao · 2014-02-13T21:04:19Z

It feels like 1.0 ./A should be the matrix reciprocal, eye()/A should be the matrix inverse and 1.0/A should just be an error.

jiahao · 2014-02-13T21:12:52Z

That tutorial is quite outdated.

andreasnoack · 2014-02-13T21:14:06Z

Agree. I was too fast here. There is also the consequence

julia> [1 1]\1
1x2 Array{Float64,2}:
 1.0  1.0

julia> [1 1]\[1]
2-element Array{Float64,1}:
 0.5
 0.5

Most linear algebra function accept Number interpreted as a Vector of length one or a 1x1 Matrix so I guess the right thing for \(Matrix,Number) is to check if the dimensions are correct.

StefanKarpinski · 2014-02-13T22:21:47Z

If we were to go down the road of making 1/A compute a matrix inverse, then we'd also want to change the behavior of a lot of other operations. For example, A + 1 should be equivalent to A + eye(size(A)) while A .+ 1 would be how you write the broadcasting sense. I've strongly considered this kind of change before, but it would be pretty disruptive. I think the only path to get there would be to deprecate the non-broadcasting forms for a fairly long time and then reintroduce the new "algebraically correct" behavior.

JeffBezanson · 2014-02-13T22:33:30Z

I would seriously consider doing that. The case of + is a bit odd since I don't see that very often, but / and \ are used with matrices routinely and the 1/A change seems like the right thing.

jiahao · 2014-02-13T22:56:15Z

A + 1 (or in more conventional notation, A + λ*I) is actually a very important use case: A + λ*I is used in level shifts for all sorts of matrix operations, particularly ill-conditioned matrix eigenvalue problems; (A + λI)⁻¹ is a resolvent kernel for inverting linear operators in the spectral domain; sqrt(A² + λ²I)⁻¹ and the like show up in regularization, to name just three use cases. Currently there isn't a more compact way to write A + λ*I other than A + λ*eye(A), and this notation is wasteful in that it initializes a second matrix only to do little other than to scale and add it to something else. So there are nice algebraic reasons why A + 1 should promote to A + eye(A).

I'm not convinced, however, that this has to be conflated with 1/A computing the matrix inverse vs. the elementwise reciprocal.

jiahao · 2014-02-13T23:06:16Z

A few thoughts on the specific issue of matrix inverse notation though:

The usual admonition is that people should not compute the matrix inverse A⁻¹ explicitly. There's really no need to make it easier to compute matrix inverses.
However, we do use A/B as a synonym for B/A (or at least, try to.). So if change 1/A to be the matrix inverse, then A/1 ought to be consistent with that as well.
The real question is whether an expression containing both a matrix and 1 should have the 1 promote to eye(N) (the proposed behavior) or ones(N,N) with elementwise operations (essentially the current behavior).

JeffBezanson · 2014-02-13T23:26:01Z

I don't think scalar*matrix could do anything but scale the matrix, no matter how we interpret all the above arguments.

jiahao · 2014-02-13T23:28:45Z

Hm, maybe I wasn't being clear. My point is that there are plenty of contexts in which A + λ for a matrix A and scalar λ is a powerful synonym for A + λ*eye(A).

jiahao · 2014-02-13T23:29:55Z

as opposed to the current behavior of A + λ, which behaves as a synonym for A + λ*ones(A) (and in other such expressions involving matrices and scalars).

StefanKarpinski · 2014-02-13T23:53:19Z

I would be very much in favor of that change. I can't imagine a single mathematician who would not prefer for A + λ to be an efficient and expressive way to modify the diagonal. The reason that and the 1/A thing should be coupled is that they are both about how we treat mathematical operations on a scalar and a matrix – should it broadcast or should it do the algebraically correct thing? What we're doing now is mathematically sketchy but at least it's consistent. If we changed one and not the other then we'd be neither consistent nor correct. I think that changing both is the best thing – then we'll be both consistent and correct.

jiahao · 2014-02-14T00:17:27Z

This proposal, by the way, is very much as its heart about whether a Matrix is a linear algebraic object or a two-dimensional container (c.f. #987). The broadcasting behavior is consistent across all n-dimensional arrays, whereas this proposal would special-case behavior for n=2.

mbauman · 2014-02-14T00:29:28Z

I like the idea of making scalars behave like they're multiplied by I of the "correct size" when interacting with matrices through non-broadcasting operators. The fact that there's already operators designated to do element-wise broadcasting makes it a bit easier, too. Addition is well-defended above, and multiplication is free. Division is tricky, though: what's the correct size for I when dividing by non-square matrices?

If A is square, then eye(size(A)...)/A * A = I, no? My linear algebra is rusty, but isn't eye(size(A,2),size(A,1))/A the choice to make in general? I think that should behave like pinv(A).

I know I have tons of A + λ * eye(size(A)) littered through my Matlab code for state estimation and identification. I think this could be a great change.

kmsquire · 2014-02-14T00:33:30Z

As an aside, the size(A) isn't necessary for ones or eye:

julia> ones(A) == ones(size(A))
true

julia> eye(A) == eye(size(A))
ERROR: no method eye((Int64,Int64))

julia> eye(A) == eye(size(A)...)
true

jiahao · 2014-02-14T00:38:14Z

@mbauman Tempting as it is to define 1/A as pinv(A) for rectangular matrices, pseudoinverses are left inverses but not right inverses (whereas a true inverse is both a left inverse and a right inverse). So I don't think we can reason consistently about 1/A for nonsquare matrices in a way that neglects order of operations.

jiahao · 2014-02-14T00:40:44Z

I know I have tons of A + λ * eye(size(A)) littered through my Matlab code for state estimation

You could always define a custom +(A::Matrix, λ::Number) method in your code...

mbauman · 2014-02-14T00:48:53Z

Right, of course, since there is no true inverse for rectangular matrices, there is no universal solution here… perhaps the answer is to punt and throw an error for rectangular matrices? I still like the consistency of (effectively) transforming the scalar λ into λI across all basic arithmetic operations with matrices.

(Thanks for the tips, Kevin and Jiahao - I'll be happy to ditch those Matlabisms in my Julia code)

jiahao · 2014-02-14T01:19:51Z

@kmsquire thanks, I've edited my comments.

@mbauman we wouldn't have to do anything new at the level of redefining 1/A, since if it calls inv(A), that would do the dimension check already.

carlobaldassi · 2014-02-14T01:22:16Z

Also, speaking of customization, probably a special matrix type which just keeps a scalar and represents a matrix proportional to the identity of a variable size, together with a constant I associated to the value 1, wouldn't be too difficult to write. For example:

immutable IdentityMatrix{T<:Number}
    λ::T
end

const I = IdentityMatrix(1)

Base.(:*)(x::Number, M::IdentityMatrix) = IdentityMatrix(x*M.λ)
Base.(:+){T<:Number}(A::Matrix{T}, M::IdentityMatrix) = A + M.λ * eye(A)
Base.(:/){T<:Number}(M::IdentityMatrix, A::Matrix{T}) = M.λ * inv(A)

etc. (of course + should be written more efficiently).
With these definitions alone one already has:

julia> A = rand(2,2)
2x2 Array{Float64,2}:
 0.184807  0.557457
 0.342099  0.889467

julia> A + 2I
2x2 Array{Float64,2}:
 2.18481   0.557457
 0.342099  2.88947 

julia> I/A
2x2 Array{Float64,2}:
 -33.787   21.1754
  12.9948  -7.02

lindahua · 2014-02-14T20:19:09Z

A local variable I would hide the constant identity matrix I. So probably won't break existing codes. (Just like e is also being used as local variables/fields in many places).

If we add this constant I, it will only be used in new/revised codes. In such cases, I believe most people would avoid using I as a local variable whenever he wants to use I as an identity matrix within the same local context.

milktrader · 2014-02-14T20:24:26Z

@lindahua thanks for the excellent explanation

stevengj · 2014-02-14T21:01:10Z

I like the IdentityMatrix idea too.

One could then define I[n,m] as a Kronecker delta function as well.

However, I don't think IdentityMatrix is the right name for the type, since it includes a scale factor. Maybe ScalingOperator (since it could be an operator for arbitrary vector space, and in any case is not a <: AbstractMatrix since it doesn't have a size)

carlobaldassi · 2014-02-14T21:09:34Z

I agree with @lindahua on everything, included the preference for deprecating α/I and forcing to be explicit on that.
But I'm actually a bit (not too much really, but still) concerned about using I as a constant; it's true it would not probably break any code, but a quick grepping in Julia base reveals it's very commonly used (480 hits), especially to indicate a group of indices. The best choice would probably be one of \U1D7D9 ('MATHEMATICAL DOUBLE-STRUCK DIGIT ONE') or \U1D540 ('MATHEMATICAL DOUBLE-STRUCK CAPITAL I'); maybe the (admittedly ugly) name II could be used to mimic that. Or, following ⅈ → im it could be called ID. So in Base code we would use Unicode, but users could still use ASCII without too much hassle.

nalimilan · 2014-02-15T14:57:23Z

@carlobaldassi Unfortunately, the standard notation for the identity matrix really seems to be I. Even though 1 is used sometimes, it is also used to designate a vector/matrix of ones, so that's not ideal. And I with double strike does not seem to be used anywhere.

StefanKarpinski · 2014-02-15T17:10:06Z

Using I seems fine to me since it won't break any existing code. The same name means different things in different contexts. That's just normal and not really a big deal at all.

andreasnoack · 2014-02-15T18:01:31Z

Sorry for the long break. I can see that you have had a great party here.

I am not proposing the following behaviour in julia, but just for completeness, here is Sage

A=matrix([[1,1],[1,1]])

A+2
[3 1]
[1 3]

I think it will be better to add the IdentityMatrix.

Generally, I think that it would be cleaner to let the . versions do the less-mathy elementwise operations and reserve +,-,*,/,^ for the math operations, i.e. make randn(2,2)+1 and error and point to .+. I don't see the benefit of sometimes having the non dot version default to the dot version.

Finally, even though we might keep + and - to default to elemenwise operations, I don't see why the fact that A^-1=inv(A) should imply that 1/A=inv(A). I think the Number/Array methods should go away except maybe for the case mentioned earlier, but that exception is not so important to me.

JeffBezanson · 2014-02-15T18:05:11Z

I'd be fine with removing Number/Array.

lsorber · 2014-02-15T23:07:02Z

+1 for @andreasnoackjensen's arguments

toivoh · 2014-02-18T20:38:49Z

I want to put in my vote for letting matrix+scalar give an error, or be allowed only for square matrices where it would give matrix+scalar*I.

It was not always obvious that Julia should have both + and .+ operators. After all, + would typically just be a more restricted form of the broadcasting .+, so why not let + broadcast? The argument that finally swayed Stefan was that then * (matrix multiplication) distributes over + (non-broadcasting addition), while .* distributes over .+. This should definitely lead to fewer surprises.

Put another way, when you are writing linear algebra code using + and *, you really prefer an error when you try to add scalar+vector, scalar+matrix, or vector+matrix, rather than silently getting a broadcasted result. A lot of the strength of the + operator compared to .+ comes from the cases that it does not allow.

When it comes to being explicit about what operations do, I think that matrix+scalar working elementwise is really sneaky. If you want a broadcasting operation, be explicit about it and use .+. I think that a good way forward be to deprecate matrix+scalar, and make everyone use .+ instead.

If the distributive property should hold for matrix+scalar as well, it must evaluate to matrix+scalar*I, so that e.g.

matrix*vector + scalar*vector = (matrix+scalar)*vector = (matrix+scalar*I)*vector

But I think it would be ok if you have to put in the I yourself. Cases might come up where the above would be convenient, but it if we deprecate matrix+scalar, that can always be discussed later on.

For 1/A where A is a matrix, I want to appeal to associativity. If we want that

B*1/A = (B*1)/A = B*(1/A)

then we need 1/A = inv(A), which I think is the right choice. This also goes in line with letting + - * / be linear algebra operators.

A lot of you have talked about consistency with other languages. I beg of you to consider the internal consistency of Julia as well. For this, we need to consider mathematical arguments. When one gains experience with numerical software, it easy to be lead to believe that it is by necessity much less consistent than mathematics, with corner cases that you have to watch out for everywhere you turn. Experiencing Julia has led me to believe that it does not have be so, it's more that too many programming languages were designed without such considerations, often by people without sufficient mathematical background. I still hope that Julia can be different. I would argue that satisfying basic properties like distributivity and associativity is an important consistency aspect, and generic programming thrives on consistency.

andreasnoack · 2014-02-18T21:54:17Z

@toivoh Thanks for the comment. After reading it twice, I am convinced by your arguments, and also annoyed that I didn't see the associativity argument for inv(A)=1/A.

Jutho · 2014-02-25T15:31:45Z

It looks like a lot of the discussion is on the difference between interpreting an Array as just a multidimensional collection of data versus interpreting them as a mathematical object (vector,matrix, tensor = multilinear map).

My vote would also be on having number/matrix = number * inv(matrix), and on deprecating array + number and requesting users to change the code to array .+ number, despite @lindahua 's argument that this construction is often used in the technical computing community. I agree with @lindahua 's argument for not having matrix + number adding to the diagonal (cited below), but the same argument also requires not to have matrix + number = adding number everywhere. In fact, I cannot think of any algebraic construction that would ever want to use such a strange operation, but that's probably just my ignorance.

Citing @lindahua
"From a pure algebraic standpoint, the Matrix Ring is formed of all n x n square matrices together with the addition and matrix multiplication defined thereon. Perhaps, I am missing something. I have yet to see an algebraic system (in terms of mathematical concept) that involves addition between matrix and scalar that exhibit the A + λI behaviour."

lindahua · 2014-02-25T16:01:09Z

I don't have a problem deprecating Array + Number in favor of Array .+ Number. I am actually against quietly changing the behavior/semantics in a radical way.

If I were asked to design a system, I would prefer such a system:

Using dot-prefixed operators .+, .-, .*, ./, .\, and .^ for element-wise operations.
Using +, -, *, \, and / for (linear) algebraic operations between arrays (considered as tensors)
A ^ n should be interpreted as algebraic power, i.e. A * A * A ....
Allow λ * A and A / λ for scalar product (this is also well defined in linear algebra)
adding scalars to diagonals should be written as A + λI. A + λ and A - λ not allowed (due to potential confusion).
λ * inv(A) may be written as λI / A. λ / A is not allowed.

johnmyleswhite · 2014-02-25T16:03:56Z

+1 for that

jiahao · 2014-02-25T16:26:20Z

+1 in favor of @lindahua's proposal.

1/A as matrix inverse is cute, but more likely symptomatic of someone who doesn't quite know what they're doing. And people really should avoid computing inverses explicitly anyway.

StefanKarpinski · 2014-02-25T16:41:00Z

Yes, that seems like a very sane approach. +1

simleb · 2014-02-25T17:23:20Z

+1

andreasnoack · 2014-02-25T21:51:04Z

I support @lindahua's proposal except that I think @toivoh's argument for allowing 1/A is convincing. Now that @jiahao raises the often heard warning against the inverse, I would like to ask you about Druinsky and Toledo. I have also made a plot.

jiahao · 2014-02-25T22:55:01Z

You've caught me reciting scripture (*￣m￣)

It isn't so much that inv(A)*b is less accurate than A\b - that obviously depends on the actual algorithms used for inv and \ - , but rather that it is significantly slower in many practical use cases. Three in particular come to mind:

For sparse matrices, computing inv(A) is often ill-advised, since it usually has a very different sparsity pattern from A itself, and is prone to fill-in. So even if A is sparse, inv(A) usually isn't.
For singular matrices, you can graciously overload \ compute the least-squares solution A\b, which inv(A)*b won't give you - you'd need pinv(A)*b for that, which is significantly more expensive.
For use cases with multiple rhs bs, it's faster to factorize once and solve with the factorization multiple times, as opposed to inverting the matrix once and computing multiple matvecs. The factorizations usually involve special matrix forms for which the matvecs would be faster.

You'll notice in your notebook that the spread of errors using A\b is narrower than for inv(A)*b. I think this is indicative of doing more flops in the latter than the former, and that you can have cancellation pushing the error lower, or componding error pushing the error higher. I don't think that is necessarily surprising.

I really can't think of a compelling use case for matrix inverses where the right to do is to express it using 1/As.

kmsquire · 2014-02-25T23:04:25Z

You'll notice in your notebook that the spread of errors using A\b is narrower than for inv(A)*b

Actually, I think it's the other way around. LU errors are circles, INV errors are x's.

jiahao · 2014-02-26T03:13:15Z

Oops, fail.

I'm not sure why that would be the case. I notice that inv and \ are both using the same LU factorization under the hood, is that right?

toivoh · 2014-02-26T06:20:01Z

+1 to @lindahua's suggestion, explicit is better than implicit. My argument for 1/A = inv (A) should perhaps even more be taken as argument against elementwise division.

I think that we should consider what I really is though. I would see it as an (identity) scaling operator, something more general than an identity matrix with unknown size.

Jutho · 2014-02-26T06:34:09Z

+1 for @lindahua's suggestion. I had no strong preference for 1/A being either inv(A) or just an error (but not elementwise division), and would definitely not use the 1/A notation myself.

andreasnoack · 2014-02-26T18:46:56Z

I have updated the pull request #5810 in consequence of the discussion here.

@jiahao I understand that there are many situations where the \ version is preferred, but I think it interesting that the broadly communicated warning against the numeric accuracy of inv(A)*b doesn't really hold. I have updated the plot with relative residual/backward errors. It explains the warning: inv(A)*b is not backward stable in b, but it must be the error of the solution that matters. One of the cited papers points out that the only place in Higham's 14 pages on matrix inversion where the forward error of inv(A)*b is mentioned is in an exercise where you asked to derive and interpret the forward error. That paper also gives examples where use of inv(A)*b is faster because of repeated use inv(A) for changing bs. I can't tell if their examples are reasonable or their timings fair.

stevengj · 2014-02-26T20:34:30Z

inv(A)*b, given inv(A), requires asymptotically the same number (2m² for an m×m matrix) of flops as the two backsubstitutions in U \ L \ b, given the LU factors of A. The practical constant factor in the real execution time of the former is slightly better because matrix-vector multiplication has greater regularity than backsubstitution, I think, and hence is easier to optimize. However, the difference only matters if you have a huge number of right hand sides ( >> m), as otherwise the time to compute the inverse dominates. But I agree that it is mostly a question of performance, not accuracy.

(In practice, I suspect that inverting matrices for performance reasons is mostly seen for tiny matrices, e.g. 3×3. On the other hand, for pedagogical and exploration purposes, I compute inverse matrices of moderate-sized matrices all the time.)

JeffBezanson · 2014-03-10T17:54:43Z

closed by #5810

andreasnoack closed this as completed Feb 13, 2014

JeffBezanson reopened this Feb 13, 2014

andreasnoack mentioned this issue Feb 13, 2014

Restrict +,- and / to algebraic operations (for matrices). Use . versions for elementwise operations. Add UniformScaling matrix. Was:Don't let division with array default to elementwise division. #5810

Merged

jiahao added decision labels Feb 14, 2014

JeffBezanson closed this as completed Mar 10, 2014

carlobaldassi mentioned this issue Mar 12, 2014

Unicode identity matrix symbol #6121

Closed

lindahua mentioned this issue Apr 4, 2014

Array+Scalar elementwise operations #6417

Closed

stevengj mentioned this issue Jul 26, 2017

Deprecate +/- methods for array+scalar etc #22932

Merged

1/x not the same as inv(x) for matrix in julia 0.2.0 #5807

1/x not the same as inv(x) for matrix in julia 0.2.0 #5807

Comments

freefrancisco commented Feb 13, 2014

andreasnoack commented Feb 13, 2014

pao commented Feb 13, 2014

JeffBezanson commented Feb 13, 2014

jiahao commented Feb 13, 2014

freefrancisco commented Feb 13, 2014

jiahao commented Feb 13, 2014

jiahao commented Feb 13, 2014

andreasnoack commented Feb 13, 2014

StefanKarpinski commented Feb 13, 2014

JeffBezanson commented Feb 13, 2014

jiahao commented Feb 13, 2014

jiahao commented Feb 13, 2014

JeffBezanson commented Feb 13, 2014

jiahao commented Feb 13, 2014

jiahao commented Feb 13, 2014

StefanKarpinski commented Feb 13, 2014

jiahao commented Feb 14, 2014

mbauman commented Feb 14, 2014

kmsquire commented Feb 14, 2014

jiahao commented Feb 14, 2014

jiahao commented Feb 14, 2014

mbauman commented Feb 14, 2014

jiahao commented Feb 14, 2014

carlobaldassi commented Feb 14, 2014

lindahua commented Feb 14, 2014

milktrader commented Feb 14, 2014

stevengj commented Feb 14, 2014

carlobaldassi commented Feb 14, 2014

nalimilan commented Feb 15, 2014

StefanKarpinski commented Feb 15, 2014

andreasnoack commented Feb 15, 2014

JeffBezanson commented Feb 15, 2014

lsorber commented Feb 15, 2014

toivoh commented Feb 18, 2014

andreasnoack commented Feb 18, 2014

Jutho commented Feb 25, 2014

lindahua commented Feb 25, 2014

johnmyleswhite commented Feb 25, 2014

jiahao commented Feb 25, 2014

StefanKarpinski commented Feb 25, 2014

simleb commented Feb 25, 2014

andreasnoack commented Feb 25, 2014

jiahao commented Feb 25, 2014

kmsquire commented Feb 25, 2014

jiahao commented Feb 26, 2014

toivoh commented Feb 26, 2014

Jutho commented Feb 26, 2014

andreasnoack commented Feb 26, 2014

stevengj commented Feb 26, 2014

JeffBezanson commented Mar 10, 2014