Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow inv(::Eigen) method #38335

Closed
JeffBezanson opened this issue Nov 6, 2020 · 13 comments
Closed

slow inv(::Eigen) method #38335

JeffBezanson opened this issue Nov 6, 2020 · 13 comments
Labels
domain:linear algebra Linear algebra performance Must go faster

Comments

@JeffBezanson
Copy link
Sponsor Member

inv of an Eigen seems to be no faster (maybe a bit slower) than inv of the original matrix. The code is

inv(A::Eigen) = A.vectors * inv(Diagonal(A.values)) / A.vectors

Should that be

inv(A::Eigen) = A.vectors * inv(Diagonal(A.values)) * A.vectors'

instead?

@JeffBezanson JeffBezanson added performance Must go faster domain:linear algebra Linear algebra labels Nov 6, 2020
@simeonschaub
Copy link
Member

Isn't that only valid if the original matrix is hermitian?

@oscardssmith
Copy link
Member

It shouldn't be. Vectors is an orthogonal matrix

@timholy
Copy link
Sponsor Member

timholy commented Nov 6, 2020

No, @simeonschaub is right. V*D*V' is guaranteed to be symmetric, so if you compute the eigendecomposition of a nonsymmetric matrix the decomposition is V*D*inv(V) (and V isn't orthogonal).

We should probably add a field to Eigen that records the result of ishermitian(A) and then use a branch to compute the inverse more efficiently when possible.

@KlausC
Copy link
Contributor

KlausC commented Nov 9, 2020

Please note, that the initial formula inv(A) == E.vectors * D^-1 / E.vectors is wrong or numerically instable, if A is not hermitian and contains multiple eigenvalues; even if A has optimal condition.
The reason is, that the condition of E.vectors may be arbitrarily bad, if A is not hermitian or symmetric.
So the original formula is in general not advisable.

Example:

julia> A = [0.0 1 0; 0 0 1; 0 0 0] + I
3×3 Matrix{Float64}:
 1.0  1.0  0.0
 0.0  1.0  1.0
 0.0  0.0  1.0

julia> E = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 1.0
 1.0
 1.0
vectors:
3×3 Matrix{Float64}:
 1.0  -1.0           1.0
 0.0   2.22045e-16  -2.22045e-16
 0.0   0.0           4.93038e-32

julia> inv(A)
3×3 Matrix{Float64}:
 1.0  -1.0   1.0
 0.0   1.0  -1.0
 0.0   0.0   1.0

julia> E.vectors * inv(Diagonal(E.values)) / E.vectors
3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

julia> cond(A)
4.048917339522305

@KlausC
Copy link
Contributor

KlausC commented Nov 12, 2020

Would it make sense to (optionally) extend the output gathered in Eigen in the case of non-hermitian matrices, as provided by LAPACK.geevx?
That could include left eigenvectors, and condition numbers.
In the case of unitary/orthonormal eigenvectors, as they are guaranteed for hermitian/symmetric-real matrices, and which are possible for all normal matrices (A'A == A*A'). A stored condition number of 1.0 could indicate this;
and then use the right multiplication with * V' instead of / V.

For non-unitary vectors, the user could decide, if V * f.(D) * V^-1 is numerically stable from this condition number, which is then > 1 .

@KlausC
Copy link
Contributor

KlausC commented Nov 14, 2020

For the records: In the general (non-symmetric) case.
If we had evaluate the left-eigenvectors evl besides the right ones evr and eigenvalues in D (which can be achieved by LAPACK.geevx), we have for a extended set of functions (at least polynomials, analytical, meromorphic)

f(A) = evr * f(D) / evr = evr * f(D) * (evl / X)' 
where
X = evr' * evl

X is essentially block diagonal with block sizes identical to the algebraic multiplicities of eigenvalues, becuase v' * w == 0 if v and w are left- and right eigenvectors for two different eigenvalues. In the most probable case it is diagonal.
So it is possible to avoid the division by evr and replace it by the multiplication with a slightly modified adjoint of evl.

@KlausC
Copy link
Contributor

KlausC commented Nov 14, 2020

For the timing, some measured values:

julia> A = randn(1000, 1000); # A a random general square matrix.

julia> @btime eigen(A);
  2.349 s (26 allocations: 31.60 MiB)

julia> @btime eigen(A; jvl = true); # yielding left eigenvectors form geevx
  2.685 s (28 allocations: 54.49 MiB)

julia> @btime E.vectors * Diagonal(1 ./ E.values) / E.vectors; # the current methods for inv(E)
  866.606 ms (18 allocations: 61.06 MiB)

julia> @btime E.vectors * Diagonal(1 ./ E.values) * E.vectorsl; # using multiplication with evl' 
                                                                # (neglecting evaluation of X anddivision by X)
  334.301 ms (13 allocations: 30.53 MiB)

julia> @btime inv(A);
  525.948 ms (5 allocations: 8.13 MiB)

### the symmetric case
julia> A = (A' + A) / 2;

julia> E = @btime eigen(A);
  305.186 ms (16 allocations: 23.25 MiB)

jjulia> @btime E.vectors * Diagonal(1 ./ E.values) * E.vectors';
  28.650 ms (14 allocations: 15.27 MiB)

julia> @btime E.vectors * Diagonal(1 ./ E.values) / E.vectors;
  93.815 ms (18 allocations: 30.53 MiB)

julia> @btime inv(A);
  48.002 ms (5 allocations: 8.13 MiB)

julia> versioninfo()
Julia Version 1.6.0-DEV.1473
Commit 695a8db142* (2020-11-11 14:40 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz

@stevengj
Copy link
Member

stevengj commented Nov 16, 2020

Anyone working with a diagonalization of a very non-normal matrix should know that the eigenvectors are potentially a badly conditioned matrix, and that anything they might want to do with the Eigen factorization might be unstable as a result. I don't think we can or should protect them from that.

I thought the getting the left eigenvectors requires extra computation (they are equivalent to inv(X)), so I don't think we can store them by default. Does LAPACK provide the condition number of the eigenvectors for free either?

(I'm not sure it's essential to provide a faster inv(::Eigen) in the Hermitian case; since in practice it's a bit bonkers to compute matrix inverses this way.)

@KlausC
Copy link
Contributor

KlausC commented Nov 17, 2020

so I don't think we can store them by default.

From the measured figures you can see, that for n = 1000, the extra evaluation of left eigenvectors takes 336 ms
which is 14% more than calculating only the right ones. If we cannot afford that by default, we could make it optional (kwarg of eigen).
If the left eigenvectors are available, we save 532 ms when using the left eigenvectors instead of the inverse of the right ones,
when calculating the inverse. So if we need all right eigenvectors and also the inverse matrix, it can make sense to calculate it this way.

The same timing argument is valid in the hermitian/symmetric case.

Does LAPACK provide the condition number of the eigenvectors for free either?

Calculating condition numbers for the eigenvalues is an option of geevx. It requires the calculation of right and left eigenvectors, but does not imply essential extra time apart of that. Those numbers are 1.0 for unitary eigenvectors.

Calculating condition numbers for the right eigenvectors is an option of geevx. It is extremly expensive. (25 seconds for the example). Those numbers explode in the case of (close to) multiple eigenvalues and don't help in deciding unitarity.

in practice it's a bit bonkers to compute matrix inverses this way

I agree, if you don't need the eigen decomposition. But once you have it, it can save you time (28ms vs. 48 ms in the symmetric case).

Having a flag, which indicates the eigenvectors are unitary would also allow to implement other kinds of matrix functions from Eigen like f(A) = isunitary? evr * f(D) *evr' : evr * f(D) / evr more efficiently. For badly conditioned evr that is not an option, of course.

@stevengj
Copy link
Member

stevengj commented Nov 17, 2020

If you already have the eigendecomposition of a Hermitian matrix and you are interested in saving time, you should generally avoid computing the matrix inverse in the first place. Anything you might want to do with the matrix inverse you can do directly with the eigenvectors and eigenvalues.

That being said, it makes sense to me to store an isnormal flag in the Eigen type, since we know when the object is created whether the initial matrix was Hermitian or some related type — no need to try to infer this information from the geevx outputs. That information is potentially useful for lots of calculations.

@KlausC
Copy link
Contributor

KlausC commented Nov 17, 2020

Anything you might want to do with the matrix inverse you can do directly with the eigenvectors and eigenvalues.

Yes, of course. That questions the existence of inv(::Eigen) in general. One reason for eigen-decomposition would be to be able simplify further calculations by making use of a more convenient basis.

it makes sense to me to store an isnormal flag in the Eigen type

From the existing type information, we can derive from A being Hermitian and Symmetric{<:Real}, that this matrix is normal. There is no such information available from the matrix type for any other kind of normal matrices. As the definition of "normal" is
A'A == A * A' that needed to be evaluated to fill in isnormal`.

For the purpose of the spectral theorem (not only the inverse), it would be useful to know if the eigenvector matrix is unitary. This can be assumed for the output of the LAPACK functions for the hermitian/symmetric case, but not for normal matrices in general. (For multiple eigenvalues, geevx does not deliver unitary eigenvectors, although such do exist).
Therefore it would be necessary to derive this unitarity property from the output.

The condition numbers for the eigenvalues, which can be delivered by geevx are useful on their own for the general case, primarily for sensitivity/error analysis. I would not understand, why we do not exploit the abilities of geevx, if the user can opt in.
The Eigen struct would be the proper place to keep this information.

@stevengj
Copy link
Member

That questions the existence of inv(::Eigen) in general.

No reason not to define it, since it is well-defined and easy to compute (assuming the diagonalization itself is well conditioned). But I'm not sure if it's worth optimizing too much.

I would not understand, why we do not exploit the abilities of geevx, if the user can opt in.

The key point here is opting in. Left eigenvectors and condition numbers can be very useful, but I don't think we should compute them by default. It's not clear to me what the opt-in interface would look like.

@KlausC
Copy link
Contributor

KlausC commented Nov 18, 2020

It's not clear to me what the opt-in interface would look like.

Additional fields and corresponding type parameters in Eigen, additional keyword(s) in eigen and eigen!?
The unused fields are populated with empty vectors / matrices. I am working on a PR including that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:linear algebra Linear algebra performance Must go faster
Projects
None yet
Development

No branches or pull requests

6 participants