Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove (c)transpose definitions from operators.jl. Add them to string.jl #9170

Closed
wants to merge 1 commit into from

Conversation

andreasnoack
Copy link
Member

The definitions give silent errors for matrix types that don't define transpose whenever the *(A,B,C) method is called. See the example below.

julia> Qf = qrfact(randn(4,4));

julia> Q = full(Qf[:Q]);

julia> A = eye(4)
4x4 Array{Float64,2}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  1.0  0.0
 0.0  0.0  0.0  1.0

julia> Q'*A*Q #Matrix-transpose * Matrix * Matrix, should be approximately A
4x4 Array{Float64,2}:
  1.0          -2.22045e-16  -1.11022e-16  1.11022e-16
 -1.11022e-16   1.0           1.66533e-16  5.55112e-17
  0.0           3.33067e-16   1.0          8.32667e-17
  8.32667e-17   5.55112e-17   5.55112e-17  1.0        

julia> Qf[:Q]'*A*Qf[:Q] #QRCompactWYQ-transpose * Matrix * QRCompactWYQ, should be approximately A but isn't
4x4 Array{Float64,2}:
 -0.171812    -0.967699   -0.114264  -0.144855
 -0.592102     0.220933   -0.754075  -0.178816
  0.787293    -0.0461806  -0.594673  -0.156207
 -0.00810461  -0.112316   -0.25432    0.960542

@jiahao
Copy link
Member

jiahao commented Nov 26, 2014

To clarify, special matrix types like QRCompactWYQ have nontrivial semantics for transpose, but the default transpose(::Any) no-op method silently misleads users into believing that everything is ok when in fact a new method definition is required and a MethodError should have been raised. Therefore, the current behavior is Very Bad. 😿

@Jutho
Copy link
Contributor

Jutho commented Nov 26, 2014

I remember a discussion whether (c)transpose of a matrix needs to call (c)transpose on its elements, e.g to represent a block matrix as a matrix of matrices. Currently, transpose of a matrix does indeed call transpose on the elements, so if transpose of Any is not defined, transposing a matrix with custom types will not work.

@andreasnoack
Copy link
Member Author

@Jutho I'm wondering how often you transpose a matrix of non-number or non-string elements. In any case, I think it would be okay to require that the behavior of transpose is defined per type subtree. Falling back on a noop transpose seems too dangerous compared to the benefits.

@StefanKarpinski
Copy link
Sponsor Member

Doing A' to transpose non-numeric arrays is quite common.

@andreasnoack
Copy link
Member Author

Would it cause too much inconvenience to require that users define the behavior of transpose for their type?

If it does, we could consider to restrict the recursive behavior of transpose to be restricted to matrices with matrix elements.

@andreasnoack
Copy link
Member Author

@Jutho and @StefanKarpinski, can I go ahead and merge this or are your comments blocking objections.

@StefanKarpinski
Copy link
Sponsor Member

I'm not sure what spurred this change. I recall convincing at least @JeffBezanson that it was ok for transpose and conj to default to identity on non-numeric types.

@andreasnoack
Copy link
Member Author

It can be hard to remember old comments. Even when is's your own, #7244 (comment). I wrote the lines that I want to remove now, but at least I saw the potential issue. The issues #7244 and #6395 seem to be the relevant references here. I really don't think that it is a good idea to have methods defined for Any when they are wrong for a some types. Now we have a specific example of the problem.

@mbauman
Copy link
Sponsor Member

mbauman commented Nov 30, 2014

Perhaps transpose(::AbstractArray) = error(...)?

@StefanKarpinski
Copy link
Sponsor Member

Perhaps I'm being dense but I'm not seeing what the problem with the current definitions is.

@andreasnoack
Copy link
Member Author

@mbauman The problem is that not all matrix-like types will be subtypes of AbstractMatrix

@StefanKarpinski Could you elaborate why returning a wrong result is not a problem, i.e. returning the identity when transposing a matrix? Wouldn't a few no method errors be better than wrong results?

@JeffBezanson
Copy link
Sponsor Member

The choices are

  1. no-method error when transposing a type that doesn't know about transpose at all.
  2. silently using the identity function on a type that should know about transpose but hasn't implemented it.

Now we prefer (2), since the type in question should just implement transpose. Otherwise nearly every type will need to implement it, or face annoying errors when transposing arrays of random things. @mbauman 's suggestion above sounds like a good compromise.

@andreasnoack
Copy link
Member Author

I was too probably too brief when I opened this issue because I didn't really think that two dimensional arrays of non-number and strings were transposed that often.

The special Q matrices don't implement transpose because these matrices shouldn't be transposed explicitly. I see the point that matrices could be expected to implement a transpose, but very often you would like only to define transpose multiplication with the matrix.

However, this is not the main problem here. As mentioned before, I really don't like general fallbacks for Any that silently give a wrong result for some types. I think it is bad solution on the problem which is that sometimes you'd like to flip the dimensions of a two dimensional array that cannot really be considered a mathematical matrix. I have two other solutions which I think will give all of us what we want.

  1. ctranspose and transpose are defined recursively (as introduced in Make (c)transpose work correctly for block matrices #7244) only when the elements are matrices. Then arbitrary types don't have to implement (c)transpose and we can remove the definitions for Any without users facing annoying errors.
  2. ctranspose is defined recursively and transpose is not. Then users who'd like to flip the dimensions of their two dimensional array of IOBuffers Strings and BitVectors should just use .' instead of ' which takes us back to the discussion in transposition operator for string arrays #6395. Here we can again remove the definitions in operators.jl without introducing error when transposing general two dimensional arrays.

@jiahao
Copy link
Member

jiahao commented Nov 30, 2014

Unfortunately @mbauman's suggestion will not work here currently would work but arguably should not. In #987 we decided that some matrix objects should not be AbstractArrays, because they are not containers with "cheap" element lookups. Consequently, the type hierarchy currently does nothing to help us distinguish between types for which transposition can be safely ignored as a no-op and types which require nontrivial transposition methods. Arguably the current choice of (2) is misleading should users want to implement the latter kind of type, since it gives the false sense of "just working for free" when in fact the wrong result is computed.

@jiahao
Copy link
Member

jiahao commented Nov 30, 2014

Maybe it's time to reconsider introducing the LinearOperator type? There are a couple of types in LinAlg that arguably should not be subtypes of AbstractArray but could be some kind of LinearOperator:

  • Compact representations of rotation matrices like: Givens, Rotation, QRCompactWYQ, HessenbergQ, QRPackedQ. Only the last three are still defined as <: AbstractArray.
  • More marginal, but still worth considering are compact representations of structured matrices like SymmetricRFP, TriangularRFP in rectangular full packed format as described in LAWNS 199.
  • the UniformScaling type (currently a standalone type).
  • SparseMatrixCSC (currently <: AbstractArray).

In #987 we did not resolve what "cheap" means. Consider:

  • Some of these types, like Givens and UniformScaling matrices, store very little information but it is still possible to do getindex in O(1) time for each matrix element. This is cheaper than our current sparse CSC format, where getindex is not strictly O(1) but we decided that we want SparseMatrixCSC <: AbstractArray anyway.
  • Other types like Rotation and QRCompactWYQ store a bunch of rotations and it is technically possible to do getindex in something like O(k^2) time, where k is the number of rotations encoded. In principle k has no relation whatsoever with the nominal dimension N.

@andreasnoack can correct me here, but I think "things you shouldn't explicitly transpose" could be another useful criterion for defining a LinearOperator beyond "does not have 'cheap' getindex".

@Jutho
Copy link
Contributor

Jutho commented Nov 30, 2014

The most useful definition for a LinearOperator (or maybe LinearMap, not necessarily mapping a vector to a vector of the same length) should be: 'efficient application on / multiplication with a vector'. I guess this would apply to all of the above types? Whether or not one can easily define a transpose for these objects really depends on the type (e.g. it's ok for a sparse matrix).

@andreasnoack
Copy link
Member Author

Maybe I am looking at this in the wrong perspective and any further explanation would be appreciated. I don't understand why it is okay to assume structure for Any. At least not when you can point to specific cases where it results in errors. I can add definitions for transpose to our base code, but we cannot enforce that users define transpose and our present solution will therefore result in errors for any new array type defined. Even in Base it could be difficult to cover all corner cases when no errors are thrown, e.g. we'll now also have this case

julia> A = sub(sparse([1 0;1 1]), :, :)
2x2 SubArray{Int64,2,Base.SparseMatrix.SparseMatrixCSC{Int64,Int64},(Colon,Colon),0}:
 1  0
 1  1

julia> transpose(A) == A
true

You generally seem to think of Any as scalars: one(Any) used to return 1 before I got away with changing that and transpose(x)=x is also treating Any as a scalar. Even though it might be convenient and often gives the right result, I don't think it is worth the costs of introducing wrong results in the linear algebra code.

@Jutho
Copy link
Contributor

Jutho commented Dec 5, 2014

I have no problem with seeing the recursive transpose call go away, even for the case where you have a matrix with matrices as elements. It is not guaranteed that the user transposing the outer matrix really intended to transpose all the individual elements, and currently there is no way to just transpose the outer matrix (unless via permutedims). If a user wants to transpose all the inner elements, he can still do something like transpose(map(transpose,A))). I just don't have time to make this modification right now, and will be unavailable all of next week.

@StefanKarpinski
Copy link
Sponsor Member

I wonder if we can improve this by separating the x' syntax from the ctranspose function.

@andreasnoack
Copy link
Member Author

@Jutho For linear algebra, I think that recursive transposing is the right thing to do and that using permutedims is the right thing for arrays that are not considered numerical matrices. However, I recognize the syntax for the former is more handy.

@StefanKarpinski Could you elaborate? I don't understand your comment.

@StefanKarpinski
Copy link
Sponsor Member

I'm not sure there was anything to my comment. I don't think what I was thinking of really helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants