Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing fallback methods for vector spaces #43903

Closed
juliohm opened this issue Jan 23, 2022 · 29 comments · Fixed by #44564
Closed

Missing fallback methods for vector spaces #43903

juliohm opened this issue Jan 23, 2022 · 29 comments · Fixed by #44564
Labels

Comments

@juliohm
Copy link
Sponsor Contributor

juliohm commented Jan 23, 2022

Given a vector space, the division by a non-zero scalar can be defined in terms of the scalar product:

/(v::SomeVectorType, λ::Number) = inv(λ) * v # where * is the implemented scalar product

Can this fallback method be added to Base? Currently we need to implement the fallback for all new vector spaces.

@dkarrasch dkarrasch added the domain:linear algebra Linear algebra label Jan 24, 2022
@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 20, 2022

Another operation that needs a fallback:

-(u::SomeVectorType, v::SomeVectorType) = u + (-v)
-(v::SomeVectorType) = -1 * v

@dkarrasch can I submit a PR? What is the file where these methods should go?

@juliohm juliohm changed the title Fallback for division by scalar in vector spaces Missing fallback methods for vector spaces Feb 20, 2022
@dkarrasch
Copy link
Member

Actually, we do have fallbacks. We have

-(A::AbstractArray) = broadcast_preserving_zero_d(-, A)

i.e, apply - elementwise, and

for f in (:+, :-)
    @eval function ($f)(A::AbstractArray, B::AbstractArray)
        promote_shape(A, B) # check size compatibility
        broadcast_preserving_zero_d($f, A, B)
    end
end

i.e., take the difference elementwise. As for the division, we have

for f in (:/, :\, :*)
    if f !== :/
        @eval ($f)(A::Number, B::AbstractArray) = broadcast_preserving_zero_d($f, A, B)
    end
    if f !== :\
        @eval ($f)(A::AbstractArray, B::Number) = broadcast_preserving_zero_d($f, A, B)
    end
end

So, if your vector space operations should not fall back to broadcasting, then I guess this is exactly a case where you need to overload and specialize these operations for your own types? Your issue reminds me of https://github.com/Jutho/TensorKit.jl.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 20, 2022

@dkarrasch consider the following vector space:

import Base: *, +

# functional data type
struct Fun
    data
end

# scalar multiplication
*::Number, f::Fun) = Fun* f.data)

# vector addition
+(f::Fun, g::Fun) = Fun(f.data + g.data)

By default I get the following errors:

julia> Fun([1,2,3]) - Fun([4,5,6])
ERROR: MethodError: no method matching -(::Fun, ::Fun)
Stacktrace:
 [1] top-level scope
   @ REPL[6]:1

julia> Fun([1,2,3]) / 2.0
ERROR: MethodError: no method matching /(::Fun, ::Float64)
Closest candidates are:
  /(::StridedArray{P}, ::Real) where P<:Dates.Period at ~/Desktop/julia-1.7.2/share/julia/stdlib/v1.7/Dates/src/deprecated.jl:44
  /(::Union{SparseArrays.SparseVector{Tv, Ti}, SubArray{Tv, 1, <:SparseArrays.AbstractSparseMatrixCSC{Tv, Ti}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, false}, SubArray{Tv, 1, <:SparseArrays.AbstractSparseVector{Tv, Ti}, Tuple{Base.Slice{Base.OneTo{Int64}}}, false}} where {Tv, Ti}, ::Number) at ~/Desktop/julia-1.7.2/share/julia/stdlib/v1.7/SparseArrays/src/sparsevector.jl:1476
  /(::LinearAlgebra.Bidiagonal, ::Number) at ~/Desktop/julia-1.7.2/share/julia/stdlib/v1.7/LinearAlgebra/src/bidiag.jl:375
  ...
Stacktrace:
 [1] top-level scope
   @ REPL[7]:1

@dkarrasch
Copy link
Member

Sure, but you're not subtyping to anything. Of course you need to teach Julia every single operation for Fun (whose data field is also unparametrized BTW). Or do you want to define -(x::Any) based on (-1)*x, where you silently assume that *(::Number, ::Any) is defined? I think we need a meaningful "upper type bound" that is strictly lower than Any. Can you make Fun <: AbstractArray? Then you need to define size and getindex and the rest should pretty much work out of the box.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 20, 2022

@dkarrasch this is an example of an infinite dimensional vector space. The idea of working with coordinates (i.e. size, and getindex) shouldn't be enforced on new vector space types.

Or do you want to define -(x::Any) based on (-1)*x, where you silently assume that *(::Number, ::Any) is defined?

Wouldn't that make sense? What other definition could we assign to -(::Any)? The definition of *(::Number, ::Any) is not necessary since we already have it explicitly defined in new vector spaces for a specific type. In my example above, I explicitly defined *(::Number, ::Fun)

whose data field is also unparametrized BTW

Yep, that was a quick simple example to illustrate the point.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 20, 2022

What I mean here is that in order to implement a new vector space as we do in math, we only need to define two operations. The rest follows from proofs. We can prove that -x = -1 * x where 1 is the unit in the field of scalars, and so on.

@dkarrasch
Copy link
Member

I understand. I guess we need others to chime in (@Jutho @MasonProtter @andreasnoack @ViralBShah @dlfivefifty come to mind). It feels strange to me to assume that Any is by default a vector space element. Your approach really reminds of the abstract generic approach taken in TensorKit.jl. I don't have strong feelings either way, it's just that Any is very very generic.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 20, 2022

I sympathize with your concern, but can't think of a more natural fallback. Most of us are doing linear algebra 24/7 and this fallback would simplify things tremendously for teaching students and for writing more generic code.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 22, 2022

If the generic fallback -(x::Any) is not acceptable (I think it should be ok and harmless to add), then we could have a new abstract type in LinearAlgebra to represent vectors as in vector spaces, not vectors as in 1-dimensional arrays of numbers.

Any update on this issue?

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 22, 2022

I think my opinion on this is that the lack of upper bounds makes me a big nervous, but on the other hand, one should ask

What else could something like -x or v / λ possibly mean than the meaning suggested here?

I struggle to imagine an alternative meaning that we wouldn't consider a 'pun'. For /, the docstring is very explicit that this is the exact meaning we assign to /:

help?> /
search: / //

  /(x, y)

  Right division operator: multiplication of x by the inverse of y on the right. Gives
  floating-point results for integer arguments.

I think given this, if someone wanted to make a method a::T / b::Number be something semantically different than a * inv(b), then they're not really respecting the intended meaning of the / function anyways and should define their own function with its own methods.

I feel similarly for -, which simply just claims to mean subtraction, and subtraction is generally defined as the inverse of addition, so having a fallback for subtraction that uses addition with the additive inverse makes sense to me.

The more I think about this, the more I think my only real hesitance is fear of additional method ambiguities.

@MasonProtter
Copy link
Contributor

Generally speaking though, I think this sort of thing is why we want proper support for traits. Ultimately, we have bundled up all these math methods on the premise that if something is structurally a number or array, then it obeys the rules of a vector space, but really what we want is to be able to say

"I have a type, and this type obeys the rules of vector space arithmetic" without forcing the user to subtype AbstractArray or Number, which carry a ton of additional, inappropriate baggage for certain use cases.

@DilumAluthge
Copy link
Member

It feels strange to me to assume that Any is by default a vector space element.

I agree with this. Why should Julia make such an assumption? If you're defining a new vector space element, you should say so explicitly, either by subtyping or traits.

@MasonProtter
Copy link
Contributor

I would say it's not about Any at all. It's about the - and / functions who explicitly say in their docstrings that they are to be defined in this sense.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 22, 2022

without forcing the user to subtype AbstractArray or Number, which carry a ton of additional, inappropriate baggage for certain use cases.

Exactly. That is the crux of the issue.

The proposed fallback methods for / and - are basically the definition of these operations for any algebraic structure.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 22, 2022

However, I feel that the docstring of /:

multiplication of x by the inverse of y on the right

could be generalized to:

multiplication of the inverse of y by x on the right

I would say that the more general case also flips the order of the arguments so that we only rely on *(::Number, ::Any) and never on (::Any, ::Number). That is:

/(x, y) = inv(y) * x

In the above expression we assume that x is a vector (not necessarily an array) in a vector space and y is a scalar.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 23, 2022

I added a new tutorial with the example I had in mind. The Pluto notebook has a hidden cell with the fallback methods implemented manually: https://www.youtube.com/watch?v=aQSlPDzqGPY

It would be nice to remove this hidden cell at some point in the future.

@dkarrasch
Copy link
Member

One possible way forward is to add those generic methods in a PR and let nanosoldier run, to see if any issues in the package ecosystem occur. If they don't, then indeed why not make those definitions.

@Jutho
Copy link
Contributor

Jutho commented Feb 23, 2022

@dkarrasch, I don't think TensorKit.jl is relevant here, but maybe KrylovKit.jl is. Here, I also want to support general user types that have the mathematical interpretation of being vectors in a vector space, rather than restricting to AbstractVector or AbstractArray as many other packages seem to be doing (even major packages as IterativeSolvers.jl, Optim.jl, DifferentialEquations.jl, ...). I find it somewhat unfortunate and discomforting that in a flexible and generic language as Julia, people tend to take over the habits from older languages and require things to be AbstractArray when this is not strictly necessary.

As stated by others, a common supertype cannot solve this. I think I learned early on when I started using Julia that the type system is not meant to express mathematical structures (as this would anyway be impossible without being able to have multiple supertypes). A trait system could help here, but for now there is only the possibility of having an informal interface (as e.g. with the iterator interface), i.e. requiring that a number of methods are implemented.

I thus fully agree that it would be useful to have an agreed upon specification of a vector interface in the Julia base library. In KrylovKit.jl, the methods that I request are documented in point 2 of: https://jutho.github.io/KrylovKit.jl/stable/#Package-features-and-alternatives

So for example, in the KrylovKit code, I will make sure to never write -v but rather (-1)*v.

Note that, on top of the minimal set of methods that appear in the mathematical definition of vectors (vector addition and multiplying with scalars), it is in practice useful to have in-place methods to do these operations. If I were to propose what such a minimal vector interface would need to have, it would be the following:

  • vector addition: v + w
  • scalar multiplication: a * v and v * a (for a vector space, where scalars belong to a field, only one of the two would be necessary, but I guess we want to support non-commutative scalars, and thus the weaker definition of a left or right module instead of a vector space)
  • in-place vector addition, where you can take a linear combination. Here, there are the BLAS methods axpy! and axpby!, but it would be nice to have a less cryptic name like add!: so add!(w, v, a = 1, b = 1) would have the effect of replacing w with b w + a v. Here, of course, it's already harder to deal with non-commutative scalars.
  • in-place scalar multiplication. I liked the method scale! that was around in pre julia-1 times, but that was removed in favor of rmul!, lmul! and mul!. I think the concept of scalar multiplication of a vector is very different than adding an actual multiplication rule to a vector space (turning it into an algebra, as e.g. matrices), and so I don't particularly like that these concepts have become conflated.

On top of those, it is probably useful to be able to extract the scalar type that is used by those vectors. From the mathematical perspective, that's probably real numbers or complex numbers (or more exotic), but in practice you wan't to know which specific implementation (e.g. Float32 or Float64). For Arrays, this corresponds to eltype, but eltype in Julia is supposed to return the type if you index into that variable, or if it does not support indexing, if you iterate over it. But this can be very different from the actual scalar type. Nested arrays would be a primary example here. In KrylovKit, I don't rely on a scalartype method, though this would make things more useful. Instead, I use implicit tricks like computing the inner product of two vectors to extract the scalar type, which is clearly suboptimal.

Then, finally, a method to create a similar vector, possibly with a different (promoted) scalar type. Something analogous to similar(v::Array, eltype::T) but which works with the aforementioned scalar type, rather than eltype. In KrylovKit.jl, this is currently solved by letting the out-of-place multiplication handle allocation and scalar type promotion. So if I have a vector v and I want to create a similar vector with a scalar type T, I would write one(T) * v or zero(T) * v. That is again suboptimal, as in fact the resulting scalar type can be different, namely promote_type(T, scalartype(v)). It's only because I already know that this yields T because of how T was determined that this works in my case.

Optionally, there would be dot (I would have liked inner) and norm functions for vectors in an inner product space. Not all vectors need to have those, but for those where they can be defined, I think there is not much debate about how the interface should look like.

@dlfivefifty
Copy link
Contributor

/(x, y) = inv(y) * x

is wrong:

julia> A/B
3×3 Matrix{Float64}:
 0.834777  0.328324  -0.335083
 0.538943  0.151829  -0.952223
 0.785972  2.19402   -0.826465

julia> A*inv(B)
3×3 Matrix{Float64}:
 0.834777  0.328324  -0.335083
 0.538943  0.151829  -0.952223
 0.785972  2.19402   -0.826465

julia> inv(B)*A
3×3 Matrix{Float64}:
  0.229277   0.440463  -1.20529
 -1.29909   -0.948649  -0.213654
  1.20497    1.4632     0.879513

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 24, 2022

Yes, perhaps we need a distinction between these two cases, or we can simply ignore the / for vector spaces. The main concern is the fallback for - at the moment.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 28, 2022

Is it reasonable to consider the fallback for - in Julia v1.8?

-(u, v) = u + (-v)
-(v) = -1 * v

I fully agree @Jutho that we need a better treatment of spaces in the future, probably as a package with a well-defined API for the field of scalars, etc. But these fallbacks would solve a lot of headache already.

@N5N3
Copy link
Member

N5N3 commented Feb 28, 2022

Is there a strong reason to put them in Base? *// are generic functions, user can map anything to them with their own types.

Currently we need to implement the fallback for all new vector spaces.

Limiting these transform to, e.g. AbstractVectorSpaces, seems more reasonable?

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 28, 2022 via email

@dlfivefifty
Copy link
Contributor

You probably mean: (-one(v)) * v but if typeof(one(v)) == typeof(v) this introduces an infinite call.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Feb 28, 2022 via email

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Mar 2, 2022

@dkarrasch where we could add these fallbacks for -? I am happy to submit a PR to check the new method in the existing test sets.

@juliohm
Copy link
Sponsor Contributor Author

juliohm commented Mar 9, 2022

@dkarrasch I am happy to submit a PR with the fallback for -. I really wish it could make it to Julia v1.8.

@KristofferC
Copy link
Sponsor Member

I really wish it could make it to Julia v1.8.

The feature freeze for 1.8 was a long time ago so that is not possible.

@ParadaCarleton
Copy link
Contributor

ParadaCarleton commented Mar 23, 2022

Another operation that needs a fallback:

-(u::SomeVectorType, v::SomeVectorType) = u + (-v)
-(v::SomeVectorType) = -1 * v

@dkarrasch can I submit a PR? What is the file where these methods should go?

I'd like to add that this fallback would have saved us from a bug in Distributions.jl that we just discovered, where we forgot to define a method for - being called on multivariate distributions. We have to define - by hand for many different subtypes (reals, arrays, ...) in a way that makes it easy to miss a particular combination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants