Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ellipsis slicing #5405

Open
mschauer opened this issue Jan 15, 2014 · 29 comments
Open

Ellipsis slicing #5405

mschauer opened this issue Jan 15, 2014 · 29 comments
Labels
domain:arrays [a, r, r, a, y, s] status:help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@mschauer
Copy link
Contributor

Numpy has this ellipsis notation for filling
index tuples up to dimension, where in
A[..., i, k]
... is interpreted as shorthand for the number of colons :, :, : etc needed to fill
the indexing tuple up to dimension n.

http:https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

This gives nice access to contiguous subarrays in the dense array case and would be not more harmful than the : notation itself.

Was this feature already considered? It was mentioned by @toivoh on the mailing list, but did not trigger discussion.

@malmaud
Copy link
Contributor

malmaud commented Jan 15, 2014

+1. I find myself using this a lot in numpy. I also like the numpy newaxis feature, which seems useful and harmless.

@toivoh
Copy link
Contributor

toivoh commented Jan 16, 2014

+1 for newaxis and something that works like ... in numpy (I'm not sure if
it's appropriate to actually use ... for this purpose, since its meaning
right now is entirely different)

@EyeOfPython
Copy link

In Python 3.x ... (or Ellipsis) is the singletone instance of type ellipsis:

Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AM
D64)] on win32
>>> ...
Ellipsis
>>> str(...)
'Ellipsis'
>>> type(...)
<class 'ellipsis'>
>>> type(...)()
Ellipsis
>>> type(...)() is type(...)()
True

You can use it as a prettier alternative to None, if you want.

In Python 2.7, however, you cannot just write ... as that results in a syntax error (but Ellipsis is possible).
Nevertheless it is possible to use it in __getitem__:

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
>>> ...
  File "<stdin>", line 1
    ...
    ^
SyntaxError: invalid syntax
>>> class K:
...   def __getitem__(self, o): print(o)
...
>>> K()[...]
Ellipsis
>>> Ellipsis
Ellipsis
>>> type(Ellipsis)
<type 'ellipsis'>
>>> type(Ellipsis)() is type(Ellipsis)()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create 'ellipsis' instances
>>>

In Julia, we have to decide weather we choose one of both or just don't allow ..., which would be sad.

@timholy
Copy link
Sponsor Member

timholy commented Apr 23, 2015

In a sense this is already here, although at the scalar level rather than the vectorized level:

R = CartesianRange(size(A)[1:end-1])
for j = 1:size(A,ndims(A))
    for i in R
        # do something with A[i,j]
    end
end

In fact it's even better than that, because you can do this with "middle indexes," see #10814, which is presumably impossible with ellipsis indexing.

There are still some challenges with regards to type-stability, at least in context of the example in #10814. But for this example, I suspect it would be OK (I haven't tested).

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 23, 2015

If we do decide to add this, it'd be good to try it out first before adding syntax. We could experiment with allowing indexing with CartesianRange objects directly.

My hunch, though, is that this isn't needed nearly as much as it is in NumPy since we don't need to bend over backwards to vectorize things for performance. In fact, we do the opposite...

@kmsquire
Copy link
Member

I don't have an opinion on the syntax, but it has been pointed out before
that sometimes vectorized versions of algorithms are easier to read and
understand (and that we want these to be fast as well).

(No bending over backwards, though...!)

On Thursday, April 23, 2015, Matt Bauman [email protected] wrote:

If we do decide to add this, it'd be good to try it out first before
adding syntax. We could experiment with allowing indexing with
CartesianRange objects directly.

My hunch, though, is that this isn't needed nearly as much as it is in
NumPy since we don't need to bend over backwards to vectorize things for
performance. In fact, we do the opposite...


Reply to this email directly or view it on GitHub
#5405 (comment).

@malmaud
Copy link
Contributor

malmaud commented Apr 23, 2015

@timholy Does that example actually work? A[j, i]works, but it looks like A[i, j] (eg, getindex(::Array, ::CartesianIndex, ::Int)) isn't implemented in master.

@timholy
Copy link
Sponsor Member

timholy commented Apr 23, 2015

Aha! Yes, you're right that not all possible combinations have been implemented, for reasons described in #10814. Would be easy to add. The right way to do it is through #10525, though.

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 23, 2015

Indeed. Indexing with any combination of I::Union(Real,AbstractArray,Colon,CartesianIndex)... is already implemented there.

@mschauer
Copy link
Contributor Author

Writing A[..., i] is very natural if A is collection of say points or small matrices. Especially expressions like x[..., j] = Phi*x[...,j-1] + b mean the same for A::Matrix and A::Vector{Point} so for example the Euler scheme etc. will be oblivious to the matrix type and will default to a very fast operation in the latter case.

@mschauer
Copy link
Contributor Author

mschauer commented May 6, 2015

For the vectorized access a not very intrusive approach is to define

getindex{T}(A::AbstractArray{T,1}, ::Type{Val{:...}}, i) = A[i]
getindex{T}(A::AbstractArray{T,2}, ::Type{Val{:...}}, i) = A[:,  i]

and friends. This does not need any additional infrastructure.

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented May 28, 2016

Yes, as @mschauer mentioned, this isn't about forcing vectorization, but rather it is the natural thing to do in many cases. This allows you to index 1,...,n arbitrary size tensors, and because it's in a standard array it's very performant (and can make clean code in conjunction with the f.() syntax). Your fix there is a good stand-in until something more general comes along.

@ChrisRackauckas
Copy link
Member

@mbauman contributed a great solution to EllipsisNotation.jl which essentially solves this problem.

https://github.com/ChrisRackauckas/EllipsisNotation.jl

It handles cases like A[5,..,5], A[..,5], etc. it's very robust and (now) it infers well!

I for one think it would be nice to have this in Base so it could be standardized as part of Julia notation. It would also solve the #24069 problem of wanting a type for all indices that keeps shape.

Since the implementation is all there, I think the main question would be notation since interval arithmetic people seem to want ...

@mschauer
Copy link
Contributor Author

If gets into Base, then it should use A[i, ..., j]. Test: "I could not think of anything else what that syntax could mean."

@StefanKarpinski
Copy link
Sponsor Member

... is getting pretty overloaded, but this seems like a fairly natural meaning.

@mbauman
Copy link
Sponsor Member

mbauman commented Oct 10, 2017

Amusingly :: parsed as an identifier in this context up through 0.6. It's also overloaded, though.

We could use a word like colons, which has the advantage of also having a natural extension to a fixed number of colons with colons(N). That'd give us a complete replacement for slicedim.

@ChrisRackauckas
Copy link
Member

I like the idea of sometime in the future having colons(N), since then you'd be able to have more than one ... and still parse it.

@StefanKarpinski
Copy link
Sponsor Member

I prefer ... for this anyway – at least the "and the rest" sense is fitting. The other sense of :: as a type annotation does not dovetail with the "many colons" meaning here, imo.

@AzamatB
Copy link
Contributor

AzamatB commented Sep 6, 2018

I would like to propose the Unicode symbol for this. It has the advantage of being succinct - only 1 character, yet preserves all the good perks of ....

@tianrluo
Copy link

tianrluo commented Aug 7, 2019

Hopefully this is still on the "future features" check-list.
This kind of indexing could return slices of unknown-dimensional arrays,
and the returned slices would be shape-ready for broadcasting operations on the original array.

Related discourse discussion.
And a work around repo pointed out by @crstnbr (Sorry if this @ bothers you)

@mbauman
Copy link
Sponsor Member

mbauman commented Aug 7, 2019

I tried to make this possible in #24091, but that approach wasn't welcomed. Instead, someone would need to add this as a special form in the [] indexing syntax — adding both parser and lowering support. We could then use the implementation from EllipsisNotation.jl — it's a really small, efficient, and great module.

I'm not the biggest fan of adding a special case that works in A[...] but not view(A, ...), but if enough folks want this then I suppose I can get on board.

(I hope you don't mind I took the liberty of fixing your link).

@mschauer
Copy link
Contributor Author

mschauer commented Aug 8, 2019

There is not much discussion in #24091, but well, I can trust your judgement "not gonna happen". A pity in my opinion.

@Cvikli
Copy link

Cvikli commented Jul 7, 2020

Well, if we can vote for features, then I would say +1 for this syntax. It would be nice and clean if it would be fitted into the language.
What is the problem with this feature exactly? Does it bring speed drawback?

@mcabbott
Copy link
Contributor

mcabbott commented Jul 7, 2020

Note BTW that the objection above that .. is also wanted for intervals is now solved, in that IntervalSets.jl uses the definition from EllipsisNotation.jl.

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented May 18, 2022

Note BTW that the objection above that .. is also wanted for intervals is now solved, in that IntervalSets.jl uses the definition from EllipsisNotation.jl.

And it would be fine too if it's defined in Base.

But https://github.com/ChrisRackauckas/EllipsisNotation.jl is pretty clean and stable at this point. IMO it's about time for it to move to Base and be a standard part of Julia syntax. We can bikeshed the symbol for it, I don't care too much about it, but I think the implementation is correct enough now to be stable enough for Base.

@mcabbott
Copy link
Contributor

One argument for doing this at a lower level than EllipsisNotation is that it could avoid this problem (which is #35681):

julia> Meta.@lower A[.., end]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1%1 = Base.lastindex(A, 2)   # always 2, even if A has 4 dimensions%2 = Base.getindex(A, .., %1)
└──      return %2
))))

IntervalSets.jl uses the definition from EllipsisNotation.jl

This was later reversed, so that IntervalSets.jl does not have to load the entire ArrayInterface.jl.

@StefanKarpinski
Copy link
Sponsor Member

This seems like a useful feature to me, I think it would just need someone to implement it.

@ChrisRackauckas
Copy link
Member

What part needs to be implemented? Or do you mean just copy-paste into Base?

@StefanKarpinski
Copy link
Sponsor Member

Making A[..., i] an actual syntax.

@LilithHafner LilithHafner added the status:help wanted Indicates that a maintainer wants help on an issue or pull request label Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:arrays [a, r, r, a, y, s] status:help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

No branches or pull requests