Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sorting iterators #46104

Merged
merged 27 commits into from
May 30, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
c762cbd
widen sort's type signature
Jul 19, 2022
a00e018
fixup
Jul 19, 2022
d9316f8
remove broken cross-ref to Base.copymutable
Jul 19, 2022
54db6a6
Style
LilithHafner Jul 19, 2022
b046e7a
better error messages for `sort(4)` and `sort('c')`
Jul 20, 2022
8e94705
throw on string
Jul 30, 2022
15b2a02
Merge branch 'master' into sort-iter
LilithHafner Aug 11, 2022
fe2ce55
Add tests for error messages and drop iffy refference
Aug 11, 2022
e4cfffc
Update base/sort.jl
LilithHafner Sep 30, 2022
7c14f9d
Remove broken suggestion
LilithHafner Sep 30, 2022
36ff056
Throw on infinite iterator
LilithHafner Sep 30, 2022
f32de07
Merge branch 'master' into sort-iter
LilithHafner Oct 10, 2022
1280aea
Merge branch 'master' into sort-iter
LilithHafner Oct 15, 2022
d9c8a7d
Merge branch 'master' into sort-iter
LilithHafner Nov 25, 2022
449f9ff
Update compat entry to 1.10
LilithHafner Nov 25, 2022
60e42f5
Merge branch 'master' into sort-iter
LilithHafner Dec 3, 2022
55d5e7e
Merge branch 'master' into sort-iter
LilithHafner Feb 11, 2023
f5ad11a
make sort(::NTuple) return a tuple
Feb 17, 2023
55c8efe
Merge branch 'master' into sort-iter
LilithHafner Feb 17, 2023
ac32a38
Fallback to vector sorting for large tuples
Feb 17, 2023
264836d
resolve method ambiguity
Feb 18, 2023
0c53911
drop threshold to 9 and switch from home rolled split to IteratorsMD.…
Feb 18, 2023
9fc05de
Update test/sorting.jl
LilithHafner Feb 18, 2023
6a838e5
Merge branch 'master' into sort-iter
Mar 3, 2023
d42c234
support sorting AbstractStrings
Mar 3, 2023
741017e
Merge branch 'master' into sort-iter
LilithHafner Apr 21, 2023
a0d3e4d
Revert "support sorting AbstractStrings"
May 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions base/sort.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ using .Base: copymutable, LinearIndices, length, (:), iterate, OneTo,
extrema, sub_with_overflow, add_with_overflow, oneunit, div, getindex, setindex!,
length, resize!, fill, Missing, require_one_based_indexing, keytype, UnitRange,
min, max, reinterpret, signed, unsigned, Signed, Unsigned, typemin, xor, Type, BitSigned, Val,
midpoint, @boundscheck, checkbounds
midpoint, @boundscheck, checkbounds, IteratorSize, HasShape

using .Base: >>>, !==, !=

Expand Down Expand Up @@ -975,6 +975,8 @@ end

Variant of [`sort!`](@ref) that returns a sorted copy of `v` leaving `v` itself unmodified.

Uses `Base.copymutable` to support immutable collections and iterables.
LilithHafner marked this conversation as resolved.
Show resolved Hide resolved

# Examples
```jldoctest
julia> v = [3, 1, 2];
Expand All @@ -992,7 +994,13 @@ julia> v
2
```
"""
sort(v::AbstractVector; kws...) = sort!(copymutable(v); kws...)
function sort(v; kws...)
IteratorSize(v) == HasShape{0}() && throw(ArgumentError("$v cannot be sorted"))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't infinite (and maybe unknown) also be errors? I see that people probably want an error for sort(1), so ok, but kind of strange that that of all things would be disallowed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unknown sizes should be supported

sort(w for w in ["karma", "infracostalis", "postencephalon", "Elia"] if all(islowercase, w))

For almost all infinite iterators, we already throw inside copymutable, but perhaps someone could define an infinite iterator that can be copymutableed, returning a sparse vector of infinite length. If they also define a sorting method for that sparse representation, then it would be a mistake to throw on infinite iterators.

Copy link
Contributor

@Seelengrab Seelengrab Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But sort is an eager operation - surely if someone wants to sort their infinite iterator lazily, they have to implement it either way and not just rely on the generic fallback that's supposed to collect all elements of the iterator? Wouldn't it be better UX to throw early and make them aware where the actual problem lies, instead of having them chase down an (incidental) error from copymutable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For generic iterables, copymutable collects, but it only needs to make a mutable copy, not necessarily instantiate every element. For example, I believe this is allowed:

struct WithTrailingZerosIterable
    head::Int
    tail::Union{Nothing, WithTrailingZerosIterable}
end

Base.iterate(x::WithTrailingZerosIterable) = iterate(x, x)
Base.iterate(::WithTrailingZerosIterable, x::WithTrailingZerosIterable) = x.head, x.tail
Base.iterate(::WithTrailingZerosIterable, ::Nothing) = 0, Nothing

Base.IteratorSize(::Type{<:WithTrailingZerosIterable}) = Base.SizeInfinite()


struct WithTrailingZerosVector <: AbstractVector{Int}
    data::Vector{Int}
end
Base.size(x::WithTrailingZerosVector) = (typemax(Int),)
Base.getindex(x::WithTrailingZerosVector, i::Int) = i <= length(x.data) ? x.data[i] : 0
function Base.show(io::IO, ::MIME"text/plain", x::WithTrailingZerosVector)
    println(io, "infinite-element WithTrailingZerosVector:")
    for i in 1:5
        x[i] >= 0 && print(io, ' ')
        println(io, x[i])
    end
    println("")
end


function Base.collect(x::WithTrailingZerosIterable)
    data = Int[]
    while x !== nothing
        push!(data, x.head)
        x = x.tail
    end
    WithTrailingZerosVector(data)
end

function Base.sort!(x::WithTrailingZerosVector)
    filter!(x -> x < 0, x.data)
    sort!(x.data)
    return x
end


const X = WithTrailingZerosIterable(1, WithTrailingZerosIterable(-2, WithTrailingZerosIterable(3, nothing)))
display(collect(X))
#=
infinite-element WithTrailingZerosVector:
 1
-2
 3
 0
 0

=#

display(sort!(Base.copymutable(X)))
#=
infinite-element WithTrailingZerosVector:
-2
 0
 0
 0
 0

=#

Nevertheless, we could put this in the same camp as sort(::AbstractString): perhaps it might make sense in some way but for now just throw because it is rarely a good idea to sort an infinite iterable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing on an infinite iterator seems like the sensible choice. I've made the change.

sort!(copymutable(v); kws...)
end
sort(::AbstractString; kws...) =
throw(ArgumentError("sort(x::AbstractString) is ambiguous. Use sort!(collect(x)) or String(sort!(collect(x))) instead."))
sort(v::AbstractVector; kws...) = sort!(copymutable(v); kws...) # for method disambiguation

## partialsortperm: the permutation to sort the first k elements of an array ##

Expand Down
17 changes: 17 additions & 0 deletions test/sorting.jl
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,23 @@ end
@test isequal(a, [8,6,7,NaN,5,3,0,9])
end

@testset "sort!(iterable)" begin
gen = (x % 7 + 0.1x for x in 1:50)
@test sort(gen) == sort!(collect(gen))
gen = (x % 7 + 0.1y for x in 1:10, y in 1:5)
@test sort(gen; dims=1) == sort!(collect(gen); dims=1)
@test sort(gen; dims=2) == sort!(collect(gen); dims=2)

@test_throws ArgumentError("dimension out of range") sort(gen; dims=3)

@test_throws UndefKeywordError(:dims) sort(gen)
@test_throws UndefKeywordError(:dims) sort(collect(gen))
@test_throws UndefKeywordError(:dims) sort!(collect(gen))

@test_throws ArgumentError sort("string")
@test_throws ArgumentError("1 cannot be sorted") sort(1)
end

@testset "sort!(::AbstractVector{<:Integer}) with short int range" begin
a = view([9:-1:0;], :)::SubArray
sort!(a)
Expand Down