Simpler syntax for creating uninitialized arrays #34775

jkrumbiegel · 2020-02-16T12:19:51Z

I find the syntax for creating uninitialized arrays a bit verbose, while there are nice and short options for almost all other common cases of creating arrays:

# compare to
Float64[1, 2, 3, 4, 5]
zeros(Float64, 10)
ones(Float64, 10)
fill(1.0, 10)

But for uninitialized arrays you always have to use curly bracket syntax if I'm correct:

v = Vector{Float64}(undef, 10)

How about one of these two alternatives, which both seem to be available:

v = Float64[undef, 10]
arr = Int32[undef, 3, 4, 5]

# or

v = undef(Float64, 10)
arr = undef(Int32, 3, 4, 5)

The second one is actually easy to get via:

(::UndefInitializer)(T::Type, dims::Vararg{Int}) = Array{T}(undef, dims...)

jakobnissen · 2020-02-17T12:48:00Z

See this issue and this Discourse discussion.

These posts are both very long and mostly discuss something other than your concrete proposal. However, there is at least one relevant point, namely that undef(Int, 10) returns an Array, when there are so many other AbstractArrays that could be usable.

Not sure I agree, though. The same could be said for zeros, ones and fill. Array still is by far the most used AbstractArray.

So think your proposed undef(T, dims...) syntax would be nice. It's short and explicit, and would probably be used quite often.

martinholters · 2020-02-17T15:36:11Z

would probably be used quite often.

...which might be a reason not to do it...

yuyichao · 2020-02-17T16:01:04Z

...which might be a reason not to do it...

It may be important to make the "uninitialized" part explicit, but I don't think it's necessary to make the syntax harder to use.

JeffBezanson · 2020-02-17T18:45:39Z

This is indeed somewhat intentional, to discourage uninitialized arrays. But we also wanted to move towards more general and regular syntax instead of all the special cases like zeros(...). The syntax undef(T, dims) is ok, but I question whether having more ways to write it is actually easier to use.

timholy · 2020-02-29T14:44:57Z

If we want to increase uniformity, one thought for 2.0: deprecate zeros, ones in favor of fill(v, axes), and consider allowing fill(Undef{T}, axes) for an uninitialized array with eltype T. (fill(T, axes) won't work because what if you want to create an array of types?)

KristofferC · 2020-02-29T15:22:56Z

deprecate zeros, ones

Wasn't this already discussed, cf #24444?

timholy · 2020-02-29T16:05:06Z

I guess I'm consistent!

StefanKarpinski · 2020-02-29T16:08:12Z

I’ve actually often wanted a way to go in the other direction: factor out the initializer concept so that I can do things uniformly like this:

Array{T}(undef, m, n)
Array{T}(zeros, m, n)
Array{T}(ones, m, n)

Why? It makes it easier to swap out any of the properties of what’s being done: it cleanly separates the container type, the element type, what to initialize it with and the dimensions.

tpapp · 2020-02-29T16:34:25Z

Note also that while ones, fill etc make sense for most <: AbstractArray types, undef is the odd one out in the sense that is only practical for mutable arrays.

KristofferC · 2020-02-29T17:19:13Z

undef is the odd one out in the sense that is only practical for mutable arrays.

Not really, because in reality undef means uninitialized(which is what originally called). I made a PR to rename it to undef (shame on me) but in hind-sight, uninit would probably have been better.

StefanKarpinski · 2020-02-29T17:23:33Z

I think the point is that making an uninitialized immutable array isn't very useful.

KristofferC · 2020-02-29T17:37:50Z

Oh, yeah, I misread that.

Sacha0 · 2020-02-29T18:37:31Z

Comet topic! :)

I’ve actually often wanted a way to go in the other direction: factor out the initializer concept so that I can do things uniformly like this: [...]

For interested newcomers to this discussion, #24595 (comment) discusses this direction at length as 'the second proposal':

The more general extension of this model is MyArray[{...}](contentspec[, modifierspec...]). Roughly, contentspec defines the result's contents, while modifierspec... (if given) provides qualifications, e.g. shape.

StefanKarpinski · 2020-02-29T19:11:04Z

One thing we could do is:

make Array{T}(zeros, dims...) etc. work
make undef(T, dims...) and undef(dims...) work

That way we round out the collection of convenience constructors in way that can always be expressed in terms of the fuller Container{Eltype}(initializer, dims...) form.

johnnychen94 · 2020-03-01T02:14:31Z

make Array{T}(zeros, dims...) etc. work

What I see from this syntax is that whatever initializer put here should be as fast as undef. Since zeros is way sloweeer than undef, I think it's perhaps the time to get some updates on #130

julia> @btime zeros(Float64, 1000, 1000);
  443.589 μs (2 allocations: 7.63 MiB)

julia> @btime Array{Float64}(undef, 1000, 1000);
  37.140 μs (2 allocations: 7.63 MiB)

tpapp · 2020-03-01T05:50:56Z

I am trying to think about the implications of these proposals for generic code. It is not clear to me if

these methods (zeros, ones, fill) were meant to be convenience constructors for Array{T,N}, or more generic (and if yes, how generic? should there be a unified API for various collections of homogeneous items? does that even make sense?)
if the motivation for zeros and ones is syntactic convenience (shorter than fill(one(T), dims...)), or something more abstract (as zero and one are, for additive and multiplicative identities), or speed (we can do zeros faster for some types?)

As for (1) a lot of packages define Base.zeros etc for their own types, which are not even necessarily <:AbstractArray. Should they do the same for the proposed undef(...) (if applicable)?

Regarding (2), it would be nice for custom types to be able to rely on a default like

function zeros(S::Type{SomeCustomType{T}}, shape...) where T
    fill(S, zero(eltype(S)), shape...)
end

and define only this fill method; zeros only when that confers an extra advantage. Then we could unify syntax with the fallback

function undef(SomeCustomType{T}, shape...)
    SomeCustomType{T}(undef, shape...)
end

StefanKarpinski · 2020-03-01T19:51:39Z

What I see from this syntax is that whatever initializer put here should be as fast as undef.

I don't understand why that should be the case. Yes, we want initializers to be as fast as we can make them, but some require more work than others. Why would we require that they all be as fast as doing nothing?

KristofferC · 2020-03-01T19:56:42Z

Since zeros is way sloweeer than undef

It is quite tricky to measure this since the OS can sometimes give out uninitialized memory "for free" and only commit to the actual allocation when the memory is used.

JeffBezanson added the arrays [a, r, r, a, y, s] label Feb 17, 2020

mkitti mentioned this issue Oct 13, 2021

Define and export undefs similar to zeros and ones #42620

Closed

brenhinkeller added the feature Indicates new feature / enhancement requests label Nov 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simpler syntax for creating uninitialized arrays #34775

Simpler syntax for creating uninitialized arrays #34775

jkrumbiegel commented Feb 16, 2020

jakobnissen commented Feb 17, 2020

martinholters commented Feb 17, 2020

yuyichao commented Feb 17, 2020

JeffBezanson commented Feb 17, 2020

timholy commented Feb 29, 2020

KristofferC commented Feb 29, 2020

timholy commented Feb 29, 2020

StefanKarpinski commented Feb 29, 2020 •

edited

Loading

tpapp commented Feb 29, 2020

KristofferC commented Feb 29, 2020

StefanKarpinski commented Feb 29, 2020

KristofferC commented Feb 29, 2020 •

edited

Loading

Sacha0 commented Feb 29, 2020

StefanKarpinski commented Feb 29, 2020

johnnychen94 commented Mar 1, 2020 •

edited

Loading

tpapp commented Mar 1, 2020

StefanKarpinski commented Mar 1, 2020

KristofferC commented Mar 1, 2020

Simpler syntax for creating uninitialized arrays #34775

Simpler syntax for creating uninitialized arrays #34775

Comments

jkrumbiegel commented Feb 16, 2020

jakobnissen commented Feb 17, 2020

martinholters commented Feb 17, 2020

yuyichao commented Feb 17, 2020

JeffBezanson commented Feb 17, 2020

timholy commented Feb 29, 2020

KristofferC commented Feb 29, 2020

timholy commented Feb 29, 2020

StefanKarpinski commented Feb 29, 2020 • edited Loading

tpapp commented Feb 29, 2020

KristofferC commented Feb 29, 2020

StefanKarpinski commented Feb 29, 2020

KristofferC commented Feb 29, 2020 • edited Loading

Sacha0 commented Feb 29, 2020

StefanKarpinski commented Feb 29, 2020

johnnychen94 commented Mar 1, 2020 • edited Loading

tpapp commented Mar 1, 2020

StefanKarpinski commented Mar 1, 2020

KristofferC commented Mar 1, 2020

StefanKarpinski commented Feb 29, 2020 •

edited

Loading

KristofferC commented Feb 29, 2020 •

edited

Loading

johnnychen94 commented Mar 1, 2020 •

edited

Loading