Functions that return arrays with eltype as input should use container type instead? #11557

tkelman · 2015-06-03T07:40:01Z

ref #11280 (comment) and a few other places I've mentioned this. Functions like zeros, ones, eye, rand, mmap (maybe), etc which currently take an eltype as inputs and return some subtype of AbstractArray, would be much more general if they instead took the container type (with element type as a parameter) as input, so the current case would instead look like zeros(Array{Float64}, dims...). This would allow extending these functions to other types of AbstractArray in a much nicer way than right now. We could potentially get rid of a bunch of ugly-named functions like bitrand and spzeros and speye and sprandbool with this change.

I think it would by necessity be breaking, though maybe clever enough subtyping of zeros{T<:AbstractArray}(::Type{T}, dims...) would allow putting in deprecations (or defaults? might be fragile) for "scalar" types, if we have enough of a concept of interfaces to determine what makes a type a scalar (that's a different issue).

This would have to be a breaking change for arrays-of-arrays (if any of these functions currently work for arrays of arrays, that is?).

The text was updated successfully, but these errors were encountered:

ScottPJones · 2015-06-03T07:51:32Z

Not sure anybody gives a hoot about my opinion, but I think this is a very good idea, and is exactly the sort of thing that should only happen once in the language, and the coming 0.5 "Arraymeggedon" is exactly the perfect time to do so. 👍

ScottPJones · 2015-06-03T07:56:52Z

Should it be called zeros() though? Could this better be more general? inittype(initclass, type, dims...)? Just a thought off the top of my head... zeros(T, dims...) = inittype(INITZERO, T, dims...),
ones(T,dims...) = inittype(INITONE, T, dims...), initclass might be from an enum... [ignore my names for the function and classes... I'm not great at naming so early in the morning])

tkelman · 2015-06-03T08:02:20Z

That doesn't work for things like rand and eye though, it's a bit narrow-minded and probably well-served already by fill!. Completely eliminating zeros and ones would probably not go over all that well with the linear algebraists among us, though I suppose nothing is guaranteed to be sacred from banishment to MatlabCompat.jl, at least not before Julia hits 1.0.

ScottPJones · 2015-06-03T08:10:55Z

Who said anything about eliminating them? That's why I showed how they could be defined using a new general function... I think you are the one who will get the 🍅s for daring to suggest taking away spzeros, etc. 😀

tkelman · 2015-06-03T08:21:38Z

That is actually already done in terms of fill! at

julia/base/array.jl

Lines 234 to 240 in d198e90

 for (fname, felt) in ((:zeros,:zero), (:ones,:one)) 

 @eval begin 

 ($fname)(T::Type, dims...) = fill!(Array(T, dims...), ($felt)(T)) 

 ($fname)(dims...) = fill!(Array(Float64, dims...), ($felt)(Float64)) 

 ($fname){T}(A::AbstractArray{T}) = fill!(similar(A), ($felt)(T)) 

 end 

 end

- I guess the proposal here is to identify the rest of the places where we do this inconsistently and come up with something more uniform, apply it more broadly. Currently you can do zeros(A::AbstractArray), but not zeros{T<:AbstractArray}(::Type{T}, dims...).

The decision item is which is more general, needing to take an already-constructed instance of the AbstractArray type as input, or taking the array type itself as input to allow these functions to define their own direct construction?

ScottPJones · 2015-06-03T08:24:53Z

Well, I already stated that I liked the idea of passing the array type (actually, couldn't it be any collection type?)

jiahao · 2015-06-03T08:42:13Z

For matrices of matrices, the type information alone is insufficient to determine the appropriate zero, eye or one. The shape information of the inner nested matrix is also required. None of these functions currently work on matrices of matrices because, e.g. zero(::AbstractArray{T}) calls zero(T), but does not propagate the shape information of the current instance of T to the second call.

Thinking about zero(::SparseMatrix{<:SparseMatrix{T}}) and zero(::SparseMatrix{Matrix{T}}) makes my head hurt.

tkelman · 2015-06-03T08:50:43Z

Yeah, sparse arrays of sparse arrays, while godawfully cool in concept, would be pretty tough to support generically. A nerd can dream.

To give a concrete example where I ran into this as a major problem, I was making a separate sparse matrix type in a package using a different representation than CSC, and immediately found that there was no signature for spzeros (or zeros) that I could extend to construct an object of my new type. I wound up calling the function foozeros and being very displeased with that workaround.

jiahao · 2015-06-03T08:58:05Z

Arrays of arrays are not pleasant to work with in Julia. Recall #8759? There's definitely room for a variant of fill that does recursive copies.

ScottPJones · 2015-06-03T09:23:38Z

@tkelman, why are sparse arrays of sparse arrays so difficult? Is it just to handle the mathematical functions? (I've been using and implementing structures [multidimensional associative arrays], for the past 29 years, and they could be viewed/used as sparse arrays of sparse arrays)... sorry in advance for the potentially dumb question

tkelman · 2015-06-03T10:01:38Z

There are assumptions sprinkled around in various places that you can always call zero on the element type of an array once you try to do non-trivial mathematical operations on them. Constructing the data structure in the first place should work okay as long as the type and constructor are general enough to support that, but doing non-trivial math on arrays of arrays, or even just on sparse arrays while preserving and exploiting sparsity in as many operations as possible (ref #8001), runs into trouble with the current state of base. It's gradually improving but there are still some missing pieces in terms of formalizing interfaces and making nontrivial operations generic over both structure and element types.

edit: Also please please refrain from conflating missing data with sparsity in the linear algebraic sense. They are rather different in implementation aspects and semantics. You're very familiar with data structures that are optimized for missingness, those exist outside of base in the Julia ecosystem, primarily for statistical purposes right now. Most of what I'm discussing is about linear algebra and data structures optimized for that purpose.

ScottPJones · 2015-06-03T10:04:05Z

@tkelman Thanks for the explanation! (formalizing interfaces I think is very important, all over julia...)

andreasnoack · 2015-06-03T13:39:12Z

I'm generally positive about this proposal, but it is a general problem that parametric types can be cumbersome to write out, e.g. zeros(SparseMatrixCSC{Float64,Int64), 3, 3) vs spzeros(3,3). The latter is more restrictive, but probably a good default choice so a solution could be that spzeros is just a convenience layer. Just to spell out the problem: in DistributedArrays.jl it would be very unhandy to write zeros(DArray{Float64,2,Matrix{Float64}},10,10) instead of dzeros(10,10).

I solution might be to have default behavior when the type parameters are not specified, e.g. something like zeros(Type{SparseMatrixCSC}, dims...) = zeros(SparseMatrix{Float64,Int}, dims...) and zeros(Type{DArray}, args...) = zeros(DArray{Float64,2,Matrix{64}}, args...).

ScottPJones · 2015-06-03T13:52:14Z

@andreasnoack Yes, I think that makes a lot of sense, like the convenience of still having zeros and ones that I suggested...

tkelman · 2015-06-03T13:54:51Z

I think the convenience methods should eventually be behind a non-default import though, not exported globally by default.

jiahao · 2015-06-03T13:56:01Z

I think the convenience methods should eventually be behind a non-default import though, not exported globally by default.

Wouldn't unexported convenience methods just be inconvenient convenience methods then?

ScottPJones · 2015-06-03T13:59:09Z

Can't you do something to just bring all the convenience functions in, in one fell swoop, if desired?
(but only when asked for, not part of default import, as you want)

quinnj · 2015-06-03T14:01:17Z

Sounds like we need a Nicknames.jl package?

jiahao · 2015-06-03T14:06:35Z

Actually sprandn is a little different from the others; it must take a density parameter which cannot be specified in randn. Essentially, we only support the density = 1.0 case for dense matrices).

andreasnoack · 2015-06-03T14:10:08Z

Oh, I forgot that. Then just ignore randn and substitute it with zeros/ones.

nalimilan · 2015-06-04T21:08:16Z

If arrays of arrays are excluded from the scope of zeros() and friends, then for convenience maybe zeros(T, dims...) could be equivalent to zeros(Array{T}, dims...) when T is not a container? Then we would have both the flexible and the short form. We would need a definition of "container", but an abstract type for that wouldn't hurt anyway (Collection?).

stevengj · 2015-06-07T08:00:42Z

I'm not sure why this has to be breaking. zeros{T}(::Type{T}, m, n) does the current thing and zeros{T<:AbstractArray}(::Type{T}, m, n) can do the proposed (more flexible, but harder to type) thing, no?.

nalimilan · 2015-06-07T09:03:37Z

@stevengj Are you replying to my comment? If so, I don't think it has to be breaking indeed. My idea was precisely, as you say, to allow both.

ScottPJones · 2015-06-07T09:22:51Z

@tkelman I just noticed the edit you'd added after I'd read and responded:

edit: Also please please refrain from conflating missing data with sparsity in the linear algebraic sense. They are rather different in implementation aspects and semantics.

So far, I didn't see anything that makes these different in implementation... the sparse structures in Julia might be optimized for linear algebra, but that doesn't make them not be useful (with a bit more generalization) for dealing with missing data (and much more efficient than simply using dense arrays).
It's more a difference of interpretation, what is the meaning of "missing"... is it zero(), or is it a more general missing type, maybe Void, "", or zero()? I think generalizing things in that fashion wouldn't have to break anything... and maybe the statisticians and database programmers wouldn't need to invent more data structures in packages...

JeffBezanson · 2017-10-20T21:53:31Z

I think it's much better for functionality like this to be constructors of the container type you want, e.g. Array(Diagonal([1,1,1])), instead of eye(Matrix{Int}, 3, 3).

nalimilan · 2017-10-21T09:42:04Z

What about zeros, ones and rand, though? We would have to create e.g. a Zeros array type and do Matrix(Zeros{Int}(3, 3))?

JeffBezanson · 2017-10-21T16:36:26Z

It doesn't have to be a Zeros array type, it can be an iterator like repeated(0).

Do we know what the general set of affected functions is here? Or is it just zeros, ones, and rand? eye is a different case, since that really is more of a conversion from I or Diagonal.

Sacha0 · 2017-10-21T17:03:36Z

randn is another. Best!

martinholters · 2017-10-24T08:12:17Z

typed_hcat, typed_vcat, typed_hvcat - it would be useful if one could decide the output container type and avoid the construction of a temporary of another type.

quinnj · 2017-10-24T15:42:44Z

Has anybody mentioned collect? I think that'd be a great case where you may want to collect into an arbitrary type, that individual types can overload.

fredrikekre · 2017-10-24T15:45:19Z

Has anybody mentioned collect? I think that'd be a great case where you may want to collect into an arbitrary type, that individual types can overload.

#16029

JeffBezanson · 2017-10-26T19:32:28Z

Yes I think we should resolve this along with #16029 by deprecating the current Array constructors in favor of junk(T, dims) (this was proposed on the triage call), and then we can have array constructors accept iterators. Array(itr) can work if itr has the HasShape trait, otherwise you'd need Array(itr, dims). We can also make junk, ones, zeros, rand, and randn iterable, so MyArray(ones, dims) can work.

StefanKarpinski · 2017-10-26T20:01:50Z

From triage, alternatives here seem to be:

Change zeros(Int, 10, 10) to zeros(Array{Int}, 10, 10) which generalizes to zeros(GPUArray{Int}, 10, 10}, etc.
Leave zeros(Int, 10, 10) and introduce GPUArray{Int}(zeros, 10, 10) as the standard way to fill a custom array type with zeros, where zeros is some zero iterator object that's also callable as a function.
Same as 2 but deprecate zeros(Int, 10, 10) in favor of Array{Int}(zeros, 10, 10) and leave the zeros function as something that only acts as an iterator and constructs only Float64 arrays.

Note that 1 is not in conflict with more generically constructing arrays by passing them generators, and in fact zeros(A::Type{AbstractArray}, dims...) could simply be defined as A(repeated(zero(eltype(A))), dims...).

Sacha0 · 2017-10-28T21:48:30Z

One complication with e.g. Array(rand[n], dims) is RNG specification. Another is rand methods that sample from an array or string, though perhaps those methods should be deprecated in favor of package functionality or a sample function as previously discussed elsewhere.

Edit: Perhaps e.g. Array(rand, rng::AbstractRNG, dims...) would work for those cases?

fredrikekre · 2017-10-29T07:55:47Z

One complication with e.g. Array(rand[n], dims) is RNG specification.

Could we do that like:

rng = MersenneTwister(seed)
Array(rng, dims)

Another is rand methods that sample from an array or string

That could be done similarly then, e.g.

a = [1, 2, 3, 4, 5]
Array(a, dims)

Although thats pretty bad cause it is not apparent that we create something at random, so we should probably include rand somehow...

JeffBezanson · 2017-10-29T17:12:51Z

The key is that we need ways to get random numbers without constructing arrays, so you can do e.g. take(random_stream(...), n).

Sacha0 · 2017-11-02T18:15:10Z

Continuing discussion from #24389 (comment) here.

I think we might want to generalize "possibly shapeless iterator" to "thing that provides a mapping from indices to values", which includes iterators as a special case where the indices are implied by counting as iteration happens. In this case the I object provides a natural mapping from pairs of indices to values, yielding ones on the diagonals and zeros off of it.

Through working with eye/ones/zeros/junk over the last week, my thoughts have evolved in that direction as well :).

eye, ones, and zeros provide roughly the same interface, with two broad modes of invocation. The first mode of invocation is (eye|ones|zeros)([T, ]dims...), meaning "construct an Array, filled with these contents, of this shape, and (if specified) of this element type". The second mode of invocation is (eye|ones|zeros)(A, [T, ][dims...]), meaning "construct an array similar to A, filled with these contents, and, if specified, of this shape and element type instead of those of A". The primary reason these methods are pleasant is their brevity, achieved by admitting as much implicit specification as possible.

The new constructor direction can achieve similar brevity with a bit of generalization/relaxation. Consider ArrayType[{T,N}](contentspec[, modifierspec...]). Roughly, contentspec defines the result's contents, while (if provided) modifierspec... provides e.g. shape and/or element type (supplementing contentspec if contentspec did not provide e.g. shape or element type, or supplanting contentspec's e.g. shape and/or element type if contentspec did provide those). The {T,N} on the ArrayType of course also provides supplemental or overriding element type and dimension if present, and takes precedence over modifierspec.

Illustrating with eye and Matrix, for eye([T, ]m[, n]) we now have Matrix[{T}](I, m, n). For eye(A::Matrix), we now must write Matrix{eltype(A)}(I, size(A)), but with the above generalization could write Matrix(I, A)) (A serving as the modifierspec). Similarly, for eye(A::Matrix, T) we now have Matrix{T}(I, size(A)) but could have Matrix{T}(I, A). For eye(A::Matrix, T, m, n) we now have Matrix{T}(I, m, n), which seems solid as it stands. And analogously with ones/zeros, but with something appropriate in place of I above (~~which could simply be a literal, for example~~ edit: not wise, per triage).

With array types that carry information beyond shape and element type, for example OffsetArray, the ArrayType[{T,N}](contentspec, A::ArrayType) forms above could pass the additional information.

More on the second invocation mode later. Best!

martinholters · 2017-11-03T14:54:19Z

Could ones(eltype(x), n) become Vector(ones, x, n) or Vector(ones, n, x) or would they be equivalent? Would OffsetArray(ones, -2:3) take the element type and/or the indices from the -2:3? If the latter, should the indices be -2:3 or the indices of it, i.e. 1:6?

Should we consider keyword arguments to disambiguate these cases?

JeffBezanson · 2018-01-04T23:16:35Z

I believe we've decided not to do this.

tkelman · 2018-01-04T23:22:45Z

Looks like the conclusion has mostly been to deprecate them and use array constructors directly instead? Are any of the old style functions left? What about the rand family?

vtjnash · 2018-01-05T00:22:27Z

Are any of the old style functions left

That's OK, we mostly want to keep them for interactive use. If Base commonly uses a different style, that's not an issue.

tkelman added kind:speculative Whether the change will be implemented is speculative needs decision A decision on this change is needed kind:breaking This change will break code labels Jun 3, 2015

tkelman changed the title ~~Functions that return arrays with eltype as an input should use fully parameterized container type instead?~~ Functions that return arrays with eltype as input should use container type instead? Jun 3, 2015

quinnj mentioned this issue Jun 3, 2015

RFC: mmap makeover #11280

Merged

tkelman mentioned this issue Jun 4, 2015

What is the official similar interface? #11574

Closed

JeffBezanson removed the status:triage This should be discussed on a triage call label Oct 26, 2017

Sacha0 mentioned this issue Oct 28, 2017

constructors for Matrix and SparseMatrixCSC from UniformScaling #24372

Merged

Sacha0 mentioned this issue Oct 29, 2017

[WIP] constructors for Array from zeros/ones #24389

Closed

This was referenced Oct 29, 2017

[WIP] add some junk #24400

Closed

deprecate eye(::Type{Diagonal{T}}, n), introduce Diagonal[{T}](s::UniformScaling, n) #24415

Merged

fredrikekre mentioned this issue Nov 2, 2017

Deprecate ones? #24444

Closed

Sacha0 mentioned this issue Nov 13, 2017

design of array constructors #24595

Open

ViralBShah unassigned tkelman Nov 19, 2017

JeffBezanson added status:triage This should be discussed on a triage call and removed status:triage This should be discussed on a triage call labels Dec 28, 2017

JeffBezanson closed this as completed Jan 4, 2018

maleadt mentioned this issue Oct 31, 2018

Add optimized implementation of fill for types supported by memset. JuliaGPU/CuArrays.jl#178

Merged

maleadt mentioned this issue May 27, 2020

Array constructors for ones, zeros, rand, ... JuliaGPU/CUDA.jl#159

Open

Functions that return arrays with eltype as input should use container type instead? #11557

Functions that return arrays with eltype as input should use container type instead? #11557

Comments

tkelman commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

tkelman commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

tkelman commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

jiahao commented Jun 3, 2015

tkelman commented Jun 3, 2015

jiahao commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

tkelman commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

andreasnoack commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

tkelman commented Jun 3, 2015

jiahao commented Jun 3, 2015

ScottPJones commented Jun 3, 2015

quinnj commented Jun 3, 2015

jiahao commented Jun 3, 2015

andreasnoack commented Jun 3, 2015

nalimilan commented Jun 4, 2015

stevengj commented Jun 7, 2015

nalimilan commented Jun 7, 2015

ScottPJones commented Jun 7, 2015

JeffBezanson commented Oct 20, 2017

nalimilan commented Oct 21, 2017

JeffBezanson commented Oct 21, 2017

Sacha0 commented Oct 21, 2017

martinholters commented Oct 24, 2017

quinnj commented Oct 24, 2017

fredrikekre commented Oct 24, 2017

JeffBezanson commented Oct 26, 2017

StefanKarpinski commented Oct 26, 2017 • edited Loading

Sacha0 commented Oct 28, 2017 • edited Loading

fredrikekre commented Oct 29, 2017 • edited Loading

JeffBezanson commented Oct 29, 2017

Sacha0 commented Nov 2, 2017 • edited Loading

martinholters commented Nov 3, 2017

JeffBezanson commented Jan 4, 2018

tkelman commented Jan 4, 2018

vtjnash commented Jan 5, 2018

StefanKarpinski commented Oct 26, 2017 •

edited

Loading

Sacha0 commented Oct 28, 2017 •

edited

Loading

fredrikekre commented Oct 29, 2017 •

edited

Loading

Sacha0 commented Nov 2, 2017 •

edited

Loading