Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fieldnames() and propertynames() to return a tuple rather than an array #25725

Merged
merged 1 commit into from
Feb 9, 2018

Conversation

nalimilan
Copy link
Member

Using an immutable structure makes sense since the names cannot be modified, and it avoids an allocation.

Fixes #25327.

Note: applying the change to propertynames is somewhat less natural given that the number of fields can vary depending on runtime values (as can be seen in the diff). Not sure whether that can be a problem in practice: if so, we could keep returning an array for that function.

@smldis
Copy link

smldis commented Jan 24, 2018

For the Note:
This means propertynames return type cannot be inferred in general.
Do we have a way to stabilize the type instability by collecting the (length unstable) tuple to an array after calling propertynames()?

For example using Vector{Symbol}(collect(propertynames(x)))

Thx for fixing that!

@KristofferC
Copy link
Sponsor Member

+1 for returning an array for propertynames.

@nalimilan
Copy link
Member Author

Do we have a way to stabilize the type instability by collecting the (length unstable) tuple to an array after calling propertynames()?

For example using Vector{Symbol}(collect(propertynames(x)))

You can always collect the tuple to an array, but the type-instability related to the length of the tuple will still affect the code which works with the tuple.

FWIW, the propertynames methods defined in LinearAlgebra don't seem to be a real problem, because the length of the returned tuple only depends on the private::Bool argument, and these functions are going to be inlined anyway given how simple they are (which should allow inference to get rid of the unused branch).

The type instability problem will be more serious for types like DataFrame, for which the property names are fully dynamic. But since type instabilities happen all the time with DataFrame, I'm not sure it's really a problem.

@nalimilan
Copy link
Member Author

Actually, I had forgotten that propertynames falls back to fieldnames, so at least in some cases it has to return the same type as fieldnames. Some types could be allowed to return a vector, though.

@KristofferC
Copy link
Sponsor Member

Actually, I had forgotten that propertynames falls back to fieldnames, so at least in some cases it has to return the same type as fieldnames

propertynames(x) could just be changed to collect(fieldnames(x))?

@nalimilan
Copy link
Member Author

Right, we could do that too.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 24, 2018

and it avoids an allocation.

It allocates a tuple, so this change doesn't make it allocate less or make it more inferrable (It's based on fieldname, which can't easily be inferred). If we do decide we want to do this, we need to change the structure of how these names are stored so that we are returning the existing object rather than necessarily copying it (from svec to vector or tuple).

@smldis
Copy link

smldis commented Jan 24, 2018

We are discussing the interface, not the implementation. But it seems the choice is driven by the optimizability (implementation can come later) of the collection of that tuple into an array without the traditional performance impact of the type unstability.

if we use an array then the allocation can still be avoided by a mutable propertynames!(myarray, mytypeinstance)

@nalimilan
Copy link
Member Author

More opinions?

@nalimilan nalimilan added the status:triage This should be discussed on a triage call label Feb 2, 2018
@StefanKarpinski
Copy link
Sponsor Member

Triage favors rebasing and merging this (with the exception of @vtjnash who dissents).

@StefanKarpinski StefanKarpinski removed the status:triage This should be discussed on a triage call label Feb 8, 2018
@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Feb 8, 2018
@JeffBezanson
Copy link
Sponsor Member

I'm +1 on this. But, I would also say propertynames is more flexible and can return any collection of symbols, not necessarily a tuple.

… an array

Using an immutable structure makes sense since the names cannot be modified,
and it avoids an allocation.
@nalimilan
Copy link
Member Author

OK, I've rebased and changed the propertyname docstring to say "tuple or vector" so that custom types can use either type depending on whether the number of elements is statically known or not.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Feb 9, 2018

@JeffBezanson, I've assigned you to review – whenever you have a chance, just take a look and merge since its tests are as solid green as it's possible to get at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants