-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ensure promotion rules don't alter eltype values #26109
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I showed at #25924,
Tuple{Union{T,Int}}
is currently much faster thanTuple{Any}
, so AFAICT this would be a clear regression.I also think we should consider these questions in the light of broader issue of inference and
NamedTuple
at #25925.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is simply a very minor known optimization miss (specifically, that we don't yet implement the obvious optimization to split
x isa Union{Int, Missing}
tox isa Int || x isa Missing
), so we spend almost all of the time running subtyping. This is quite trivial – and non-breaking – to fix, so we haven't worked on it yet. We can verify that this PR doesn't actually affect performance by forcing away the handling of missing at the end (and also reducing the expense of the kernel operation fromsin
to*
):Note that the data vector used to benchmark in #25924 isn't quite correct for general comparisons. I'm using
x = map(i -> (isodd(i) ? missing : 12345,), 1:10_000);
which ensures I have the same percent missingness and avoids the small-integer cache.By contrast, I don't really know that the issue at #25925 has a good solution. It's possible just to widen incrementally – which will give fairly decent performance on micro-benchmarks – but will be impossible to precompile effectively, so load times might always be very long.
Addendum: if you actually do care about performance, doing this column-wise (e.g. as a
Tuple{Vector...}
) would give me a significant (3-10x) speedup*.* this is failing to inline, which will also prevent vectorization, and is known performance issue #23338 – it should be even faster yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. But how is the compiler supposed to detect that it can do
x isa Int || x isa Missing
given that the only thing it knows about the input is that it's aTuple{Any}
? I guess in your example it's possible sincegetx
has only two methods, but what happens for e.g.sin
or+
?Also the return type of
map
is only inferred as::AbstractArray
when the input is aVector{Tuple{Any}}
, which can be very problematic down the line.OK. I'd rather use something like
map(i -> (rand(Bool) ? missing : i,), 1:10_000)
though, to make sure the repetition of(missing, 12345)
doesn't allow the CPU to predict the result.That's precisely why I propose using inference rather than widening incrementally. But can we have that discussion there instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input type doesn't actually play a role here, it's the output type that is failing to codegen. There's an open PR about this, although I can't find it right now.
As long as there's a function call boundary (or we finally implement loop out-lining), this has no performance significance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many inference optimizations currently rely on being able to compute the expanded form of the input unions, so I don't know that will make much of a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how can the compiler infer the output type given that the input is
Tuple{Any}
?Well, yeah, but by this line of reasoning we wouldn't care about type stability of any functions, would we?
Sorry, but I'm not sure what this means. Could you try to explain this for mere mortals like me? Also this PR affects the public API, while inference could be improved without breaking the API in 1.x releases. So I think the question is more: what public API will allow for the best optimizations in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need to – we don't lose any performance from this. All that matters is that the output type is something easy to optimize (like a concrete type or a trivial union).
Well, actually, no. It's critical to be able to write kernels with "type stability" so that for-loops over Array are optimized. That translates to requiring that most Base code be coded very carefully. But it doesn't actually carry over to mattering much for a significant portion of stdlib and beyond.
There are limits to what will be possible. We optimize unions by computing the fully expanded version. In general, this has
O(2^n)
components, so that will never help here – it's really just present to handle simple cases like iteration (and even there, it's not simple, ref redesign in #26079 needed to handle just this low order case). The best optimizations in the future will be for code kernels that operate directly on columns ofArray{T}
orArray{Union{T, Nothing}}
– which also happens to be the fastest way to do it now (cf. above addendum).