-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ClassCastException with MV_EXPAND on missing field #110096
Fix ClassCastException with MV_EXPAND on missing field #110096
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @luigidellaquila, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Luigi, nice catch!
I have some minor remarks, but this LGTM!
plan = plan.transformExpressionsOnlyUp( | ||
FieldAttribute.class, | ||
f -> stats.exists(f.qualifiedName()) ? f : new Alias(f.source(), f.name(), null, Literal.of(f, null), f.id()) | ||
); | ||
} | ||
// otherwise transform fields in place | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehm, isn't the default case here just asking for trouble? We're replacing field attributes by literals; that's only safe to do further down in the expression tree (if at all). I wonder if we rather shouldn't optimize at all in these cases; or maybe have a version of transformExpressionsOnlyUp
that never applies to the top-level expression. (Although that would still be unsafe in general, but could be acceptable here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a reasonable concern, probably we should not optimize at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I removed the optimization for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you mean the final else
. Yes, that's fragile and could lead to more similar failures when we add new commands.
I didn't change it because it's needed to trigger a few more downstream optimizations, see all the LocalLogicalPlanOptimizerTests.testMissingField*
tests, and because it's relatively safe with current feature set.
It's worth remembering that it's dangerous though, I'll open an issue so that we can revisit it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO but this could be a follow-up optimization | ||
plan = plan.transformExpressionsOnlyUp( | ||
FieldAttribute.class, | ||
f -> stats.exists(f.qualifiedName()) ? f : new Alias(f.source(), f.name(), null, Literal.of(f, null), f.id()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, rather than replacing the expression but keeping the MV_EXPAND, wouldn't it be cleaner to replace the MV_EXPAND by an EVAL? That might profit better from our EVAL
optimizations, pruning rules, etc. (if they're applied after this - didn't check.) or at least be easy to understand what to expect from.
E.g.
FROM idx
| MV_EXPAND field
becomes (if field is missing)
FROM idx
| EVAL field = null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered it as a follow-up.
I didn't do it with this PR because it can be slightly tricky and I wanted to treat it as an optimization rather than a fix.
In particular, in the example you propose, field
represents two different things:
- for MV_EXPAND it's the target; the output will have the same name but a different ID
- for EVAL it's a completely new field, that will need mask the previous one and have the same ID as the output for MV_EXPAND.
var plan = plan(""" | ||
from test | ||
| mv_expand last_name | ||
| keep first_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be
| keep last_name
This may (in the future) make this test fail (for the wrong reasons): last_name
is irrelevant and may be pruned.
…t' into esql/mv_expand_classcast
Please, hold on on merging this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks folks! Merging |
💔 Backport failed
You can use sqren/backport to manually backport by running |
Manual backport #110264 |
Fixes #109974
LocalLogicalPlanOptimizer.ReplaceMissingFieldWithNull
tries to replace missing fields (ie. fields that are not present on the mappings of local node) with a NULL literal.This is not possible for MvExpand, because the target is a
NamedExpression
(andLiteral
is not). This causedClassCastException
.This PR fixes the problem using a null
Alias
in this specific case.