-
Notifications
You must be signed in to change notification settings - Fork 24.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Fix LOOKUP attribute shadowing #109807
ESQL: Fix LOOKUP attribute shadowing #109807
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @alex-spies, I've created a changelog YAML for you. |
@@ -14,6 +14,43 @@ foo bar | null | null | |||
; | |||
|
|||
|
|||
shadowing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added analogous csv tests to LOOKUP, ENRICH, DISSECT, GROK and EVAL, since they should handle attribute shadowing identically.
@@ -39,7 +38,7 @@ public class Lookup extends UnaryPlan { | |||
/** | |||
* References to the input fields to match against the {@link #localRelation}. | |||
*/ | |||
private final List<NamedExpression> matchFields; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are always attributes (unresolved or resolved); the full generality of NamedExpression
(including Alias
etc.) makes reasoning about the match fields later down the road needlessly hard.
List<Attribute> fieldsAddedFromRight = removeCollisionsWithMatchFields(rightOutput, matchFieldSet, matchFieldNames); | ||
yield mergeOutputAttributes(makeNullable(makeReference(fieldsAddedFromRight)), leftOutput); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using mergeOutputAttributes
correctly here is the crux of this PR.
I'm a little conflicted on whether some logical plans should have unit tests. Arguably, we do catch inconsistent logical plan outputs either via the dependency checker or in csv tests; on the other hand, computing the output esp. for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
} | ||
} | ||
return results; | ||
return attributes.stream().filter(attr -> matchFields.contains(attr) || matchFieldNames.contains(attr.name()) == false).toList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to avoid using streams, except tests, as it's an unnecessary runtime complication.
public HashJoinExec(PlanStreamInput in) throws IOException { | ||
super(Source.readFrom(in), in.readPhysicalPlanNode()); | ||
this.joinData = new LocalSourceExec(in); | ||
this.matchFields = in.readNamedWriteableCollectionAsList(NamedExpression.class); | ||
this.matchFields = (List<Attribute>) (List) in.readNamedWriteableCollectionAsList(NamedExpression.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we haven't cut a release of with this and we don't allow these constructs in serverless yet I think you can change this to in.readNamedWriteableCollectionAsList(Attribute.class)
. There isn't any code that'll ever send anything that isn't an Attribute
.
Also, if these were always Attribute
subclasses in practice then this'd be safe here too - it's just on the read side.
Fix #109392
This makes attribute shadowing of LOOKUP consistent with ENRICH, DISSECT/GROK and EVAL.