-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-15658][table-planner-blink] Fix duplicate field names exception when join on the same key multiple times #11011
Conversation
…n when join on the same key multiple times
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit e0a7512 (Tue Feb 04 06:38:44 UTC 2020) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wuchong , I've always felt that the methods of the ProjectionCodeGenerator
is a little redundant.
What about add method to ProjectionCodeGenerator
:
def generateProjection(
name: String,
inputType: RowType,
inputMapping: Array[Int]): GeneratedProjection = {
val ctx = CodeGeneratorContext.apply(new TableConfig)
val outputType = RowType.of(inputMapping.map(inputType.getChildren.get):_ *)
generateProjection(ctx, name, inputType, outputType, inputMapping)
}
RowType returnType = RowType.of(keyFieldTypes, keyFieldNames); | ||
// do not provide field names for the result key type, | ||
// because we may have duplicate key fields and the field names may conflict | ||
RowType returnType = RowType.of(keyFieldTypes); | ||
RowType inputType = RowType.of(inputFieldTypes, rowType.getFieldNames()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rowType.toRowType()
However, the outputType is still required to build the key BaseRowTypeInfo, so the new method doesn't help much here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Thanks for the reveiwing @JingsongLi . Merging... |
…n when join on the same key multiple times (#11011)
…n when join on the same key multiple times (apache#11011)
What is the purpose of the change
The exmaple raised in the JIRA that batch can run successfully, but streaming will fail. The root cause is that the join key may be duplicate and the constructed key RowType will have duplicate field names.
Brief change log
We do not provide field names for the result key type, because we may have duplicate key fields and the field names may conflict.
Verifying this change
Added an IT case to verify this, the IT case will fail without this fix.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation