Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-6810] Disable CalcRemoveRule to fix trivial projections #8035

Merged
merged 1 commit into from
Mar 12, 2019

Conversation

kanterov
Copy link
Member

@kanterov kanterov commented Mar 12, 2019

Trivial programs project precisely their input fields, without dropping
or re-ordering them. It turns out that CalcRemoveRule eliminates them,
even if there is a field aliasing. It results in an unexpected
resulting schema of SQL query.

It isn't clear whether it's an issue in Calcite or Beam rules, as a
workaround we disable CalcRemoveRule.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Python Build Status
Build Status
--- Build Status
Build Status
Build Status --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

Trivial programs project precisely their input fields, without dropping
or re-ordering them. It turns out that CalcRemoveRule eliminates them,
even if there is a field aliasing. It results in an unexpected
resulting schema of SQL query.

It isn't clear whether it's an issue in Calcite or Beam rules, as a
workaround we disable CalcRemoveRule.
@kanterov
Copy link
Member Author

R: @amaliujia

@kanterov kanterov force-pushed the kanterov_fix_trivial_projections branch from 5bc1ae7 to aefbe14 Compare March 12, 2019 11:15
@amaliujia
Copy link
Contributor

LGTM. Thanks for finding this issue.

The issue is not because of BeamSQL as this rule is implemented in Calcite. It might not be a bug. Maybe same other rules needed to help alias not dropped.

We are lacking of a rich sets of queries to verify and protect semantic correctness for the long time. The rich query set should define what an implementation behave if it uses Calcite as its SQL dialect. I have no idea if there is such set exists.

@kanterov
Copy link
Member Author

@amaliujia thanks! Agree, I discovered this issue by an internal test suite.

I was looking into CALCITE-1584, it seems that Drill had similar issues, but I didn't check how they fixed it. Let's follow up and see how we can fix that in the right way.

I was surprised to read:

Calcite doesn't promise to retain field names. 95% of the time it can and does, but 5% of the time it really can't retain field names without seriously compromising the Volcano engine.

@kanterov kanterov merged commit 30633b2 into apache:master Mar 12, 2019
@kanterov kanterov deleted the kanterov_fix_trivial_projections branch March 12, 2019 21:35
@amaliujia amaliujia mentioned this pull request Jul 15, 2019
3 tasks
pl04351820 pushed a commit to pl04351820/beam that referenced this pull request Dec 20, 2023
FBO Sphinx (missing ref links), static analysis, autocompletion.

Also, document 'v1' entities, rather than 'v1beta1'.

Closes apache#8035.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants