-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push join into TableScan through Project #21607
base: master
Are you sure you want to change the base?
Push join into TableScan through Project #21607
Conversation
928fc95
to
465170a
Compare
Another function can be selected for tests which does not imply such complications. |
@lukasz-stec @raunaqmorarka please review. |
465170a
to
173acd3
Compare
60fa72c
to
5406688
Compare
@@ -309,6 +309,8 @@ public PostgreSqlClient( | |||
.add(new RewriteIn()) | |||
.withTypeClass("integer_type", ImmutableSet.of("tinyint", "smallint", "integer", "bigint")) | |||
.withTypeClass("numeric_type", ImmutableSet.of("tinyint", "smallint", "integer", "bigint", "decimal", "real", "double")) | |||
.withTypeClass("string_type", ImmutableSet.of("char", "varchar")) | |||
.map("reverse(value: string_type)").to("REVERSE(value)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it work on columns which are mapped to varchar i.e money being mapped to varchar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not have sense to use it for money :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The serious answer is that we cannot control it.
Also, in most cases it does not corrupt data - it has an immediate trino-runtime, remote-compile-time exception.
The best idea I came so far is to cover it with join.pushdown.disabled-functions=reverse,cast
config/session property.
Also it's a wider issue - it does not relate to the reverse(value)
expression itself, but to the external scope as well, when different type of return is expected. e.g. reverse(value) = value
#21939 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in most cases it does not corrupt data - it has an immediate trino-runtime, remote-compile-time exception.
Mostly yes - I was thinking on re-writing RewriteVaraible
rule which rewrites the column which has a native mapping and the column which supports pushdown operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some JDBC connector like PostgreSQL - we try to map unknown column type or special column type like money as varchar - so in this case we should somehow restrict the expression rewrite - I think it can be fixed by avoid rewriting the variables which are synthetically mapped.
It's tricky, as information about initial remote type is already lost, at the moment we try to rewrite for pushdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RewriteVaraible
Thank you for pointing out, will take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will be happening in this case as in all other cases, take this predicate pushdown, for example:
TestTable table = new TestTable(
onRemoteDatabase(),
"test_money_functions",
"(col money, col2 varchar)",
List.of("8, 'test'", "9, 'test'");
"SELECT col FROM " + table.getName() + " WHERE col LIKE '%$8%';
Connector attempts to push down LIKE on money
and fails:
Caused by: org.postgresql.util.PSQLException: ERROR: operator does not exist: money ~~ character varying
Hint: No operator matches the given name and argument types. You might need to add explicit type casts.
Position: 74
Can you elaborate on this point?
What are the edge cases? What's the issue with deriving a JDBC type? |
Wrt to type derivation we have two issues
|
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/PushJoinIntoTableScan.java
Outdated
Show resolved
Hide resolved
Instead of a dedicated rule to pull a projection above join before pushing down the join into the connector, we should consider adding a generic rule for doing that optimization regardless of where it occurs in the plan. |
But pushing join before projection as a generic rule might affect the overall plan right ? What if its a cross join or if its estimate result in more number of rows than probe or build side ? |
We could make it a cost-aware decision. In practice, joins that explode data (in particular, cross joins that produce significant amount of data) are not that common, though. |
Few thoughts on idea proposed by @martint: Emphasis: regardless of where it occurs in the planInstead of
Should we consider only pulling
Or in any position?
Emphasis: before pushing down the join into the connectorThis generic rule would not attempt to call
Is my reasoning correct here? |
5406688
to
9f57619
Compare
cb68b27
to
1349edb
Compare
Extend rule to support join pushdown in cases where join conditions contain Calls causing Project nodes to appear over scans. Fix failing tests
1349edb
to
9904e6f
Compare
Rather, it would be PushXXXBelowProject, and it should generally be done:
It can't. But that's a case where pushing down the join into the connector first requires pushing down the projection. That should be handled normally by an invocation of |
The problem I was trying to solve in this PR (pushing Join through Project into TableScan to enable join pushdown with function calls in the join condition) is solved by #22203 (amazing work @Praveen2112!). With this new projection pushdown framework, all the test cases I added for PostgreSQL joins with |
Does it still makes sense to continue with
These points seem relevant. I can repurpose this PR or create a new one, I would appreciate some help with crafting a new problem statement though. |
Description
Add a ruleset to push join into TableScan through Project.
Draft todos:
PushJoinIntoTableScan
- extendPushJoinIntoTableScan
with new behaviorUPPER/LOWER COLLATE C
as well as other standard collations are not producing results consistent with Trino:PostgreSQL:
Trino:
Therefore the test use
REVERSE
as a reference function call which does not exhibit inconsistent behavior.Additional context and related issues
Why push though projection is needed?
To enable pushdown of joins with function calls in join conditions (UPPER/LOWER/CASTS etc.)
Why projection is not consumed by connector?
It is currently unsupported for JDBC connectors. There was an attempt #19740, but it didn't make it due to reasons (edge cases, fact that it needs JDBC type to be derived from Trino type when mapping).
What if there is filter below projection?
This rule will not work.
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: