Support pushing dereferences within lambdas into table scan #21957

zhaner08 · 2024-05-13T19:00:22Z

Description

This is to extend the enhancement discussed here #3925, and depends/extends on the original PR #4270 that is currently rebasing by @Desmeister

Since the issue and discussion had been idled for years and this kind of optimization could be critical to anyone having highly nested schema and using Unnest, I would like to use this PR to formally restart the discussion on how the community want to eventually support this and if this is on the right direction (I have a working version locally, not this one, that speeds up the query while reducing actual data processed)

From my understanding of the previous discussions, this should be done through below steps:

Convert non replicate symbol dereferencing involved with Unnest into lambda functions with subscript expressions for each of the Unnests
- Done through Prune unnest mappings using lambda expressions #4270 which is currently not merged
Push the lambda function down
- Type 1: lambda function is already above TableScan, in this case, this rule will help to pushdown the dereferencing further, while for any connectors that dont support dereferencing, the rule will preserve the Lambda expression to remove columns
- Type 2: Lambda functions are not at the ~Leaf, this will be handled by PushDownDereferenceThroughUnnest and many other expression specific rules. PushDownDereferenceThroughUnnest is not handling any unnest symbols currently, but only replicated symbols. In order to support unnest symbols, I believe at least a new expression has to be created, or subscript expression has to be extended otherwise I dont see an easy way to represent the dereferences so it can be further pushed down through other unnests in anyway. I need more guidance on how this could be done or possible with what we have now, that is why this PR in particular is not handling any complex cases like nested Unnest and only push lambdas down through project and filters in a limited way.
Pushing dereferencing into TableScan
- This is kind of implemented by this PR. I extended the existing visitFunctionCall in ConnectorExpressionTranslator to create a new connector expression (can be merged with existing FieldDereference expression if possible), then passing those into existing applyProjection method to let connectors decide how to handle those. For this PR, only HiveMetadata has implementation to handle those, other connectors will simply ignoring them. The applyProjection will create new projections and HiveColumnHandle for Hive with extended HiveColumnProjectionInfo.
Pushing dereferences into file readers
- This is done by this PR. We need a representation of dereferencing into Array (or potentially map). Currently everything is represented by simply Arrays of String (names) or Arrays of Integers (indexes) and by just using this, we cannot pass down any dereferencing that are more complex. I cherry-picked the Subfields classes from Presto since it's already established and have similar methods already implemented for Parquet reader. Though depends on how the community want to represent this, we can swap this with another representation as long as it can supporting anything more complex than simple Structs.
Readers skip column readings
- This is done by the PR, for Parquet, file schema will be pruned to only contain needed columns and other columns will just be an empty block to be returned therefore reduce the actual data scanned while also reduced any data going through local and remote exchange.

This PR is written in a way to reduce the impacts to the existing features while I can fully validate the performance impact while gathering feedbacks and directions from the community. Therefore implementations are normally wrapped in an if instead of fully refactoring the existing method

I believe if this is the right direction, changes can be contributed through below phases

Replacing the existing Array<dereferences> within HiveColumnProjectionInfo to Subfields or anything similar to that and make sure all methods that used to depend on Array<dereferences> now depend on the new representation
Have the newly added optimization rule fully integrate with the existing applyProjection method (or not? It can simply be a non-iterative visitor at the very end like now.)
Instead of just just pruning schemas, we also prune the output symbols/types of the tableScan (currently it keeps the original symbols but just returning empty blocks to minimize changes)
Remove the Lambda expression if the translations are supported by the connector. The current overhead should be small though, but the risk of wrongly removing the lambda expression while connectors are not correctly pruning nested columns are large so this PR is currently still keeping the Lambda expression after the push down.
Supports dereference pushdown of unnest symbols through ~all kind of expressions. I have the two rules added to support pushing down through project and filter, probably we can live with those in short term, but eventually have to address things like how to push down through unnest or other complex expressions

The change has been fully validated except rebasing to the latest Trino release that could have a lot of conflicts due to AST/IR refactoring

trino:default> ***BEFORE*** with tmp as (
            -> SELECT
            ->     a1.data2 as d2,
            ->     a1.array11 as nestedarray
            -> FROM 
            ->     default.test_unnest_unnest_prunning_parquet
            ->     CROSS JOIN UNNEST(default.test_unnest_unnest_prunning_parquet.array1) t (a1)
            ->     where id>0
            -> )
            ->  SELECT
            ->      d2,
            ->     array2.struct1.data4,
            ->     array2.struct1.data5
            -> FROM 
            ->     tmp
            ->     CROSS JOIN UNNEST(tmp.nestedarray) t (array2);
  d2  | data4 | data5 
------+-------+-------
 -10- |   100 | -100- 
 -10- |   101 | -101- 
 -11- |   110 | -110- 
 -11- |   111 | -111- 
 -20- |   200 | -200- 
 -20- |   201 | -201- 
 -21- |   210 | -210- 
 -21- |   211 | -211- 
(8 rows)

Query 20240518_032355_00008_qhz93, FINISHED, 1 node
https://localhost:8080/ui/query.html?20240518_032355_00008_qhz93
Splits: 1 total, 1 done (100.00%)
CPU Time: 0.0s total,    80 rows/s, 16.5KB/s, 10% active
Per Node: 0.0 parallelism,     1 rows/s,   413B/s
Parallelism: 0.0
Peak Memory: 542B
1.02 [2 rows, 423B] [1 rows/s, 413B/s]


trino:default> ***After*** with tmp as (
            -> SELECT
            ->     a1.data2 as d2,
            ->     a1.array11 as nestedarray
            -> FROM 
            ->     default.test_unnest_unnest_prunning_parquet
            ->     CROSS JOIN UNNEST(default.test_unnest_unnest_prunning_parquet.array1) t (a1)
            ->     where id>0
            -> )
            ->  SELECT
            ->      d2,
            ->     array2.struct1.data4,
            ->     array2.struct1.data5
            -> FROM 
            ->     tmp
            ->     CROSS JOIN UNNEST(tmp.nestedarray) t (array2);
  d2  | data4 | data5 
------+-------+-------
 -10- |   100 | -100- 
 -10- |   101 | -101- 
 -11- |   110 | -110- 
 -11- |   111 | -111- 
 -20- |   200 | -200- 
 -20- |   201 | -201- 
 -21- |   210 | -210- 
 -21- |   211 | -211- 
(8 rows)

Query 20240518_032332_00007_qhz93, FINISHED, 1 node
https://localhost:8080/ui/query.html?20240518_032332_00007_qhz93
Splits: 1 total, 1 done (100.00%)
CPU Time: 0.0s total,    80 rows/s,   14KB/s, 9% active
Per Node: 0.0 parallelism,     1 rows/s,   344B/s
Parallelism: 0.0
Peak Memory: 542B
1.04 [2 rows, 359B] [1 rows/s, 344B/s]

Byte scanned decreased from 423B to 359B for the sample query, we've seen large performance improvement in production queries

Additional context and related issues

I would really appreciate any kind of comments or feedbacks as without clear directions, I can't further extend this without risking of throwing everything away later. Any of the component should be easily plug in if we have a clear idea of how we want to do it otherwise.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

# Section
* Enhance query performance on dereference on unnest symbols

pettyjamesm · 2024-05-14T19:14:40Z

core/trino-spi/src/main/java/io/trino/spi/subfield/Subfield.java

+// Picked from Presto
+public class Subfield
+{
+    public interface PathElement


Nit: could be sealed

Addressed in rev2

pettyjamesm · 2024-05-14T19:32:18Z

...main/src/main/java/io/trino/sql/planner/iterative/rule/PushSubscriptLambdaIntoTableScan.java

+        // As a result, only support limited cases now which symbol reference has to be uniquely referenced
+        ImmutableList.Builder<Expression> expressionsBuilder = ImmutableList.<Expression>builder()
+                .addAll(project.getAssignments().getExpressions());
+        List<Expression> expressions = expressionsBuilder.build();


Could just be: List<Expression> expressions = ImmutableList.copyOf(project.getAssignments().getExpressions());

Addressed in rev2

pettyjamesm · 2024-05-14T19:54:29Z

...main/src/main/java/io/trino/sql/planner/iterative/rule/PushSubscriptLambdaIntoTableScan.java

+
+        partialTranslations = partialTranslations.entrySet().stream().filter(entry -> {
+            ArrayFieldDereference arrayFieldDereference = (ArrayFieldDereference) entry.getValue();
+            return arrayFieldDereference.getTarget() instanceof Variable


Nit: Could be return arrayFieldDereference.getTarget() instanceof Variable variable && symbolReferenceNamesCount.get(variable.getTarget().getName()) == 1;

Addressed in rev2

pettyjamesm · 2024-05-14T20:00:34Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/parquet/ParquetPageSourceFactory.java

+                        combinedPrunedTypes = combinedPrunedTypes.union(prunedType);
+                    }
+                }
+                return Optional.ofNullable(combinedPrunedTypes) // Should never be null since subfields is non-empty.


If combinedPrunedTypes should never be null, then you can just use Optional.of(combinedPrunedTypes).

Addressed in rev2

…d filters, with bug fixes, style fixes and unit tests

zhaner08 · 2024-05-29T18:27:32Z

@martint please take a look when you get a chance, even any high level comment would be helpful.

github-actions · 2024-06-20T17:02:42Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

Praveen2112 · 2024-06-27T14:23:36Z

.idea/icon.png

This change is not required right ?

Praveen2112 · 2024-06-27T14:24:14Z

core/trino-main/src/main/java/io/trino/SystemSessionProperties.java

+                booleanProperty(
+                        ENABLE_PUSH_SUBSCRIPT_LAMBDA_INTO_SCAN,
+                        "Enable Push Subscript Lambda Into Scan feature",
+                        false,


Can we have a config object to toggle the same ?

Praveen2112 · 2024-06-27T14:26:52Z

core/trino-spi/src/main/java/io/trino/spi/expression/ArrayFieldDereference.java

+    {
+        super(type);
+        this.target = requireNonNull(target, "target is null");
+        this.elementFieldDereferences = requireNonNull(elementFieldDereference, "elementFieldDereference is null");


ImmutableList.copyOf(requireNonNull(elementFieldDereference, "elementFieldDereference is null"))

Can we add a verification with the Type as well ? i.e type instanceOf ArrayType

Praveen2112 · 2024-06-27T14:28:19Z

core/trino-spi/src/main/java/io/trino/spi/expression/ArrayFieldDereference.java

+    @Override
+    public List<? extends ConnectorExpression> getChildren()
+    {
+        return singletonList(target);


Shouldn't we pass the elementFieldDereferences here ? Bcoz they are not a constant/literal so if are running any logic on its children then it should be applied for the electFieldDereferences

Praveen2112 · 2024-06-27T14:29:17Z

core/trino-spi/src/main/java/io/trino/spi/expression/ArrayFieldDereference.java

+    @Override
+    public String toString()
+    {
+        return format("(%s).#%s", target, elementFieldDereferences);


Shouldn't the pattern be like #index_1#index_2#...

Praveen2112 · 2024-06-27T14:30:45Z

core/trino-spi/src/main/java/io/trino/spi/subfield/Subfield.java

+import static java.util.Objects.requireNonNull;
+
+// Class to represent subfield. Direct referenced from Presto
+public class Subfield


Subfield as in row type or ?

We don't need a dedicated abstraction in the SPI to represent this. This information is available (and should be encoded) in the structure of ConnectorExpressions passed to the connector APIs. We can have utilities in the plugin toolkit module to extract the necessary info to make it easier for connector implementers.

This is already represented as ArrayFieldDereference in this PR, then in this case we will still need to extract it to a form that readers can use regardless of table formats?

Praveen2112 · 2024-06-27T14:39:47Z

core/trino-main/src/main/java/io/trino/sql/planner/ConnectorExpressionTranslator.java

@@ -675,6 +678,32 @@ protected Optional<ConnectorExpression> visitFunctionCall(FunctionCall node, Voi
                return translateLike(node);
            }

+            // Very narrow case that only tries to extract a particular type of lambda expression
+            // TODO: Expand the scope
+            if ("transform".equals(functionName)) {


Does it work only for transform - Don't we have to extend if for other functions like subscript ?

transform for now

Praveen2112 · 2024-06-27T14:59:18Z

.../java/io/trino/sql/planner/iterative/rule/PushSubscriptLambdaThroughFilterIntoTableScan.java

+ *
+ * TODO: Remove lambda expression after subfields are pushed down
+ */
+public class PushSubscriptLambdaThroughFilterIntoTableScan


If we could push the projections through filter and the PushProjectionsIntoTableScan could take care of them right ?

not necessary, there could be project->filter->scan at the very beginning of planning for very simple queries

zhaner08 · 2024-07-08T16:47:02Z

Thanks @Praveen2112 to take a look at my PR, will definitely revise it based on those, but before that, I really want to get a confirmation on how we are going to represent Subfields across connectors, in this way I can start rewriting all the relevant part with the new representation, currently it works very awkward that if Subfields exists, the respect it, otherwise, respect the dereference indexes and names which is the real limitation in Trino at this moment. Since that cleanup will be large, I do not want to start it before the community agrees with it.

github-actions · 2024-08-15T17:03:04Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions · 2024-09-17T17:28:09Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions · 2024-10-09T17:02:57Z

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

Support pushing dereferences within lambdas into table scan

e339078

cla-bot bot added the cla-signed label May 13, 2024

zhaner08 requested a review from martint May 13, 2024 19:00

github-actions bot added delta-lake Delta Lake connector hive Hive connector labels May 13, 2024

zhaner08 self-assigned this May 13, 2024

zhaner08 requested a review from pettyjamesm May 14, 2024 19:58

pettyjamesm reviewed May 14, 2024

View reviewed changes

Extending support of subscript lambda pushdwon to support projects an…

432eb41

…d filters, with bug fixes, style fixes and unit tests

zhaner08 changed the title ~~[WIP] Support pushing dereferences within lambdas into table scan~~ Support pushing dereferences within lambdas into table scan May 22, 2024

zhaner08 added the performance label May 24, 2024

github-actions bot added the stale label Jun 20, 2024

Praveen2112 reviewed Jun 27, 2024

View reviewed changes

github-actions bot removed the stale label Jun 27, 2024

github-actions bot added the stale label Aug 15, 2024

github-actions bot removed the stale label Aug 26, 2024

zhaner08 mentioned this pull request Aug 28, 2024

Support pushing dereferences within lambdas into table scan #23148

Open

github-actions bot added the stale label Sep 17, 2024

github-actions bot closed this Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support pushing dereferences within lambdas into table scan #21957

Support pushing dereferences within lambdas into table scan #21957

zhaner08 commented May 13, 2024 •

edited

Loading

pettyjamesm May 14, 2024

zhaner08 May 22, 2024

pettyjamesm May 14, 2024

zhaner08 May 22, 2024

pettyjamesm May 14, 2024

zhaner08 May 22, 2024

pettyjamesm May 14, 2024

zhaner08 May 22, 2024

zhaner08 commented May 29, 2024

github-actions bot commented Jun 20, 2024

Praveen2112 Jun 27, 2024

Praveen2112 Jun 27, 2024

Praveen2112 Jun 27, 2024

Praveen2112 Jun 27, 2024

Praveen2112 Jun 27, 2024

Praveen2112 Jun 27, 2024

martint Jul 25, 2024

zhaner08 Aug 23, 2024

Praveen2112 Jun 27, 2024

zhaner08 Aug 23, 2024

Praveen2112 Jun 27, 2024

zhaner08 Aug 23, 2024

zhaner08 commented Jul 8, 2024 •

edited

Loading

github-actions bot commented Aug 15, 2024

github-actions bot commented Sep 17, 2024

github-actions bot commented Oct 9, 2024

Support pushing dereferences within lambdas into table scan #21957

Support pushing dereferences within lambdas into table scan #21957

Conversation

zhaner08 commented May 13, 2024 • edited Loading

Description

Additional context and related issues

Release notes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhaner08 commented May 29, 2024

github-actions bot commented Jun 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhaner08 commented Jul 8, 2024 • edited Loading

github-actions bot commented Aug 15, 2024

github-actions bot commented Sep 17, 2024

github-actions bot commented Oct 9, 2024

zhaner08 commented May 13, 2024 •

edited

Loading

zhaner08 commented Jul 8, 2024 •

edited

Loading