-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement dynamic row filtering #22411
Conversation
1c99838
to
ad6581a
Compare
core/trino-main/src/main/java/io/trino/operator/project/PageProcessorMetrics.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about adding yet more system properties and config for this. I'm fine having a kill switch if the feature has problems, but I'd make them hidden, and generally we should remove them when we think the feature is working.
core/trino-main/src/main/java/io/trino/sql/gen/columnar/CallColumnarFilterGenerator.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/InColumnarFilterGenerator.java
Outdated
Show resolved
Hide resolved
eb9e858
to
876dd37
Compare
f0cba12
to
13fd7b5
Compare
Since this is a new implementation, we need the properties to allow us to easily root cause any potential issues. These also make it easy for us to write tests for the feature. The selectivity threshold is something that a user might want to legitimately tune for their workload. Making them hidden just hinders their usage, I would like to keep them as normal properties for now as there isn't anything harmful about them. |
core/trino-main/src/main/java/io/trino/sql/planner/DomainTranslator.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ir/optimizer/rule/SimplifyContinuousInValues.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ir/optimizer/rule/SimplifyContinuousInValues.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ir/optimizer/rule/SimplifyContinuousInValues.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ir/optimizer/rule/SimplifyContinuousInValues.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ir/optimizer/rule/SimplifyContinuousInValues.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/ir/optimizer/TestSimplifyContinuousInValues.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/ir/optimizer/TestSimplifyContinuousInValues.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/FilterEvaluator.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/DynamicPageFilter.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/DynamicPageFilter.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/DynamicPageFilter.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/DynamicPageFilter.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/gen/columnar/DynamicPageFilter.java
Show resolved
Hide resolved
fffd5f8
to
c86e067
Compare
core/trino-main/src/test/java/io/trino/sql/ir/optimizer/TestSimplifyContinuousInValues.java
Outdated
Show resolved
Hide resolved
So far dynamic filters have been pushed into connectors which have used them to filter data at the level of granularity supported by them (e.g. partition, bucket, file, split, row-group etc.). This change adds evaluation of dynamic filters in the engine on worker nodes after the usual static filter (if any) has been evaluated in ScanFilterProject.
Non-selective dynamic filters are automatically detected and removed from execution so that overhead of execution these filters is low when they are not useful.
BenchmarkInCodeGenerator columnarEvaluationEnabled (hitRate) (inListCount) (type) Mode Cnt Before Score After Score Units 0.1 2 bigint avgt 12 9.638 ? 0.265 9.138 ? 0.709 us/op 0.1 4 bigint avgt 12 10.549 ? 0.682 8.410 ? 0.060 us/op 0.1 25 bigint avgt 12 30.833 ? 4.390 8.967 ? 0.346 us/op 0.1 100 bigint avgt 12 33.023 ? 5.527 8.691 ? 0.328 us/op 0.1 1000 bigint avgt 12 34.606 ? 6.841 8.438 ? 0.097 us/op 0.1 10000 bigint avgt 12 32.668 ? 4.724 8.450 ? 0.121 us/op
how much tpcds data? |
It's scale factor 1000 (1 TB) |
Description
Dynamic row filtering performs fine-grained filtering of rows in the scan operator,
thus greatly improving performance of some queries.
So far dynamic filters have been pushed into connectors which have used them for
partition, bucket, split and row-group/stripe pruning. This change adds evaluation of
dynamic filters in the engine on worker nodes after the usual static filter (if any) has been
evaluated in ScanFilterProject.
Non-selective dynamic filters are automatically detected and removed from execution
so that overhead of execution these filters is low when they are not useful.
Additional context and related issues
Fixes #13305
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: