You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some tables have partition keys based on strings. Then users might write queries like:
WHERE CAST(part_col AS DATE) > DATE '1992-01-01'
TupleDomains are derived from such predicates. However, connectors don’t really consume CAST(part_col AS DATE) > DATE '1992-01-01' predicate and such predicate remains in ScanFilterProject as remaining predicate.
This significantly reduces cache hit ratio as part_col predicates are often unique.
Hence, in order to improve subquery cache for such cases we need to:
Introduce new method like: TupleDomain<ColumnHandle> ConnectorPageSourceProvider#getSplitPredicate, which would return TupleDomain<ColumnHandle> that would describe split. Such method can actually be used internally by prunePredicate so it feels like it fits in place. This method in particular will return partition_col value as a Domain#singleValue
Enhance CommonPlanAdaptation.PlanSignatureWithPredicate to also contain predicates that couldn’t be translated into TupleDomain but touch scan columns, e.g: CAST(part_col AS DATE) > DATE '1992-01-01'
Enhance CacheDriverFactory to use ExpressionInterpreter on non-TupleDomain predicates with NullableValues from getSplitPredicate. This way such predicates could be simplified.
Put simplified non-TupleDomain predicates into CacheSplitId
The text was updated successfully, but these errors were encountered:
Some tables have partition keys based on strings. Then users might write queries like:
TupleDomains
are derived from such predicates. However, connectors don’t really consumeCAST(part_col AS DATE) > DATE '1992-01-01'
predicate and such predicate remains inScanFilterProject
as remaining predicate.This significantly reduces cache hit ratio as
part_col
predicates are often unique.Hence, in order to improve subquery cache for such cases we need to:
Introduce new method like:
TupleDomain<ColumnHandle> ConnectorPageSourceProvider#getSplitPredicate
, which would returnTupleDomain<ColumnHandle>
that would describe split. Such method can actually be used internally byprunePredicate
so it feels like it fits in place. This method in particular will returnpartition_col
value as aDomain#singleValue
Enhance
CommonPlanAdaptation.PlanSignatureWithPredicate
to also contain predicates that couldn’t be translated intoTupleDomain
but touch scan columns, e.g:CAST(part_col AS DATE) > DATE '1992-01-01'
Enhance
CacheDriverFactory
to useExpressionInterpreter
on non-TupleDomain predicates withNullableValues
fromgetSplitPredicate
. This way such predicates could be simplified.Put simplified non-TupleDomain predicates into
CacheSplitId
The text was updated successfully, but these errors were encountered: