Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic partition pruning #1102

Merged
merged 35 commits into from
Jun 22, 2023
Merged

Dynamic partition pruning #1102

merged 35 commits into from
Jun 22, 2023

Conversation

sarahyurick
Copy link
Collaborator

@sarahyurick sarahyurick commented Mar 28, 2023

WIP

Update: Should be ready to go, in combination with #1160 !

@codecov-commenter
Copy link

codecov-commenter commented Mar 28, 2023

Codecov Report

Merging #1102 (6c32d17) into main (c9eeb2c) will increase coverage by 0.11%.
The diff coverage is 100.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main    #1102      +/-   ##
==========================================
+ Coverage   81.48%   81.60%   +0.11%     
==========================================
  Files          78       78              
  Lines        4509     4511       +2     
  Branches      828      828              
==========================================
+ Hits         3674     3681       +7     
+ Misses        653      644       -9     
- Partials      182      186       +4     
Impacted Files Coverage Δ
dask_sql/context.py 94.00% <100.00%> (+0.04%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sarahyurick
Copy link
Collaborator Author

Waiting on https://github.com/rapidsai/ops/issues/2644 to resolve some datatype issues for the gpuCI tests.

@sarahyurick sarahyurick marked this pull request as ready for review May 10, 2023 18:38
@sarahyurick sarahyurick marked this pull request as draft May 15, 2023 22:32
@sarahyurick sarahyurick marked this pull request as ready for review May 30, 2023 22:23
Copy link
Collaborator

@jdye64 jdye64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @sarahyurick looks good to me

Comment on lines 105 to 107
if dask_config.get("sql.dynamic_partition_pruning"):
self.context.apply_dynamic_partition_pruning()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometime configs can be passed through the c.sql method as well https://github.com/dask-contrib/dask-sql/blob/main/dask_sql/context.py#L468.

Unless you need this during context initialization, it might make sense to set this variable somewhere in _get_ral before the query is parsed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done with 182fe8e - with this, DPP can be configured either globally (default True) or on a per-query basis with the config_options parameter.

Copy link
Collaborator

@ayushdg ayushdg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes generally lgtm! Thanks a lot for this.

Just some a couple of minor suggestions.

Would it be possible to followup this pr with some tests both for the config option as well as testing dpp.

@ayushdg ayushdg merged commit f8bf06c into dask-contrib:main Jun 22, 2023
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants