Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve order in left join for cudf-polars #16268

Merged

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Jul 12, 2024

Description

Unlike all other joins, polars provides an ordering guarantee for left joins. By default libcudf does not, so we need to order the gather maps in this case.

While here, because it requires another hard-coding of int32 for something that should be size_type, expose type_to_id in cython and plumb it through.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@wence- wence- requested a review from a team as a code owner July 12, 2024 11:24
@github-actions github-actions bot added Python Affects Python cuDF API. cudf.polars Issues specific to cudf.polars pylibcudf Issues specific to the pylibcudf package labels Jul 12, 2024
@wence- wence- added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 12, 2024
@wence- wence- force-pushed the wence/fea/polars-left-join-order branch from a89561e to 7bd3aa5 Compare July 16, 2024 14:44
Instead use new exposed pylibcudf SIZE_TYPE_ID.
We can use the constructor that takes an && reference to a
device_uvector. This means we don't accidentally overflow size_type if
the gather map is too large for a column, and avoids hardcoding the
type id for the entries.
@wence- wence- force-pushed the wence/fea/polars-left-join-order branch from 8a86fd2 to 652276b Compare July 19, 2024 14:10
@wence-
Copy link
Contributor Author

wence- commented Jul 19, 2024

Fixed merge conflicts, ready for another look.

@wence-
Copy link
Contributor Author

wence- commented Jul 19, 2024

/merge

@rapids-bot rapids-bot bot merged commit dc62177 into rapidsai:branch-24.08 Jul 19, 2024
85 checks passed
@wence- wence- deleted the wence/fea/polars-left-join-order branch July 19, 2024 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants