-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Split out cudf::distinct_count from drop_duplicates.cu #6822
[REVIEW] Split out cudf::distinct_count from drop_duplicates.cu #6822
Conversation
@@ -0,0 +1,196 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not new code but was just moved from the drop_duplicates.cu
source file here to its own source file.
@@ -151,44 +173,6 @@ column_view get_unique_ordered_indices(cudf::table_view const& keys, | |||
} | |||
} | |||
|
|||
cudf::size_type distinct_count(table_view const& keys, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was moved to the new distinct_count.cu
source file.
@davidwendt did you do any experimentation with removing pragmas from thrust to see what happens if things aren't inlined in this situation and the performance implications? |
Unroll may have implications for |
Codecov Report
@@ Coverage Diff @@
## branch-0.17 #6822 +/- ##
============================================
Coverage 81.94% 81.94%
============================================
Files 96 96
Lines 16164 16164
============================================
Hits 13246 13246
Misses 2918 2918 Continue to review full report at Codecov.
|
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmake lgtm
One of our longest compile times is for
drop_duplicates.cu
measured currently at about 15 minutes. This file contains two cudf APIs:cudf::drop_duplicates
andcudf::distinct_count
. They share no code so it is reasonable to move thedistinct_count
to its own source file. Both files individually have a compile time around 7 minutes.Also, analyzing the
drop_duplicates
source code I found that the table row-comparator is used in twothrust::copy_if()
calls which inlines this large comparator function many times. And, the firstcopy_if
has a lambda with two calls to the comparator meaning it is inlined twiced for each time it is inlined bycopy_if
. This is the main cause of the increased compile time and code size. I found a way to combine the logic for both thecopy_if
lambdas and further reduce the lambda to include only one copy of the comparator. Thecopy_if
itself will still inline this many times but should reduce it by a factor of 3.The
drop_duplicates
does not have a gbenchmark so I added one in this PR to measure before and after this change. No measurable effect was found so I'm including the change in this PR.The
distinct_count
API passes the row-comparator to thethrust::count_if
but only once so the same improvement does not apply there.Note that no new function has been added (or removed). This is simply an improvement to compile time and size for this source file. The compile time for
drop_duplicates.cu
is now less than 5 minutes.