Refactor batch coalesce to be based solely on batch data size #1133
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#1116 placed a 2GB limit on the GPU batch target size which will prevent us from trying to coalesce batches that can exceed the 2GB element limit for any single column. This PR refactors batch coalesce so it is based solely on data size rather than needing to also track string sizes. This significantly simplifies the coalesce logic and also removes the need for the plugin to traverse the column hierarchy to calculate individual column sizes when the batch is single-buffer based (e.g.: compressed batch or
contiguous_split
batch).Once cudf's
ColumnVector#getDeviceMemorySize
is updated to account for nested columns then even normal GPU batches only will need the plugin to traverse the top-level columns and not need to check for string/list types.Note that the original batch coalesce code checked for batches containing
RapidsHostColumnVector
, but those column types should never be seen by a batch coalesce. These types of batches only exist specifically between a partition and the shuffle, and there should never be a coalesce in-between. This also removes the implicit size fallback if the column type is ignored.