Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayData::try_new/new_unchecked should take NullBuffer or BooleanBuffer for nulls to preserve offsets #4810

Closed
jhorstmann opened this issue Sep 13, 2023 · 2 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@jhorstmann
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Since the introduction of BooleanBuffer the validity and data buffers of an array could have differing offsets. For example the following test passes:

let values = BooleanBuffer::from(vec![true, false, true, false]).slice(1, 3);
let nulls = BooleanBuffer::from(vec![true, false, true, false, true]).slice(2, 3);

let array = BooleanArray::new(values, Some(NullBuffer::new(nulls)));

let data = array.into_data();
assert_eq!(data.offset(), 1);
assert_eq!(data.nulls().map(|n| n.offset()), Some(2));

However there seems to be no constructor of ArrayData that allows creating such data directly. Both try_new and new_unchecked take an untyped Buffer for the null_bit_buffer.

Describe the solution you'd like

The null_bit_buffer parameter should be changed to be of type Option<NullBuffer>. This should also simplify some of the validation logic in try_new and allow to remove the null_count parameter from new_unchecked.

Describe alternatives you've considered

The parameter could also be of type Option<BooleanBuffer> and keep the null_count parameter, but using NullBuffer is probably the better encapsulation.

Additional context

@jhorstmann jhorstmann added the enhancement Any new improvement worthy of a entry in the changelog label Sep 13, 2023
@tustvold
Copy link
Contributor

tustvold commented Sep 13, 2023

ArrayData::try_new and ArrayData::new_unchecked simply call through to ArrayDataBuilder which provides support for setting nulls directly. Perhaps callsites that wish to preserve nulls could just be altered to use ArrayDataBuilder directly?

Edit: or even better the array constructors

@tustvold
Copy link
Contributor

Just doing some backlog pruning, and I believe this functionality is already provided by ArrayDataBuilder

@tustvold tustvold closed this as not planned Won't fix, can't repro, duplicate, stale Mar 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

2 participants