Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically map top-level functions over DataTree objects? #9106

Open
TomNicholas opened this issue Jun 12, 2024 · 0 comments
Open

Automatically map top-level functions over DataTree objects? #9106

TomNicholas opened this issue Jun 12, 2024 · 0 comments
Labels
enhancement topic-DataTree Related to the implementation of a DataTree class

Comments

@TomNicholas
Copy link
Contributor

TomNicholas commented Jun 12, 2024

Is your feature request related to a problem?

Sometimes you might want to map one of the xarray top-level functions (especially xr.concat or xr.merge) over DataTree objects.

Whilst this could potentially be done manually, we could also imagine generalizing top-level functions to handle this out of the box.

Describe the solution you'd like

For this to work

xr.concat([dt1, dt2], concat_dim='time')

returning a single DataTree, with xr.concat applied to sets of datasets in corresponding nodes.

Describe alternatives you've considered

We could instead not change xarray's top-level functions but still ensure that its relatively easy to achieve using map_over_subtree, i.e.

concat_datatrees = datatree.map_over_subtree(xr.concat)
dt_concatenated = concat_datatrees([dt1, dt2], dim='time')

This would still require generalizing map_over_subtree to understand iterables of DataTree objects though (see zarr-developers/VirtualiZarr#84 (comment)).

Finally we could just not support this at all, in which case the only way for users to concatenate contents of datatrees node-wise is via something like

ds_concatenated = xr.concat([mytree[node].ds for subtree in mytree], dim="time")

but called for every node in the tree.

Additional context

See zarr-developers/VirtualiZarr#84 (comment) for an example of wanting to do this in VirtualiZarr (cc @jonas-spaeth).

This was actually already something we partly discussed in the datatree design meeting (#8747), but I forgot what the conclusion was (do you remember @keewis @flamingbear @owenlittlejohns?).

@TomNicholas TomNicholas added enhancement topic-DataTree Related to the implementation of a DataTree class labels Jun 12, 2024
@TomNicholas TomNicholas added this to To do in DataTree integration via automation Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement topic-DataTree Related to the implementation of a DataTree class
Projects
Development

No branches or pull requests

1 participant