Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle concatenation with cat #73

Closed
rafaqz opened this issue Aug 28, 2022 · 4 comments
Closed

Handle concatenation with cat #73

rafaqz opened this issue Aug 28, 2022 · 4 comments

Comments

@rafaqz
Copy link
Collaborator

rafaqz commented Aug 28, 2022

Currently cat is very slow, and may even give the wrong anwer?

See: rafaqz/Rasters.jl#298

It could be more efficient to allocate the final array and then broadcast into views of it, instead of copying each disk array to memory separately.

@meggart
Copy link
Owner

meggart commented Aug 31, 2022

+100 for defining a lazy DiskArray version of cat. As you mentioned, as a first step it would be good to move relevant types from DiskArrayTools.jl into this package and then try to emulate cat from Base as good as we can. What is currently not supported by these types would be diagonal concatenation, like in this example:

julia> cat(true, trues(2,2), trues(4)', dims=(1,2))
  4×7 Matrix{Bool}:
   1  0  0  0  0  0  0
   0  1  1  0  0  0  0
   0  1  1  0  0  0  0
   0  0  0  1  1  1  1

but this might not be the most relevant method?

@meggart
Copy link
Owner

meggart commented Aug 31, 2022

Just as a short demonstration how these types work, the constructors do not have a dims keyword. You simply pass and n-d array of DiskArrays and the stacking will happen according to these dimensions. For stacks, new dimensions are created:

julia> using DiskArrayTools

julia> a = rand(5,6);

julia> stack_1d = diskstack([a,a,a])
Disk Array with size 5 x 6 x 3

julia> stack_2d = diskstack(fill(a,2,3))
Disk Array with size 5 x 6 x 2 x 3

julia> DiskArrayTools.eachchunk(stack_2d)
1×1×2×3 DiskArrays.GridChunks{4}:
[:, :, 1, 1] =
 (1:5, 1:6, 1:1, 1:1)

[:, :, 2, 1] =
 (1:5, 1:6, 2:2, 1:1)

[:, :, 1, 2] =
 (1:5, 1:6, 1:1, 2:2)

[:, :, 2, 2] =
 (1:5, 1:6, 2:2, 2:2)

[:, :, 1, 3] =
 (1:5, 1:6, 1:1, 3:3)

[:, :, 2, 3] =
 (1:5, 1:6, 2:2, 3:3)

And it works in a similar for concatentation. Here, arrays can have different lengths along concatenated dimensions. but the tiles themselves have to form a grid:

julia> a = rand(3); b=rand(4);

julia> a_concat = ConcatDiskArray([a,b])
Disk Array with size 7

julia> DiskArrayTools.eachchunk(a_concat)
2-element DiskArrays.GridChunks{1}:
 (1:3,)
 (4:7,)

julia> a = rand(2,3); b = rand(2,1); c = rand(4,3); d = rand(4,1);

julia> a_concat_2d = ConcatDiskArray([[a] [b]; [c] [d]])
Disk Array with size 6 x 4

julia> DiskArrayTools.eachchunk(a_concat_2d)
2×2 DiskArrays.GridChunks{2}:
 (1:2, 1:3)  (1:2, 4:4)
 (3:6, 1:3)  (3:6, 4:4)

Of course, this will work in higher-dimensional cases as well

@rafaqz
Copy link
Collaborator Author

rafaqz commented Aug 31, 2022

Perfect. We will also need to handle the non-lazy case of concatenating a disk array with other kinds of AbstractArray. Maybe by broadcasting each array to views of a preallocated destination array, or wrapping other aways in some way where they can coexist with the lazy disk array?

@meggart
Copy link
Owner

meggart commented Sep 1, 2022

The types here work with any array (in the example I pass plain arrays), so you can freely mix Disk- and non-DiskArrays, but they will be lazy by default. You can always call Array(...) to materialize the result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants