Matmul on 4d tensors with cuda backend #672

balisujohn · 2023-12-29T07:40:34Z

Thanks to all contributors for your work on ggml so far.

I have tensors with shape [18,18,16,4] and [18,64,16,4] and I am trying to multiply them together to get a tensor of shape [64,18,16,4]. It seems like the resulting tensor in the computational graph has the right shape, but I get this error

GGML_ASSERT: /home/john/ggml/src/ggml-cuda.cu:7387: src0->ne[3] == 1

I am using a version of ggml effectively the same as this commit. Is there a canonical way to broadcast matmul the first 2 dimensions of 4d tensors with ggml using the CUDA backend, or would I need to add this?

The text was updated successfully, but these errors were encountered:

balisujohn · 2023-12-29T08:38:16Z

(It occurs to me I might be able to consolidate dimensions 3 and 4 to make them into 3d tensors before multiplying. I'll give that a try)

ggerganov · 2023-12-29T10:36:15Z

Yes, you should be able to ggml_reshape_3d to achieve this.

However, where is this assert - I cannot seem to find it? There is only this one, but it is in ggml_cuda_cpy:

ggml/src/ggml-cuda.cu

Line 9061 in 6474641

GGML_ASSERT(src0->ne[3] == 1);

balisujohn · 2023-12-29T20:41:14Z

So it looks like the issue was actually coming from a ggml_cont command. So I have a 4d tensor that's the result of permute so it's not contiguous. If I try ggml_reshape_3d I get

GGML_ASSERT: /home/john/ggml/src/ggml.c:6882: ggml_is_contiguous(a)

And if I try ggml_cont_3d,

I get

GGML_ASSERT: /home/john/ggml/src/ggml-cuda.cu:7387: src0->ne[3] == 1

So this seems like a bit of a catch-22. Is there a way to get around this?

If I try to use the raw 4d tensor in the ggml_mul-mat, I get

GGML_ASSERT: /home/john/ggml/src/ggml.c:6496: !ggml_is_transposed(a)

Which is what motivated the ggml_cont in the first place.

balisujohn · 2023-12-31T00:37:20Z

Ok so the deal is GGML_OP_CONT results in ggml_cuda_cpy being called in ggml_cuda.cu. and ggml_cuda_cpy asserts that both tensors it operates on are no more than 3d. Would there be interest making ggml_cuda_cpy support 4d tensors? I might be willing to work on it myself.

balisujohn · 2023-12-31T00:41:43Z

(It seems like 4d cont is implemented for ggml.c so this is mostly a task of reverse engineering a for loop implementation into a cuda kernel)

slaren · 2023-12-31T00:48:21Z

Yes, it would be great to fix these issues in the CUDA backend. Most of the code in the CUDA backend was written specifically for llama.cpp, and over time it has been expanded to be more general, but there is still a lot of work to do.

balisujohn · 2024-01-01T06:49:52Z

I think I got it to work with ggml_view_3d so I'm closing this issue.

balisujohn · 2024-01-08T02:37:57Z

reopening pending PR to fix

balisujohn closed this as completed Jan 1, 2024

balisujohn mentioned this issue Jan 8, 2024

ggml_cuda_cpy support for 4d tensors and float16->float32 upcasting #686

Merged

balisujohn reopened this Jan 8, 2024

balisujohn closed this as completed Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matmul on 4d tensors with cuda backend #672

Matmul on 4d tensors with cuda backend #672

balisujohn commented Dec 29, 2023 •

edited

Loading

balisujohn commented Dec 29, 2023

ggerganov commented Dec 29, 2023

balisujohn commented Dec 29, 2023 •

edited

Loading

balisujohn commented Dec 31, 2023

balisujohn commented Dec 31, 2023 •

edited

Loading

slaren commented Dec 31, 2023

balisujohn commented Jan 1, 2024

balisujohn commented Jan 8, 2024

Matmul on 4d tensors with cuda backend #672

Matmul on 4d tensors with cuda backend #672

Comments

balisujohn commented Dec 29, 2023 • edited Loading

balisujohn commented Dec 29, 2023

ggerganov commented Dec 29, 2023

balisujohn commented Dec 29, 2023 • edited Loading

balisujohn commented Dec 31, 2023

balisujohn commented Dec 31, 2023 • edited Loading

slaren commented Dec 31, 2023

balisujohn commented Jan 1, 2024

balisujohn commented Jan 8, 2024

balisujohn commented Dec 29, 2023 •

edited

Loading

balisujohn commented Dec 29, 2023 •

edited

Loading

balisujohn commented Dec 31, 2023 •

edited

Loading