add some new ops, fix some operators and add batch operations to certain operators. #747

leejet · 2024-02-25T13:10:27Z

I tested it in the sd.cpp project, and everything works well.

FSSRepo · 2024-02-27T14:17:28Z

@leejet I believe that before merging this, the respective Metal kernels should be created to maintain compatibility, although it doesn't really help much since sd.cpp doesn't benefit from the Metal backend as matrix operations are very slow.

leejet · 2024-02-27T16:21:47Z

OK, I will find time to add support for Metal backend.

leejet · 2024-03-02T11:55:00Z

@FSSRepo I have implemented Metal backend support for ggml_arange and ggml_timestep_embedding, and tested it with the test-arrange and test-timestep-embedding test cases to ensure it works correctly. Could you find some time to review and merge this PR?

FSSRepo · 2024-03-02T11:56:23Z

Ok, I will test this PR with latest sd.cpp

slaren · 2024-03-02T13:29:04Z

Does arange affect the performance significantly? Are there any downsides to constructing the tensor in a CPU buffer and copying it to the backend as an input?
Does timestep_embedding have any use outside of SD? And again, would it hurt performance significantly if it was implemented on the CPU and copied to the backend as an input?

leejet · 2024-03-02T14:52:11Z

These two operations won't significantly improve performance, but in stable video diffusion, it's necessary to perform these operations multiple times within the UNet network. Additionally, the parameters may change with each execution. If I don't execute these operations when computing the graph, I would have to remember the tensors produced by each operation and their corresponding parameters. Then, during the ggml_gallocr_alloc_graph, I'd have to copy the data, making the process more complex. Implementing this through ggml operators would make the code more elegant.

slaren

It would be good to have tests for these operators in test-backend-ops.

src/ggml-cuda.cu

src/ggml.c

tests/test-timestep_embedding.cpp

ggerganov

After adding tests to test-backend-ops we can merge

include/ggml/ggml.h

src/ggml-cuda.cu

src/ggml-metal.m

src/ggml-metal.metal

ggerganov · 2024-03-02T15:44:32Z

src/ggml.c

                    }
                }
            }
-            float variance = sum2 / (ne00 * ne01 * step);
+            float variance = sum2;


What was the reason to change these normalization to be inside the loop?

Because previously, it calculated the sum of (v * v) and then divided by (ne00 * ne01 * step), I was concerned that in certain situations, it might lead to overflow or precision issues. Therefore, I changed it to calculate the sum of (v * v)/(ne00 * ne01 * step) to obtain the variance.

It shouldn't cause problems since we accumulate into double (i.e. ggml_float). I've introduced a separate accumulator per row and restored the normalization to be at the end in order to avoid extra division in the loop

Co-authored-by: Georgi Gerganov <[email protected]>

leejet · 2024-03-03T05:23:55Z

I have modified the code based on the suggestions from the review and added tests to test-backend-ops. There are two uncertain points still awaiting feedback from the maintainer.

leejet added 6 commits February 25, 2024 12:20

cuda: fix group_norm

33ee175

cuda: add batch inference support for ggml_pad/ggml_upscale

4339ad4

add ggml_arrange

bfa2439

add ggml_timestep_embedding

04577c9

update ggml_arange/ggml_timestep_embedding tests

f3bdd01

cuda: fix im2col

4212b75

leejet added 4 commits March 2, 2024 18:13

add ggml_arange/ggml_timestep_embbeding support for metal backend

03d9d66

fix some bugs

b94c066

Merge branch 'master' into batch-inference

165540e

fix some bugs

9cc5cb2

slaren reviewed Mar 2, 2024

View reviewed changes

slaren requested a review from ggerganov March 2, 2024 15:24

ggerganov approved these changes Mar 2, 2024

View reviewed changes

leejet and others added 6 commits March 3, 2024 12:46

Update include/ggml/ggml.h

3727444

Co-authored-by: Georgi Gerganov <[email protected]>

Update src/ggml-cuda.cu

f7398db

Co-authored-by: Georgi Gerganov <[email protected]>

Update src/ggml-metal.m

43d0a29

Co-authored-by: Georgi Gerganov <[email protected]>

Update src/ggml-metal.m

31b3c7a

Co-authored-by: Georgi Gerganov <[email protected]>

Update src/ggml-metal.metal

a2561b6

Co-authored-by: Georgi Gerganov <[email protected]>

modify according to the review comments

b8f313e

ggerganov added 2 commits March 3, 2024 09:25

ggml : fix compile warnings + code style

1199eed

ggml : normalize compute_forward calls + fix seg fault in debug

270e412

ggerganov approved these changes Mar 3, 2024

View reviewed changes

ggerganov requested a review from slaren March 3, 2024 07:44

minor

c210396

slaren approved these changes Mar 3, 2024

View reviewed changes

slaren merged commit 2746808 into ggerganov:master Mar 3, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add some new ops, fix some operators and add batch operations to certain operators. #747

add some new ops, fix some operators and add batch operations to certain operators. #747

leejet commented Feb 25, 2024

FSSRepo commented Feb 27, 2024

leejet commented Feb 27, 2024

leejet commented Mar 2, 2024

FSSRepo commented Mar 2, 2024

slaren commented Mar 2, 2024

leejet commented Mar 2, 2024

slaren left a comment

ggerganov left a comment

ggerganov Mar 2, 2024

leejet Mar 3, 2024

ggerganov Mar 3, 2024

leejet commented Mar 3, 2024

add some new ops, fix some operators and add batch operations to certain operators. #747

add some new ops, fix some operators and add batch operations to certain operators. #747

Conversation

leejet commented Feb 25, 2024

FSSRepo commented Feb 27, 2024

leejet commented Feb 27, 2024

leejet commented Mar 2, 2024

FSSRepo commented Mar 2, 2024

slaren commented Mar 2, 2024

leejet commented Mar 2, 2024

slaren left a comment

Choose a reason for hiding this comment

ggerganov left a comment

Choose a reason for hiding this comment

ggerganov Mar 2, 2024

Choose a reason for hiding this comment

leejet Mar 3, 2024

Choose a reason for hiding this comment

ggerganov Mar 3, 2024

Choose a reason for hiding this comment

leejet commented Mar 3, 2024