Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How do I force a computation of a tensor/force a dependency between 2 tensors? #610

Closed
saharNooby opened this issue Nov 13, 2023 · 2 comments

Comments

@saharNooby
Copy link

saharNooby commented Nov 13, 2023

I will use ggml_map_custom1 to perform a complicated custom operation which is too costly to be implemented naively using ggml operations. This operation depends not only on the value of an argument struct ggml_tensor * a, but will also require 6 other tensors, one of which will be written to.

I plan to pass these additional 6 tensors in void * userdata. However, I expect that ggml would not see that these tensors need to be computed before computing the custom op. I need in some way to force ggml to compute them.

As an ugly crutch, I've considered "entangling" tensors through mathematically useless operations, but this crutch fails:

// Makes x dependent on y: when x is used in a computation, y will already be computed.
// No actual changes to either x or y happen.
static struct ggml_tensor * rwkv_make_dependency(struct ggml_context * ctx, struct ggml_tensor * x, struct ggml_tensor * y) {
    struct ggml_tensor * zero = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1);

    if (zero->data != NULL) {
        ggml_set_f32_1d(zero, 0, 0.0F);
    }

    // Take the first element of y.
    struct ggml_tensor * y_elem = ggml_view_1d(ctx, y, 1, sizeof(float));
    // Zero out the element, keeping the dependency.
    // We assume that ggml can't optimize away multiplication by zero.
    y_elem = ggml_mul_inplace(ctx, zero, y_elem);

    // Take the first element of x.
    struct ggml_tensor * x_elem = ggml_view_1d(ctx, x, 1, sizeof(float));
    // Make x_elem depend on y_elem, which depends on y.
    x_elem = ggml_add_inplace(ctx, x_elem, y_elem);

    // Restore the shape of x back.
    // FAILS HERE: expects x_elem to have n_elements >= x.
    return ggml_view_4d(
        ctx,
        x_elem,
        x->ne[0],
        x->ne[1],
        x->ne[2],
        x->ne[3],
        x->nb[1],
        x->nb[2],
        x->nb[3],
        0
    );
}

Would be great to have an answer to either of 2 questions:

  • how do I force computation of a tensor before some other operation happens?
  • how do I pass arbitrary shaped tensors to ggml_map_custom1, while having them properly computed?

For the context, the operation is wkv in RWKV v5 model. Current implementation uses matmuls and is too slow & memory hungry for serious usage.

@slaren
Copy link
Collaborator

slaren commented Nov 13, 2023

The short answer is that you cannot do what you want with the current custom op API, but you can kind of hack it to make it work for your case.

You can simulate a ggml_map_custom6 by adding the dependencies manually to src, such as:

struct ggml_tensor * t = ggml_map_custom1(ctx, dep1, fun, ...);
t->src[1] = dep2;
t->src[2] = dep3;
t->src[3] = dep4;
t->src[4] = dep5;
t->src[5] = dep6;

void fun(struct ggml_tensor * dst , const struct ggml_tensor * dep1, int ith, int nth, void * userdata) {
    struct ggml_tensor * dep2 = dst->src[1];
    struct ggml_tensor * dep3 = dst->src[2];
    // etc
}

The size of src is determined by GGML_MAX_SRC, and you may need to increase it if you have more than 6 dependencies.

To propagate the op as a dependency in the graph, you should write the result to the tensor returned by ggml_map_customN (which is also passed as dst to the op function). There are no restrictions to the shape of the tensors, but the shape of the dst tensor is taken from the shape of the first dependency. If that doesn't work for you, you could use the inplace version and create a new tensor with the shape that you need, such as:

struct ggml_tensor * t = ggml_map_custom1_inplace(ctx, ggml_new_tensor_4d(...), ...);

The tensor returned by ggml_map_customN_inplace is a view of the first input tensor.

It is also possible to use ggml_build_forward_expand to immediately add an op and all its dependencies to the graph.

@saharNooby
Copy link
Author

Thank you very much! It works. I had to up GGML_MAX_SRC to 8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants