Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : improve custom operator support #294

Closed
ggerganov opened this issue Jun 25, 2023 · 4 comments · Fixed by #422
Closed

ggml : improve custom operator support #294

ggerganov opened this issue Jun 25, 2023 · 4 comments · Fixed by #422
Labels
enhancement New feature or request good first issue Good for newcomers refactoring Refactoring

Comments

@ggerganov
Copy link
Owner

We have rudimentary support for adding F32 custom ggml operators in user space via:

ggml/include/ggml/ggml.h

Lines 1173 to 1240 in a1d0ea7

// custom operators
typedef void (*ggml_unary_op_f32_t) (const int, float *, const float *);
typedef void (*ggml_binary_op_f32_t)(const int, float *, const float *, const float *);
typedef void (*ggml_custom1_op_f32_t)(struct ggml_tensor *, const struct ggml_tensor *);
typedef void (*ggml_custom2_op_f32_t)(struct ggml_tensor *, const struct ggml_tensor *, const struct ggml_tensor *);
typedef void (*ggml_custom3_op_f32_t)(struct ggml_tensor *, const struct ggml_tensor *, const struct ggml_tensor *, const struct ggml_tensor *);
GGML_API struct ggml_tensor * ggml_map_unary_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
ggml_unary_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_unary_inplace_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
ggml_unary_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_binary_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
ggml_binary_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_binary_inplace_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
ggml_binary_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_custom1_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
ggml_custom1_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_custom1_inplace_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
ggml_custom1_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_custom2_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
ggml_custom2_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_custom2_inplace_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
ggml_custom2_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_custom3_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
struct ggml_tensor * c,
ggml_custom3_op_f32_t fun);
GGML_API struct ggml_tensor * ggml_map_custom3_inplace_f32(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b,
struct ggml_tensor * c,
ggml_custom3_op_f32_t fun);

The API has to be improved to support:

  • multi-threading
  • non-F32 types

We also need to provide a few examples of how to define custom operators.

This functionality will significantly improve adoption of ggml in third-party projects since currently, they need to rely that certain operators first get implemented into ggml before being able to use them

@ggerganov ggerganov added enhancement New feature or request good first issue Good for newcomers refactoring Refactoring labels Jun 25, 2023
@slaren
Copy link
Collaborator

slaren commented Jul 27, 2023

Looks like the custom1-3 versions already support any data type, as they pass the tensors to the function, and the types are never checked. So we could simply remove the f32 suffix from the names.

To support multi-threading, we could add a n_tasks parameter to the op, and ith, nth parameters to the callback, such as:

typedef void (*ggml_custom1_op_t)(struct ggml_tensor * dst , const struct ggml_tensor * a, int ith, int nth); 

GGML_API struct ggml_tensor * ggml_map_custom1( 
        struct ggml_context          * ctx, 
        struct ggml_tensor           * a, 
               int                     n_tasks, // clamped to the number of threads
               ggml_custom1_op_t       fun); 

Would a test work as an example?

@ggerganov
Copy link
Owner Author

and ith, nth parameters to the callback

Maybe we can pass the entire struct ggml_compute_params to the callback?

@slaren
Copy link
Collaborator

slaren commented Jul 28, 2023

My view on that is that we are already exposing way too many details of the internals of ggml in the public API (ie. ggml.h), and that's already slowing us down because it makes changing certain aspects of ggml really hard because it is too easy to break user code. Some of that user code is in our own examples. At some point we are going to have to define a public API and commit to supporting it in the long term (or at least until the next major release), but until that happens we shouldn't make the problem harder for ourselves.

In this case, ggml_compute_params has the ggml_task_type and the work buffer. I could see the task type being useful to allow users to have their own INIT and FINALIZE passes, but I am not sure that it is worth exposing that complexity to the user. The work buffer shouldn't be exposed to the user, it would require adding another interface to allow the users to specify how much work buffer they need, and they can simply allocate their own buffers if they need to anyway.

We should probably add a void * parameter that is passed to the callback, so that the users can use it to pass any custom data that they may need in the callback:

typedef void (*ggml_custom1_op_t)(struct ggml_tensor * dst , const struct ggml_tensor * a, int ith, int nth, void * userdata); 

GGML_API struct ggml_tensor * ggml_map_custom1( 
        struct ggml_context          * ctx, 
        struct ggml_tensor           * a, 
               int                     n_tasks, // clamped to the number of threads
               ggml_custom1_op_t       fun,
               void                  * userdata); 

@ggerganov
Copy link
Owner Author

My view on that is that we are already exposing way too many details of the internals of ggml in the public API

Good point - agree

We should probably add a void * parameter that is passed to the callback,

Yes, let's add it.

With the proposed changes, we only need a test or example to complete this.

CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this issue Dec 18, 2023
…gerganov#154) (ggerganov#294)

* Use F16 for memory_k and memory_v

* add command line switch to use f16 instead of f32 for memory k+v

---------

Co-authored-by: Ty Everett <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers refactoring Refactoring
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants