ggml : improve custom operator support #294

ggerganov · 2023-06-25T09:25:30Z

We have rudimentary support for adding F32 custom ggml operators in user space via:

Lines 1173 to 1240 in a1d0ea7

 
 // custom operators 

 typedef void (*ggml_unary_op_f32_t) (const int, float *, const float *); 

 typedef void (*ggml_binary_op_f32_t)(const int, float *, const float *, const float *); 

 typedef void (*ggml_custom1_op_f32_t)(struct ggml_tensor *, const struct ggml_tensor *); 

 typedef void (*ggml_custom2_op_f32_t)(struct ggml_tensor *, const struct ggml_tensor *, const struct ggml_tensor *); 

 typedef void (*ggml_custom3_op_f32_t)(struct ggml_tensor *, const struct ggml_tensor *, const struct ggml_tensor *, const struct ggml_tensor *); 

 GGML_API struct ggml_tensor * ggml_map_unary_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 ggml_unary_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_unary_inplace_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 ggml_unary_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_binary_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 struct ggml_tensor * b, 

 ggml_binary_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_binary_inplace_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 struct ggml_tensor * b, 

 ggml_binary_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_custom1_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 ggml_custom1_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_custom1_inplace_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 ggml_custom1_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_custom2_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 struct ggml_tensor * b, 

 ggml_custom2_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_custom2_inplace_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 struct ggml_tensor * b, 

 ggml_custom2_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_custom3_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 struct ggml_tensor * b, 

 struct ggml_tensor * c, 

 ggml_custom3_op_f32_t fun); 

 GGML_API struct ggml_tensor * ggml_map_custom3_inplace_f32( 

 struct ggml_context * ctx, 

 struct ggml_tensor * a, 

 struct ggml_tensor * b, 

 struct ggml_tensor * c, 

 ggml_custom3_op_f32_t fun);

The API has to be improved to support:

multi-threading
non-F32 types

We also need to provide a few examples of how to define custom operators.

This functionality will significantly improve adoption of ggml in third-party projects since currently, they need to rely that certain operators first get implemented into ggml before being able to use them

The text was updated successfully, but these errors were encountered:

slaren · 2023-07-27T17:23:15Z

Looks like the custom1-3 versions already support any data type, as they pass the tensors to the function, and the types are never checked. So we could simply remove the f32 suffix from the names.

To support multi-threading, we could add a n_tasks parameter to the op, and ith, nth parameters to the callback, such as:

typedef void (*ggml_custom1_op_t)(struct ggml_tensor * dst , const struct ggml_tensor * a, int ith, int nth); 

GGML_API struct ggml_tensor * ggml_map_custom1( 
        struct ggml_context          * ctx, 
        struct ggml_tensor           * a, 
               int                     n_tasks, // clamped to the number of threads
               ggml_custom1_op_t       fun);

Would a test work as an example?

ggerganov · 2023-07-28T17:24:45Z

and ith, nth parameters to the callback

Maybe we can pass the entire struct ggml_compute_params to the callback?

slaren · 2023-07-28T18:18:51Z

My view on that is that we are already exposing way too many details of the internals of ggml in the public API (ie. ggml.h), and that's already slowing us down because it makes changing certain aspects of ggml really hard because it is too easy to break user code. Some of that user code is in our own examples. At some point we are going to have to define a public API and commit to supporting it in the long term (or at least until the next major release), but until that happens we shouldn't make the problem harder for ourselves.

In this case, ggml_compute_params has the ggml_task_type and the work buffer. I could see the task type being useful to allow users to have their own INIT and FINALIZE passes, but I am not sure that it is worth exposing that complexity to the user. The work buffer shouldn't be exposed to the user, it would require adding another interface to allow the users to specify how much work buffer they need, and they can simply allocate their own buffers if they need to anyway.

We should probably add a void * parameter that is passed to the callback, so that the users can use it to pass any custom data that they may need in the callback:

typedef void (*ggml_custom1_op_t)(struct ggml_tensor * dst , const struct ggml_tensor * a, int ith, int nth, void * userdata); 

GGML_API struct ggml_tensor * ggml_map_custom1( 
        struct ggml_context          * ctx, 
        struct ggml_tensor           * a, 
               int                     n_tasks, // clamped to the number of threads
               ggml_custom1_op_t       fun,
               void                  * userdata);

ggerganov · 2023-07-28T18:46:32Z

My view on that is that we are already exposing way too many details of the internals of ggml in the public API

Good point - agree

We should probably add a void * parameter that is passed to the callback,

Yes, let's add it.

With the proposed changes, we only need a test or example to complete this.

…gerganov#154) (ggerganov#294) * Use F16 for memory_k and memory_v * add command line switch to use f16 instead of f32 for memory k+v --------- Co-authored-by: Ty Everett <[email protected]>

ggerganov added enhancement New feature or request good first issue Good for newcomers refactoring Refactoring labels Jun 25, 2023

slaren mentioned this issue Jul 30, 2023

ggml : add ggml_map_custom1-3 #422

Merged

slaren closed this as completed in #422 Aug 1, 2023

ggerganov mentioned this issue Aug 7, 2023

ggml : sync (custom ops) ggerganov/llama.cpp#2537

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : improve custom operator support #294

ggml : improve custom operator support #294

ggerganov commented Jun 25, 2023

slaren commented Jul 27, 2023 •

edited

Loading

ggerganov commented Jul 28, 2023

slaren commented Jul 28, 2023 •

edited

Loading

ggerganov commented Jul 28, 2023

ggml : improve custom operator support #294

ggml : improve custom operator support #294

Comments

ggerganov commented Jun 25, 2023

slaren commented Jul 27, 2023 • edited Loading

ggerganov commented Jul 28, 2023

slaren commented Jul 28, 2023 • edited Loading

ggerganov commented Jul 28, 2023

slaren commented Jul 27, 2023 •

edited

Loading

slaren commented Jul 28, 2023 •

edited

Loading