ggml : add simple example #713

FSSRepo · 2024-01-29T16:34:02Z

This PR only aims to add a new example that simply performs a matrix multiplication, solely for the purpose of demonstrating a basic usage of ggml and backend handling. The code is commented to help understand what each part does.

$$ \begin{bmatrix} 2 & 8 \\ 5 & 1 \\ 4 & 2 \\ 8 & 6 \\ \end{bmatrix} \times \begin{bmatrix} 10 & 9 & 5 \\ 5 & 9 & 4 \\ \end{bmatrix} = \begin{bmatrix} 60 & 110 & 54 & 29 \\ 55 & 90 & 126 & 28 \\ 50 & 54 & 42 & 64 \\ \end{bmatrix} $$

examples/simple/simple.cpp

examples/simple/CMakeLists.txt

Green-Sky · 2024-01-29T16:58:31Z

will this add to/replace the example in ggml.h itself?

ggml/include/ggml/ggml.h

Lines 32 to 174 in b2a5c34

 // For example, here we define the function: f(x) = a*x^2 + b 

 // 

 // { 

 // struct ggml_init_params params = { 

 // .mem_size = 16*1024*1024, 

 // .mem_buffer = NULL, 

 // }; 

 // 

 // // memory allocation happens here 

 // struct ggml_context * ctx = ggml_init(params); 

 // 

 // struct ggml_tensor * x = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1); 

 // 

 // ggml_set_param(ctx, x); // x is an input variable 

 // 

 // struct ggml_tensor * a = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1); 

 // struct ggml_tensor * b = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1); 

 // struct ggml_tensor * x2 = ggml_mul(ctx, x, x); 

 // struct ggml_tensor * f = ggml_add(ctx, ggml_mul(ctx, a, x2), b); 

 // 

 // ... 

 // } 

 // 

 // Notice that the function definition above does not involve any actual computation. The computation is performed only 

 // when the user explicitly requests it. For example, to compute the function's value at x = 2.0: 

 // 

 // { 

 // ... 

 // 

 // struct ggml_cgraph * gf = ggml_new_graph(ctx); 

 // ggml_build_forward_expand(gf, f); 

 // 

 // // set the input variable and parameter values 

 // ggml_set_f32(x, 2.0f); 

 // ggml_set_f32(a, 3.0f); 

 // ggml_set_f32(b, 4.0f); 

 // 

 // ggml_graph_compute_with_ctx(ctx, &gf, n_threads); 

 // 

 // printf("f = %f\n", ggml_get_f32_1d(f, 0)); 

 // 

 // ... 

 // } 

 // 

 // The actual computation is performed in the ggml_graph_compute() function. 

 // 

 // The ggml_new_tensor_...() functions create new tensors. They are allocated in the memory buffer provided to the 

 // ggml_init() function. You have to be careful not to exceed the memory buffer size. Therefore, you have to know 

 // in advance how much memory you need for your computation. Alternatively, you can allocate a large enough memory 

 // and after defining the computation graph, call the ggml_used_mem() function to find out how much memory was 

 // actually needed. 

 // 

 // The ggml_set_param() function marks a tensor as an input variable. This is used by the automatic 

 // differentiation and optimization algorithms. 

 // 

 // The described approach allows to define the function graph once and then compute its forward or backward graphs 

 // multiple times. All computations will use the same memory buffer allocated in the ggml_init() function. This way 

 // the user can avoid the memory allocation overhead at runtime. 

 // 

 // The library supports multi-dimensional tensors - up to 4 dimensions. The FP16 and FP32 data types are first class 

 // citizens, but in theory the library can be extended to support FP8 and integer data types. 

 // 

 // Each tensor operation produces a new tensor. Initially the library was envisioned to support only the use of unary 

 // and binary operations. Most of the available operations fall into one of these two categories. With time, it became 

 // clear that the library needs to support more complex operations. The way to support these operations is not clear 

 // yet, but a few examples are demonstrated in the following operations: 

 // 

 // - ggml_permute() 

 // - ggml_conv_1d_1s() 

 // - ggml_conv_1d_2s() 

 // 

 // For each tensor operator, the library implements a forward and backward computation function. The forward function 

 // computes the output tensor value given the input tensor values. The backward function computes the adjoint of the 

 // input tensors given the adjoint of the output tensor. For a detailed explanation of what this means, take a 

 // calculus class, or watch the following video: 

 // 

 // What is Automatic Differentiation? 

 // https://www.youtube.com/watch?v=wG_nF1awSSY 

 // 

 // 

 // ## Tensor data (struct ggml_tensor) 

 // 

 // The tensors are stored in memory via the ggml_tensor struct. The structure provides information about the size of 

 // the tensor, the data type, and the memory buffer where the tensor data is stored. Additionally, it contains 

 // pointers to the "source" tensors - i.e. the tensors that were used to compute the current tensor. For example: 

 // 

 // { 

 // struct ggml_tensor * c = ggml_add(ctx, a, b); 

 // 

 // assert(c->src[0] == a); 

 // assert(c->src[1] == b); 

 // } 

 // 

 // The multi-dimensional tensors are stored in row-major order. The ggml_tensor struct contains fields for the 

 // number of elements in each dimension ("ne") as well as the number of bytes ("nb", a.k.a. stride). This allows 

 // to store tensors that are not contiguous in memory, which is useful for operations such as transposition and 

 // permutation. All tensor operations have to take the stride into account and not assume that the tensor is 

 // contiguous in memory. 

 // 

 // The data of the tensor is accessed via the "data" pointer. For example: 

 // 

 // { 

 // const int nx = 2; 

 // const int ny = 3; 

 // 

 // struct ggml_tensor * a = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, nx, ny); 

 // 

 // for (int y = 0; y < ny; y++) { 

 // for (int x = 0; x < nx; x++) { 

 // *(float *) ((char *) a->data + y*a->nb[1] + x*a->nb[0]) = x + y; 

 // } 

 // } 

 // 

 // ... 

 // } 

 // 

 // Alternatively, there are helper functions, such as ggml_get_f32_1d() and ggml_set_f32_1d() that can be used. 

 // 

 // ## The matrix multiplication operator (ggml_mul_mat) 

 // 

 // TODO 

 // 

 // 

 // ## Multi-threading 

 // 

 // TODO 

 // 

 // 

 // ## Overview of ggml.c 

 // 

 // TODO 

 // 

 // 

 // ## SIMD optimizations 

 // 

 // TODO 

 // 

 // 

 // ## Debugging ggml 

 // 

 // TODO 

 // 

 //

slaren · 2024-01-29T16:58:59Z

This is a very useful example to have, good job!

In the future I hope we can replace all the backend-specific initialization with the backend registry instead. It should already be possible, but I think it needs some generic function to initialize the "best" backend available.

Also fair warning: the ggml-alloc API is going to change very soon, there is already a wrapper for backwards compatibility and I am not going to going to add another, so that means that all applications will need slight changes to adapt to the new ggml-alloc API. It will be a fairly minor change for most applications, though.

slaren · 2024-01-29T17:04:25Z

will this add to/replace the example in ggml.h itself?

Not really, it is still possible to use ggml with that API, ggml-backend and ggml-alloc are completely optional. But it may be better to rename this example to simple-backend or similar to make that clear.

ggerganov · 2024-01-29T19:48:51Z

Thanks for this @FSSRepo !

It would be useful to change the dimensions of the matrices in order to highlight along which dimension is the reduction. Most people are used to multiplying row by column and might get surprised by the numbers.

To disambiguate, you can use (2x4) * (2x3) = (4x3) for example

examples/simple/simple-backend.cpp

examples/simple/simple-ctx.cpp

examples/simple/simple-backend.cpp

FSSRepo · 2024-02-27T15:35:05Z

This is ready

examples/simple/simple-backend.cpp

examples/simple/simple-ctx.cpp

examples/simple/simple-backend.cpp

add simple example

53aa5ef

slaren reviewed Jan 29, 2024

View reviewed changes

examples/simple/simple.cpp Outdated Show resolved Hide resolved

examples/simple/simple.cpp Outdated Show resolved Hide resolved

examples/simple/simple.cpp Outdated Show resolved Hide resolved

Green-Sky reviewed Jan 29, 2024

View reviewed changes

examples/simple/CMakeLists.txt Outdated Show resolved Hide resolved

FSSRepo added 2 commits January 29, 2024 12:38

add legacy api demo + some suggestions

14dc1f7

fix typos

3b74d83

explain matrix multiplication

557e567

slaren reviewed Jan 29, 2024

View reviewed changes

examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved

slaren reviewed Jan 29, 2024

View reviewed changes

examples/simple/simple-ctx.cpp Outdated Show resolved Hide resolved

slaren reviewed Jan 29, 2024

View reviewed changes

examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved

FSSRepo added 4 commits February 27, 2024 09:47

Merge remote-tracking branch 'upstream/master' into simple-example

124dba3

fix code

d477191

float* -> std::vector<float>

583b56f

update matrix calc

e40deee

ggerganov approved these changes Feb 28, 2024

View reviewed changes

ggerganov requested a review from slaren February 28, 2024 11:09

slaren approved these changes Feb 28, 2024

View reviewed changes

examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved

examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved

examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved

FSSRepo and others added 3 commits February 28, 2024 11:29

apply suggestions

e7c3046

apply suggestions x2

b5ae555

Merge branch 'ggerganov:master' into simple-example

d865c1c

FSSRepo merged commit d9cd6b5 into ggerganov:master Feb 28, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : add simple example #713

ggml : add simple example #713

FSSRepo commented Jan 29, 2024 •

edited

Loading

Green-Sky commented Jan 29, 2024

slaren commented Jan 29, 2024 •

edited

Loading

slaren commented Jan 29, 2024

ggerganov commented Jan 29, 2024

FSSRepo commented Feb 27, 2024

ggml : add simple example #713

ggml : add simple example #713

Conversation

FSSRepo commented Jan 29, 2024 • edited Loading

Green-Sky commented Jan 29, 2024

slaren commented Jan 29, 2024 • edited Loading

slaren commented Jan 29, 2024

ggerganov commented Jan 29, 2024

FSSRepo commented Feb 27, 2024

FSSRepo commented Jan 29, 2024 •

edited

Loading

slaren commented Jan 29, 2024 •

edited

Loading