Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : add simple example #713

Merged
merged 11 commits into from
Feb 28, 2024
Merged

ggml : add simple example #713

merged 11 commits into from
Feb 28, 2024

Conversation

FSSRepo
Copy link
Collaborator

@FSSRepo FSSRepo commented Jan 29, 2024

This PR only aims to add a new example that simply performs a matrix multiplication, solely for the purpose of demonstrating a basic usage of ggml and backend handling. The code is commented to help understand what each part does.

$$ \begin{bmatrix} 2 & 8 \\ 5 & 1 \\ 4 & 2 \\ 8 & 6 \\ \end{bmatrix} \times \begin{bmatrix} 10 & 9 & 5 \\ 5 & 9 & 4 \\ \end{bmatrix} = \begin{bmatrix} 60 & 110 & 54 & 29 \\ 55 & 90 & 126 & 28 \\ 50 & 54 & 42 & 64 \\ \end{bmatrix} $$

examples/simple/simple.cpp Outdated Show resolved Hide resolved
examples/simple/simple.cpp Outdated Show resolved Hide resolved
examples/simple/simple.cpp Outdated Show resolved Hide resolved
@Green-Sky
Copy link
Contributor

will this add to/replace the example in ggml.h itself?

ggml/include/ggml/ggml.h

Lines 32 to 174 in b2a5c34

// For example, here we define the function: f(x) = a*x^2 + b
//
// {
// struct ggml_init_params params = {
// .mem_size = 16*1024*1024,
// .mem_buffer = NULL,
// };
//
// // memory allocation happens here
// struct ggml_context * ctx = ggml_init(params);
//
// struct ggml_tensor * x = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1);
//
// ggml_set_param(ctx, x); // x is an input variable
//
// struct ggml_tensor * a = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1);
// struct ggml_tensor * b = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1);
// struct ggml_tensor * x2 = ggml_mul(ctx, x, x);
// struct ggml_tensor * f = ggml_add(ctx, ggml_mul(ctx, a, x2), b);
//
// ...
// }
//
// Notice that the function definition above does not involve any actual computation. The computation is performed only
// when the user explicitly requests it. For example, to compute the function's value at x = 2.0:
//
// {
// ...
//
// struct ggml_cgraph * gf = ggml_new_graph(ctx);
// ggml_build_forward_expand(gf, f);
//
// // set the input variable and parameter values
// ggml_set_f32(x, 2.0f);
// ggml_set_f32(a, 3.0f);
// ggml_set_f32(b, 4.0f);
//
// ggml_graph_compute_with_ctx(ctx, &gf, n_threads);
//
// printf("f = %f\n", ggml_get_f32_1d(f, 0));
//
// ...
// }
//
// The actual computation is performed in the ggml_graph_compute() function.
//
// The ggml_new_tensor_...() functions create new tensors. They are allocated in the memory buffer provided to the
// ggml_init() function. You have to be careful not to exceed the memory buffer size. Therefore, you have to know
// in advance how much memory you need for your computation. Alternatively, you can allocate a large enough memory
// and after defining the computation graph, call the ggml_used_mem() function to find out how much memory was
// actually needed.
//
// The ggml_set_param() function marks a tensor as an input variable. This is used by the automatic
// differentiation and optimization algorithms.
//
// The described approach allows to define the function graph once and then compute its forward or backward graphs
// multiple times. All computations will use the same memory buffer allocated in the ggml_init() function. This way
// the user can avoid the memory allocation overhead at runtime.
//
// The library supports multi-dimensional tensors - up to 4 dimensions. The FP16 and FP32 data types are first class
// citizens, but in theory the library can be extended to support FP8 and integer data types.
//
// Each tensor operation produces a new tensor. Initially the library was envisioned to support only the use of unary
// and binary operations. Most of the available operations fall into one of these two categories. With time, it became
// clear that the library needs to support more complex operations. The way to support these operations is not clear
// yet, but a few examples are demonstrated in the following operations:
//
// - ggml_permute()
// - ggml_conv_1d_1s()
// - ggml_conv_1d_2s()
//
// For each tensor operator, the library implements a forward and backward computation function. The forward function
// computes the output tensor value given the input tensor values. The backward function computes the adjoint of the
// input tensors given the adjoint of the output tensor. For a detailed explanation of what this means, take a
// calculus class, or watch the following video:
//
// What is Automatic Differentiation?
// https://www.youtube.com/watch?v=wG_nF1awSSY
//
//
// ## Tensor data (struct ggml_tensor)
//
// The tensors are stored in memory via the ggml_tensor struct. The structure provides information about the size of
// the tensor, the data type, and the memory buffer where the tensor data is stored. Additionally, it contains
// pointers to the "source" tensors - i.e. the tensors that were used to compute the current tensor. For example:
//
// {
// struct ggml_tensor * c = ggml_add(ctx, a, b);
//
// assert(c->src[0] == a);
// assert(c->src[1] == b);
// }
//
// The multi-dimensional tensors are stored in row-major order. The ggml_tensor struct contains fields for the
// number of elements in each dimension ("ne") as well as the number of bytes ("nb", a.k.a. stride). This allows
// to store tensors that are not contiguous in memory, which is useful for operations such as transposition and
// permutation. All tensor operations have to take the stride into account and not assume that the tensor is
// contiguous in memory.
//
// The data of the tensor is accessed via the "data" pointer. For example:
//
// {
// const int nx = 2;
// const int ny = 3;
//
// struct ggml_tensor * a = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, nx, ny);
//
// for (int y = 0; y < ny; y++) {
// for (int x = 0; x < nx; x++) {
// *(float *) ((char *) a->data + y*a->nb[1] + x*a->nb[0]) = x + y;
// }
// }
//
// ...
// }
//
// Alternatively, there are helper functions, such as ggml_get_f32_1d() and ggml_set_f32_1d() that can be used.
//
// ## The matrix multiplication operator (ggml_mul_mat)
//
// TODO
//
//
// ## Multi-threading
//
// TODO
//
//
// ## Overview of ggml.c
//
// TODO
//
//
// ## SIMD optimizations
//
// TODO
//
//
// ## Debugging ggml
//
// TODO
//
//

@slaren
Copy link
Collaborator

slaren commented Jan 29, 2024

This is a very useful example to have, good job!

In the future I hope we can replace all the backend-specific initialization with the backend registry instead. It should already be possible, but I think it needs some generic function to initialize the "best" backend available.

Also fair warning: the ggml-alloc API is going to change very soon, there is already a wrapper for backwards compatibility and I am not going to going to add another, so that means that all applications will need slight changes to adapt to the new ggml-alloc API. It will be a fairly minor change for most applications, though.

@slaren
Copy link
Collaborator

slaren commented Jan 29, 2024

will this add to/replace the example in ggml.h itself?

Not really, it is still possible to use ggml with that API, ggml-backend and ggml-alloc are completely optional. But it may be better to rename this example to simple-backend or similar to make that clear.

@ggerganov
Copy link
Owner

Thanks for this @FSSRepo !

It would be useful to change the dimensions of the matrices in order to highlight along which dimension is the reduction. Most people are used to multiplying row by column and might get surprised by the numbers.

To disambiguate, you can use (2x4) * (2x3) = (4x3) for example

@FSSRepo
Copy link
Collaborator Author

FSSRepo commented Feb 27, 2024

This is ready

examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved
examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved
examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved
examples/simple/simple-ctx.cpp Outdated Show resolved Hide resolved
examples/simple/simple-ctx.cpp Outdated Show resolved Hide resolved
examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved
examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved
examples/simple/simple-backend.cpp Outdated Show resolved Hide resolved
@FSSRepo FSSRepo merged commit d9cd6b5 into ggerganov:master Feb 28, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants