-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGML takes down the whole process even for recoverable errors #123
Comments
The current "error handling" logic in
The logic is that when you build in I don't think it is a good idea to change this behaviour now. The computation of the required memory size for a context has to be implemented in another way, without relying on error codes. For the moment, I think the |
This is the kind of thing I'm talking about then, not the normal I'm not talking about something like "What if we're in So a condition that indicates any sort of bug inside GGML it makes perfect sense to fail in the current way. However: struct ggml_tensor * ggml_add(ctx, a, b) {
GGML_ASSERT(ggml_are_same_shape(a, b));
/* make the tensor */
return new_tensor;
} If the application passes in What I'm proposing is changing user-facing functions like that to just return I think this could be done with almost no actual code changes, it would mostly just be a change to the Is that still something you're opposed to? |
I guess the specific example can become: struct ggml_tensor * ggml_add(ctx, a, b) {
if (!ggml_are_same_shape(a, b)) {
GGML_ASSERT(false, "ggml_add: input tensors are not the same shape");
return nullptr;
}
/* make the tensor */
return new_tensor;
} And then But by default, i.e. without explicitly disabling it, |
Yes, that's pretty similar to what I was thinking of but I think it can be even less invasive of a change. This is pseudocode and the name of the defines/etc can be whatever you prefer: #ifndef GGML_RECOVERABLE
// Exactly the same as now
#define GGML_ASSERT(x) \
do { \
if (!(x)) { \
fprintf(stderr, "GGML_ASSERT: %s:%d: %s\n", __FILE__, __LINE__, #x); \
abort(); \
} \
} while (0)
#else
#define GGML_ASSERT(x) \
if (!(x)) { \
snprintf(ctx->last_error, \
"GGML_ASSERT: %s:%d: %s\n", __FILE__, __LINE__, #x); \
return NULL; \
}
#endif In real life you'd probably want to have a slightly different name for the fallible assert, like That's just pseudocode of the basic idea I'm talking about: a real implementation would take into account stuff like the buffer not having enough space, etc. (I'm pretty sure that's not a good use of Those new fields in the context could also be gated by that
Right, and you also don't want to have to have to change stuff like whisper, Does this sound like something you'd be okay with in principle? I can spend some time making a proof of concept if so. |
Yes, let's give this idea a try and see how it goes |
Great! Do you want the pull here (as opposed to the llama.cpp repo)? Also, do you have any preference for the names of the defines? Specifically, the one that is used to guard user-facing operations like tensor creation and the one that's used to turn this feature on. |
In the |
From the perspective of application code consuming the GGML library, it's really not ideal to have the whole process just die even if it's something as simple as not enough memory to create a tensor.
It might be difficult to get rid of all
assert
s but I think it would be relatively simple to very significantly limit the surface area where that kind of thing can happen. i.e. tensor creation functions could simply returnNULL
for recoverable errors like wrong dimension, not enough memory, etc.Most of these operations require a context, so a fairly simple way to communicate more detail about the problem back to the user would be to have something like an
errno
field in the context which could hold a code for that error. You could also have a buffer there if you wanted to hold a more user-friendly description of what happened (just whatever theassert
prints out now). This would require allocating... I don't know, like 128-256 bytes of extra data in the context but that doesn't seem likely to be a big issue.Through use of a
#define
toggle it should be possible to preserve the current behavior (if you want the ability to not have to check for NULL from stuff like tensor operations) but allow something like a language binding to handle failure in a more graceful way.I'm not just creating this issue to complain about the problem. If a change like that would be accepted, I can implement it myself and submit a pull.
(Also, is it preferable to submit GGML changes here or in the
llama.cpp
repo?)The text was updated successfully, but these errors were encountered: