Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGML GPT-2 consistently dies at around 825 tokens with: ggml_new_object: not enough space in the context's memory pool #480

Closed
rmc135 opened this issue Aug 26, 2023 · 1 comment · Fixed by #486

Comments

@rmc135
Copy link

rmc135 commented Aug 26, 2023

Firstly, thanks to GG and contributors for a great library/utility.

When generating using gpt-2, ggml bombs out at around 824 or 825 tokens, reporting an error then dumping core.

I would expect there to be a problem (hopefully not involving fatal errors and core dumps) when the total tokens equal the context size, but 824 or 825 total seems an odd number?

The same error is referenced in the llama.cpp repo, but possibly for a different reason: ggerganov/llama.cpp#2404

REPRODUCE:

Clean build, CPU only, Ubuntu 22: git pull && rm -Rf build && mkdir build && cd build && cmake .. && make

with ggml-model-f16.bin (gpt2-xl), eg bin/gpt-2 -m ~/gpt-2/models/1558M/ggml-model-f32.bin -n ...
-n 823: ok (run completes without error)
-n 824: ggml_new_object: not enough space in the context's memory pool (needed 268457104, available 268435456)
-n 825: ggml_new_object: not enough space in the context's memory pool (needed 268457104, available 268435456)

with ggml-model-f32.bin (gpt2-xl):
-n 823: ok
-n 824: ok
-n 825: ggml_new_object: not enough space in the context's memory pool (needed 268457104, available 268435456)

Note: I had to repeat some runs several times as ggml will stop prematurely if an <|endoftext|> token is generated. Getting to 823+ tokens can take a few tries.

@slaren
Copy link
Collaborator

slaren commented Aug 26, 2023

You can try increasing the buffer size here:

static size_t buf_size = 256u*1024*1024;

If you want a more reliable solution for different context sizes, you can use the allocator in ggml-alloc.h instead. @ggerganov should we update the examples to use the allocator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants