-
Notifications
You must be signed in to change notification settings - Fork 941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPTNeoX model with 16k context results in context-size related issues: ggml_new_tensor_impl: not enough space in the context's memory pool
, and instant core dump with fp16
#225
Comments
It seems to be a calculation error with signed and unsigned integers. Change
to
Working on q8_0 quantization:
|
Thanks so much! i will test and close this shortly |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey guys
Today I was doing quants of a new GPTNeoX model called Literature-7B-16384
I tried making GGMLs through the usual process:
Both steps completed fine. But the models can't be used.
Trying to use the fp32:
Trying an fp16 conversion instead is even more spectacular:
And then trying a quantised version made from either fp32 or fp16 gives the same errors as with the fp32:
I tried various
-n
values with both files but that made no difference.I assume it's because some support needs to be made for the unusually large context size? I have previously tested GPTNeoX models with 4k and 8k context and they seemed to work.
I don't know if this is a bug or a feature request, but I thought I'd let you guys know. Let me know if you'd like me to upload the fp16, fp32 or q4_0 GGMLs anywhere for inspection.
Thanks in advance!
The text was updated successfully, but these errors were encountered: