-
Notifications
You must be signed in to change notification settings - Fork 966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cerberas 2.7B yields garbage tokens after quantizing to 4bits #54
Comments
It’s a known bug, ggerganov tweed about it. |
Does this happen only for GPT2-based models? |
I think it is just an issue with Cerebras but I am not sure. |
I am using Cerebras too. It would be great if this could be fixed. The Cerebras are excellent models. |
I think I have figured out this issue, the f16 to f32 tables were not properly initialized in the quantize examples. This can be fixed by adding this code to main() in quanitize.cpp
Please refer to my PR #77 |
I'm getting garbage-looking tokens (
&>,32>G$F7"=%0.173)@++*$16*:=!32%;:2@$5")0!!DGDA(:F*G$!")=9&9D69C9H-4.>&<A+1>.;6D7^C
) after quantizing an f16 Cerberas model like this:Example:
The f16 model loads and works fine.
The text was updated successfully, but these errors were encountered: