-
Notifications
You must be signed in to change notification settings - Fork 966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPT quantize does not include quantization version #168
Comments
Thanks for the heads up. I'm happy to re do them with the fixed quantization code. |
Going to bed now but as soon as the fix is in I can re do them tomorrow |
Yes, this was merged while I was working on the mpt model integration - I'll contribute a fix |
Also worth noting is the code that loads the model needs to account for this (e.g. it needs to modulo the ftype by |
@lukasmoellerch I sent #165 to remove global variables. Can you also please include those changes in your fix. Also |
Added quantization version support to MPT and Replit models |
Thanks very much for the fix. I've updated my three repos: |
Hi there!
The concept of quantization versions was recently introduced by this commit, which encodes the version in the model's
ftype
: effcfa6However, it looks like this change didn't make it over to the MPT quantizer:
ggml/examples/mpt/quantize.cpp
Line 89 in 2a75bd4
This means that the MPT models recently uploaded by @TheBloke at https://huggingface.co/TheBloke/MPT-7B-GGML unfortunately do not include the quantization version. The effect of this is mitigated by the MPT example not currently checking the version.
We're checking the version in llm, which is why this has come up. We may add an option to disable the check or override the perceived version, but I figured that it was worth reporting upstream before it proliferates.
The text was updated successfully, but these errors were encountered: