MPT quantize does not include quantization version #168

philpax · 2023-05-19T00:29:08Z

Hi there!

The concept of quantization versions was recently introduced by this commit, which encodes the version in the model's ftype: effcfa6

However, it looks like this change didn't make it over to the MPT quantizer:

ggml/examples/mpt/quantize.cpp

Line 89 in 2a75bd4

fout.write((char *)&ftype, sizeof(hparams.ftype));

This means that the MPT models recently uploaded by @TheBloke at https://huggingface.co/TheBloke/MPT-7B-GGML unfortunately do not include the quantization version. The effect of this is mitigated by the MPT example not currently checking the version.

We're checking the version in llm, which is why this has come up. We may add an option to disable the check or override the perceived version, but I figured that it was worth reporting upstream before it proliferates.

The text was updated successfully, but these errors were encountered:

TheBloke · 2023-05-19T00:30:34Z

Thanks for the heads up. I'm happy to re do them with the fixed quantization code.

TheBloke · 2023-05-19T00:30:56Z

Going to bed now but as soon as the fix is in I can re do them tomorrow

lukasmoellerch · 2023-05-19T07:14:30Z

Yes, this was merged while I was working on the mpt model integration - I'll contribute a fix

philpax · 2023-05-19T09:27:49Z

Also worth noting is the code that loads the model needs to account for this (e.g. it needs to modulo the ftype by GGML_QNT_VERSION_FACTOR)

marella · 2023-05-20T10:00:12Z

@lukasmoellerch I sent #165 to remove global variables. Can you also please include those changes in your fix.

Also max_seq_len is read from model file but doesn't seem to be used anywhere. Is that expected or is it supposed to be used as n_ctx?

ggerganov · 2023-05-20T15:03:22Z

Added quantization version support to MPT and Replit models

TheBloke · 2023-05-20T20:11:08Z

Thanks very much for the fix. I've updated my three repos:
https://huggingface.co/TheBloke/MPT-7B-GGML
https://huggingface.co/TheBloke/MPT-7B-Instruct-GGML
https://huggingface.co/TheBloke/MPT-7B-Storywriter-GGML

marella mentioned this issue May 20, 2023

mpt : fix n_ctx #165

Closed

ggerganov added a commit that referenced this issue May 20, 2023

examples : add quantize version to MPT and Replit examples (ref #168)

c2fab8a

philpax closed this as completed May 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPT quantize does not include quantization version #168

MPT quantize does not include quantization version #168

philpax commented May 19, 2023 •

edited

Loading

TheBloke commented May 19, 2023

TheBloke commented May 19, 2023 •

edited

Loading

lukasmoellerch commented May 19, 2023

philpax commented May 19, 2023

marella commented May 20, 2023

ggerganov commented May 20, 2023

TheBloke commented May 20, 2023

MPT quantize does not include quantization version #168

MPT quantize does not include quantization version #168

Comments

philpax commented May 19, 2023 • edited Loading

TheBloke commented May 19, 2023

TheBloke commented May 19, 2023 • edited Loading

lukasmoellerch commented May 19, 2023

philpax commented May 19, 2023

marella commented May 20, 2023

ggerganov commented May 20, 2023

TheBloke commented May 20, 2023

philpax commented May 19, 2023 •

edited

Loading

TheBloke commented May 19, 2023 •

edited

Loading