Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

./bin/stablelm -> invalid quantization type 0 (f32) #140

Closed
lefnire opened this issue May 10, 2023 · 1 comment
Closed

./bin/stablelm -> invalid quantization type 0 (f32) #140

lefnire opened this issue May 10, 2023 · 1 comment

Comments

@lefnire
Copy link

lefnire commented May 10, 2023

I followed examples/stablelm step-by. At the 4-bit part, I get the following error:

stablelm_model_quantize: loading model from './stablelm-base-alpha-3b/ggml-model-f16.bin'
stablelm_model_quantize: n_vocab = 50688
stablelm_model_quantize: n_ctx   = 4096
stablelm_model_quantize: n_embd  = 4096
stablelm_model_quantize: n_head  = 32
stablelm_model_quantize: n_layer = 16
stablelm_model_quantize: ftype   = 1
ggml_common_quantize_0: invalid quantization type 0 (f32)
stablelm_model_quantize: failed to quantize model './stablelm-base-alpha-3b/ggml-model-f16.bin'
main: failed to quantize model from './stablelm-base-alpha-3b/ggml-model-f16.bin'

I'm a newb, my guess was that means the stablelm-quantize expected 32 instead of f16, so I tried the same steps but used python3 ../examples/stablelm/convert-h5-to-ggml.py ./stablelm-base-alpha-3b/ 0 (0 instead of 1), but no cigar.

@ggerganov
Copy link
Owner

The README has been updated to reflect the new quantization usage - it should work now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants