Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will the new K quantizations be added anytime soon for StarCoder? #278

Closed
richardr1126 opened this issue Jun 22, 2023 · 3 comments
Closed

Comments

@richardr1126
Copy link

I just quantized to q4_0 with WizardCoder-15b, using the Starcoder examples folder in the repo.

I really want to be able to use 3-bit k quant (Q3_K_M) with my model. Will this ever be possible? Or is it not possible with the starcoder models?

@richardr1126 richardr1126 changed the title Will the new K quantizations from llama.cpp be added anytime soon? Will the new K quantizations be added anytime soon for StarCoder? Jul 25, 2023
@saharNooby
Copy link

+1 for the question! k_quants.h and k_quants.c are missing in the main ggml repo. Is there a reason why this repo does not support k-quants yet?

(BTW, I've tried to copy-paste the k-quants files in rwkv.cpp, but the inference did not work because of assertion failure: [ggml_is_transposed](ggml.c:6161: !ggml_is_transposed(a)); I did not dig deeper)

@Green-Sky
Copy link
Contributor

FYI, starcoder can now be used with llama.cpp ggerganov/llama.cpp#3187
i did not check if it supports k-quants

@richardr1126
Copy link
Author

FYI, starcoder can now be used with llama.cpp ggerganov/llama.cpp#3187 i did not check if it supports k-quants

confirmed Starcoder k-quants do work on llama.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants