-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StableLM example #96
StableLM example #96
Conversation
I do not see an issue in that specific path, but this looks wrong, i11 is increased twice: Lines 5774 to 5775 in 2b07ae8
|
This looks exciting and I can't wait. Btw don't know if you've already seen this, but there was a previous attempt at a NeoX implementation in ggml, thought I'd link it for reference just in case: https://github.com/NolanoOrg/cformers/blob/master/cformers/cpp/ggml.c#L7292 |
Yes I know. I took the QKV unpacking from their repo, but I think there has to be a better way to do it |
@slaren Yup - that was the issue |
Cannot see why, but multi-thread does not work
…d updated quantizers and quantization handling for gpt neox gpt 2 and gptj
For what it's worth, StableLM 3B works for me, but StableLM 7B doesn't.
|
Seems to be working on my side - both F16 and quantized model. Can you double-check using latest
|
Still not loading on latest master for me..
Taken from https://huggingface.co/cakewalk/ggml-q4_0-stablelm-tuned-alpha-7b/blob/main/ggml-model-stablelm-tuned-alpha-7b-q4_0.bin. (I can't verify which conversion/quantization this used, I don't have enough RAM to convert 7B myself.)
Edit: Just now realized there was changes to the quantize code here two days ago, I'm guessing older quantizations won't work. |
How to use stableml to get embeddings back? |
Usage: https://github.com/ggerganov/ggml/tree/stablelm/examples/stablelm
TODO:
ggml_forward_dup_xxx()
bug