-
Notifications
You must be signed in to change notification settings - Fork 966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpt2 error #371
Comments
in f32 i get this
|
@stellanhaglund you need to modify the conversion script for it to work with huggingface pytorch models. For fp32 you need to transpose In the fp16 model, you get that error because you need to convert ggml/examples/gpt-2/convert-h5-to-ggml.py Line 119 in f6365c0
|
I'm using that script |
Ok so I added this for the f32.
And it works but the model seems like it doesn't feel very good anymore it just outputs the same word over and over.. |
And from what I can tell in the conversion script wpe is converted to f16, but somehow it expects another size.. |
You also need to add:
|
If you mean this i didnt remove it, just added the two other ones |
It should be "c_proj.weight" and not "mlp.c_proj.weight" for it to work with huggingface models |
I just tried it and the model just outputs only weird stuff and with the
same prompt in the python version it works good, so i think something is
off.
What could be causing the wrong size of the wpe in the f16 version?
I can see that it’s converted to f16 but something with its size is not
correct..
tors 13 juli 2023 kl. 20:34 skrev Ebey Abraham ***@***.***>:
… It should be "c_proj.weight" and not "mlp.c_proj.weight" for it to work
with huggingface models
—
Reply to this email directly, view it on GitHub
<#371 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATH7LV2LIGYTPUQLJTG2G3XQA5SDANCNFSM6AAAAAA2FUH52Y>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Retry generation with latest |
Thanks! 🙏
I will, can that also solve the f16 issue i have where the sizes doesnt
match?
fre 14 juli 2023 kl. 10:26 skrev Georgi Gerganov ***@***.***>:
… Retry generation with latest master - I just pushed a bug fix
—
Reply to this email directly, view it on GitHub
<#371 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATH7LT4YBL3RPMQNYIMEA3XQD7FFANCNFSM6AAAAAA2FUH52Y>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
You'll need to convert the tensors as explained by @MrGrayCode The convert script is very hacky atm and as you can see it requires many manual adjustments. |
From what I can tell in the f16 version the wpe section is already being
converted to f16 but still there is something with the size that’s not
correct.
I can rewrite it to match better but the problem is that I don’t really
have a reference to how it should work yet
fre 14 juli 2023 kl. 10:49 skrev Georgi Gerganov ***@***.***>:
… You'll need to convert the tensors as explained by @MrGrayCode
<https://github.com/MrGrayCode>
The convert script is very hacky atm and as you can see it requires many
manual adjustments.
We'll improve the support with time, but for now it's a bit tedious and
most things won't work straight out the box
—
Reply to this email directly, view it on GitHub
<#371 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATH7LVP2J5CRTI3XKHQNFTXQEBY7ANCNFSM6AAAAAA2FUH52Y>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I just did |
just added some more prints and get
that should mean |
looks like the example is expecting wpe to be f32 Line 170 in 5bef75b
|
You can try using this script: https://gist.github.com/MrGrayCode/f8c795f1e2744dc5d5c7d8917c7846e0 which works for the huggingface gpt2 model conversion. Make sure that you are using the correct files for |
Thanks a lot! 🙏
could it be that the tokenizer is getting modified a bit? the interesting parts.
|
I had to do this
instead of
|
Same thing happens with a HuggingFace GPT-J 4bits model: main: seed = 1695659205
gptj_model_load: loading model from 'ggml-model-f16.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx = 2048
gptj_model_load: n_embd = 4096
gptj_model_load: n_head = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot = 64
gptj_model_load: ftype = 1
gptj_model_load: qntvr = 0
gptj_model_load: ggml ctx size = 12438.93 MB
gptj_model_load: memory_size = 896.00 MB, n_mem = 57344
gptj_model_load: tensor 'transformer.h.0.attn.k_proj.weight' has wrong size in model file
main: failed to load model from 'ggml-model-f16.bin'
gptj_model_load: |
Apparently, it's duplicating the tensor's size as I added some verbosity here: https://github.com/ggerganov/ggml/blob/master/examples/gpt-j/main.cpp#L333 The original tensor has 8388608 while ggml expects 16777216: gptj_model_load: tensor 'transformer.h.0.attn.k_proj.weight' has wrong size in model file (16777216, 8388608)
main: failed to load model from 'ggml-model-f16.bin'
gptj_model_load: I assume this might be related with the model being 4 bits, but I'm yet not sure what to touch. |
Hi I have fine tuned gpt2 with huggingface and pytorch.
Ran the gpt-2/convert-h5-to-ggml.py but when I try to run the model I get this
The text was updated successfully, but these errors were encountered: