gpt2 error #371

stellanhaglund · 2023-07-11T08:25:03Z

Hi I have fine tuned gpt2 with huggingface and pytorch.
Ran the gpt-2/convert-h5-to-ggml.py but when I try to run the model I get this

main: seed = 1689063614
gpt2_model_load: loading model from 'models/gpt2-train/ggml-model.bin'
gpt2_model_load: n_vocab = 50259
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 1
gpt2_model_load: qntvr   = 0
gpt2_model_load: ggml tensor size = 240 bytes
gpt2_model_load: ggml ctx size = 384.78 MB
gpt2_model_load: memory size =    72.00 MB, n_mem = 12288
gpt2_model_load: tensor 'model/wpe' has wrong size in model file: got 3145728, expected 1572864
main: failed to load model from 'models/gpt2-train/ggml-model.bin'

The text was updated successfully, but these errors were encountered:

stellanhaglund · 2023-07-11T10:52:50Z

in f32 i get this

main: seed = 1689072653
gpt2_model_load: loading model from 'models/gpt2-train/ggml-model-f32.bin'
gpt2_model_load: n_vocab = 50259
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 0
gpt2_model_load: qntvr   = 0
gpt2_model_load: ggml tensor size = 240 bytes
gpt2_model_load: ggml ctx size = 694.02 MB
gpt2_model_load: memory size =    72.00 MB, n_mem = 12288
gpt2_model_load: tensor 'model/h0/attn/c_attn/w' has wrong shape in model file: got [768, 2304], expected [2304, 768]

ebeyabraham · 2023-07-13T05:20:05Z

@stellanhaglund you need to modify the conversion script for it to work with huggingface pytorch models. For fp32 you need to transpose model/h0/attn/c_attn/w like it is done in the conversion script here: https://github.com/ggerganov/ggml/blob/f6365c0605ac86c6ab106cda0e8d6650e54097a7/examples/gpt-2/convert-h5-to-ggml.py#L130C5-L132C32

In the fp16 model, you get that error because you need to convert model/wpe to fp16 similar to this:

ggml/examples/gpt-2/convert-h5-to-ggml.py

Line 119 in f6365c0

if name[-7:] == ".weight" and n_dims == 2:

stellanhaglund · 2023-07-13T08:01:43Z

I'm using that script

stellanhaglund · 2023-07-13T09:35:07Z

Ok so I added this for the f32.

    if name.endswith('attn.c_attn.weight'):
        data = data.transpose()

    if name.endswith('mlp.c_fc.weight'):
        data = data.transpose()

And it works but the model seems like it doesn't feel very good anymore it just outputs the same word over and over..

stellanhaglund · 2023-07-13T10:09:43Z

And from what I can tell in the conversion script wpe is converted to f16, but somehow it expects another size..

ebeyabraham · 2023-07-13T17:30:12Z

Ok so I added this for the f32.
    if name.endswith('attn.c_attn.weight'):
        data = data.transpose()

    if name.endswith('mlp.c_fc.weight'):
        data = data.transpose() 
And it works but the model seems like it doesn't feel very good anymore it just outputs the same word over and over..

You also need to add:

if name.endswith("c_proj.weight"):
    data = data.transpose()

stellanhaglund · 2023-07-13T18:12:45Z

If you mean this

https://github.com/ggerganov/ggml/blob/f6365c0605ac86c6ab106cda0e8d6650e54097a7/examples/gpt-2/convert-h5-to-ggml.py#L130C5-L132C32

i didnt remove it, just added the two other ones

ebeyabraham · 2023-07-13T18:33:56Z

It should be "c_proj.weight" and not "mlp.c_proj.weight" for it to work with huggingface models

stellanhaglund · 2023-07-13T18:44:26Z

I just tried it and the model just outputs only weird stuff and with the same prompt in the python version it works good, so i think something is off. What could be causing the wrong size of the wpe in the f16 version? I can see that it’s converted to f16 but something with its size is not correct.. tors 13 juli 2023 kl. 20:34 skrev Ebey Abraham ***@***.***>:

…

It should be "c_proj.weight" and not "mlp.c_proj.weight" for it to work with huggingface models — Reply to this email directly, view it on GitHub <#371 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATH7LV2LIGYTPUQLJTG2G3XQA5SDANCNFSM6AAAAAA2FUH52Y> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ggerganov · 2023-07-14T08:26:48Z

Retry generation with latest master - I just pushed a bug fix

stellanhaglund · 2023-07-14T08:37:21Z

Thanks! 🙏 I will, can that also solve the f16 issue i have where the sizes doesnt match? fre 14 juli 2023 kl. 10:26 skrev Georgi Gerganov ***@***.***>:

…

Retry generation with latest master - I just pushed a bug fix — Reply to this email directly, view it on GitHub <#371 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATH7LT4YBL3RPMQNYIMEA3XQD7FFANCNFSM6AAAAAA2FUH52Y> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ggerganov · 2023-07-14T08:49:08Z

You'll need to convert the tensors as explained by @MrGrayCode

The convert script is very hacky atm and as you can see it requires many manual adjustments.
We'll improve the support with time, but for now it's a bit tedious and most things won't work straight out the box

stellanhaglund · 2023-07-14T08:54:25Z

From what I can tell in the f16 version the wpe section is already being converted to f16 but still there is something with the size that’s not correct. I can rewrite it to match better but the problem is that I don’t really have a reference to how it should work yet fre 14 juli 2023 kl. 10:49 skrev Georgi Gerganov ***@***.***>:

…

You'll need to convert the tensors as explained by @MrGrayCode <https://github.com/MrGrayCode> The convert script is very hacky atm and as you can see it requires many manual adjustments. We'll improve the support with time, but for now it's a bit tedious and most things won't work straight out the box — Reply to this email directly, view it on GitHub <#371 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATH7LVP2J5CRTI3XKHQNFTXQEBY7ANCNFSM6AAAAAA2FUH52Y> . You are receiving this because you were mentioned.Message ID: ***@***.***>

stellanhaglund · 2023-07-14T14:41:11Z

I just did sys.getsizeof(data) on the data for the wpe section and got 1572992.
the error message says got 3145728, expected 1572864, i don't understand where it get's 3145728 from 🤷‍♂️

stellanhaglund · 2023-07-14T15:01:10Z

just added some more prints and get

Processing variable: wpe.weight with shape:  (1024, 768)
  Converting to float16
<class 'numpy.float16'>

that should mean 1024 * 786 * 2 so 1 572 864.
ftype gets set to 1 so everything looks correct from the conversion?

stellanhaglund · 2023-07-14T15:34:29Z

looks like the example is expecting wpe to be f32

ggml/examples/gpt-2/main.cpp

Line 170 in 5bef75b

ctx_size += n_ctx*n_embd*ggml_type_sizef(GGML_TYPE_F32); // wpe

ebeyabraham · 2023-07-14T17:12:49Z

You can try using this script: https://gist.github.com/MrGrayCode/f8c795f1e2744dc5d5c7d8917c7846e0 which works for the huggingface gpt2 model conversion.

Make sure that you are using the correct files for vocab.json and added_tokens.json

stellanhaglund · 2023-07-14T17:21:07Z

Thanks a lot! 🙏
It works, as my changes did, but this is what the output looks like..

[SingleEP2830acl2006!),"HQ!),046IntegADD2006!),"HQ!),046234!),046995Fair"},%%.""StreamerBotFSopp Battlefield morpabsorFI!) RiftLock!).?00007!) SteadEPLock!).?00007!) SteadEP2200IED046 Battlefield!]628177estychristFSopp Battlefield morp995////////HandleDeveloper%234!),046 ZuckerADD2006!),"HQ!),046 ramp Albania"",,413Bott$.""413Bott 1988""413Bottash"""}],"!)------------------------------------------------""unc"}],",,137""RepHQ"}],"!)------------------------------------------------""2006 Diablo"}],",,..""RepHQ"}],"!)------------------------------------------------""NodeCU"}],",,137""RepHQ"}],"!)------------------------------------------------""Node960"}],",,..""RepHQ"}],"!)------------------------------------------------""ENTION?""}],",,137""657HQ!),AMES"}],"!)------------------------------------------------""ENTIONCommission"}],",,..""657HQ!),AMES Discover DL??

could it be that the tokenizer is getting modified a bit?
I'm using this colab.
https://colab.research.google.com/drive/13dZVYEOMhXhkXWfvSMVM1TTtUDrT6Aeh

the interesting parts.

tokenizer = GPT2Tokenizer.from_pretrained('gpt2', bos_token='<|startoftext|>', eos_token='<|endoftext|>', pad_token='<|pad|>')


class GPT2Dataset(Dataset):

  def __init__(self, txt_list, tokenizer, gpt2_type="gpt2", max_length=768):

    self.tokenizer = tokenizer
    self.input_ids = []
    self.attn_masks = []

    for txt in txt_list:

      encodings_dict = tokenizer('<|startoftext|>'+ txt + '<|endoftext|>', truncation=True, max_length=max_length, padding="max_length")

      self.input_ids.append(torch.tensor(encodings_dict['input_ids']))
      self.attn_masks.append(torch.tensor(encodings_dict['attention_mask']))
    
  def __len__(self):
    return len(self.input_ids)

  def __getitem__(self, idx):
    return self.input_ids[idx], self.attn_masks[idx] 
    ```

stellanhaglund · 2023-07-17T22:16:16Z

I had to do this

tokenizer = GPT2Tokenizer.from_pretrained(dir)

for i in range(hparams["vocab_size"]):
    text = tokenizer.decode([i]).encode('utf-8')
    fout.write(struct.pack("i", len(text)))
    fout.write(text)

instead of

for key in encoder:
    text = bytearray([byte_decoder[c] for c in key])
    fout.write(struct.pack("i", len(text)))
    fout.write(text)

for key in encoder_added:
    text = bytearray([byte_decoder[c] for c in key])
    fout.write(struct.pack("i", len(text)))
    fout.write(text)

webpolis · 2023-09-25T16:38:50Z

Same thing happens with a HuggingFace GPT-J 4bits model:

main: seed = 1695659205
gptj_model_load: loading model from 'ggml-model-f16.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: ftype   = 1
gptj_model_load: qntvr   = 0
gptj_model_load: ggml ctx size = 12438.93 MB
gptj_model_load: memory_size =   896.00 MB, n_mem = 57344
gptj_model_load: tensor 'transformer.h.0.attn.k_proj.weight' has wrong size in model file
main: failed to load model from 'ggml-model-f16.bin'
gptj_model_load:

webpolis · 2023-09-25T17:17:16Z

Apparently, it's duplicating the tensor's size as I added some verbosity here:

https://github.com/ggerganov/ggml/blob/master/examples/gpt-j/main.cpp#L333

The original tensor has 8388608 while ggml expects 16777216:

gptj_model_load: tensor 'transformer.h.0.attn.k_proj.weight' has wrong size in model file (16777216, 8388608)
main: failed to load model from 'ggml-model-f16.bin'
gptj_model_load:

I assume this might be related with the model being 4 bits, but I'm yet not sure what to touch.

stellanhaglund closed this as completed Jul 17, 2023

webpolis mentioned this issue Sep 25, 2023

Issue inferencing HuggingFace's GPT-J 4 bits model #539

Open

CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this issue Dec 18, 2023

specify build type for ctest on windows (ggerganov#371)

01a297b

Twenkid mentioned this issue Feb 27, 2024

In gpt-2/convert-h5-to-ggml.py : size mismatch for wpe.weight ... torch.Size([50255, 1024]) ... #745

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpt2 error #371

gpt2 error #371

stellanhaglund commented Jul 11, 2023

stellanhaglund commented Jul 11, 2023

ebeyabraham commented Jul 13, 2023 •

edited

Loading

stellanhaglund commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023

ebeyabraham commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023

ebeyabraham commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023 via email

ggerganov commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023 via email

ggerganov commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023 via email

stellanhaglund commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023

ebeyabraham commented Jul 14, 2023 •

edited

Loading

stellanhaglund commented Jul 14, 2023

stellanhaglund commented Jul 17, 2023

webpolis commented Sep 25, 2023

webpolis commented Sep 25, 2023 •

edited

Loading

gpt2 error #371

gpt2 error #371

Comments

stellanhaglund commented Jul 11, 2023

stellanhaglund commented Jul 11, 2023

ebeyabraham commented Jul 13, 2023 • edited Loading

stellanhaglund commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023

ebeyabraham commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023

ebeyabraham commented Jul 13, 2023

stellanhaglund commented Jul 13, 2023 via email

ggerganov commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023 via email

ggerganov commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023 via email

stellanhaglund commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023

stellanhaglund commented Jul 14, 2023

ebeyabraham commented Jul 14, 2023 • edited Loading

stellanhaglund commented Jul 14, 2023

stellanhaglund commented Jul 17, 2023

webpolis commented Sep 25, 2023

webpolis commented Sep 25, 2023 • edited Loading

ebeyabraham commented Jul 13, 2023 •

edited

Loading

ebeyabraham commented Jul 14, 2023 •

edited

Loading

webpolis commented Sep 25, 2023 •

edited

Loading