Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpt2 error #371

Closed
stellanhaglund opened this issue Jul 11, 2023 · 21 comments
Closed

gpt2 error #371

stellanhaglund opened this issue Jul 11, 2023 · 21 comments

Comments

@stellanhaglund
Copy link

Hi I have fine tuned gpt2 with huggingface and pytorch.
Ran the gpt-2/convert-h5-to-ggml.py but when I try to run the model I get this

main: seed = 1689063614
gpt2_model_load: loading model from 'models/gpt2-train/ggml-model.bin'
gpt2_model_load: n_vocab = 50259
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 1
gpt2_model_load: qntvr   = 0
gpt2_model_load: ggml tensor size = 240 bytes
gpt2_model_load: ggml ctx size = 384.78 MB
gpt2_model_load: memory size =    72.00 MB, n_mem = 12288
gpt2_model_load: tensor 'model/wpe' has wrong size in model file: got 3145728, expected 1572864
main: failed to load model from 'models/gpt2-train/ggml-model.bin'
@stellanhaglund
Copy link
Author

in f32 i get this

main: seed = 1689072653
gpt2_model_load: loading model from 'models/gpt2-train/ggml-model-f32.bin'
gpt2_model_load: n_vocab = 50259
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 0
gpt2_model_load: qntvr   = 0
gpt2_model_load: ggml tensor size = 240 bytes
gpt2_model_load: ggml ctx size = 694.02 MB
gpt2_model_load: memory size =    72.00 MB, n_mem = 12288
gpt2_model_load: tensor 'model/h0/attn/c_attn/w' has wrong shape in model file: got [768, 2304], expected [2304, 768]

@ebeyabraham
Copy link
Contributor

ebeyabraham commented Jul 13, 2023

@stellanhaglund you need to modify the conversion script for it to work with huggingface pytorch models. For fp32 you need to transpose model/h0/attn/c_attn/w like it is done in the conversion script here: https://github.com/ggerganov/ggml/blob/f6365c0605ac86c6ab106cda0e8d6650e54097a7/examples/gpt-2/convert-h5-to-ggml.py#L130C5-L132C32

In the fp16 model, you get that error because you need to convert model/wpe to fp16 similar to this:

if name[-7:] == ".weight" and n_dims == 2:

@stellanhaglund
Copy link
Author

I'm using that script

@stellanhaglund
Copy link
Author

Ok so I added this for the f32.

    if name.endswith('attn.c_attn.weight'):
        data = data.transpose()

    if name.endswith('mlp.c_fc.weight'):
        data = data.transpose() 

And it works but the model seems like it doesn't feel very good anymore it just outputs the same word over and over..

@stellanhaglund
Copy link
Author

And from what I can tell in the conversion script wpe is converted to f16, but somehow it expects another size..

@ebeyabraham
Copy link
Contributor

Ok so I added this for the f32.

    if name.endswith('attn.c_attn.weight'):
        data = data.transpose()

    if name.endswith('mlp.c_fc.weight'):
        data = data.transpose() 

And it works but the model seems like it doesn't feel very good anymore it just outputs the same word over and over..

You also need to add:

if name.endswith("c_proj.weight"):
    data = data.transpose()

@stellanhaglund
Copy link
Author

@ebeyabraham
Copy link
Contributor

It should be "c_proj.weight" and not "mlp.c_proj.weight" for it to work with huggingface models

@stellanhaglund
Copy link
Author

stellanhaglund commented Jul 13, 2023 via email

@ggerganov
Copy link
Owner

Retry generation with latest master - I just pushed a bug fix

@stellanhaglund
Copy link
Author

stellanhaglund commented Jul 14, 2023 via email

@ggerganov
Copy link
Owner

You'll need to convert the tensors as explained by @MrGrayCode

The convert script is very hacky atm and as you can see it requires many manual adjustments.
We'll improve the support with time, but for now it's a bit tedious and most things won't work straight out the box

@stellanhaglund
Copy link
Author

stellanhaglund commented Jul 14, 2023 via email

@stellanhaglund
Copy link
Author

I just did sys.getsizeof(data) on the data for the wpe section and got 1572992.
the error message says got 3145728, expected 1572864, i don't understand where it get's 3145728 from 🤷‍♂️

@stellanhaglund
Copy link
Author

just added some more prints and get

Processing variable: wpe.weight with shape:  (1024, 768)
  Converting to float16
<class 'numpy.float16'>

that should mean 1024 * 786 * 2 so 1 572 864.
ftype gets set to 1 so everything looks correct from the conversion?

@stellanhaglund
Copy link
Author

looks like the example is expecting wpe to be f32

ctx_size += n_ctx*n_embd*ggml_type_sizef(GGML_TYPE_F32); // wpe

@ebeyabraham
Copy link
Contributor

ebeyabraham commented Jul 14, 2023

You can try using this script: https://gist.github.com/MrGrayCode/f8c795f1e2744dc5d5c7d8917c7846e0 which works for the huggingface gpt2 model conversion.

Make sure that you are using the correct files for vocab.json and added_tokens.json

@stellanhaglund
Copy link
Author

Thanks a lot! 🙏
It works, as my changes did, but this is what the output looks like..

[SingleEP2830acl2006!),"HQ!),046IntegADD2006!),"HQ!),046234!),046995Fair"},%%.""StreamerBotFSopp Battlefield morpabsorFI!) RiftLock!).?00007!) SteadEPLock!).?00007!) SteadEP2200IED046 Battlefield!]628177estychristFSopp Battlefield morp995////////HandleDeveloper%234!),046 ZuckerADD2006!),"HQ!),046 ramp Albania"",,413Bott$.""413Bott 1988""413Bottash"""}],"!)------------------------------------------------""unc"}],",,137""RepHQ"}],"!)------------------------------------------------""2006 Diablo"}],",,..""RepHQ"}],"!)------------------------------------------------""NodeCU"}],",,137""RepHQ"}],"!)------------------------------------------------""Node960"}],",,..""RepHQ"}],"!)------------------------------------------------""ENTION?""}],",,137""657HQ!),AMES"}],"!)------------------------------------------------""ENTIONCommission"}],",,..""657HQ!),AMES Discover DL??

could it be that the tokenizer is getting modified a bit?
I'm using this colab.
https://colab.research.google.com/drive/13dZVYEOMhXhkXWfvSMVM1TTtUDrT6Aeh

the interesting parts.

tokenizer = GPT2Tokenizer.from_pretrained('gpt2', bos_token='<|startoftext|>', eos_token='<|endoftext|>', pad_token='<|pad|>')

class GPT2Dataset(Dataset):

  def __init__(self, txt_list, tokenizer, gpt2_type="gpt2", max_length=768):

    self.tokenizer = tokenizer
    self.input_ids = []
    self.attn_masks = []

    for txt in txt_list:

      encodings_dict = tokenizer('<|startoftext|>'+ txt + '<|endoftext|>', truncation=True, max_length=max_length, padding="max_length")

      self.input_ids.append(torch.tensor(encodings_dict['input_ids']))
      self.attn_masks.append(torch.tensor(encodings_dict['attention_mask']))
    
  def __len__(self):
    return len(self.input_ids)

  def __getitem__(self, idx):
    return self.input_ids[idx], self.attn_masks[idx] 
    ```

@stellanhaglund
Copy link
Author

I had to do this

tokenizer = GPT2Tokenizer.from_pretrained(dir)

for i in range(hparams["vocab_size"]):
    text = tokenizer.decode([i]).encode('utf-8')
    fout.write(struct.pack("i", len(text)))
    fout.write(text)

instead of

for key in encoder:
    text = bytearray([byte_decoder[c] for c in key])
    fout.write(struct.pack("i", len(text)))
    fout.write(text)

for key in encoder_added:
    text = bytearray([byte_decoder[c] for c in key])
    fout.write(struct.pack("i", len(text)))
    fout.write(text) 

@webpolis
Copy link

Same thing happens with a HuggingFace GPT-J 4bits model:

main: seed = 1695659205
gptj_model_load: loading model from 'ggml-model-f16.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: ftype   = 1
gptj_model_load: qntvr   = 0
gptj_model_load: ggml ctx size = 12438.93 MB
gptj_model_load: memory_size =   896.00 MB, n_mem = 57344
gptj_model_load: tensor 'transformer.h.0.attn.k_proj.weight' has wrong size in model file
main: failed to load model from 'ggml-model-f16.bin'
gptj_model_load: 

@webpolis
Copy link

webpolis commented Sep 25, 2023

Apparently, it's duplicating the tensor's size as I added some verbosity here:

https://github.com/ggerganov/ggml/blob/master/examples/gpt-j/main.cpp#L333

The original tensor has 8388608 while ggml expects 16777216:

gptj_model_load: tensor 'transformer.h.0.attn.k_proj.weight' has wrong size in model file (16777216, 8388608)
main: failed to load model from 'ggml-model-f16.bin'
gptj_model_load:

I assume this might be related with the model being 4 bits, but I'm yet not sure what to touch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants