Odd behaviour with GPT2 after bfc6d42 #385

smspillaz · 2023-07-13T19:51:45Z

New behaviour after bfc6d42

  $ ./bin/gpt-2 -m ./models/gpt-2-117M/ggml-model.bin -p 'The meaning of life is:' --top_k 1
  main: seed = 1689160875
  gpt2_model_load: loading model from './models/gpt-2-117M/ggml-model.bin'
  gpt2_model_load: n_vocab = 50257
  gpt2_model_load: n_ctx   = 1024
  gpt2_model_load: n_embd  = 768
  gpt2_model_load: n_head  = 12
  gpt2_model_load: n_layer = 12
  gpt2_model_load: ftype   = 1
  gpt2_model_load: qntvr   = 0
  gpt2_model_load: ggml tensor size = 240 bytes
  gpt2_model_load: ggml ctx size = 384.77 MB
  gpt2_model_load: memory size =    72.00 MB, n_mem = 12288
  gpt2_model_load: model size  =   239.08 MB
  extract_tests_from_file : No test file found.
  test_gpt_tokenizer : 0 tests failed out of 0 tests.
  main: prompt: 'The meaning of life is:'
  main: number of tokens in prompt = 6, first 8 tokens: 464 3616 286 1204 318 25 
  
  The meaning of life is: life
  
  
  "I'm
  
  "I'm
  
  "I'm
  
  "I'm

Old behaviour at bfc6d42^

$ ./bin/gpt-2 -m ../../ggml/build/models/gpt-2-117M/ggml-model.bin -p 'The meaning of life is:' --top_k 1
main: seed = 1689160978
gpt2_model_load: loading model from '../../ggml/build/models/gpt-2-117M/ggml-model.bin'
gpt2_model_load: n_vocab = 50257
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 1
gpt2_model_load: qntvr   = 0
gpt2_model_load: ggml tensor size = 240 bytes
gpt2_model_load: ggml ctx size = 384.77 MB
gpt2_model_load: memory size =    72.00 MB, n_mem = 12288
gpt2_model_load: model size  =   239.08 MB
extract_tests_from_file : No test file found.
test_gpt_tokenizer : 0 tests failed out of 0 tests.
main: prompt: 'The meaning of life is:'
main: number of tokens in prompt = 6, first 8 tokens: 464 3616 286 1204 318 25 

The meaning of life is: to live in a world of abundance, and to live in a world of abundance.

I haven't been able to figure out exactly why this happens, but I did bisect it down to that commit. One thing I noticed when poking around in the debugger is that the logits for the first predicted token are correct, but the logits for the second predicted token differ.

The text was updated successfully, but these errors were encountered:

ggerganov · 2023-07-13T20:05:12Z

I can reproduce the issue and confirm that bfc6d42 is the cause. We should fix this

ggerganov · 2023-07-14T08:16:51Z

Thank you very much for spotting this regression!

Hope I manage to implement a better CI soon and be able to catch such bugs earlier.
In the meantime, if you spot anything weird in the result - let us know

smspillaz · 2023-07-14T10:05:30Z

Ah, thanks for finding the source of the bug.

By the way, I found this because my own unit tests were failing :) . But I'd be happy to contribute some testing stuff into GGML. There's a related issue for that, see #344 .

goerch · 2023-07-23T19:08:03Z

@smspillaz : @ggerganov just merged this PR. Would be happy to coordinate efforts!

ggerganov added the bug Something isn't working label Jul 13, 2023

smspillaz changed the title ~~Odd behaviour with GPT2 after~~ Odd behaviour with GPT2 after bfc6d42 Jul 14, 2023

ggerganov mentioned this issue Jul 14, 2023

ggml : fix mul_mat src1 indexing when src1 is not contiguous #386

Merged

ggerganov closed this as completed in #386 Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Odd behaviour with GPT2 after bfc6d42 #385

Odd behaviour with GPT2 after bfc6d42 #385

smspillaz commented Jul 13, 2023

ggerganov commented Jul 13, 2023

ggerganov commented Jul 14, 2023

smspillaz commented Jul 14, 2023

goerch commented Jul 23, 2023 •

edited

Loading

Odd behaviour with GPT2 after bfc6d42 #385

Odd behaviour with GPT2 after bfc6d42 #385

Comments

smspillaz commented Jul 13, 2023

ggerganov commented Jul 13, 2023

ggerganov commented Jul 14, 2023

smspillaz commented Jul 14, 2023

goerch commented Jul 23, 2023 • edited Loading

goerch commented Jul 23, 2023 •

edited

Loading