Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use custom GPT-J checkpoint #488

Open
mariecwhite opened this issue Aug 28, 2023 · 1 comment
Open

Use custom GPT-J checkpoint #488

mariecwhite opened this issue Aug 28, 2023 · 1 comment

Comments

@mariecwhite
Copy link

I would like to run the ggml/gpt-j version on the MLPerf benchmark. Is it possible to use a fine-tuned GPT-J checkpoint listed here: https://github.com/mlcommons/inference/blob/master/language/gpt-j/README.md#download-gpt-j-model? The pre-trained version used in MLPerf is EleutherAI/gpt-j-6B which is the same as what is used in ggml.

@maxng07
Copy link

maxng07 commented Sep 7, 2023

Hi, have you tried out on your end?
I was looking at doing benchmarking test last week after I converted and quantize the 7B Code Llama for accuracy, so your post definitely interest me.

What I did is download the pytorch file, which is approx 24G from the link you shared. I used one of the older convert-h5-to-ggml.py (https://github.com/ggerganov/ggml/blob/master/examples/gpt-j/convert-h5-to-ggml.py), the newer gguf convert tool produce int error. I was able to run without problems. However, because my machine is only 16G RAM, I run out of memory and the job is killed. I'm pretty positive that it will work if you have a machine with more than 24G RAM.

Please keep me posted on the status of the conversion and the status of the MLPerf testing. I'm interested in the MLPerf too as I am going to run similar. Either post back here or DM me privately.

See below
root@master:~/oldllama.cpp/models/gpt-j/checkpoint-final# python3 convert.py ./ 1
Loading checkpoint shards: 33%|██████████████████████████ | 1/3 [00:24<00:49, 24.94s/it]
Killed

root@master:~/oldllama.cpp/models/gpt-j/checkpoint-final# ls -l
total 23759020
-rw-r--r-- 1 root root        3110 Jul 20 22:11 README.md
-rw-r--r-- 1 root root        4346 Jun 30 22:06 added_tokens.json
-rw-r--r-- 1 root root        1000 Jun 30 22:06 config.json
-rw-r--r-- 1 root root        5509 Sep  6 08:11 convert.py
-rw-r--r-- 1 root root         124 Jun 30 22:06 generation_config.json
-rw-r--r-- 1 root root      456318 Jun 30 22:06 merges.txt
-rw-r--r-- 1 root root 10004248818 Jun 30 22:22 pytorch_model-00001-of-00003.bin
-rw-r--r-- 1 root root  9983934481 Jun 30 22:22 pytorch_model-00002-of-00003.bin
-rw-r--r-- 1 root root  4332935279 Jun 30 22:14 pytorch_model-00003-of-00003.bin
-rw-r--r-- 1 root root       25834 Jun 30 22:06 pytorch_model.bin.index.json
-rw-r--r-- 1 root root         462 Jun 30 22:06 special_tokens_map.json
-rw-r--r-- 1 root root         810 Jun 30 22:06 tokenizer_config.json
-rw-r--r-- 1 root root     6571300 Jun 30 22:06 trainer_state.json
-rw-r--r-- 1 root root      999186 Jun 30 22:06 vocab.json

root@master:~/oldllama.cpp/models/gpt-j/checkpoint-final# python3 convert.py ./ 1 pytorch_model-00001-of-00003.bin
Loading checkpoint shards: 33%|██████████████████████████ | 1/3 [00:20<00:41, 20.91s/it]Killed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants