Support for RedPajama #134

amirza1 · 2023-05-06T14:20:36Z

This supports RedPajama, which is gptneo-x with use_parallel_residual=False.

examples/redpajama/README.md

Co-authored-by: Tom Bailey <[email protected]>

Green-Sky · 2023-05-07T11:45:14Z

just tested, it works. But I feel like adding more and more small variations of the same code is kinda bad. we should merge stableml and redpajama (and dolly?) and call it gptneox :)

amirza1 · 2023-05-07T11:55:39Z

I agree. Dolly and stableLM were pretty similar too.

…

On Sun, May 7, 2023, 2:45 PM Erik Scholz ***@***.***> wrote: just tested, it works. But I feel like adding more and more small variations of the same code is kinda bad. we should merge stableml and redpajama (and dolly?) and call it gptneox :) — Reply to this email directly, view it on GitHub <#134 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABPM37YZJHAIFPAZXMLWNOTXE6DNLANCNFSM6AAAAAAXYE2JEM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

ggerganov · 2023-05-07T15:38:08Z

Yup, will combine them soon. Just need some time to test the Dolly model and make sure the inference is correct - currently can't convert the model on Mac OS since no bfloat16 python support.

keldenl · 2023-05-08T07:12:03Z

@amirza1 awesome stuff! uploading ggml models on hugging face https://huggingface.co/huggersbro/RedPajama-INCITE-Chat-3B-v1-GGML (should be good to link since it's open source right?)

mverrilli · 2023-05-08T15:10:28Z

Hi @ggerganov

currently can't convert the model on Mac OS since no bfloat16 python support

I liked @keldenl's idea so I posted the ggml bins here if that helps you test:
https://huggingface.co/mverrilli/dolly-v2-3b-ggml/tree/main
https://huggingface.co/mverrilli/dolly-v2-12b-ggml/tree/main

Let me know if I can assist.

keldenl · 2023-05-08T15:14:15Z

@amirza1 awesome stuff! uploading ggml models on hugging face https://huggingface.co/huggersbro/RedPajama-INCITE-Chat-3B-v1-GGML (should be good to link since it's open source right?)

Here's the 3B instruct model: https://huggingface.co/keldenl/RedPajama-INCITE-Instruct-3B-v1-GGML/

@amirza1 @ggerganov should we link these ggml models in the readme like gpt-2 (since this has apache 2 license) as an alternative option (i.e. "or you can get the ggml directly")

Update: Here's the 7B instruct model https://huggingface.co/keldenl/RedPajama-INCITE-Instruct-7B-v0.1-GGML

only chat 7b left, i'll upload it later tonight

mudler · 2023-05-09T20:33:13Z

Hi @ggerganov

currently can't convert the model on Mac OS since no bfloat16 python support

I liked @keldenl's idea so I posted the ggml bins here if that helps you test: https://huggingface.co/mverrilli/dolly-v2-3b-ggml/tree/main https://huggingface.co/mverrilli/dolly-v2-12b-ggml/tree/main

Let me know if I can assist.

I did give a shot and tried Q5 locally, no luck so far

main: seed = 1683660204
dollyv2_model_load: loading model from '/models/ggml-dolly-q5_0.bin' - please wait ...
dollyv2_model_load: n_vocab = 50280
dollyv2_model_load: n_ctx   = 2048
dollyv2_model_load: n_embd  = 4096
dollyv2_model_load: n_head  = 32
dollyv2_model_load: n_layer = 32
dollyv2_model_load: n_rot   = 32
dollyv2_model_load: ftype   = 8
dollyv2_model_load: ggml ctx size = 8596.22 MB
dollyv2_model_load: memory_size =  1024.00 MB, n_mem = 65536
dollyv2_model_load: unknown tensor 'gpt_neox.embed_in.weight' in model file
main: failed to load model from '/models/ggml-dolly-q5_0.bin'

mverrilli · 2023-05-09T23:47:32Z

@mudler This works fine for me.

This is the mverrilli/dolly-v2-12b-ggml/ggml-model-q5_0.bin model, correct?

Maybe check your hash? SHA256: 79280421cc792330eaa56621060b8e2fb48ef570ace4572a91a1cf0e18ce7f38
I verified mine matches what's on HF.

There isn't a lot of error handling on the examples. Do you have enough ram to load the model?

mudler · 2023-05-10T07:23:52Z

@mudler This works fine for me.

This is the mverrilli/dolly-v2-12b-ggml/ggml-model-q5_0.bin model, correct?

Maybe check your hash? SHA256: 79280421cc792330eaa56621060b8e2fb48ef570ace4572a91a1cf0e18ce7f38 I verified mine matches what's on HF.

There isn't a lot of error handling on the examples. Do you have enough ram to load the model?

I've downloaded the 7b model (https://huggingface.co/mverrilli/dolly-v2-7b-ggml/blob/main/ggml-model-q5_0.bin):

~/_git/LocalAI
base ❯ sha256sum models/ggml-dolly-q5_0.bin            
9926cddcccd5c4d61a43ec05c8999147ae3c1deac7af636d3ffc618d7d30514b  models/ggml-dolly-q5_0.bin

Note I have 64GB of RAM, so that shouldn't be the issue

mverrilli · 2023-05-10T18:35:23Z

@mudler Hash matches mine and I repulled master and rebuilt and it is working. I don't want to clutter up this PR any further, if you want to create a new issue I can work through it with you.

ggerganov · 2023-05-11T21:44:42Z

I've been a bit busy these days - will start looking soon into the newly proposed models here.

Please check if #139 works with RedPajama, and if so - I think we should merge it instead of adding a new example in order to reduce code duplication

ggerganov · 2023-05-13T09:43:22Z

I've decided to merge #139
I haven't tested RedPajama yet, so if anyone can give it try using latest master and report if it is working correctly.
Will close this PR and feel free to report if there are any issues with the new gpt-neox example

Green-Sky · 2023-05-13T10:28:37Z

reconverted and works 👍

$ bin/gpt-neox -m ../examples/gpt-neox/models/RedPajama-INCITE-Base-3B-v1/ggml-model-f16.bin
main: seed = 1683973640
gpt_neox_model_load: loading model from '../examples/gpt-neox/models/RedPajama-INCITE-Base-3B-v1/ggml-model-f16.bin' - please wait ...
gpt_neox_model_load: n_vocab = 50432
gpt_neox_model_load: n_ctx   = 2048
gpt_neox_model_load: n_embd  = 2560
gpt_neox_model_load: n_head  = 32
gpt_neox_model_load: n_layer = 32
gpt_neox_model_load: n_rot   = 80
gpt_neox_model_load: par_res = 0
gpt_neox_model_load: ftype   = 1
gpt_neox_model_load: ggml ctx size = 7376.40 MB
gpt_neox_model_load: memory_size =   640.00 MB, n_mem = 65536
gpt_neox_model_load: ................................................ done
gpt_neox_model_load: model size =  5296.58 MB / num tensors = 388
main: number of tokens in prompt = 1
main: token[0] =   4553, After

After all, the first thing he did when he got to the table was to ask me for a light, and then he asked for a beer. It's the least I could do."

"I see," she said, and she did. He was a good-looking man, a little taller than she was, and a little broader too. He had dark, wavy hair and blue eyes and an open, easy manner. "Well," she said, "I have to go to the bathroom, and it's getting late. I'll see you tomorrow night."

"Thanks," he said. "I look forward to it. I'll send the bill to you."

"I'll put it on my account," she said. "The bill will be coming from me."

"Oh," he said, and smiled. "You're very careful with your money."

She had to smile too. "You're probably right," she said.


main: mem per token = 16137296 bytes
main:     load time =  1440.80 ms
main:   sample time =    20.09 ms
main:  predict time = 26467.06 ms / 132.34 ms per token
main:    total time = 28234.08 ms

NancyAurum · 2023-05-27T13:48:59Z

That commit has a bug. It calls
const int64_t t_main_start_us = ggml_time_us();
But doesn't call ggml_time_init()
So timer_freq is uninitialized and there is div by 0.

amirsight added 5 commits May 6, 2023 10:34

Added RedPajama-3b support

8ea9520

fixes

7984f3f

fixed bug

0ba253a

fixed README

762a7a0

Added quantization doc

91505ae

tombailey reviewed May 7, 2023

View reviewed changes

examples/redpajama/README.md Outdated Show resolved Hide resolved

Update examples/redpajama/README.md

2368dbb

Co-authored-by: Tom Bailey <[email protected]>

ggerganov closed this May 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for RedPajama #134

Support for RedPajama #134

amirza1 commented May 6, 2023

Green-Sky commented May 7, 2023

amirza1 commented May 7, 2023 via email

ggerganov commented May 7, 2023

keldenl commented May 8, 2023

mverrilli commented May 8, 2023 •

edited

Loading

keldenl commented May 8, 2023 •

edited

Loading

mudler commented May 9, 2023 •

edited

Loading

mverrilli commented May 9, 2023

mudler commented May 10, 2023 •

edited

Loading

mverrilli commented May 10, 2023

ggerganov commented May 11, 2023

ggerganov commented May 13, 2023

Green-Sky commented May 13, 2023

NancyAurum commented May 27, 2023

Support for RedPajama #134

Support for RedPajama #134

Conversation

amirza1 commented May 6, 2023

Green-Sky commented May 7, 2023

amirza1 commented May 7, 2023 via email

ggerganov commented May 7, 2023

keldenl commented May 8, 2023

mverrilli commented May 8, 2023 • edited Loading

keldenl commented May 8, 2023 • edited Loading

mudler commented May 9, 2023 • edited Loading

mverrilli commented May 9, 2023

mudler commented May 10, 2023 • edited Loading

mverrilli commented May 10, 2023

ggerganov commented May 11, 2023

ggerganov commented May 13, 2023

Green-Sky commented May 13, 2023

NancyAurum commented May 27, 2023

mverrilli commented May 8, 2023 •

edited

Loading

keldenl commented May 8, 2023 •

edited

Loading

mudler commented May 9, 2023 •

edited

Loading

mudler commented May 10, 2023 •

edited

Loading