Something magic in here... #10

trinhdoduyhungss · 2023-04-03T13:41:56Z

trinhdoduyhungss
Apr 3, 2023

I tried to run your code on my computer (Windows OS). Everything's so smooth and it works so well. No matter if my local computer only has 16GB RAM (just 8GB available), 14B raven still works well (even though so slow - 2 minutes for answering "What is your name"). However...The magic started from here when I push the weight into my server (Linux) with 126GB RAM (100GB available) and 32 cores CPU....but I got the error:
Loading 20B tokenizer
System info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Loading RWKV model
ggml_new_tensor_impl: not enough space in the context's memory pool (needed 10662867344, available 10662862009)
Segmentation fault (core dumped)
What happens here?
I tried to download, convert it to the ggml, quantize the model one more time at the server, and run it again but still got the error above. Hmmm, I have no idea...

saharNooby · 2023-04-03T15:20:13Z

saharNooby
Apr 3, 2023
Collaborator

Please tell model size and data type (FP16, FP32, Q4_0, Q4_1).

This is a known issue: #8 I've bumped the memory today morning. Pleast try latest commit and check if it works.

If it does not work, then that's a bug in rwkv.cpp; but as a workaround you can increase allocated memory in rwkv.cpp, line 179 and rebuild.

7 replies

saharNooby Apr 3, 2023
Collaborator

Hm, and you use the exact same model file?

trinhdoduyhungss Apr 3, 2023
Author

yes

saharNooby Apr 3, 2023
Collaborator

I strongly recommend double-checking that you run the latest commit. You can delete the repo completely and clone/build it from scratch to ensure that.

trinhdoduyhungss Apr 3, 2023
Author

169 is ok on my server but raven 7B and 14B ... :(

trinhdoduyhungss Apr 4, 2023
Author

Dear sir,
After I increase memory in rwkv.cpp at line 179, it really can help me run 14B on my server now and work so fast with the 32 cores process.
But when I run, I got some warning like that:
rwkv_cpp_model.py:82: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storage directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
state_in_ptr = state_in.storage().data_ptr()
Thank you for your recommendation. How it be great if you improve text generation faster? :D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Something magic in here... #10

{{title}}

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Something magic in here... #10

trinhdoduyhungss Apr 3, 2023

Replies: 1 comment · 7 replies

saharNooby Apr 3, 2023 Collaborator

saharNooby Apr 3, 2023 Collaborator

trinhdoduyhungss Apr 3, 2023 Author

saharNooby Apr 3, 2023 Collaborator

trinhdoduyhungss Apr 3, 2023 Author

trinhdoduyhungss Apr 4, 2023 Author

trinhdoduyhungss
Apr 3, 2023

Replies: 1 comment 7 replies

saharNooby
Apr 3, 2023
Collaborator

saharNooby Apr 3, 2023
Collaborator

trinhdoduyhungss Apr 3, 2023
Author

saharNooby Apr 3, 2023
Collaborator

trinhdoduyhungss Apr 3, 2023
Author

trinhdoduyhungss Apr 4, 2023
Author