Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion ggml_nelements(a) == ne0*ne1*ne2 when loading TheBloke/Llama-2-70B-GGML/llama-2-70b.ggmlv3.q2_K.bin #2445

Closed
xvolks opened this issue Jul 29, 2023 · 3 comments
Labels

Comments

@xvolks
Copy link

xvolks commented Jul 29, 2023

Loading the Llama 2 - 70B model from TheBloke with rustformers/llm seems to work but fails on inference.

llama.cpp raises an assertion regardless of the use_gpu option :

Loading of model complete
Model size = 27262.60 MB / num tensors = 723
[2023-07-29T14:24:19Z INFO  actix_server::builder] starting 10 workers
[2023-07-29T14:24:19Z INFO  actix_server::server] Actix runtime found; starting in Actix runtime
GGML_ASSERT: llama-cpp/ggml.c:6192: ggml_nelements(a) == ne0*ne1*ne2

This might be related to the model files, but the models from TheBloke are usually reliable.

Running on MacBook Pro M1 Max 32 GB RAM.
macOS 14.0.0 23A5301g

@dillfrescott
Copy link

Similar error happened to me too. its not the model. Its something with llama cpp. I rolled back to yesterdays commit and it worked fine

Copy link
Contributor

github-actions bot commented Apr 9, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 9, 2024
@oldmanjk
Copy link

oldmanjk commented Jun 1, 2024

Same issue when loading DeepSeek-V2-Chat. Reopen? @ggerganov
$ ./imatrix --seed 0 --threads 24 --threads-batch 32 --file [FILE] --flash-attn --model ggml-model-f32.gguf -o [OUTPUT] --no-ppl

GGML_ASSERT: ggml.c:5715: ggml_nelements(a) == ne0*ne1
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)

Edit - Trying again as the root user produced this extra output:

[New LWP 1533186]
[New LWP 1533187]
[New LWP 1533188]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007702834ea42f in __GI___wait4 (pid=1535286, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30	../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0  0x00007702834ea42f in __GI___wait4 (pid=1535286, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30	in ../sysdeps/unix/sysv/linux/wait4.c
#1  0x0000635da3c593fb in ggml_print_backtrace ()
#2  0x0000635da3c83415 in ggml_reshape_2d ()
#3  0x0000635da3ca0ec2 in llm_build_kqv(ggml_context*, llama_model const&, llama_hparams const&, llama_cparams const&, llama_kv_cache const&, ggml_cgraph*, ggml_tensor*, ggml_tensor*, ggml_tensor*, ggml_tensor*, int, int, float, std::function<void (ggml_tensor*, char const*, int)> const&, int) ()
#4  0x0000635da3ca3797 in llm_build_kv(ggml_context*, llama_model const&, llama_hparams const&, llama_cparams const&, llama_kv_cache const&, ggml_cgraph*, ggml_tensor*, ggml_tensor*, ggml_tensor*, ggml_tensor*, ggml_tensor*, ggml_tensor*, int, int, int, float, std::function<void (ggml_tensor*, char const*, int)> const&, int) [clone .constprop.0] ()
#5  0x0000635da3d1dfae in llm_build_context::build_deepseek2() ()
#6  0x0000635da3caa8cf in llama_build_graph(llama_context&, llama_batch const&, bool) ()
#7  0x0000635da3cc7f49 in llama_new_context_with_model ()
#8  0x0000635da3d40578 in llama_init_from_gpt_params(gpt_params&) ()
#9  0x0000635da3c55d48 in main ()
[Inferior 1 (process 1533172) detached]
Aborted

Edit - -ngl 0 changes nothing
Edit - -b 256 changes nothing
Edit - disabling flash attention fixed it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants