Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local-ai: ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library #320145

Closed
teto opened this issue Jun 15, 2024 · 2 comments · Fixed by #324387
Closed

local-ai: ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library #320145

teto opened this issue Jun 15, 2024 · 2 comments · Fixed by #324387

Comments

@teto
Copy link
Member

teto commented Jun 15, 2024

Describe the bug

Not sure if it's a local-ai (or rather one of its dependency llama-cpp/gpt4all) but I can't leverage GPU inference of my nvidia RTX3060 because of ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library error I think.

➜ nix run github:teto/nixpkgs/local-ai-with-cuda#local-ai -- --debug
11:51PM INF loading environment variables from file envFile=/home/teto/.config/localai.env
11:51PM INF Setting logging to info
11:51PM INF Starting LocalAI using 4 threads, with models path: /home/teto/models
11:51PM INF LocalAI version: v2.16.0 ()
WARNING: failed to read int from file: open /sys/class/drm/card0/device/numa_node: no such file or directory
11:51PM INF Preloading models from /home/teto/models

  Model name: mistral                                                         


11:51PM ERR error establishing configuration directory watcher error="unable to establish watch on the LocalAI Configuration Directory: no such file or directory"
11:51PM INF core/startup process completed!
11:51PM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=https://0.0.0.0:11111
....
ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library

nvidia-smi output looks ok:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |

Following an advice on the nixpkgs cuda matrix, I tried to run the previous command with LD_DEBUG=libs and ended up with

�[90m10:21PM�[0m DBG GRPC(mistral-7b-openorca.Q6_K.gguf-127.0.0.1:37653): stderr     633360:	 search path=/nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib/glibc-hwcaps/x86-64-v3:/nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib/glibc-hwcaps/x86-64-v2:/nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/zx9yfgv4ag607b8m3dgcp5p94b6vd13c-cuda_cudart-12.2.140-lib/lib/stubs/glibc-hwcaps/x86-64-v3:/nix/store/zx9yfgv4ag607b8m3dgcp5p94b6vd13c-cuda_cudart-12.2.140-lib/lib/stubs/glibc-hwcaps/x86-64-v2:/nix/store/zx9yfgv4ag607b8m3dgcp5p94b6vd13c-cuda_cudart-12.2.140-lib/lib/stubs		(RUNPATH from file /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-avx2)

if that helps

Steps To Reproduce

Steps to reproduce the behavior:
I've pusehd master with one commit to enable cuda and unfree by default so one can test with

  1. run nix run github:teto/nixpkgs/local-ai-with-cuda#local-ai -- --debug on a cuda machine
  2. start a request (you can do this from the browser localhost:8080 )
  3. check the output

Expected behavior

Have GPU inference. It used to work btw but I would have hard time tracking down, nvidia drivers are hit and miss and there seems to have been a lot of nvidia related changes recently in nixpkgs.

Additional context

I found this related issue but not sure what to do with it horovod/horovod#3831 I think we want to keep using shared libraries

➜ ls -l /run/opengl-driver/lib | head
lrwxrwxrwx - root  1 janv.  1970 d3d -> /nix/store/fh3p3s1gg0ick2f295zfwi2jlr78166r-mesa-24.1.1-drivers/lib/d3d
dr-xr-xr-x - root  1 janv.  1970 dri
lrwxrwxrwx - root  1 janv.  1970 gbm -> /nix/store/w7fcnyxkxara9fixrmigzrir3k8fbdb3-nvidia-x11-550.90.07-6.8.12/lib/gbm
lrwxrwxrwx - root  1 janv.  1970 nvidia -> /nix/store/w7fcnyxkxara9fixrmigzrir3k8fbdb3-nvidia-x11-550.90.07-6.8.12/lib/nvidia
lrwxrwxrwx - root  1 janv.  1970 systemd -> /nix/store/w7fcnyxkxara9fixrmigzrir3k8fbdb3-nvidia-x11-550.90.07-6.8.12/lib/systemd
dr-xr-xr-x - root  1 janv.  1970 vdpau
lrwxrwxrwx - root  1 janv.  1970 libcuda.so -> /nix/store/w7fcnyxkxara9fixrmigzrir3k8fbdb3-nvidia-x11-550.90.07-6.8.12/lib/libcuda.so
lrwxrwxrwx - root  1 janv.  1970 libcuda.so.1 -> /nix/store/w7fcnyxkxara9fixrmigzrir3k8fbdb3-nvidia-x11-550.90.07-6.8.12/lib/libcuda.so.1

Feel free to close if this is the wrong place to submit but I would appreciate any workaround/tip.

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
output here

Add a 👍 reaction to issues you find important.

@teto
Copy link
Member Author

teto commented Jun 22, 2024

using llama-cpp directly works fine so I suspect a problem in local-ai building. I will stick to llama-cpp for now

@teto teto mentioned this issue Jun 22, 2024
13 tasks
@SomeoneSerge
Copy link
Contributor

ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library

This is ${cudaPackages.cuda_cudart.stubs}/lib/libcuda.so being loaded instead of ${addDriverRunpath.driverLink}/lib/libcuda.so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants