Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version libcudnn_ops_infer.so.8 not defined in file libcudnn_ops_infer.so.8 with link time reference #104591

Open
Mycatinjuly opened this issue Jul 4, 2023 · 20 comments
Labels
module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Mycatinjuly
Copy link

Mycatinjuly commented Jul 4, 2023

Issue description

when I load one torch model ,it works,but when I load two torch models,it shows:

Could not load library libcudnn_cnn_infer.so.8. Error: /usr/local/cuda/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8: symbol _ZN5cudnn3ops18GetInternalStreamsEP12cudnnContextiPP11CUstream_st, version libcudnn_ops_infer.so.8 not defined in file libcudnn_ops_infer.so.8 with link time reference
Please make sure libcudnn_cnn_infer.so.8 is in your library path!

but I can found libcudnn_cnn_infer.so.8 file in this path:/usr/local/cuda/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8

Code example

from TTS.api import TTS
from faster_whisper import WhisperModel
tts = TTS(model_name="tts_models/multilingual/multi-dataset/your_tts", progress_bar=True, gpu=True)
tts_clone = TTS(model_name="voice_conversion_models/multilingual/vctk/freevc24", progress_bar=True, gpu=True)
whispter_model = WhisperModel(WHISPER_MODEL_PATH, device="cuda", compute_type="float16")
segments, info = whispter_model.transcribe(audio_file_path, beam_size=5, word_timestamps=True)

error information

Could not load library libcudnn_cnn_infer.so.8. Error: /usr/local/cuda/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8: symbol _ZN5cudnn3ops18GetInternalStreamsEP12cudnnContextiPP11CUstream_st, version libcudnn_ops_infer.so.8 not defined in file libcudnn_ops_infer.so.8 with link time reference
Please make sure libcudnn_cnn_infer.so.8 is in your library path!

then I just load one tts model,it works

from TTS.api import TTS
from faster_whisper import WhisperModel
tts = TTS(model_name="tts_models/multilingual/multi-dataset/your_tts", progress_bar=True, gpu=True)
#tts_clone = TTS(model_name="voice_conversion_models/multilingual/vctk/freevc24", progress_bar=True, gpu=True)
whispter_model = WhisperModel(WHISPER_MODEL_PATH, device="cuda", compute_type="float16")
segments, info = whispter_model.transcribe(audio_file_path, beam_size=5, word_timestamps=True)

System Info

image

cc @ptrblck

@ptrblck
Copy link
Collaborator

ptrblck commented Jul 6, 2023

Based on your torch.__version__ output it seems you've installed the pip wheels via pip install torch, which ship with their own CUDA dependencies:

pip list | grep torch
(tmp) pbialecki@ptrblck-srv:~$ pip install torch
Collecting torch
  Downloading torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 619.9/619.9 MB 58.4 MB/s eta 0:00:00
...
Collecting nvidia-cudnn-cu11==8.5.0.96 (from torch)
  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 58.5 MB/s eta 0:00:00
Collecting nvidia-cublas-cu11==11.10.3.66 (from torch)
  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 58.5 MB/s eta 0:00:00
...
Successfully installed MarkupSafe-2.1.3 cmake-3.26.4 filelock-3.12.2 jinja2-3.1.2 lit-16.0.6 mpmath-1.3.0 networkx-3.1 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 sympy-1.12 torch-2.0.1 triton-2.0.0 typing-extensions-4.7.0

Smoke test:

(tmp) pbialecki@ptrblck-srv:~$ python -c "import torch; print(torch.__version__); print(torch.backends.cudnn.version())"
2.0.1+cu117
8500

(tmp) pbialecki@ptrblck-srv:~$ ls /home/pbialecki/miniforge3/envs/tmp/lib/python3.10/site-packages/nvidia/cudnn/lib/
__init__.py              libcudnn_adv_train.so.8  libcudnn_cnn_train.so.8  libcudnn_ops_train.so.8  __pycache__
libcudnn_adv_infer.so.8  libcudnn_cnn_infer.so.8  libcudnn_ops_infer.so.8  libcudnn.so.8

However, your error message points to the system-wide library install, which should not be used.
Are you LD_PRELOAD'ing the library for some reason?

@janeyx99 janeyx99 added module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jul 6, 2023
@zhaoxin111
Copy link

I encountered the same problem, how did you solve it?

@xu-zhang-handsome
Copy link

bro, me too, how did you solve it?

@jS5t3r
Copy link

jS5t3r commented Oct 20, 2023

SYSTRAN/faster-whisper#516

need install cudnn 8

conda install -c conda-forge cudnn

@aradhyamathur
Copy link

The problem persists even after conda install cudnn

@jS5t3r
Copy link

jS5t3r commented Oct 26, 2023

@aradhyamathur What is about https://anaconda.org/conda-forge/cudatoolkit-dev? Still persists?

And have u tried cudnn version 8?

@bakermanbrian
Copy link

Having a similar issue. I'm trying to get Faster Whisper to run off a docker build.

I'm trying to use the docker image:
pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime

Unfortunately, getting this libcudnn_ops_infer.so.8 issue as well. Anyone know how I might add the necessary additional libraries? I can't use the official Nvidia docker image it seems (was too large for my smaller system to handle).

@Sere1nz
Copy link

Sere1nz commented Nov 1, 2023

In teriminal, export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/zzm/anaconda3/envs/ttskit-new/lib/python3.8/site-packages/nvidia/cudnn/lib
replace home/zzm/anaconda3/envs/ttskit-new to your environment path and don't forget to change the python version as well.

@bakermanbrian
Copy link

My problem is I'm building then running this directly in a Google Cloud VM. Do you know if there's any way to do this via my Docker file?

@aradhyamathur
Copy link

@jS5t3r it seems the issue was perhaps coming from the conda, not really sure , creating a new env and installing with pip worked for me.

@bakermanbrian
Copy link

I updated my docker file like so (I got the LD_LIBRARY_PATH by testing on my VM by running python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.file) + ":" + os.path.dirname(nvidia.cudnn.lib.file))' )

FROM pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime

RUN pip install nvidia-cublas-cu11 nvidia-cudnn-cu11

RUN pip install -r requirements.txt

ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/conda/lib/python3.10/site-packages/nvidia/cublas/lib:/opt/conda/lib/python3.10/site-packages/nvidia/cudnn/lib
Got a bit further, but got this error:

Could not load library libcudnn_cnn_infer.so.8. Error: /opt/conda/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_infer.so.8: undefined symbol: _ZN11nvrtcHelper4loadEb, version libcudnn_ops_infer.so.8

Seems like it's close, just missing one thing on compatibility.

@jS5t3r
Copy link

jS5t3r commented Nov 8, 2023

I updated my docker file like so (I got the LD_LIBRARY_PATH by testing on my VM by running python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.file) + ":" + os.path.dirname(nvidia.cudnn.lib.file))' )

FROM pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime

RUN pip install nvidia-cublas-cu11 nvidia-cudnn-cu11

RUN pip install -r requirements.txt

ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/conda/lib/python3.10/site-packages/nvidia/cublas/lib:/opt/conda/lib/python3.10/site-packages/nvidia/cudnn/lib
Got a bit further, but got this error:

Could not load library libcudnn_cnn_infer.so.8. Error: /opt/conda/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_infer.so.8: undefined symbol: _ZN11nvrtcHelper4loadEb, version libcudnn_ops_infer.so.8

Seems like it's close, just missing one thing on compatibility.

conda install -c anaconda cudatoolkit
conda install -c anaconda cudnn
conda install -c conda-forge cudatoolkit-dev

@bakermanbrian
Copy link

Appreciate the suggestion! Unfortunately, this made my container size too large for my VM to handle; is there an alternative that doesn't install as much? I'm still confused why the original pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime doesn't accomplish what I need.

@FredHaa
Copy link

FredHaa commented Nov 29, 2023

The correct path to the libcudnn_ops_infer.so.8 is /opt/conda/lib/python3.10/site-packages/torch/lib for the PyTorch image.

I got it working using ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/conda/lib/python3.10/site-packages/torch/lib

@lifeiteng
Copy link

LD_LIBRARY_PATH=/YOUR_PATH/anaconda3/lib/python3.11/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH worked for me, pip install torch installs cudnn itself (site-packages/nvidia/cudnn), the program try to load cudnn from /usr/local/cuda/lib64

I remember that in early years, both pytorch and tensorflow loaded cuda and cudnn from the system.

@ynlgcn
Copy link

ynlgcn commented Apr 14, 2024

![Uploading 微信图片_20240414153931.pn
What should I do?

@HydrogenSulfate
Copy link

In teriminal, export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/zzm/anaconda3/envs/ttskit-new/lib/python3.8/site-packages/nvidia/cudnn/lib replace home/zzm/anaconda3/envs/ttskit-new to your environment path and don't forget to change the python version as well.

It really works!

@zxxf18
Copy link

zxxf18 commented Jun 6, 2024

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/zzm/anaconda3/envs/ttskit-new/lib/python3.8/site-packages/nvidia/cudnn/lib

may export LD_LIBRARY_PATH=/home/zzm/anaconda3/envs/ttskit-new/lib/python3.8/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH

@ovenKiller
Copy link

In teriminal, export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/zzm/anaconda3/envs/ttskit-new/lib/python3.8/site-packages/nvidia/cudnn/lib replace home/zzm/anaconda3/envs/ttskit-new to your environment path and don't forget to change the python version as well.

Thank you, although it didn't work, I tried export LD_LIBRARY_PATH=/home/zzm/anaconda3/envs/ttskit-new/lib/python3.8/site-packages/nvidia/cudnn/lib , and it works, I guess the old path misled the program.

@tianhao-stan-wu
Copy link

LD_LIBRARY_PATH=/YOUR_PATH/anaconda3/lib/python3.11/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH worked for me, pip install torch installs cudnn itself (site-packages/nvidia/cudnn), the program try to load cudnn from /usr/local/cuda/lib64

I remember that in early years, both pytorch and tensorflow loaded cuda and cudnn from the system.

This worked for me!!! Setting the path before $LD_LIBRARY_PATH is crucial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests