Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Detector Support]: TensorRT CUDA Version Problem #11518

Closed
JoshuaPK opened this issue May 25, 2024 Discussed in #11424 · 1 comment
Closed

[Detector Support]: TensorRT CUDA Version Problem #11518

JoshuaPK opened this issue May 25, 2024 Discussed in #11424 · 1 comment

Comments

@JoshuaPK
Copy link

Discussed in #11424

Originally posted by JoshuaPK May 18, 2024

Describe the problem you are having

I am trying to set up Frigate using a TensorRT detector with CUDA. I have configured and verified the CUDA driver, libraries, and container tools with my 3050. When I start Frigate, it gets to the point where it starts to generate yolov7-320.trt, and then it fails with an error indicating a problem with the CUDA driver vs. library version (i.e. error 35).

Version

Unsure- haven't got that far yet

Frigate config file

mqtt:
  enabled: False

cameras:
  dummy_camera: # <--- this will be changed to your actual camera later
    enabled: False
    ffmpeg:
      inputs:
        - path: rtsp:https://127.0.0.1:554/rtsp
          roles:
            - detect

docker-compose file or Docker CLI command

version: "3.9"
services:
  frigate:
    container_name: frigate
    #privileged: true # this may not be necessary for all setups
    restart: unless-stopped
    image: ghcr.io/blakeblackshear/frigate:stable-tensorrt
    shm_size: "4gb" # update for your cameras based on calculation above
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0']
              count: 1
              capabilities: [gpu] 
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /srv/vcs/frigate/config:/config
      - /srv/vcs/frigate/media:/media/frigate
      - type: tmpfs # Optional: 1GB of memory, reduces SSD/SD Card wear
        target: /tmp/cache
        tmpfs:
          size: 1000000000
    ports:
      - "5000:5000"
      - "8554:8554" # RTSP feeds
      - "8555:8555/tcp" # WebRTC over tcp
      - "8555:8555/udp" # WebRTC over udp
    environment:
      FRIGATE_RTSP_PASSWORD: "password"

Relevant log output

Frigate:

[frigate] | Creating yolov7-320.cfg and yolov7-320.weights
[frigate] | 
[frigate] | Done.
[frigate] | 2024-05-18 16:01:34.222576297  [INFO] Starting go2rtc healthcheck service...
[frigate] | 
[frigate] | Generating yolov7-320.trt. This may take a few minutes.
[frigate] | 
Traceback (most recent call last):
  File "/usr/local/src/tensorrt_demos/yolo/onnx_to_tensorrt.py", line 214, in <module>
    main()
  File "/usr/local/src/tensorrt_demos/yolo/onnx_to_tensorrt.py", line 202, in main
    engine = build_engine(
  File "/usr/local/src/tensorrt_demos/yolo/onnx_to_tensorrt.py", line 112, in build_engine
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(*EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
TypeError: pybind11::init(): factory function returned nullptr
[frigate] | [05/18/2024-16:01:37] [TRT] [W] Unable to determine GPU memory usage
[frigate] | [05/18/2024-16:01:37] [TRT] [W] Unable to determine GPU memory usage
[frigate] | [05/18/2024-16:01:37] [TRT] [W] CUDA initialization failure with error: 35. Please check your CUDA installation:  http:https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
[frigate] | Loading the ONNX file...
[frigate] | Available tensorrt models:
ls: cannot access '*.trt': No such file or directory

nvidia-smi:

Sat May 18 16:31:13 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050        Off |   00000000:01:00.0 Off |                  N/A |
| 34%   39C    P0             N/A /   70W |       0MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Operating system

Other Linux

Install method

Docker Compose

Coral version

CPU (no coral)

Any other information that may be helpful

I am using Podman instead of Docker on Almalinux 9.4.

@NickM-27
Copy link
Sponsor Collaborator

issues are for feature requests

@NickM-27 NickM-27 closed this as not planned Won't fix, can't repro, duplicate, stale May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants