Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries #13055

yxl23 · 2024-05-31T08:43:41Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries

Additional

No response

glenn-jocher · 2024-05-31T15:15:06Z

Hello,

Thank you for reaching out with your query. Currently, we don't have a single-file Python script specifically for running inference with a TensorRT .engine model that minimizes external dependencies. However, you can achieve this by using the tensorrt library in Python to load and run inference with the .engine file.

Here’s a basic outline of what the code could look like:

import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
engine_file_path = 'path_to_your_engine_file.engine'

def load_engine(engine_file_path):
    with open(engine_file_path, "rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
        return runtime.deserialize_cuda_engine(f.read())

def main():
    engine = load_engine(engine_file_path)
    context = engine.create_execution_context()

    # Allocate buffers and create a stream.
    inputs, outputs, bindings, stream = [], [], [], cuda.Stream()
    for binding in engine:
        size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
        dtype = trt.nptype(engine.get_binding_dtype(binding))
        # Allocate host and device buffers
        host_mem = cuda.pagelocked_empty(size, dtype)
        device_mem = cuda.mem_alloc(host_mem.nbytes)
        # Append the device buffer to device bindings.
        bindings.append(int(device_mem))
        # Append to the appropriate list.
        if engine.binding_is_input(binding):
            inputs.append({'host': host_mem, 'device': device_mem})
        else:
            outputs.append({'host': host_mem, 'device': device_mem})

    # Assuming input data is in numpy array `input_data`
    np.copyto(inputs[0]['host'], input_data.ravel())
    cuda.memcpy_htod_async(inputs[0]['device'], inputs[0]['host'], stream)

    # Run inference
    context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
    cuda.memcpy_dtoh_async(outputs[0]['host'], outputs[0]['device'], stream)
    stream.synchronize()

    # Output data will be in outputs[0]['host']
    print("Inference output:", outputs[0]['host'])

if __name__ == '__main__':
    main()

This script is a simplified example and assumes you have the necessary setup for TensorRT and PyCUDA. You might need to adjust the data handling and buffer management based on your specific model inputs and outputs.

For more comprehensive guidance on exporting and running YOLOv5 models with TensorRT, please refer to our documentation on model export and inference.

github-actions · 2024-07-01T00:26:24Z

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

yxl23 added the question Further information is requested label May 31, 2024

github-actions bot added the Stale label Jul 1, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries #13055

Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries #13055

yxl23 commented May 31, 2024

glenn-jocher commented May 31, 2024

github-actions bot commented Jul 1, 2024

Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries #13055

Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries #13055

Comments

yxl23 commented May 31, 2024

Search before asking

Question

Additional

glenn-jocher commented May 31, 2024

github-actions bot commented Jul 1, 2024