Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF TensorRT misconfigured #1323

Open
maciejskorski opened this issue Nov 18, 2023 · 0 comments
Open

TF TensorRT misconfigured #1323

maciejskorski opened this issue Nov 18, 2023 · 0 comments
Labels
bug bug & failures with existing packages help wanted

Comments

@maciejskorski
Copy link
Contributor

maciejskorski commented Nov 18, 2023

🐛 Bug

Tensorflow TensorRT seems to be wrongly linked.

To Reproduce

On a few recent images (including gcr.io/kaggle-gpu-images/python latest 311277776c9b 7 days ago 47.2GB) I see very different linked and loaded TensorRT libs, namely 8.4 vs 8.6.

import tensorflow.compiler as tf_cc
linked_trt_ver=tf_cc.tf2tensorrt._pywrap_py_utils.get_linked_tensorrt_version()
print(f"Linked TRT ver: {linked_trt_ver}")
loaded_trt_ver=tf_cc.tf2tensorrt._pywrap_py_utils.get_loaded_tensorrt_version()
print(f"Loaded TRT ver: {loaded_trt_ver}")
# Linked TRT ver: (8, 4, 3)
# Loaded TRT ver: (8, 6, 1)

This has been Python. Now, the system inference libraries are indeed at 8.6:

dpkg -l | grep TensorRT

Now, minimal compatibility rules are fulfilled - the loaded version more recent.

However, the linking doesn't work properly. Under these recent containers, minimal TensorRT samples crash:

import tensorflow as tf
from tensorflow.python.compiler.tensorrt import trt_convert as trt
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants

from tensorflow.keras.applications.resnet50 import ResNet50
tf_model_dir = './models/tf_model'
model = ResNet50(include_top=50, weights='imagenet')
model.save(tf_model_dir)

converter = trt.TrtGraphConverterV2(  
   input_saved_model_dir=tf_model_dir,
)
converter.convert()

MAX_BATCH_SIZE=1 
def input_fn():
   img = tf.random.normal((MAX_BATCH_SIZE, 224,224,3),dtype=tf.float32)
   return (img, )

import faulthandler
faulthandler.enable()
converter.build(input_fn=input_fn) #SEGMENTATION FAULT can happen under missconfigured software!

Expected behavior

Align versions and make the sample code runnable.

Additional context

See the NVIDIA installation guidelines

Conditions from `tensorrt

        "Loaded TensorRT %s but linked TensorFlow against TensorRT %s. A few "
        "requirements must be met:\n"
        "\t-It is required to use the same major version of TensorRT during "
        "compilation and runtime.\n"
        "\t-TensorRT does not support forward compatibility. The loaded "
        "version has to be equal or more recent than the linked version.",
@maciejskorski maciejskorski added bug bug & failures with existing packages help wanted labels Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bug & failures with existing packages help wanted
Projects
None yet
Development

No branches or pull requests

1 participant