[Bug] [Relax] Segfault error when parse the Relax IR #17239

Cookiee235 · 2024-08-03T08:52:05Z

Actual behavior

[16:49:04] /software/tvm-lunder/src/runtime/logging.cc:390: TVM_LOG_DEBUG enables VLOG statements in 'ir/transform.cc' up to level 1
[16:49:04] /software/tvm-lunder/src/runtime/logging.cc:390: TVM_LOG_DEBUG enables VLOG statements in 'relay/ir/transform.cc' up to level 1
Segmentation fault (core dumped)

Steps to reproduce

from tvm.script import ir as I
from tvm.script import tir as T
from tvm.script import relax as R

@I.ir_module
class Module:
    @T.prim_func(private=True)
    def multiply_by_two(A: T.Buffer((16,), "float32")):
        for i in range(16):
            A[i] = A[i] * T.float32(2)

    @R.function
    def main(A: R.Tensor((16,), dtype="float32")) -> R.Tensor((16,), dtype="float32"):
        cls = Module
        args: R.Tuple(R.Tensor((16,), dtype="float32")) = (A,)
        gv1: R.Tensor((16,), dtype="float32") = R.call_tir_inplace(cls.multiply_by_two, args, out_sinfo=R.Tensor((16,), dtype="float32"), inplace_indices=[0])
        return gv1
m = Module

cc @Lunderberg @junrushao

The text was updated successfully, but these errors were encountered:

Lunderberg · 2024-08-05T16:02:25Z

Looks like this is a combination of a couple of factors.

Like R.call_tir, the arguments in R.call_tir_inplace must be an in-line relax::Tuple (See the discussion in [Unity][Transform] Handle relax.Var as call_tir args when lowering #15916 for discussion on this requirement.)
The argument provided to R.call_tir_inplace is wrapped into an in-line tuple if it is not already an in-line tuple. While the normalization of R.call_tir_inplace would handle this case if the expression is generated through C++, this wrapping generates a tuple of var-to-tuple (R.tuple(args)), circumventing the normalization.
The error-checking by R.call_tir_inplace can be triggered by multiple conditions (argument is not a tensor, argument doesn't have known shape, argument's known shape does not match output shape), but the error message attempts to access the argument's known shape, triggering a segfault if it doesn't actually exist.

Lunderberg · 2024-08-05T16:28:20Z

I've submitted separate PR #17242 which should provide a better error message (instead of a segfault) when this occurs.

For (2), we may be able to improve it by checking isinstance(args.struct_info, TupleStructInfo) rather than isinstance(args. relax.Tuple). This way, a tuple that was defined earlier in the function wouldn't be modified, and would produce an error message at an earlier point. The normalization (which would produce the required in-line Tuple) is suppressed during TVMScript parsing, since TVMScript is frequently used for writing test cases that violate Relax assumptions. Maybe we should tie the normalization to the existing check_well_formed flag, so those tests usually disable the well-formed checks as well.

Prior to this commit, the different `R.call_tir*` variations would wrap the arguments into an in-line `relax.Tuple`, if it is not already a `relax.Tuple`. While this allows a tensor to be passed into these functions as a single argument (`R.call_tir(func, arg, ...)` instead of `R.call_tir(func, [arg], ...)`), the wrapped Relax variable may already refer to a tuple. This use of a variable to refer to an argument tuple rather than an in-line argument tuple is not allowed by Relax. (See discussion on apache#15916 for details.) However, by wrapping a variable `args: R.Tuple(R.Tensor, R.Tensor, ...)` into a tuple-of-tuples, the error occurs after the expression has already been generated, and refers to an expression `R.Tuple(R.Tuple(R.Tensor, R.Tensor, ...))` that doesn't appear anywhere in the user's input. This can make debugging difficult (see apache#17239 for an example). This commit updates the argument-handling in `R.call_tir` to only generate an in-line `relax.Tuple` if the arguments do not already have `relax.TupleStructInfo`. If the argument was provided as a Relax variable bound to a tuple of arguments, it will still produce an error. However, that error will occur much earlier, and will explicitly state that the argument must be a `relax.Tuple` instead of a `relax.Var`.

Lunderberg · 2024-08-05T16:56:51Z

And #17243 should address (2) by improving the error message.

Cookiee235 · 2024-08-06T14:11:52Z

@Lunderberg Thanks for your fixing. Segfault is a dangerous behavior, fixing it with an error message is a good strategy. Also thanks for your efforts in improving the well-formed checking for relay IR which is also very meaningful.

…17243) * [Relax] Avoid wrapping TupleStructInfo into a Tuple for R.call_tir Prior to this commit, the different `R.call_tir*` variations would wrap the arguments into an in-line `relax.Tuple`, if it is not already a `relax.Tuple`. While this allows a tensor to be passed into these functions as a single argument (`R.call_tir(func, arg, ...)` instead of `R.call_tir(func, [arg], ...)`), the wrapped Relax variable may already refer to a tuple. This use of a variable to refer to an argument tuple rather than an in-line argument tuple is not allowed by Relax. (See discussion on #15916 for details.) However, by wrapping a variable `args: R.Tuple(R.Tensor, R.Tensor, ...)` into a tuple-of-tuples, the error occurs after the expression has already been generated, and refers to an expression `R.Tuple(R.Tuple(R.Tensor, R.Tensor, ...))` that doesn't appear anywhere in the user's input. This can make debugging difficult (see #17239 for an example). This commit updates the argument-handling in `R.call_tir` to only generate an in-line `relax.Tuple` if the arguments do not already have `relax.TupleStructInfo`. If the argument was provided as a Relax variable bound to a tuple of arguments, it will still produce an error. However, that error will occur much earlier, and will explicitly state that the argument must be a `relax.Tuple` instead of a `relax.Var`. * lint fixes

Cookiee235 added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug labels Aug 3, 2024

Lunderberg mentioned this issue Aug 5, 2024

[Relax] Avoid wrapping TupleStructInfo into a Tuple for R.call_tir #17243

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] [Relax] Segfault error when parse the Relax IR #17239

[Bug] [Relax] Segfault error when parse the Relax IR #17239

Cookiee235 commented Aug 3, 2024

Lunderberg commented Aug 5, 2024

Lunderberg commented Aug 5, 2024

Lunderberg commented Aug 5, 2024

Cookiee235 commented Aug 6, 2024

[Bug] [Relax] Segfault error when parse the Relax IR #17239

[Bug] [Relax] Segfault error when parse the Relax IR #17239

Comments

Cookiee235 commented Aug 3, 2024

Actual behavior

Steps to reproduce

Lunderberg commented Aug 5, 2024

Lunderberg commented Aug 5, 2024

Lunderberg commented Aug 5, 2024

Cookiee235 commented Aug 6, 2024