Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

call/callbr's function type should not be a pointer type #189

Closed
RyanGlScott opened this issue May 16, 2022 · 1 comment · Fixed by #207
Closed

call/callbr's function type should not be a pointer type #189

RyanGlScott opened this issue May 16, 2022 · 1 comment · Fixed by #207
Labels

Comments

@RyanGlScott
Copy link
Contributor

If you parse an invoke instruction, such as these one found in crux-llvm's invoke-test.ll test case, the resulting IR will be:

...
  (Invoke
     (FunTy
        (PrimType (Integer 32))
        [ PtrTo (PtrTo (PrimType (Integer 8))) ]
        False)
     (ValSymbol
        (Symbol
           "_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h6e1c450892b1f754E"))
     [ Typed
         { typedType = PtrTo (PtrTo (PrimType (Integer 8)))
         , typedValue = ValIdent (Ident "arg0")
         }
     ]
     (Named (Ident "bb1"))
     (Named (Ident "cleanup")))
...

The thing that is peculiar about this IR is that the argument type is a raw FunTy. Reading the LLVM Language Reference Manual entry for invoke, however, suggests that this should instead be a PtrTo:

Syntax:

<result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
              [operand bundles] to label <normal label> unwind label <exception label>

...

Arguments:

This instruction requires several arguments:

...

  1. fnptrval: An LLVM value containing a pointer to a function to be invoked. In most cases, this is a direct function invocation, but indirect invoke’s are just as possible, calling an arbitrary pointer to function value.

...

Making this a PtrTo would also match the treatment for call and callbr, two closely related instructions. In fact, in the source of of LLVM, the classes for call, callbr, and invoke are all subclasses of the same CallBase class, so they all store their function type in the same way.

@RyanGlScott
Copy link
Contributor Author

After some further investigation (prompted by the opaque pointer switchover—see #177), I've actually come to the opposite conclusion from this issue's title. Namely, invoke should not have a pointer type, and the fact that we give call and callbr pointer types is wrong. The thing that threw me off was the LLVM Language Reference Manual's description for the call instruction:

<result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
           <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]

I had mistakenly assumed that fnptrval must have the same type as fnty, but this is not the case! As an experiment, consider this program, which has been carefully written to ensure that the compiled bitcode will contain a call instruction containing a fnty:

#include <stdio.h>

void f() {
  int (*p)(const char*, ...) = &printf;
  p("%d\n", 0);
}

What should the fnty be when we invoke call on p? I had assumed that it would be a pointer type, but in fact, it is a raw function type:

define dso_local void @f() #0 !dbg !17 {
entry:
  %p = alloca ptr, align 8
  call void @llvm.dbg.declare(metadata ptr %p, metadata !21, metadata !DIExpression()), !dbg !28
  store ptr @printf, ptr %p, align 8, !dbg !28
  %0 = load ptr, ptr %p, align 8, !dbg !29
  %call = call i32 (ptr, ...) %0(ptr noundef @.str, i32 noundef 0), !dbg !29
  ret void, !dbg !30
}

In particular, i32 (ptr, ...) is a function type representing p's function signature: it takes a pointer as its first argument, zero or more variadic arguments afterwards, and returns an i32. (If we disable opaque pointers, then the type would be i32 (i8*, ...).) The type of %0 is a pointer to a function of type i32 (ptr, ...), but the fnty itself is not a pointer type.

llvm-pretty-bc-parser (as well as downstream libraries like crucible-llvm) have propped up the mistaken belief that the fnty in a call/callbr should be a pointer to a function type, rather than a direct function type. We have managed to get away with this up until now, but this will cause issues when transitioning over to opaque pointers. This is because there are some operations that extract the pointee type of a pointer, which is not possible to do with an opaque pointer. See, for instance, this code in crucible-llvm.

Long story short: in order to support opaque pointers in the future, we will need to change the fnty convention used within llvm-pretty-bc-parser and crucible-llvm. Unless I am missing something, this change can be done indepedently of implementing opaque pointers themselves.

@RyanGlScott RyanGlScott changed the title invoke's function should have a pointer type call/callbr's function should not be a pointer type Mar 20, 2023
@RyanGlScott RyanGlScott changed the title call/callbr's function should not be a pointer type call/callbr's function type should not be a pointer type Mar 20, 2023
RyanGlScott added a commit that referenced this issue Mar 20, 2023
See `Note [Typing function applications]` for the rationale.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
RyanGlScott added a commit to GaloisInc/crucible that referenced this issue Mar 20, 2023
There are a handful of places in `crucible-llvm` where we assume that the
`Call` and `CallBr` instructions have pointer types. As noted in
GaloisInc/llvm-pretty-bc-parser#189, however, this is subtly incorrect. This
patch:

* Bumps the `llvm-pretty-bc-parser` submodule to bring in the changes from
  GaloisInc/llvm-pretty-bc-parser#207, which corrects this mistake, and
* Changes the parts of `crucible-llvm` that match on pointer type to no longer
  do so.
RyanGlScott added a commit to GaloisInc/crucible that referenced this issue Mar 20, 2023
There are a handful of places in `crucible-llvm` where we assume that the
`Call` and `CallBr` instructions have pointer types. As noted in
GaloisInc/llvm-pretty-bc-parser#189, however, this is subtly incorrect. This
patch:

* Bumps the `llvm-pretty-bc-parser` submodule to bring in the changes from
  GaloisInc/llvm-pretty-bc-parser#207, which corrects this mistake.
* Bumps the `llvm-pretty` submodule to be in tandem with
  `llvm-pretty-bc-parser`.
* Changes the parts of `crucible-llvm` that match on pointer type to no longer
  do so.
RyanGlScott added a commit to GaloisInc/llvm-pretty that referenced this issue Mar 21, 2023
Previously, the code in `ppCallSym` assumed that the function type stored in a
`call` instruction was a pointer type, but this should actually be a raw
function type. See the discussion at
GaloisInc/llvm-pretty-bc-parser#189 (comment)
for more on this.  We now assume the convention of raw function types in `call`
instructions, as well as the related `callbr` and `invoke` instructions.

This addresses one part of #102.
RyanGlScott added a commit that referenced this issue Mar 21, 2023
See `Note [Typing function applications]` for the rationale.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
RyanGlScott added a commit to GaloisInc/llvm-pretty that referenced this issue Apr 3, 2023
Previously, the code in `ppCallSym` assumed that the function type stored in a
`call` instruction was a pointer type, but this should actually be a raw
function type. See the discussion at
GaloisInc/llvm-pretty-bc-parser#189 (comment)
for more on this.  We now assume the convention of raw function types in `call`
instructions, as well as the related `callbr` and `invoke` instructions.

This addresses one part of #102.
RyanGlScott added a commit that referenced this issue Apr 3, 2023
See `Note [Typing function applications]` for the rationale. I have added a
`T189.ll` test case to ensure that we actually practice what we preach in that
Note.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
RyanGlScott added a commit that referenced this issue Apr 3, 2023
See `Note [Typing function applications]` for the rationale. I have added a
`T189.ll` test case to ensure that we actually practice what we preach in that
Note.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
RyanGlScott added a commit to GaloisInc/crucible that referenced this issue Apr 3, 2023
There are a handful of places in `crucible-llvm` where we assume that the
`Call` and `CallBr` instructions have pointer types. As noted in
GaloisInc/llvm-pretty-bc-parser#189, however, this is subtly incorrect. This
patch:

* Bumps the `llvm-pretty-bc-parser` submodule to bring in the changes from
  GaloisInc/llvm-pretty-bc-parser#207, which corrects this mistake.
* Bumps the `llvm-pretty` submodule to be in tandem with
  `llvm-pretty-bc-parser`. Note that because this brings in the changes from
  GaloisInc/llvm-pretty#104, I opted to explicitly disallow LLVM's opaque
  pointers in `crucible-llvm` for now, but this is something that will likely
  change in subsequent patches.
* Changes the parts of `crucible-llvm` that match on pointer types to no longer
  do so.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant