Fix SVF integration #412

lzaoral · 2021-10-20T19:41:39Z

No description provided.

lzaoral · 2021-10-22T19:08:15Z

@mchalupa This is the current state of the few failing tests:

funcptr12 -- With SVF the definition of struct callbacks cb ends up being undefined:
```
@cb = dso_local global %struct.callbacks undef, align 8
```
Thus, lli segfaults when it tries to call the function whose address is undefined as well.
bitcast5 -- This one doesn't fail in the CI but does on my machine as apparently Ubuntu doesn't build it's packages with -fstack-clash-protection -fcf-protection as opposed to Arch. lli crashes with *** stack smashing detected ***: terminated.

In fact, the original code's behaviour is really undefined so it's just a coincidence that it emerged when the code was sliced by SVF. By manual inspection, SVF's analysis seems to be correct as only instructions having no impact on the slicing criterion were removed.

The problematic code:

dg/tests/slicing/sources/bitcast5.c

Lines 4 to 6 in a9c6101

char a[] = "Hello, world";

int *p = (int *) a + 3;

*p = 0;

Assume that we are on a platform where sizeof(int) == 4 * sizeof(char) holds. Apart from the violation of strict aliasing rules [1] (same for bitcast4), even if we compile with -fno-strict-aliasing, we try to zero out the 12th to 16th element of the array of size 13 as type cast has higher precedence than addition [2].

EDIT: Fixed by Refactor slicing test runner and get rid of UB in tests #413.
EDIT2: There was actually even another problem because taking a pointer to an address that's not aligned is also UB.

OT: I suggest that we should check that every test input is UB-free. Otherwise, there is no guarantee what's going to happen when we interpret the sliced off code with lli assuming that llvm-slicer did nothing wrong.

[1] https://gist.github.com/shafik/848ae25ee209f698763cffee272a58f8#what-does-the-c11-standard-say
[2] https://en.cppreference.com/w/c/language/operator_precedence
dynalloc5 -- Following line is sliced off, thus, the assertion i->a == 13 fails:

dg/tests/slicing/sources/dynalloc5.c

Line 12 in a9c6101

i->a = 13;

vararg3 -- After slicing, only empty main remains:

define dso_local i32 @main() #0 {
entry:
  ret i32 0
}

vararg4 -- The variadic function call that sets value of a is completely sliced-off.

dg/tests/slicing/sources/vararg4.c

Lines 20 to 25 in a9c6101

int main(void) {

int a = 0;

setv(1, foo, &a);

test_assert(a == 17);

return 0;

}

mchalupa · 2021-11-01T06:18:17Z

The failing tests show that the points-to sets computed with -pta svf are most probably incomplete. Comparing the output of

llvm-pta-dump testX.bc

with and without -pta svf would probably give a hint where is the problem.

lzaoral · 2021-11-01T14:53:55Z

Might be unrelated but llvm-pta-dump without any flags prints the following warning:

$ build/tools/llvm-pta-dump funcptr12.ll
Pointer Subgraph is broken (right after building)!
Invalid number of edges:
PSNodeType::STORE with ID 7
  - operands: [5 PSNodeType::FUNCTION, 6 PSNodeType::CONSTANT]
(Non-entry node has no predecessors)
Invalid number of edges:
PSNodeType::STORE with ID 10
  - operands: [8 PSNodeType::FUNCTION, 9 PSNodeType::CONSTANT]
(Non-entry node has no predecessors)
Unreachable node:
PSNodeType::STORE with ID 7
  - operands: [5 PSNodeType::FUNCTION, 6 PSNodeType::CONSTANT]
Unreachable node:
PSNodeType::STORE with ID 10
  - operands: [8 PSNodeType::FUNCTION, 9 PSNodeType::CONSTANT]
INFO: Pointer analysis took 0 sec 7 ms
...

lzaoral · 2021-11-01T15:56:10Z

`funcptr12`

It looks like the SVF analysis might be somewhat incomplete, the last six instructions are not present in the output. However, they are still present in the final sliced code which has the both foo functions completely sliced off even though SVF correctly computed that main has a pointer to fun2.

Default:

foo2::  %3 = alloca i32*, align 8
  -> foo2::  %3 = alloca i32*, align 8
foo2::  %4 = alloca i32*, align 8
  -> foo2::  %4 = alloca i32*, align 8
foo2::  %5 = load i32*, i32** %4, align 8
  -> main::  %3 = alloca i32, align 4
main::  %1 = alloca i32, align 4
  -> main::  %1 = alloca i32, align 4
main::  %2 = alloca i32, align 4
  -> main::  %2 = alloca i32, align 4
main::  %3 = alloca i32, align 4
  -> main::  %3 = alloca i32, align 4
main::  %4 = alloca i32*, align 8
  -> main::  %4 = alloca i32*, align 8
main::  %5 = alloca i64, align 8
  -> main::  %5 = alloca i64, align 8
main::  %6 = alloca i32* (i32*, i32*)*, align 8
  -> main::  %6 = alloca i32* (i32*, i32*)*, align 8
main::  %7 = load i32* (i32*, i32*)*, i32* (i32*, i32*)** getelementptr inbounds (%struct.callbacks, %struct.callbacks* @cb, i32 0, i32 1), align 8
  -> fun 'foo2'
main::  %8 = ptrtoint i32* (i32*, i32*)* %7 to i64
  -> fun 'foo2'
main::  %9 = load i64, i64* %5, align 8
  -> fun 'foo2'
main::  %10 = inttoptr i64 %9 to i32* (i32*, i32*)*
  -> fun 'foo2'
main::  %11 = load i32* (i32*, i32*)*, i32* (i32*, i32*)** %6, align 8
  -> fun 'foo2'
main::  %12 = call i32* %11(i32* %2, i32* %3)
  -> main::  %3 = alloca i32, align 4
main::  %13 = load i32*, i32** %4, align 8
  -> main::  %3 = alloca i32, align 4

SVF:

foo1::  %3 = alloca i32*, align 8
  -> foo1::  %3 = alloca i32*, align 8
foo1::  %4 = alloca i32*, align 8
  -> foo1::  %4 = alloca i32*, align 8
foo2::  %3 = alloca i32*, align 8
  -> foo2::  %3 = alloca i32*, align 8
foo2::  %4 = alloca i32*, align 8
  -> foo2::  %4 = alloca i32*, align 8
main::  %1 = alloca i32, align 4
  -> main::  %1 = alloca i32, align 4
main::  %2 = alloca i32, align 4
  -> main::  %2 = alloca i32, align 4
main::  %3 = alloca i32, align 4
  -> main::  %3 = alloca i32, align 4
main::  %4 = alloca i32*, align 8
  -> main::  %4 = alloca i32*, align 8
main::  %5 = alloca i64, align 8
  -> main::  %5 = alloca i64, align 8
main::  %6 = alloca i32* (i32*, i32*)*, align 8
  -> main::  %6 = alloca i32* (i32*, i32*)*, align 8
main::  %7 = getelementptr %struct.callbacks, %struct.callbacks* @cb, i32 0, i32 1
  -> @cb = dso_local global %struct.callbacks { i32* (i32*, i32*)* @foo1, i32* (i32*, i32*)* @foo2 }, align 8
main::  %8 = load i32* (i32*, i32*)*, i32* (i32*, i32*)** %7, align 8
  -> fun 'foo2'

EDIT:

It looks like SVF does not properly handle the ptrtoint -> inttoptr conversion. Consider the following reduced case

#include <stdint.h>

int main(void)
{
    int a;
    uintptr_t ptr = (uintptr_t) &a;
    * (int *) ptr = 1;
}

and corresponding PAG:

So we don't actually know after the conversion where the pointer points to.

While this is something that can surely be improved on SVF's side, I think this is a bug in the slicer. It seems to be too aggressive to slice away both func1 and func2 in the functptr12 test when the analysis says that the pointer we get after inttoptr can point to literally anything.

lzaoral · 2021-11-06T11:25:56Z

`dynalloc5`

The reduced case:

#include <stddef.h>

extern void* malloc(size_t);
extern int assert(_Bool);

struct foo {
  int a;
};

void* (*malloc2)(size_t) = malloc;

int main(void)
{
    struct foo *ptr = malloc2(sizeof(struct foo));
    ptr->a = 1;
    assert(ptr->a == 1);
}

and corresponding PAG:

llvm-slicer -c assert --anotate=pta,slicer -pta=svf:

; ModuleID = 'test.ll'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

%struct.foo = type { i32 }

@malloc2 = dso_local global i8* (i64)* @malloc, align 8

; -- Generated by llvm-slicer --
;   * slicing criteria: ''
;   * legacy slicing criteria: 'assert'
;   * legacy secondary slicing criteria: ''
;   * forward slice: '0'
;   * remove slicing criteria: '0'
;   * undefined functions behavior: 'read args'
;   * pointer analysis: ;   * PTA field sensitivity: full

declare i8* @malloc(i64) #0

; Function Attrs: noinline nounwind optnone sspstrong uwtable
define dso_local i32 @main() #1 {
  ; PTR:   %1 = alloca %struct.foo*, align 8 + ?
  %1 = alloca %struct.foo*, align 8
  ; PTR: malloc + ?
  %2 = load i8* (i64)*, i8* (i64)** @malloc2, align 8
  %3 = call i8* %2(i64 4)
  %4 = bitcast i8* %3 to %struct.foo*
  store %struct.foo* %4, %struct.foo** %1, align 8
  ; x   %5 = load %struct.foo*, %struct.foo** %1, align 8
  ; x   %6 = getelementptr inbounds %struct.foo, %struct.foo* %5, i32 0, i32 0
  ; x   store i32 1, i32* %6, align 4
  %7 = load %struct.foo*, %struct.foo** %1, align 8
  %8 = getelementptr inbounds %struct.foo, %struct.foo* %7, i32 0, i32 0
  ; unknown
  %9 = load i32, i32* %8, align 4
  ; unknown
  %10 = icmp eq i32 %9, 1
  ; unknown
  ; SLICING CRITERION
  %11 = call i32 @assert(i1 zeroext %10)
  ; x   ret i32 0
}

declare i32 @assert(i1 zeroext) #0

I think that this is once again an error in the slicer as the PAG clearly shows that both registers %5 and %7 point to the same object and apart from that the subsequent GEPs are identical. Thus, the %9 will contain the 1 stored by the store that was sliced away.

However, If you use a plain malloc instead of the alias, the result is correct.

lzaoral · 2021-11-06T17:14:22Z

`vararg{3,4}`

Call graph produced by SVF is wrong as it does not seem to support function pointers passed as arguments to variadic functions. In vararg3 the foo function is not marked as being call from the variadic function setv (the graph for vararg4 is almost the same):

Thus, the whole setv function gets sliced away.

mchalupa · 2021-11-09T16:20:37Z

It looks like SVF does not properly handle the ptrtoint -> inttoptr conversion. Consider the following reduced case
#include <stdint.h>

int main(void)
{
    int a;
    uintptr_t ptr = (uintptr_t) &a;
    * (int *) ptr = 1;
}
and corresponding PAG: So we don't actually know after the conversion where the pointer points to.

While this is something that can surely be improved on SVF's side, I think this is a bug in the slicer. It seems to be too aggressive to slice away both func1 and func2 in the functptr12 test when the analysis says that the pointer we get after inttoptr can point to literally anything.

Yes, the slicer should say that %5 actually points to unknown memory. The problem may be here: https://github.com/mchalupa/dg/blob/master/include/dg/llvm/PointerAnalysis/SVFPointerAnalysis.h#L69 . We test only the black hole object, but the inttoptr is assigned from a dummy object.

mchalupa · 2021-11-09T16:28:21Z

dynalloc5

...

I think that this is once again an error in the slicer as the PAG clearly shows that both registers %5 and %7 point to the same object and apart from that the subsequent GEPs are identical. Thus, the %9 will contain the 1 stored by the store that was sliced away.

However, If you use a plain malloc instead of the alias, the result is correct.

From the points-to sets provided by SVF, we can find out that the pointer points to some dummy object, but, SVF does not propagate the information about what instruction allocated this dummy memory object: https://github.com/SVF-tools/SVF/blob/master/lib/WPA/Andersen.cpp#L644
And that is why it is not reflected in the DG's points-to set. Maybe we could get this information from PAG, as you suggest. (But I am not sure if it can be done in general, I mean without any complicated analysis).

mchalupa · 2021-11-09T16:33:13Z

vararg{3,4}

Call graph produced by SVF is wrong as it does not seem to support function pointers passed as arguments to variadic functions. In vararg3 the foo function is not marked as being call from the variadic function setv (the graph for vararg4 is almost the same):

I'm not sure what we can do here. We can always use our lazy-callgraph which overapproximates the called functions, but for that it would be good to detect that the call graph (or just the points-to set of the function pointer) is incomplete. That is, for any variadic function, we would need to check if some of its actual arguments can point to a function and for such functions we use information from our call graph (possibly filtered with the information about actual arguments).

If `buildSymbolTableInfo` is not called on the built `SVFModule`, `PAGBuilder` tries to dereference a `nullptr` which results in a segmentation violation. Related UBSAN report: SVF/lib/SVF-FE/ICFGBuilder.cpp:43:59: runtime error: member call on null pointer of type 'struct SVFModule' 0 0x55ce7156a1b0 in SVF::ICFGBuilder::build(SVF::SVFModule*) SVF/lib/SVF-FE/ICFGBuilder.cpp:43 1 0x55ce7136cbe5 in SVF::PAG::PAG(bool) SVF/lib/Graphs/PAG.cpp:367 2 0x55ce7120a576 in SVF::PAG::getPAG(bool) (dg/build/tools/llvm-slicer+0xc73576) 3 0x55ce7120a9d6 in SVF::PAGBuilder::PAGBuilder() (dg/build/tools/llvm-slicer+0xc739d6) 4 0x55ce7120f6d8 in dg::SVFPointerAnalysis::run() (dg/build/tools/llvm-slicer+0xc786d8) 5 0x55ce7120ffab in dg::llvmdg::LLVMDependenceGraphBuilder::_runPointerAnalysis() (dg/build/tools/llvm-slicer+0xc78fab) 6 0x55ce71213d56 in dg::llvmdg::LLVMDependenceGraphBuilder::constructCFGOnly() (dg/build/tools/llvm-slicer+0xc7cd56) 7 0x55ce712273ad in Slicer::buildDG(bool) (dg/build/tools/llvm-slicer+0xc903ad) 8 0x55ce711d5ef4 in main dg/tools/llvm-slicer.cpp:248

* Apply relevant clang-tidy suggestions * Fix typos * Simplify some if-else expressions

lzaoral · 2021-11-18T16:52:15Z

Thanks @mchalupa. The funcptr12 and vararg{3,4} tests now pass thanks to your suggestion to use dg's lazy call graph!

lzaoral mentioned this pull request Oct 20, 2021

Refactor slicing test runner and get rid of UB in tests #413

Merged

lzaoral force-pushed the fix-svf-integration branch from 765b38d to a0d0481 Compare October 21, 2021 18:05

mchalupa mentioned this pull request Nov 9, 2021

Add missing SVF initialization step #414

Closed

lzaoral added 4 commits November 18, 2021 17:49

CI: update SVF to latest upstream release

31561fd

tests/slicing/test-runner.py: fix integration with SVF

16e8e40

dg/llvm/pta/SVFPointerAnalysis.h: refactor

5a1a85c

* Apply relevant clang-tidy suggestions * Fix typos * Simplify some if-else expressions

lzaoral force-pushed the fix-svf-integration branch from a0d0481 to 338843b Compare November 18, 2021 16:49

SVF: Use our CallGraph when SVF's is not sufficient

01c0d7d

lzaoral force-pushed the fix-svf-integration branch from 338843b to 01c0d7d Compare November 18, 2021 16:52

mchalupa mentioned this pull request Dec 13, 2022

Please update SVFPointerAnalysis.h in dg due to SVF updated. #447

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SVF integration #412

Fix SVF integration #412

lzaoral commented Oct 20, 2021

lzaoral commented Oct 22, 2021 •

edited

Loading

mchalupa commented Nov 1, 2021

lzaoral commented Nov 1, 2021

lzaoral commented Nov 1, 2021 •

edited

Loading

lzaoral commented Nov 6, 2021

lzaoral commented Nov 6, 2021

mchalupa commented Nov 9, 2021

mchalupa commented Nov 9, 2021

`dynalloc5`

mchalupa commented Nov 9, 2021

`vararg{3,4}`

lzaoral commented Nov 18, 2021

Fix SVF integration #412

Are you sure you want to change the base?

Fix SVF integration #412

Conversation

lzaoral commented Oct 20, 2021

lzaoral commented Oct 22, 2021 • edited Loading

mchalupa commented Nov 1, 2021

lzaoral commented Nov 1, 2021

lzaoral commented Nov 1, 2021 • edited Loading

funcptr12

lzaoral commented Nov 6, 2021

dynalloc5

lzaoral commented Nov 6, 2021

vararg{3,4}

mchalupa commented Nov 9, 2021

mchalupa commented Nov 9, 2021

dynalloc5

mchalupa commented Nov 9, 2021

vararg{3,4}

lzaoral commented Nov 18, 2021

lzaoral commented Oct 22, 2021 •

edited

Loading

lzaoral commented Nov 1, 2021 •

edited

Loading

`funcptr12`

`dynalloc5`

`vararg{3,4}`

`dynalloc5`

`vararg{3,4}`