Use Map-based data structure for LLVM allocation histories. #550

brianhuffman · 2020-10-01T00:19:20Z

Instead of recording allocations sequentially, we now store
collections of allocations together in a map indexed by block
ID. This makes lookups much faster in the common case where
the block ID we are checking is concrete.

Fixes #549.

robdockins

A couple of minor comments about the data structures involved, but this otherwise looks OK to me.

robdockins · 2020-10-02T00:20:10Z

crucible-llvm/src/Lang/Crucible/LLVM/MemModel/MemLog.hs

+
+instance Semigroup (MemAllocs sym) where
+ (MemAllocs lhs_allocs) <> (MemAllocs rhs_allocs)
+ | Just (lhs_head_allocs, Allocations lhs_m) <- List.unsnoc lhs_allocs


If this semigroup operation is called a lot, it may be worth optimizing access to the tail of the list by using some double-sided data structure, like Data.Sequence.

In practice, I believe this semigroup operation is really only used to cons new operations onto the front, so a list probably isn't too bad. Changing the datatype would also require changing all the functions that traverse MemAllocs, which currently use pattern matching on the list constructors. On the other hand, making that traversal code more abstract wouldn't be a bad thing.

In either case, if we switch MemAllocs to use Data.Sequence, we should probably do MemWrites as well.

I'd be in favor, generally, of making those traversals and such more abstract to provide us with more flexibility in the future with representation choices. I don't know if it should be part of this PR or not.

Revision 151e1f9 makes MemAllocs into an abstract datatype, so now all the traversals are done using the abstract interface provided by Lang.Crucible.LLVM.MemModel.MemLog.

crucible-llvm/src/Lang/Crucible/LLVM/MemModel/MemLog.hs

andreistefanescu

This is a substantial improvement to how allocation info is stored, let's take this opportunity to add a few tests

crucible-llvm/src/Lang/Crucible/LLVM/MemModel/Generic.hs

brianhuffman · 2020-10-02T15:15:15Z

I could use some advice/suggestions for how to make some proper tests for this code.

brianhuffman · 2020-10-02T19:19:47Z

It looks like the macos CI builds are failing in the same way that the saw-script builds were doing recently:

inplace-ghc8.6.5.dylib: unknown file type, first eight bytes: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

I think @jared-w managed to fix the problem for saw-script by changing some cache settings; we probably need to do the same for the crucible repo.

hazelweakly · 2020-10-02T20:01:07Z

Unfortunately the only reliable thing I've been able to do so far is just remove the cache by changing the cache key in the github actions file and seeing if that does anything. I've gotten the cache as narrow as I can get it at this point.

One thing I have next on my list to try is to use cabal-cache which has more advanced caching logic than just grabbing all of dist-newbuild.

andreistefanescu · 2020-10-08T21:02:29Z

Regarding tests, take a look at

crucible/crucible-llvm/test/Tests.hs

Line 479 in b8a466a

 testMemWritesIndexed = testCase "indexed memory writes" $ withMem BigEndian $ \sym mem0 -> do 

. Let's add a few tests using doAlloc and/or the allocation API.

robdockins · 2020-10-30T20:51:14Z

I agree that adding tests for this is worthwhile, but I don't think we should let this PR linger unmerged.

Instead of recording allocations sequentially, we now store collections of allocations together in a map indexed by block ID. This makes lookups much faster in the common case where the block ID we are checking is concrete. Fixes #549.

All operations that traverse a MemAllocs log are now implemented within the Lang.Crucible.LLVM.MemModel.MemLog module.

Internal invariants of this abstract datatype are moved to an ordinary comment because they are not relevant to the exported API.

brianhuffman · 2020-11-07T04:07:06Z

9a304ce adds a few simple tests alongside testMemWritesIndexed, as suggested.

brianhuffman marked this pull request as ready for review October 1, 2020 03:50

brianhuffman requested review from andreistefanescu and robdockins October 1, 2020 03:50

brianhuffman mentioned this pull request Oct 1, 2020

Update crucible submodule. GaloisInc/saw-script#854

Closed

robdockins approved these changes Oct 2, 2020

View reviewed changes

andreistefanescu suggested changes Oct 2, 2020

View reviewed changes

crucible-llvm/src/Lang/Crucible/LLVM/MemModel/Generic.hs Outdated Show resolved Hide resolved

brianhuffman force-pushed the MemAllocs branch from fd5d97f to 151e1f9 Compare October 8, 2020 19:48

Brian Huffman added 8 commits November 6, 2020 18:11

Factor out function isAllocatedGeneric.

8daeca4

Give local functions more sensible names.

c69cad9

Document stronger invariant for MemAllocs type.

017973f

Make MemAllocs into an abstract datatype.

ceb2be8

All operations that traverse a MemAllocs log are now implemented within the Lang.Crucible.LLVM.MemModel.MemLog module.

Update doc-string for MemAllocs to omit internal details.

697f8df

Internal invariants of this abstract datatype are moved to an ordinary comment because they are not relevant to the exported API.

Remove commented-out line.

5a3064e

Add some simple alloc/free tests for the llvm memory model.

9a304ce

brianhuffman force-pushed the MemAllocs branch from dd4cb31 to 9a304ce Compare November 7, 2020 04:00

andreistefanescu approved these changes Nov 7, 2020

View reviewed changes

brianhuffman requested a review from andreistefanescu November 7, 2020 13:50

andreistefanescu approved these changes Nov 7, 2020

View reviewed changes

brianhuffman merged commit 4ec5674 into master Nov 9, 2020

travitch deleted the MemAllocs branch December 16, 2021 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Map-based data structure for LLVM allocation histories. #550

Use Map-based data structure for LLVM allocation histories. #550

brianhuffman commented Oct 1, 2020

robdockins left a comment

robdockins Oct 2, 2020

brianhuffman Oct 2, 2020

robdockins Oct 2, 2020

brianhuffman Oct 8, 2020

andreistefanescu left a comment

brianhuffman commented Oct 2, 2020

brianhuffman commented Oct 2, 2020

hazelweakly commented Oct 2, 2020

andreistefanescu commented Oct 8, 2020

robdockins commented Oct 30, 2020

brianhuffman commented Nov 7, 2020

Use Map-based data structure for LLVM allocation histories. #550

Use Map-based data structure for LLVM allocation histories. #550

Conversation

brianhuffman commented Oct 1, 2020

robdockins left a comment

Choose a reason for hiding this comment

robdockins Oct 2, 2020

Choose a reason for hiding this comment

brianhuffman Oct 2, 2020

Choose a reason for hiding this comment

robdockins Oct 2, 2020

Choose a reason for hiding this comment

brianhuffman Oct 8, 2020

Choose a reason for hiding this comment

andreistefanescu left a comment

Choose a reason for hiding this comment

brianhuffman commented Oct 2, 2020

brianhuffman commented Oct 2, 2020

hazelweakly commented Oct 2, 2020

andreistefanescu commented Oct 8, 2020

robdockins commented Oct 30, 2020

brianhuffman commented Nov 7, 2020