-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Tree Borrows Provenance GC compact the tree #3837
Make Tree Borrows Provenance GC compact the tree #3837
Conversation
pub fn can_be_replaced_by_child(self, child: Self) -> bool { | ||
match self.inner.partial_cmp(&child.inner) { | ||
Some(Ordering::Less) | Some(Ordering::Equal) => true, | ||
Some(Ordering::Greater) | None => false, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the magic function. I think that this is sound, i.e. if it returns true
, then the child can indeed replace the parent.
Here is an example: The parent is Active
, and the child is Frozen
. On a foreign write, both go to Disabled
. On a foreign read, both go to Frozen
. On a child read, both remain; and the Frozen
tag blocks child writes. So the Active
has no effect and can be compacted by replacing it with its child.
pub fn can_be_replaced_by_child(self, child: Self) -> bool { | ||
match self.inner.partial_cmp(&child.inner) { | ||
Some(Ordering::Less) | Some(Ordering::Equal) => true, | ||
Some(Ordering::Greater) | None => false, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't think this function is complete. For once, if the parent is ReservedIM
, and the child is Reserved
, we can still replace the parent by the child, whereas partial_cmp
says that they are incomparable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the handling of Reserved {conflicted: true}
parents is very subtle:
- Usually, it's not OK to replace a conflicted
Reserved
by anActive
. But aReserved
never hasActive
children anyways (as allActive
haveActive
parents), and further, since we know the parent is not protected, we know that the conflictedness does not matter anyway - This exposes another way the method is incomplete: a conflicted
Reserved
parent can be replaced by a non-conflictedReserved
child, since the parent's conflictedness no longer matters, as the parent is not protected. But this is not howpartial_cmp
works.
(Note that I am highlighting all these cases to show that the reasoning here is indeed subtle in many cases)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial_cmp
on permissions denotes reachability, and is not by definition tied to the amount of UB that the permission can produce. I like the idea of can_be_replaced_by_child
but I think the above implementation suggests that the replaceability derives from partial_cmp
when it's actually very dependent on invariants of the tree. I think it should be decoupled from partial_cmp
and hand-written to explicitly show all compactable cases with their justification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example the fact that you can compact an Active
parent with a Frozen
child is entirely dependent on whether the Active
is protected, so there are at minimum many unwritten assumptions in can_be_replaced_by_child
about the fact that protected tags can't be GC'd and the valid parent-child combinations.
If we make the match
explicit we can add a sanity assert(!parent_prot)
in the parent: Active, child: Frozen
case.
Alright, the diff had cut off the comment above so I hadn't seen that the parent not being protected is at least documented. I still think we should decouple replaceability from reachability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do not remove protected parents. Such parents should never be GC'd anyways since they still have a protector end write coming.
Note that I do think there's a deeper connection between reachability and replaceability: Permissions that come "later" always have "less" permissions than those that come earlier. It should always be sound to replace a node by one that's further ahead in the state machine (protectors notwithstanding). While this is not "by definition," it is the underlying principle making TB work. But I agree that it should probably be decoupled in code.
if global.borrow().protected_tags.contains_key(&child.tag) { | ||
// todo think really hard whether we can't just handle this as we would regularly | ||
return None; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this check is necessary. I reckon that it is not, i.e. we can merge protected tags up. Part of the intuition is:
- Almost all the protector does is make more transitions UB. Since we do not remove the protector, these same transitions remain UB
- The one exception is that
Reserved
becomes conflicted on foreign accesses -- but this should not cause any issues, since the foreign read still happens similarly on the child. - We still know the parent is not protected
We should probably have a definition of Let
|
The tests are failing for several reasons, all related to the fact that it's a "GC-stress" mode test where the GC runs every tick. First, The other tests because the GC messes with the tree structure, combined with a suboptimal tree visiting order when checking accesses. To check an access, one currently starts at the root, and first checks that all nodes from there downwards support a child access. If the violating node (i.e. one not supporting the access) is such a parent, the error message will point this out (and print information about both tags). But quite often, this parent is simply the immediate parent of the node that was accessed, and not used otherwise, so the GC has merged it with the child. Since then it's not a violating parent but the node itself, the error message changes. And this makes the UI tests fail. My solution would be to change the iteration order to go upwards but not downwards, which would be resilient to GC tree compactings. The comments currently in code seem to imply that the current iteration order has a deeper meaning, but sadly without going any deeper. @Vanille-N do you know more? |
Sure. The current iteration order is of course a heuristic (TB doesn't specifically require an order to these) in an attempt to quickly identify the "root cause" of the issue, with a priority for more straightforward errors. Insufficient permissions are more straightforward than protector violations, so we check parents before children. As for the iteration order (top-down), the reasoning is that if we have I recognize that the combination of this iteration order and this tree compacting strategy can lead to nondeterministic error messages. Fixing that should take absolute priority over whatever diagnostics heuristics is implemented, and any iteration order will probably still give good enough messages. |
Finally, from what I understand (I haven't looked at the detail of your compaction logic yet), you seem to be keeping the child and deleting the parent. This is not the only compacting logic possible. The one I had in mind originally was to do the opposite : look at the parent to determine if we can delete the child. This removes the constraint of "if the parent is a singleton" because it preserves the ancestry relation even in branching trees.
into
I don't doubt that your bottom-up compacting can merge some configurations that a top-down approach cannot, and maybe those are more frequent, but the point is that the iteration order that I chose is compatible with the compacting logic that I initially had in mind. |
Also since wide trees cannot get very deep, it's probably fine to have a compacting logic that can't handle branching if we observe that issues come mostly from concrete examples where the tree is deep. |
Compared with that approach, doing it like I do here has several advantages:
Also note that the number of additional nodes that your approach would remove, but mine would not, is bounded: My approach produces trees at most twice as large as yours. The reason is that in a binary tree, the number of interior nodes is at most the number of leaves. So the efficiency gains offered by an even more collapsing GC are constant. |
Note that further optimizations are possible elsewhere to handle wide trees. For instance, those implemented in this branch, which makes reads and writes skip more subtrees. But I did not see them cause a huge speedup, and they also affect diagnostics in rare cases, so I did not PR them yet. |
Is this the right check? "Remain in simulation" is very weak. For instance, juts making |
Such an implementation would fail the second condition (e.g. if the parent is Note that whether or not something causes UB is the only observable action here. So formally, the parent must simulate the child, including the child's steps to UB. |
In preparation for rust-lang#3837, the tree traversal needs to be made bottom-up, because the current top-down tree traversal, coupled with that PR's changes to the garbage collector, can introduce non-deterministic error messages if the GC removes a parent tag of the accessed tag that would have triggered the error first. This is a breaking change for the diagnostics emitted by TB. The implemented semantics stay the same.
96e6c48
to
8d9e92b
Compare
In preparation for rust-lang#3837, the tree traversal needs to be made bottom-up, because the current top-down tree traversal, coupled with that PR's changes to the garbage collector, can introduce non-deterministic error messages if the GC removes a parent tag of the accessed tag that would have triggered the error first. This is a breaking change for the diagnostics emitted by TB. The implemented semantics stay the same.
8d9e92b
to
2f98d27
Compare
Your comment says they have to satisfy one of the conditions, and the first one is trivially satisfied. So it's okay to violate the second condition, the entire soundness criterion you are suggesting is still satisfied as far as I can tell. |
I mean you either move along or cause UB, you can't have both at the same time, so that's why it's "either." |
Ah, "move along" implies "without UB". :) |
☔ The latest upstream changes (presumably #3847) made this pull request unmergeable. Please resolve the merge conflicts. |
In preparation for rust-lang#3837, the tree traversal needs to be made bottom-up, because the current top-down tree traversal, coupled with that PR's changes to the garbage collector, can introduce non-deterministic error messages if the GC removes a parent tag of the accessed tag that would have triggered the error first. This is a breaking change for the diagnostics emitted by TB. The implemented semantics stay the same.
Make TB tree traversal bottom-up In preparation for #3837, the tree traversal needs to be made bottom-up, because the current top-down tree traversal, coupled with that PR's changes to the garbage collector, can introduce non-deterministic error messages if the GC removes a parent tag of the accessed tag that would have triggered the error first. This is a breaking change for the diagnostics emitted by TB. The implemented semantics stay the same.
2f98d27
to
3acb87a
Compare
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
3acb87a
to
1cd7317
Compare
I squashed all the comments here. The reason is that I made some changes to the overall garbage collector infrastructure in the first commit here, that I undid in one of the later commits (because turns out it's unnecessary), but when rebasing to the master branch where the GC also changed, git got confused, and I could not be bothered to correctly replay my changes as I knew that a later commit would undo them. |
a395a0d
to
203a7a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have to check the tests
// Active can be replaced by Frozen, since it is not protected | ||
(Active, Frozen) => true, | ||
(Active, Disabled) => true, | ||
// Frozen can only be replaced by Disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Frozen can only be replaced by Disabled | |
// Frozen can only be replaced by Disabled (and itself). |
(Active, ReservedIM) => false, | ||
(Active, ReservedFrz { .. }) => false, | ||
(Active, Active) => true, | ||
// Active can be replaced by Frozen, since it is not protected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Active can be replaced by Frozen, since it is not protected | |
// Active can be replaced by Frozen, since it is not protected. |
(Frozen, Frozen) => true, | ||
(Frozen, Disabled) => true, | ||
(Frozen, _) => false, | ||
// Disabled can not be replaced by anything else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Disabled can not be replaced by anything else | |
// Disabled can not be replaced by anything else. |
@@ -128,6 +128,22 @@ impl LocationState { | |||
Ok(transition) | |||
} | |||
|
|||
/// Like `perform_access`, but ignores the diagnostics, and also is pure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Like `perform_access`, but ignores the diagnostics, and also is pure. | |
/// Like `perform_access`, but ignores the concrete error cause and also uses state-passing | |
/// rather than a mutable reference. |
/// should have no children, but this is not checked, so that nodes | ||
/// whose children were rotated somewhere else can be deleted without | ||
/// having to first modify them to clear that array. | ||
/// otherwise (i.e. the GC should have marked it as removable). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last line doesn't sound like a sentence...?
Looks good, thanks. :) @bors r+ |
☀️ Test successful - checks-actions |
Follow-up on #3833 and #3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test
fill::horizontal_line
intiny-skia
. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time (only for it to be OOM-killed 🤬), whereas it finishes within 24 seconds in Stacked Borrows. With this merged, it finishes in about 40 seconds under TB.The problem in that test was that it used
slice::chunked
to iterate a slice in chunks. That iterator is written to reborrow at each call tonext
, which creates a linear tree with a bunch of intermediary nodes, which also fragments theRangeMap
for that allocation.The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
I am currently only 99% sure that this is sound, and I do also think that this could compact even more. So @Vanille-N please also have a look at whether I got the compacting logic right.
For a more visual comparison, here is a gist of what the tree looks like at one point during that test, with and without compacting.
This new GC requires a different iteration order during accesses (since the current one can make the error messages non-deterministic), so it is rebased on top of #3843 and requires that PR to be merged first.