Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory model bug for data-race free programs #1354

Closed
syg opened this issue Nov 20, 2018 · 6 comments
Closed

Memory model bug for data-race free programs #1354

syg opened this issue Nov 20, 2018 · 6 comments
Labels

Comments

@syg
Copy link
Contributor

syg commented Nov 20, 2018

@conrad-watt, in his work on adopting the JS memory model for wasm, has found a bug in the memory model that causes data-race free programs to be not sequentially consistent.

The Bug

Consider the following example,

Example 1

           Agent 1                          Agent 2
           -------                          -------
[W1] Atomics.store(x, 0, 1);  ||  [W2] Atomics.store(x, 0, 2);
[Wy] Atomics.store(y, 0, 1);  ||  [Ry] if (Atomics.load(y, 0) === 1) {
                              ||  [R1]   print(x[0]);
                              ||  [R2]   print(x[0]);
                              ||       }

The current memory model admits an execution where 2,1 is printed. Let sw, hb, and mo abbreviate synchronizes-with, happens-before, and memory-order respectively.

[Wy] <sw [Ry], therefore [Wy] <hb [Ry]
[W1] <hb [Wy]
[W2] <hb [Ry] <hb [R1] <hb [R2]
[W1] <mo [Wy] <mo [W2] <mo [Ry] <mo [R1] <mo [R2]

[R1] reads-from [W2]
[R2] reads-from [W1]

Note that there is no data race because [W1] and [W2], while racing, are not a data race since both are SC atomic.

The above is a valid but non-SC execution because "Coherent Reads" only says that a read must read the most recent write event on some range with respect to happens-before. So while there is a demonstrable total memory order as written above, it is still valid for [R2] to read from a memory-order-older [W1].

Strong Fix

It is natural to try to fix the above by disallowing reads of stale memory-order write events, which is a stronger condition than only disallowing reads of stale happens-before write events.

This fix might be too strong, however. Consider a related example where W1 and W2 are changed to non-atomic writes.

Example 2

           Agent 1                          Agent 2
           -------                          -------
[W1'] x[0] = 1;                ||  [W2'] x[0] = 2;
[Wy ] Atomics.store(y, 0, 1);  ||  [Ry ] if (Atomics.load(y, 0) === 1) {
                               ||  [R1 ]   print(x[0]);
                               ||  [R2 ]   print(x[0]);
                               ||       }

Under the strong fix, there is no valid execution where 2,1 could be printed. That is, both Examples 1 and 2 cannot print 2,1.

The pros are:

  • Easier-to-reason-about meaning of SC atomics as full SC fences.
  • Why work harder to make programs with data races better behaved?

The cons are:

  • Maybe tying our hands for weaker architectures?
  • Maybe tying our hands for optimizations? (Though I can't think of any optimizations this would enable.)
  • Stronger than needed for fixing DRF-SC.
  • @conrad-watt makes a future proofing web compat hazard: by restricting the set of allowed behavior now, we'd be hard pressed to weaken it in the future.

Weak Fix

A weaker fix is to disallow Example 1 to print 2,1 while allowing Example 2 to print 2,1.

The pros are:

  • Minimum fix needed for DRF-SC.
  • Can always strengthen in the future, as by not guaranteeing SC atomics to be fences, nobody can depend on them being such.

The cons are:

  • Kind of unintuitive. We've already banned other optimizations as being too weird.

Not Currently Manifest on the Web

This bug is currently, AFAICT, not manifest on the web as the engines do compile SC atomics as fences currently.

Compiler folks, thoughts?

@syg syg added the spec bug label Nov 20, 2018
@ajklein
Copy link
Contributor

ajklein commented Nov 20, 2018

CC @binji @dtig

@syg
Copy link
Contributor Author

syg commented Nov 20, 2018

For the record, after discussion with @conrad-watt and some thinking, I favor the weak fix. The strong fix seems to have a lot of unintended consequences around strengthening the guarantees of data racy programs.

@binji
Copy link

binji commented Nov 20, 2018

The weak fix seems like a good approach. Agreed it's not intuitive, but many behaviors are unintuitive with unordered accesses.

@conrad-watt
Copy link
Contributor

conrad-watt commented Nov 21, 2018

So to make things explicit, the weakest possible strengthening fix ("Weak Fix") would involve replacing the second bullet point of the Sequentially Consistent Atomics condition with the following, or something equivalent:

  • For each pair of Shared Data Block events (R, W) in execution.[[ReadsFrom]], if (W, R) is in memory-order, then the following is true.

    • For each event E in SharedDataBlockEventSet(execution), if the pairs (W, E) and (E, R) are in memory-order, and E is a WriteSharedMemory or ReadModifyWriteSharedMemory event, then all of the following are true.

      • If (W, R) is in execution.[[SynchronizesWith]], then E and R do not have equal ranges.
      • If (E, R) is in execution.[[HappensBefore]], and both W.[[Order]] and E.[[Order]] are SeqCst, then W and E do not have equal ranges.
      • If (W, E) is in execution.[[HappensBefore]], and both E.[[Order]] and R.[[Order]] are SeqCst, then E and R do not have equal ranges.

Note that the first inner bullet point is precisely the old condition, and the latter two are the strengthening additions.

@syg has prompted me to think about whether the "If (-, -) is in execution.[[HappensBefore]]" pre-conditions of the last two bullets are necessary. Removing these strengthens the condition, but potentially leads to a simpler model. To be clear, even this strengthening would be weaker than the "Strong Fix". It's effectively a third flavour in between the Weak and Strong fixes that I have to think more carefully about.

I'm still in general biased towards the weakest possible fix, because it absolutely minimizes the chance that there will be a problem in future.

@waldemarhorwat
Copy link

Fascinating. I'm still mulling over the implications of the options as well.

@syg
Copy link
Contributor Author

syg commented Mar 16, 2022

The fix has been merged years ago, forgot to close.

@syg syg closed this as completed Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants