Memory model bug for data-race free programs #1354

syg · 2018-11-20T20:15:51Z

@conrad-watt, in his work on adopting the JS memory model for wasm, has found a bug in the memory model that causes data-race free programs to be not sequentially consistent.

The Bug

Consider the following example,

Example 1

           Agent 1                          Agent 2
           -------                          -------
[W1] Atomics.store(x, 0, 1);  ||  [W2] Atomics.store(x, 0, 2);
[Wy] Atomics.store(y, 0, 1);  ||  [Ry] if (Atomics.load(y, 0) === 1) {
                              ||  [R1]   print(x[0]);
                              ||  [R2]   print(x[0]);
                              ||       }

The current memory model admits an execution where 2,1 is printed. Let sw, hb, and mo abbreviate synchronizes-with, happens-before, and memory-order respectively.

[Wy] <sw [Ry], therefore [Wy] <hb [Ry]
[W1] <hb [Wy]
[W2] <hb [Ry] <hb [R1] <hb [R2]
[W1] <mo [Wy] <mo [W2] <mo [Ry] <mo [R1] <mo [R2]

[R1] reads-from [W2]
[R2] reads-from [W1]

Note that there is no data race because [W1] and [W2], while racing, are not a data race since both are SC atomic.

The above is a valid but non-SC execution because "Coherent Reads" only says that a read must read the most recent write event on some range with respect to happens-before. So while there is a demonstrable total memory order as written above, it is still valid for [R2] to read from a memory-order-older [W1].

Strong Fix

It is natural to try to fix the above by disallowing reads of stale memory-order write events, which is a stronger condition than only disallowing reads of stale happens-before write events.

This fix might be too strong, however. Consider a related example where W1 and W2 are changed to non-atomic writes.

Example 2

           Agent 1                          Agent 2
           -------                          -------
[W1'] x[0] = 1;                ||  [W2'] x[0] = 2;
[Wy ] Atomics.store(y, 0, 1);  ||  [Ry ] if (Atomics.load(y, 0) === 1) {
                               ||  [R1 ]   print(x[0]);
                               ||  [R2 ]   print(x[0]);
                               ||       }

Under the strong fix, there is no valid execution where 2,1 could be printed. That is, both Examples 1 and 2 cannot print 2,1.

The pros are:

Easier-to-reason-about meaning of SC atomics as full SC fences.
Why work harder to make programs with data races better behaved?

The cons are:

Maybe tying our hands for weaker architectures?
Maybe tying our hands for optimizations? (Though I can't think of any optimizations this would enable.)
Stronger than needed for fixing DRF-SC.
@conrad-watt makes a future proofing web compat hazard: by restricting the set of allowed behavior now, we'd be hard pressed to weaken it in the future.

Weak Fix

A weaker fix is to disallow Example 1 to print 2,1 while allowing Example 2 to print 2,1.

The pros are:

Minimum fix needed for DRF-SC.
Can always strengthen in the future, as by not guaranteeing SC atomics to be fences, nobody can depend on them being such.

The cons are:

Kind of unintuitive. We've already banned other optimizations as being too weird.

Not Currently Manifest on the Web

This bug is currently, AFAICT, not manifest on the web as the engines do compile SC atomics as fences currently.

Compiler folks, thoughts?

The text was updated successfully, but these errors were encountered:

ajklein · 2018-11-20T20:31:39Z

CC @binji @dtig

syg · 2018-11-20T21:35:30Z

For the record, after discussion with @conrad-watt and some thinking, I favor the weak fix. The strong fix seems to have a lot of unintended consequences around strengthening the guarantees of data racy programs.

binji · 2018-11-20T22:26:30Z

The weak fix seems like a good approach. Agreed it's not intuitive, but many behaviors are unintuitive with unordered accesses.

conrad-watt · 2018-11-21T17:04:29Z

So to make things explicit, the weakest possible strengthening fix ("Weak Fix") would involve replacing the second bullet point of the Sequentially Consistent Atomics condition with the following, or something equivalent:

For each pair of Shared Data Block events (R, W) in execution.[[ReadsFrom]], if (W, R) is in memory-order, then the following is true.
- For each event E in SharedDataBlockEventSet(execution), if the pairs (W, E) and (E, R) are in memory-order, and E is a WriteSharedMemory or ReadModifyWriteSharedMemory event, then all of the following are true.
  - If (W, R) is in execution.[[SynchronizesWith]], then E and R do not have equal ranges.
  - If (E, R) is in execution.[[HappensBefore]], and both W.[[Order]] and E.[[Order]] are SeqCst, then W and E do not have equal ranges.
  - If (W, E) is in execution.[[HappensBefore]], and both E.[[Order]] and R.[[Order]] are SeqCst, then E and R do not have equal ranges.

Note that the first inner bullet point is precisely the old condition, and the latter two are the strengthening additions.

@syg has prompted me to think about whether the "If (-, -) is in execution.[[HappensBefore]]" pre-conditions of the last two bullets are necessary. Removing these strengthens the condition, but potentially leads to a simpler model. To be clear, even this strengthening would be weaker than the "Strong Fix". It's effectively a third flavour in between the Weak and Strong fixes that I have to think more carefully about.

I'm still in general biased towards the weakest possible fix, because it absolutely minimizes the chance that there will be a problem in future.

waldemarhorwat · 2018-11-26T22:28:37Z

Fascinating. I'm still mulling over the implications of the options as well.

…ntially consistent. (tc39#1354)

syg · 2022-03-16T21:59:17Z

The fix has been merged years ago, forgot to close.

syg added the spec bug label Nov 20, 2018

syg added a commit to syg/ecma262 that referenced this issue Nov 29, 2018

Normative: Fix memory model so that data race free programs are seque…

7ff98e9

…ntially consistent. (tc39#1354)

syg mentioned this issue Nov 29, 2018

Normative: Fix memory model so DRF-SC holds #1362

Closed

syg added a commit to syg/ecma262 that referenced this issue Nov 29, 2018

Normative: Fix memory model so that data race free programs are seque…

4e6f8f2

…ntially consistent. (tc39#1354)

syg added a commit to syg/ecma262 that referenced this issue Nov 29, 2018

Normative: Fix memory model so that data race free programs are seque…

19c4434

…ntially consistent. (tc39#1354)

conrad-watt mentioned this issue Apr 17, 2019

Normative: fix axiomatic model to be SC-DRF, and allow ARMv8 compilation #1511

Merged

syg closed this as completed Mar 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory model bug for data-race free programs #1354

Memory model bug for data-race free programs #1354

syg commented Nov 20, 2018 •

edited

Loading

ajklein commented Nov 20, 2018

syg commented Nov 20, 2018

binji commented Nov 20, 2018

conrad-watt commented Nov 21, 2018 •

edited

Loading

waldemarhorwat commented Nov 26, 2018

syg commented Mar 16, 2022

Memory model bug for data-race free programs #1354

Memory model bug for data-race free programs #1354

Comments

syg commented Nov 20, 2018 • edited Loading

The Bug

Strong Fix

Weak Fix

Not Currently Manifest on the Web

ajklein commented Nov 20, 2018

syg commented Nov 20, 2018

binji commented Nov 20, 2018

conrad-watt commented Nov 21, 2018 • edited Loading

waldemarhorwat commented Nov 26, 2018

syg commented Mar 16, 2022

syg commented Nov 20, 2018 •

edited

Loading

conrad-watt commented Nov 21, 2018 •

edited

Loading