Corrigendum

For:
Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. 2017. Repairing sequential consistency in C/C++11. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). Association for Computing Machinery, New York, NY, USA, 618–632.
https://doi.org/10.1145/3062341.3062352

December 2025.
Written by Ori Lahav.


A mistake was identified by Ariel Ben-Yehuda in December 2024 (see https://github.com/rust-lang/unsafe-code-guidelines/issues/548#issuecomment-2549734323 ). We summarize the issue and its fix below.

Summary of the Issue

The paper claims that register promotion is sound. This claim is incorrect as stated.

Counterexample (Variant of Figure 4)

Consider the following program, together with the annotated outcome (a = 2 && b = 0 && c = 0):

Thread 1 Thread 2 Thread 3
e1: a := x_acq // 2
e2: b := y_sc  // 0
e3: x :=_sc 1
e4: unused := z_na
e5: x :=_sc 2
e6: y :=_sc 1
e7: c := x_sc  // 0

In the model as given in the paper, this outcome is disallowed, since it can only be justified by a PSC cycle:

e2 -> e6 (via rb)
e6 -> e7 (via sb)
e7 -> e3 (via rb)
e3 -> e2 (via sb|!=loc from e3 to e4,
          hb from e4 to e1,
          and sb|!=loc from e1 to e2)

Together these edges form a PSC cycle, which is forbidden by the PSC acyclicity requirement.

Why This Contradicts Register Promotion

However, since z is accessed by only one thread, it should be possible to promote it into a register (say d), yielding the program:

Thread 1 Thread 2 Thread 3
e1: a := x_acq // 2
e2: b := y_sc  // 0
e3: x :=_sc 1
unused := d
e5: x :=_sc 2
e6: y :=_sc 1
e7: c := x_sc  // 0

In this transformed program, the annotated behavior becomes allowed. In particular, the PSC edge from e3 to e2 can no longer be derived, since the first sb|!=loc relation used in the original argument is no longer present.

This is a real issue: compilers do perform such promotions and related transformations for local (non-atomic) reads.

Where the Proof Goes Wrong

In the full version of the paper, the issue can be traced to Lemma I.12, whose proof states that the result “easily follows from our definitions”.

Possible Repair

Our suggested repair is to give up elimination of SC accesses (see Section 2.1.1) and to remove the sb|!=loc restriction from PSC, i.e., to use:

(sb \ rmw) ; hb ; (sb \ rmw)

instead of:

sb|!=loc ; hb ; sb|!=loc

Removing rmw edges is needed since the formal model in the paper (starting from Section 3) represents RMWs using two events (see Z6.U in Figure 7).

This change means that the model no longer supports elimination of repeated SC accesses, e.g.:

x :=_sc 1 ; x :=_sc 2  ~>  x :=_sc 2
a := x_sc ; b := x_sc  ~>  a := x_sc ; b := a

To the best of our knowledge, such transformations are not performed by mainstream compilers. Moreover, the C/C++ committee was not favorable toward sb|!=loc, and the solution adopted in the C++20 standard already uses sb rather than sb|!=loc in the corresponding constraints (see https://en.cppreference.com/w/cpp/atomic/memory_order.html ).

With sb \ rmw instead of sb|!=loc, Lemma I.12 should hold and register promotion is sound again. A mechanized proof would, of course, be beneficial.

Side Note

While the C++ committee largely followed the approach proposed in this paper, the C++20 memory model still exhibits issues at the time of writing this correction. These stem from additional simplifications made relative to the paper’s model. In particular, the standard appears to rely on PSC_strong as described in Remark 4; however, that remark already notes that this choice breaks the standard mapping to x86. This issue is known to Hans Boehm, and work is ongoing to align the standard with the paper’s model (see https://cplusplus.github.io/LWG/issue3941 ).

Imprint / Data Protection