GPS: Navigating weak memory with ghosts, protocols, and separation

Aaron Turon, Viktor Vafeiadis, Derek Dreyer

*Manuscript*.

[ abstract
| project page ]

*Abstract:*
Weak memory models axiomatize the unexpected behavior that one can
expect to observe in multi-threaded programs running on modern
hardware. In so doing, however, they complicate the already-difficult
task of reasoning about correctness of concurrent code. Worse, they
render impotent the sophisticated formal methods that have been
developed to tame concurrency, which almost universally assume a
strong (i.e., sequentially consistent) memory model.

This work introduces **GPS**, the first program logic to provide
a full-fledged suite of modern verification techniques—including
ghost state, rely-guarantee "protocols", *and* separation
logic—for high-level, structured reasoning about weak memory. We
demonstrate the effectiveness of GPS by applying it to challenging
examples drawn from the Linux kernel as well as lock-free data
structures. We also define the semantics of GPS and prove its
soundness directly in terms of the axiomatic C11 weak memory model.

Freeze after writing: quasi-deterministic parallel programming with LVars

Lindsey Kuper, Aaron Turon, Neelakantan R. Krishnaswami, Ryan R. Newton

*to appear in POPL 2014*.

[ abstract
| pdf | technical appendix ]

*Abstract:*
Deterministic-by-construction parallel programming models offer the advantages
of parallel speedup while avoiding the nondeterministic, hard-to-reproduce bugs
that plague fully concurrent code. A principled approach to
deterministic-by-construction parallel programming with shared state is offered
by

*LVars*: shared memory locations whose semantics are defined in terms of
an application-specific lattice. Writes to an LVar take the least upper bound
of the old and new values with respect to the lattice, while reads from an LVar
can observe only that its contents have crossed a specified threshold in the
lattice. Although it guarantees determinism, this interface is quite limited.

We extend LVars in two ways. First, we add the ability to "freeze" and then
read the contents of an LVar directly. Second, we add the ability to attach
event handlers to an LVar, triggering a callback when the LVar's value changes.
Together, handlers and freezing enable an expressive and useful style of
parallel programming. We prove that in a language where communication takes
place through these extended LVars, programs are at worst *quasi-deterministic*: on
every run, they either produce the same answer or raise an error. We
demonstrate the viability of our approach by implementing a library for Haskell
supporting a variety of LVar-based data structures, together with a case
study that illustrates the programming model and yields promising parallel
speedup.

Modular rollback through control logging:

Twin pearls in continuation, and a monadic third

Olin Shivers, Aaron Turon, Conor McBride

*Submitted*.

[ abstract
| pdf ]

*Abstract:*
We present a technique, based on the use of first-class control operators,
enabling programs to maintain and invoke rollback logs for sequences of
reversible effects.
Our technique is modular, in that it provides complete separation
between some library of effectful operations, and a client, "driver"
program which invokes and rolls back sequences of these operations.
In particular, the checkpoint mechanism,
which is entirely encapsulated within the effect library,
logs not only the library's effects, but also the client's control state.
Thus, logging and rollback can be almost completely transparent to the
client code.

This separation of concerns manifests itself nicely when we must
implement software with sophisticated error handling.
We illustrate with two examples that exploit the architecture
to disentangle some core parsing task from its error management.
The parser code is completely separate from the error-correction code,
although the two components are deeply intertwined at run time.

Unifying refinement and Hoare-style reasoning
in a logic for higher-order concurrency

Aaron Turon, Derek Dreyer, and Lars Birkedal

*in ICFP 2013*.

[ abstract
| pdf ]

*Abstract:*
Modular programming and modular verification go hand in hand, but most
existing logics for concurrency ignore two crucial forms of
modularity: *higher-order functions*, which are essential for building
reusable components, and *granularity abstraction*, a key technique
for hiding the intricacies of fine-grained concurrent data structures
from the clients of those data structures. In this paper, we present
CaReSL, the first logic to support the use of granularity abstraction
for modular verification of higher-order concurrent programs. After
motivating the features of CaReSL through a variety of illustrative
examples, we demonstrate its effectiveness by using it to tackle a
significant case study: the first formal proof of (partial)
correctness for Hendler et al.'s "flat combining" algorithm.

Logical relations for fine-grained concurrency

Aaron Turon, Jacob Thamsborg, Amal Ahmed, Lars Birkedal, and Derek Dreyer

*in POPL 2013*.

[ abstract
| pdf | technical appendix ]

*Abstract:*
Fine-grained concurrent data structures (or FCDs) reduce the
granularity of critical sections in both time and space, thus making
it possible for clients to access different parts of a mutable data
structure in parallel. However, the tradeoff is that the
implementations of FCDs are very subtle and tricky to reason about
directly. Consequently, they are carefully designed to be contextual
refinements of their coarse-grained counterparts, meaning that their
clients can reason about them as if all access to them were
sequentialized.

In this paper, we propose a new semantic model, based
on Kripke logical relations, that supports direct proofs of contextual
refinement in the setting of a type-safe high-level language. The key
idea behind our model is to provide a simple way of expressing the
``local life stories'' of individual pieces of an FCD's hidden state
by means of protocols that the threads concurrently accessing that
state must follow. By endowing these protocols with a simple yet
powerful transition structure, as well as the ability to assert
invariants on both heap states and specification code, we are able to
support clean and intuitive refinement proofs for the most
sophisticated types of FCDs, such as conditional compare-and-set
(CCAS).

Superficially substructural types

Neelakantan R. Krishnaswami, Aaron Turon, Derek Dreyer, and Deepak Garg

*in ICFP 2012*.

[ abstract
| pdf | technical appendix ]

*Abstract:*
Many substructural type systems have been proposed for controlling
access to shared state in higher-order languages. Central to these
systems is the notion of a

*resource*, which may be split into
disjoint pieces that different parts of a program can manipulate
independently without worrying about interfering with one another.
Some systems support a

*logical* notion of resource (such as
permissions), under which two resources may be considered disjoint even
if they govern the

*same* piece of state. However, in nearly
all existing systems, the notions of resource and disjointness are
fixed at the outset, baked into the model of the language, and fairly
coarse-grained in the kinds of sharing they enable.

In this paper, inspired by recent work on "fictional disjointness"
in separation logic, we propose a simple and flexible way of enabling
any module in a program to create its own custom type of splittable
resource (represented as a commutative monoid), thus providing
fine-grained control over how the module's private state is shared
with its clients. This functionality can be incorporated into an
otherwise standard substructural type system by means of a new typing
rule we call *the sharing rule*, whose soundness we prove
semantically via a novel resource-oriented Kripke logical relation.

Reagents: expressing and composing fine-grained concurrency

Aaron Turon

*in PLDI 2012*.

[ abstract
| pdf | Keynote slides | pdf slides ]

*Abstract:*
Efficient communication and synchronization is crucial for fine-grained
parallelism. Libraries providing such features, while indispensable, are
difficult to write, and often cannot be tailored or composed to meet the needs
of specific users. We introduce *reagents*, a set of combinators for
concisely expressing concurrency algorithms. Reagents scale as well as their
hand-coded counterparts, while providing the composability existing libraries
lack.

Scalable Join Patterns

Aaron Turon, Claudio Russo

*in OOPSLA 2011*.

[ abstract
| pdf | Keynote slides | pdf slides ]

*Abstract:*
Coordination can destroy scalability in parallel programming. A
comprehensive library of synchronization primitives is therefore an
essential tool for exploiting parallelism. Unfortunately, such
primitives do not easily combine to yield scalable solutions to more
complex problems. We demonstrate that a concurrency library based on
Fournet and Gonthier's join calculus can provide declarative and
scalable coordination. By declarative, we mean that the
programmer needs only to write down the constraints of a coordination
problem, and the library will automatically derive a correct solution.
By scalable, we mean that the derived solutions deliver robust
performance both as the number of processors increases, and as the
complexity of the coordination problem grows. We validate our claims
empirically on seven coordination problems, comparing our generic
solution to specialized algorithms from the literature.

Modular rollback through control logging

A pair of twin functional pearls

Olin Shivers, Aaron Turon

*in ICFP 2011*.

[ abstract
| pdf | Keynote slides | pdf slides ]

*Abstract:*
We present a technique, based on the use of first-class control
operators, permitting programs to maintain and invoke rollback
logs for sequences of reversible effects. Our technique is modular,
in that it provides for complete separation between some library of
effectful operations, and a client, "driver" program which invokes
and rolls back sequences of these operations. In particular, the
checkpoint mechanism, which is entirely encapsulated within the
effect library, logs not only the library's effects, but also the client's
control state. Thus, logging and rollback can be almost completely
transparent to the client code.

This separation of concerns manifests itself nicely when we
must implement software with sophisticated error handling. We
illustrate with two examples that both exploit the architecture to
disentangle some core parsing task from its error management. The
parser code is completely separate from the error-correction code,
although the two components are deeply intertwined at run time.

A resource analysis of the pi-calculus

Aaron Turon, Mitchell Wand

*In MFPS 2011*.

[ abstract
| pdf | slides ]

*Abstract:*
We give a new treatment of the pi-calculus based on the semantic
theory of separation logic, continuing a research program begun by
Hoare and O'Hearn. Using a novel resource model that distinguishes
between public and private ownership, we refactor the operational
semantics so that sending, receiving, and allocating are commands that
influence owned resources. These ideas lead naturally to two
denotational models: one for safety and one for liveness. Both models
are fully abstract for the corresponding observables, but more
importantly both are very simple. The close connections with the
model theory of separation logic (in particular, with Brookes's action
trace model) give rise to a logic of processes and resources.

A separation logic for refining concurrent objects

Aaron Turon, Mitchell Wand

*in POPL 2011*.

[ abstract
| pdf ]

*Abstract:*
Fine-grained concurrent data structures are crucial for gaining performance from multiprocessing, but their design is a subtle art. Recent literature has made large strides in verifying these data structures, using either atomicity refinement or separation logic with rely-guarantee reasoning. In this paper we show how the ownership discipline of separation logic can be used to enable atomicity refinement, and we develop a new rely-guarantee method that is localized to the definition of a data structure. We present the first semantics of separation logic that is sensitive to atomicity, and show how to control this sensitivity through ownership. The result is a logic that enables compositional reasoning about atomicity and interference, even for programs that use fine-grained synchronization and dynamic memory allocation.

All-Termination(T)

Panagiotis Manolios, Aaron Turon

*in TACAS 2009*.

[ abstract
| pdf | slides ]

*Abstract:* We introduce the
All-Termination(*T*) problem: given a termination
solver *T* and a collection of functions *F*, find
every subset of the formal parameters to *F* whose
consideration is sufficient to show, using *T*,
that *F* terminates. An important and motivating
application is enhancing theorem proving systems by constructing
the set of strongest induction schemes for *F*,
modulo *T*. These schemes can be derived from the set of
termination cores, the minimal sets returned by
All-Termination(*T*), without any reference to an explicit
measure function. We study the All-Termination(*T*) problem
as applied to the size-change termination analysis (SCT), a
PSpace-complete problem that underlies many termination
solvers. Surprisingly, we show that All-Termination(SCT) is also
PSpace-complete, even though it substantially generalizes SCT
. We develop a practical algorithm for All- Termination(SCT),
and show experimentally that on the ACL2 regression suite (whose
size is over 100MB) our algorithm generates stronger induction
schemes on 90% of multiargument functions.

Regular expression derivatives reexamined

Scott Owens, John Reppy, Aaron Turon

In *JFP*, March 2009, vol 19, issue 02, pp. 173-190

[ abstract
| pdf ]

*Abstract:*
The derivative of a set of strings

*S* with respect to a
symbol

*a* is the set of strings generated by stripping the
leading

*a* from the strings in

*S* that start
with

*a*. For regular sets of strings, i.e., sets defined by
regular expressions (REs), the derivative is also a regular set. In a
1964 paper, Janusz Brzozowski presented an elegant method for directly
constructing a recognizer from a regular expression based on
regular-expression derivatives (Brzozowski, 1964). His approach is
elegant and easily supports extended regular expressions; i.e., REs
extended with Boolean operations such as complement. Unfortunately, RE
derivatives have been lost in the sands of time, and few computer
scientists are aware of them. Recently, we independently developed two
scanner generators, one for PLT Scheme and one for Standard ML, using
RE derivatives. Our experiences with this approach have been quite
positive: the implementation techniques are simple, the generated
scanners are usually optimal in size, and the extended RE language
allows for more compact scanner specifications. Of special interest
is that the implementation techniques are well-suited to functional
languages that provide good support for symbolic term manipulation
(e.g., inductive datatypes and pattern matching).

The purpose of this paper is largely educational. Our positive
experience with RE derivatives leads us to believe that they deserve
the attention of the current generation of functional programmers,
especially those implementing RE recognizers. We begin with a review
of background material in Section 2, introducing the notation and
definitions of regular expressions and their recognizers. Section 3
gives a fresh presentation of Brzozowski's work, including DFA
construction with RE derivatives. In addition to reexamining
Brzozowski's work, we also report on the key implementation challenges
we faced in Section 4, including new techniques for handling large
character sets such as Unicode (Unicode Consortium, 2003). Section 5
reports our experience in general and includes an empirical comparison
of the derivative-based scanner generator for SML/NJ with the more
traditional tool it replaces. We conclude with a review of related
work and a summary.

Metaprogramming with Traits

John Reppy, Aaron Turon

In ECOOP, July 2007.

[ abstract
| pdf ]

*Abstract:*
In many domains, classes have highly regular internal structure. For
example, so-called business objects often contain boilerplate code for
mapping database fields to class members. The boilerplate code must be
repeated per-field for every class, because existing mechanisms for
constructing classes do not provide a way to capture and reuse such
member-level structure. As a result, programmers often resort to ad
hoc code generation. This paper presents a lightweight mechanism for
specifying and reusing member-level structure in Java programs. The
proposal is based on a modest extension to traits that we have termed
trait-based metaprogramming. Although the semantics of the mechanism
are straightforward, its type theory is difficult to reconcile with
nominal subtyping. We achieve reconciliation by introducing a hybrid
structural/nominal type system that extends Java's type system. The
paper includes a formal calculus defined by translation to
Featherweight Generic Java.

A foundation for trait-based metaprogramming

John Reppy, Aaron Turon

In FOOL/WOOD, January 2006.

[ abstract
| pdf
| slides
| tech report ]

*Abstract:*
Schaerli et al. introduced traits as reusable units of behavior
independent of the inheritance hierarchy. Despite their relative
simplicity, traits offer a surprisingly rich calculus. Trait calculi
typically include operations for resolving conflicts when composing
two traits. In the existing work on traits, these operations (method
exclusion and aliasing) are shallow, i.e., they have no effect on the
body of the other methods in the trait. In this paper, we present a
new trait system, based on the Fisher-Reppy trait calculus, that adds
deep operations (method hiding and renaming) to support conflict
resolution. The proposed operations are deep in the sense that they
preserve any existing connections between the affected method and the
other methods of the trait. Our system uses Riecke-Stone dictionaries
to support these features. In addition, we define a more fine-grained
mechanism for tracking trait types than in previous systems. The
resulting calculus is more flexible and expressive, and can serve as
the foundation for trait-based metaprogramming, an idiom we introduce.