-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathgcverify.tex
101 lines (88 loc) · 6.65 KB
/
gcverify.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
\subsection{Collector Verification in Boogie}
We have implemented and verified our algorithm in Boogie,
including initialization (Initialize), the GC (GarbageCollect), the allocator (Alloc),
and the mutator operations (ReadField, WriteField, and Eq),
and all the lower-level operations required to implement them (some of these appear
in Figure~\ref{fig:VerifiedGC}; others are omitted from the figure to save space).
To make the verification as realistic as possible,
our Boogie code implements everything in terms of individual CPU operations,
such as load, store, atomic increment/decrement, and CAS (compare-and-swap);
in contrast to some previous work~\cite{gont96},
we do not assume any built-in higher-level operations.
To ease verification, we make some simplifications:
we use a naive allocator (sequential search for free space),
we assume a sequentially consistent memory model,
and we assume that all objects have the same number of fields.
(Except for the assumption of sequential consistency, none of these substantially alter the nature of the proof.)
Overall, our implementation consists of about 2100 lines of Boogie code (included in supplemental material),
and it takes Boogie/Z3 about 80 minutes to verify.
Our verification takes advantage of all techniques in \civl: refinement, assertions, reduction, and linearity.
Refinement gives us extremely simple high-level action specifications for Initialize, ReadField, WriteField, Eq, and Alloc,
shown in their entirety in Figure~\ref{fig:VerifiedGC}'s dashed boxes.
(Initialize and GarbageCollect have empty actions; the GarbageCollector itself is just an internal implementation detail inside Initialize,
which serves only to set up the global GC invariant needed by the other high-level actions.)
Crucially, ReadField, WriteField, Eq, and Alloc appear atomic to mutators, even though internally,
WriteField and Alloc involve many interleaved operations on shared GC data structures.
Figure~\ref{fig:VerifiedGC} shows only shows phases 5 and 6, the two most abstract phases of refinement;
phases 1-4 fill in the implementation details,
such as implementing the set of gray objects as an explicit stack (an array of elements with a pointer to the stack top, in phase 4),
handshaking (phase 3), locks (phase 2), and wrapping the primitive CPU operations in left/right/non-moving atomic actions (phase 1).
Ultimately, the phases are built on trusted CPU-level atomic actions, such as reading and writing roots directly:
\begin{verbatim}
procedure PrimitiveWriteRoot(i:idx, v:int)
atomic [assert rootAddr(i); root[i] := v;];
procedure PrimitiveReadRoot(i:idx) returns (v:int)
atomic [assert rootAddr(i); v := root[i];];
\end{verbatim}
We write the highest-level action specifications in terms of an abstract view of memory,
as in earlier work on sequential garbage collector verification~\cite{mccr07,hawb09}.
(Abstract memory is infinite and eternal: once allocated, an abstract object lives forever.
Deallocation is an underlying implementation detail, not exposed in the abstract interface.)
Our abstract view describes a machine as consisting of just three variables:
abstract memory memAbs:[obj][fld]obj, mapping object identifiers and fields to other objects,
abstract root values rootAbs:[idx]obj, mapping root names to objects, and
allocSet:[obj]bool, the set of objects allocated so far.
At this high layer of abstraction, we use Boogie's hiding to hide all other variables
(such as the concrete root set, ``root'', the concrete memory, ``mem'', and the colors, ``Color'', used by lower-level procedures).
All operations are relative to root names of type idx.
ReadField, for example, reads an object field from an object pointed to by root x into a root y.
The predicates rootAddr and tidOwns establish that x and y are valid root names, owned by a particular mutator tid.
(We assume that each root is private to a single mutator stack or register file;
sharing between mutator threads takes place through shared pointers to objects.)
The predicates fieldIndex(f) and memAddrAbs(o) establish that x.f is a valid field of a valid object.
Allocation establishes memAddrAbs(o) for newly allocated objects so that they may be used by subsequent ReadField and WriteField operations.
It also establishes o's unique identity by ensuring that it did not previously belong to the allocated object set.
In addition to atomic action specifications, the verification establishes invariants using assertion reasoning
(omitted from Figure~\ref{fig:VerifiedGC} to save space).
For example, Initialize establishes a global mapping toAbs:[int]obj from physical memory mem and abstract memory memAbs:
\begin{verbatim}
(forall x:int, f:fld ::
memAddr(x) && toAbs[x] != nil && fieldIndex(f)
==> toAbs[mem[x][f]] == memAbs[toAbs[x]][f])
\end{verbatim}
and Mark maintains the no-black-to-white invariant:
\begin{verbatim}
(forall x:int, f:fld ::
memAddr(x) && Black(Color[x]) && fieldIndex(f)
&& memAddr(mem[x][f]) ==>
Gray(Color[mem[x][f]]) || Black(Color[mem[x][f]]))
\end{verbatim}
Finally, linearity plays a key role in establishing mutual exclusion.
The GC thread has its own thread id gcTid, and each mutator has its own thread id.
The Initialize procedure consumes gcTid (written here as ``consume'')
and borrows all the mutator thread ids (written as ``linear'', as in Section~\ref{sec:overview}),
so that it's clear that no other concurrent actions are allowed during initialization.
This allows all the internal initialization actions to be both-movers, without requiring any explicit locking.
Initialize must consume gcTid because it passes gcTid to the newly spawned GarbageCollector thread;
since gcTid is consumed, it's impossible to call Initialize twice in an attempt to spawn two parallel GC threads
(which naturally expresses how the algorithm is only safe for a single GC thread).
Because \civl's linearity is based on sets of values,
we can represent thread identifiers as sets that can be subdivided into subsets
(similar to how fractional permissions may be divided into fractions).
During root scanning, each mutator thread places a fraction of its thread id in a global variable,
and reclaims the fraction from the global variable after root scanning completes;
a collector invariant tracks that the global variable contains non-empty fractions from all mutators during root scanning.
Thus, during root scanning, \civl's rules for linearity prove that no interference occurs
between the collector and any mutator operations that require the whole mutator thread id
(``mutatorTidWhole'', used in Figure~\ref{fig:VerifiedGC}'s ReadField, WriteField, Alloc,
and most other mutator operations).