Cover art for The Part Time Parliament by Leslie Lamport

The Part Time Parliament

1 viewer

The Part Time Parliament Lyrics

The Part-Time Parliament
Leslie Lamport
This article appeared in ACM Transactions on Computer Systems 16, 2 (May 1998), 133-169. Minor corrections were made
on 29 August 2000.The Part-Time Parliament
LESLIE LAMPORT
Digital Equipment Corporation
Recent archaeological discoveries on the island of Paxos reveal that the parliament functioned despite the peripatetic propensity of its part-time legislators. The legislators maintained consistent
copies of the parliamentary record, despite their frequent forays from the chamber and the forgetfulness of their messengers. The Paxon parliament’s protocol provides a new way of implementing
the state-machine approach to the design of distributed systems.
Categories and Subject Descriptors: C2.4 [Computer-Communications Networks]: Distributed
Systems—Network operating systems; D4.5 [Operating Systems]: Reliability—Fault-tolerance;
J.1 [Administrative Data Processing]: Government
General Terms:Design, Reliability
Additional Key Words and Phrases:State machines, three-phase commit, voting
This submission was recently discovered behind a filing cabinet in the TOCS editorial
office. Despite its age, the editor-in-chief felt that it was worth publishing. Because the
author is currently doing field work in the Greek isles and cannot be reached, Iwas asked
to prepare it for publication.
The author appears to be an archeologist with only a passing interest in computer science. This is unfortunate; even though the obscure ancient Paxon civilization he describes
is of little interest to most computer scientists, its legislative system is an excellent model
for how to implement a distributed computer system in an asynchronous environment.
Indeed, some of the refinements the Paxons made to their protocol appear to be unknown
in the systems literature.
The author does give a brief discussion of the Paxon Parliament’s relevance to distributed computing in Section 4. Computer scientists will probably want to read that
section first. Even before that, they might want to read the explanation of the algorithm
for computer scientists by Lampson [1996]. The algorithm is also described more formally
by De Prisco et al. [1997]. Ihave added further comments on the relation between the
ancient protocols and more recent work at the end of Section 4.
Keith Marzullo
University of California, San Diego
Authors’ address:Systems Research Center, Digital Equipment Corporation, 130 Lytton Avenue,
Palo Alto, CA 94301.
Permission to copy without fee all or part of this material is granted provided that the copies are
not made or distributed for direct commercial advantage, the ACM copyright notice and the title
of the publication and its date appear, and notice is given that copying is by permission of the
Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or
specific permission.
c 1998 ACM 0000-0000/98/0000-0000 $00.002 · Leslie Lamport
1 The Problem
1.1 The Island of Paxos
Early in this millennium, the Aegean island of Paxos was a thriving mercantile center.1
Wealth led to political sophistication, and the Paxons replaced their ancient
theocracy with a parliamentary form of government. But trade came before civic
duty, and no one in Paxos was willingto devote his life to Parliament. The Paxon
Parliament had to function even though legislators continually wandered in and out
of the parliamentary Chamber.
The problem of governing with a part-time parliament bears a remarkable correspondence to the problem faced by today’s fault-tolerant distributed systems, where
legislators correspond to processes and leaving the Chamber corresponds to failing.
The Paxons’ solution may therefore be of some interest to computer scientists. I
present here a short history of the Paxos Parliament’s protocol, followed by an even
shorter discussion of its relevance for distributed systems.
Paxon civilization was destroyed by a foreign invasion, and archeologists have just
recently begun to unearth its history. Our knowledge of the Paxon Parliament is
therefore fragmentary. Although the basic protocols are known, we are ignorant of
many details. Where such details are of interest, I will take the liberty of speculating
on what the Paxons might have done.
1.2 Requirements
Parliament’s primary task was to determine the law of the land, which was defined
by the sequence of decrees it passed. A modern parliament will employ a secretary
to record its actions, but no one in Paxos was willingto remain in the Chamber
throughout the session to act as secretary. Instead, each Paxon legislator maintained a ledger in which he recorded the numbered sequence of decrees that were
passed. For example, legislator Λ˘ινχ∂’s ledger had the entry
155: The olive tax is 3 drachmas per ton
if she believed that the 155th decree passed by Parliament set the tax on olives to 3
drachmas per ton. Ledgers were written with indelible ink, and their entries could
not be changed.
The first requirement of the parliamentary protocol was the consistency of ledgers,
meaning that no two ledgers could contain contradictory information. If legislator
Φισ∂ρ had the entry
132: Lamps must use only olive oil
in his ledger, then no other legislator’s ledger could have a different entry for decree
132. However, another legislator might have no entry in his ledger for decree 132 if
he hadn’t yet learned that the decree had been passed.
Consistency of ledgers was not sufficient, since it could be trivially fulfilled by
leaving all ledgers blank. Some requirement was needed to guarantee that decrees
1
It should not be confused with the Ionian island of Paxoi, whose name is sometimes corrupted
to Paxos.The Part-Time Parliament · 3
were eventually passed and recorded in ledgers. In modern parliaments, the passing
of decrees is hindered by disagreement among legislators. This was not the case
in Paxos, where an atmosphere of mutual trust prevailed. Paxon legislators were
willingto pass any decree that was proposed. However, their peripatetic propensity
posed a problem. Consistency would be lost if one group of legislators passed the
decree
37: Painting on temple walls is forbidden
and then left for a banquet, whereupon a different group of legislators entered
the Chamber and, knowingnothingabout what had just happened, passed the
conflictingdecree
37: Freedom of artistic expression is guaranteed
Progress could not be guaranteed unless enough legislators stayed in the Chamber for a long enough time. Because Paxon legislators were unwilling to curtail
their outside activities, it was impossible to ensure that any decree would ever be
passed. However, legislators were willing to guarantee that, while in the Chamber,
they and their aides would act promptly on all parliamentary matters. This guarantee allowed the Paxons to devise a parliamentary protocol satisfyingthe following
progress condition.
If a majority of the legislators2
were in the Chamber and no one entered
or left the Chamber for a sufficiently longperiod of time then any decree
proposed by a legislator in the Chamber would be passed, and every
decree that had been passed would appear in the ledger of every legislator
in the Chamber.
1.3 Assumptions
The requirements of the parliamentary protocol could be achieved only by providing
the legislators with the necessary resources. Each legislator received a sturdy ledger
in which to record the decrees, a pen, and a supply of indelible ink. Legislators might
forget what they had been doing if they left the Chamber,3
so they would write
notes in the back of the ledgers to remind themselves of important parliamentary
tasks. An entry in the list of decrees was never changed, but notes could be crossed
out. Achieving the progress condition required that legislators be able to measure
the passage of time, so they were given simple hourglass timers.
Legislators carried their ledgers at all times, and could always read the list of
decrees and any note that had not been crossed out. The ledgers were made of the
finest parchment and were used for only the most important notes. A legislator
would write other notes on a slip of paper, which he might (or might not) lose if
he left the Chamber.
The acoustics of the Chamber were poor, makingoratory impossible. Legislators
could communicate only by messenger, and were provided with funds to hire as
2
In translating the progress condition, I have rendered the Paxon word µαδζ∂ωριτ˘ιστ as majority
of the legislators. Alternative translations of this word have been proposed and are discussed in
Section 2.2.
3
In one tragic incident, legislator T ωυγ developed irreversible amnesia after being hit on the
head by a falling statue just outside the Chamber.4 · Leslie Lamport
many messengers as they needed. A messenger could be counted on not to garble
messages, but he might forget that he had already delivered a message, and deliver
it again. Like the legislators they served, messengers devoted only part of their
time to parliamentary duties. A messenger might leave the Chamber to conduct
some business—perhaps taking a six-month voyage—before delivering a message.
He might even leave forever, in which case the message would never be delivered.
Although legislators and messengers could enter and leave at any time, when
inside the Chamber they devoted themselves to the business of Parliament. While
they remained in the Chamber, messengers delivered messages in a timely fashion
and legislators reacted promptly to any messages they received.
The official records of Paxos claim that legislators and messengers were scrupulously honest and strictly obeyed parliamentary protocol. Most scholars discount
this as propaganda, intended to portray Paxos as morally superior to its eastern
neighbors. Dishonesty, although rare, undoubtedly did occur. However, because it
was never mentioned in official documents, we have little knowledge of how Parliament coped with dishonest legislators or messengers. What evidence has been
uncovered is discussed in Section 3.3.5.
2 The Single-Decree Synod
The Paxon Parliament evolved from an earlier ceremonial Synod of priests that
was convened every 19 years to choose a single, symbolic decree. For centuries, the
Synod had chosen the decree by a conventional procedure that required all priests
to be present. But as commerce flourished, priests began wandering in and out of
the Chamber while the Synod was in progress. Finally, the old protocol failed, and
a Synod ended with no decree chosen. To prevent a repetition of this theological
disaster, Paxon religious leaders asked mathematicians to formulate a protocol for
choosingthe Synod’s decree. The protocol’s requirements and assumptions were essentially the same as those of the later Parliament except that instead of containing
a sequence of decrees, a ledger would have at most one decree. The resulting Synod
protocol is described here; the Parliamentary protocol is described in Section 3.
Mathematicians derived the Synod protocol in a series of steps. First, they proved
results showingthat a protocol satisfyingcertain constraints would guarantee consistency and allow progress. A preliminary protocol was then derived directly from
these constraints. A restricted version of the preliminary protocol provided the
basic protocol that guaranteed consistency, but not progress. The complete Synod
protocol, satisfyingthe consistency and progress requirements, was obtained by
restrictingthe basic protocol.4
The mathematical results are described in Section 2.1, and the protocols are
described informally in Sections 2.2–2.4. A more formal description and correctness
proof of the basic protocol appears in the appendix.
4
The complete history of the Synod protocol’s discovery is not known. Like modern computer
scientists, Paxon mathematicians would describe elegant, logical derivations that bore no resemblance to how the algorithms were actually derived. However, it is known that the mathematical
results (Theorems 1 and 2 of Section 2.1) really did precede the protocol. They were discovered
when mathematicians, in response to the request for a protocol, were attempting to prove that a
satisfactory protocol was impossible.The Part-Time Parliament · 5
2.1 Mathematical Results
The Synod’s decree was chosen through a series of numbered ballots, where a ballot
was a referendum on a single decree. In each ballot, a priest had the choice only
of votingfor the decree or not voting.5
Associated with a ballot was a set of
priests called a quorum. A ballot succeeded iff (if and only if) every priest in the
quorum voted for the decree. Formally, a ballot B consisted of the followingfour
components. (Unless otherwise qualified, set is taken to mean finite set.
6
)
Bdec A decree (the one beingvoted on).
Bqrm A nonempty set of priests (the ballot’s quorum).
Bvot A set of priests (the ones who cast votes for the decree).7
Bbal A ballot number.
A ballot B was said to be successful iff Bqrm ⊆ Bvot, so a successful ballot was one
in which every quorum member voted.
Ballot numbers were chosen from an unbounded ordered set of numbers. If
B
bal > Bbal, then ballot B
was said to be later than ballot B. However, this
indicated nothingabout the order in which ballots were conducted; a later ballot
could actually have taken place before an earlier one.
Paxon mathematicians defined three conditions on a set B of ballots, and then
showed that consistency was guaranteed and progress was possible if the set of
ballots that had taken place satisfied those conditions. The first two conditions
were simple; they can be stated informally as follows.
B1(B) Each ballot in B has a unique ballot number.
B2(B) The quorums of any two ballots in B have at least one priest in common.
The third condition was more complicated. One Paxon manuscript contained the
following, rather confusing, statement of it.
B3(B) For every ballot B in B, if any priest in B’s quorum voted in an earlier
ballot in B, then the decree of B equals the decree of the latest of those
earlier ballots.
Interpretation of this cryptic text was aided by the manuscript pictured in Figure 1,
which illustrates condition B3(B) with a set B of five ballots for a Synod consisting
of the five priests A, B, Γ, ∆, and E. This set B contains five ballots, where for
each ballot, the set of voters is the subset of the priests in the quorum whose names
are enclosed in boxes. For example, ballot number 14 has decree α, a quorum
containingthree priests, and a set of two voters. Condition B3(B) has the form
“for every B in B: . . . ”, where “...” is a condition on ballot B. The conditions for
the five ballots B of Figure 1 are as follows.
5
Like some modern nations, Paxos had not fully grasped the nature of Athenian democracy.
6
Although Paxon mathematicians were remarkably advanced for their time, they obviously had
no knowledge of set theory. I have taken the liberty of translating the Paxon’s more primitive
notation into the language of modern set theory.
7
Only priests in the quorum actually voted, but Paxon mathematicians found it easier to convince
people that the protocol was correct if, in their proof, they allowed any priest to vote in any ballot.6 · Leslie Lamport
# decree quorum and voters
2 α ABΓ ∆
5 β A B Γ E
14 α B ∆ E
27 β A Γ ∆
29 β B Γ ∆
Fig. 1. Paxon manuscript showing a set B, consisting of five ballots, that satisfies conditions
B1(B)–B3(B). (Explanatory column headings have been added.)
2. Ballot number 2 is the earliest ballot, so the condition on that ballot is trivially
true.
5. None of ballot 5’s four quorum members voted in an earlier ballot, so the
condition on ballot 5 is also trivially true.
14. The only member of ballot 14’s quorum to vote in an earlier ballot is ∆, who
voted in ballot number 2, so the condition requires that ballot 14’s decree must
equal ballot 2’s decree.
27. (This is a successful ballot.) The members of ballot 27’s quorum are A, Γ, and
∆. Priest A did not vote in an earlier ballot, the only earlier ballot Γ voted in
was ballot 5, and the only earlier ballot ∆ voted in was ballot 2. The latest of
these two earlier ballots is ballot 5, so the condition requires that ballot 27’s
decree must equal ballot 5’s decree.
29. The members of ballot 29’s quorum are B, Γ, and ∆. The only earlier ballot
that B voted in was number 14, priest Γ voted in ballots 5 and 27, and ∆ voted
in ballots 2 and 27. The latest of these four earlier ballots is number 27, so the
condition requires that ballot 29’s decree must equal ballot 27’s decree.
To state B1(B)–B3(B) formally requires some more notation. A vote v was
defined to be a quantity consistingof three components: a priest v
pst, a ballot
number v
bal, and a decree v
dec. It represents a vote cast by priest v
pst for decree
v
dec in ballot number v
bal. The Paxons also defined null votes to be votes v with
v
bal = −∞ and v
dec = blank, where −∞ < ∞ for any ballot number b, and
blank is not a decree. For any priest p, they defined nullp to be the unique null
vote v with v
pst = p.
Paxon mathematicians defined a total orderingon the set of all votes, but part
of the manuscript containingthe definition has been lost. The remainingfragment
indicates that, for any votes v and v
, if v
bal < v
bal then v< b, or to be nullp
if there was no such vote. Since nullp is smaller than any real vote cast by p, this
means that MaxVote(b, p, B) is the largest vote in the set
{v ∈ Votes(B):(v
pst = p) ∧ (v
bal < b)}∪{nullp}
For any nonempty set Q of priests, MaxVote(b, Q, B) was defined to equal the
maximum of all votes MaxVote(b, p, B) with p in Q.
Conditions B1(B)–B3(B) are stated formally as follows.8
B1(B)

= ∀B,B
∈ B : (B
= B

) ⇒ (Bbal
= B

bal)
B2(B)

= ∀B,B
∈ B : Bqrm ∩B
qrm
= ∅
B3(B)

= ∀B ∈ B : (MaxVote(Bbal, Bqrm, B)
bal
= −∞) ⇒
(Bdec = MaxVote(Bbal, Bqrm, B)
dec)
Although the definition of MaxVote depends upon the orderingof votes, B1(B)
implies that MaxVote(b, Q, B)
dec is independent of how votes with equal ballot
numbers were ordered.
To show that these conditions imply consistency, the Paxons first showed that
B1(B)–B3(B) imply that, if a ballot B in B is successful, then any later ballot in
B is for the same decree as B.
Lemma If B1(B), B2(B), and B3(B) hold, then
((Bqrm ⊆ Bvot)∧(B

bal > Bbal)) ⇒ (B

dec = Bdec)
for any B, B
in B.
Proof of Lemma
For any ballot B in B, let Ψ(B, B) be the set of ballots in B later than B for a
decree different from B’s:
Ψ(B, B)

= {B
∈ B : (B

bal > Bbal)∧(B

dec
= Bdec)}
To prove the lemma, it suffices to show that if Bqrm ⊆ Bvot then Ψ(B, B) is empty.
The Paxons gave a proof by contradiction. They assumed the existence of a B with
Bqrm ⊆ Bvot and Ψ(B, B)
= ∅, and obtained a contradiction as follows.9
1. Choose C ∈ Ψ(B, B) such that Cbal = min{B
bal : B
∈ Ψ(B, B)}.
Proof: C exists because Ψ(B, B) is nonempty and finite.
2. Cbal > Bbal
Proof: By 1 and the definition of Ψ(B, B).
3. Bvot ∩ Cqrm
= ∅
Proof: By B2(B) and the hypothesis that Bqrm ⊆ Bvot.
8
I use the Paxon mathematical symbol ∆
=, which meant equals by definition.
9
Paxon mathematicians always provided careful, structured proofs of important theorems. They
were not as sophisticated as modern mathematicians, who can omit many details and write
paragraph-style proofs without ever making a mistake.8 · Leslie Lamport
4. MaxVote(Cbal, Cqrm, B)
bal ≥ Bbal
Proof: By 2, 3 and the definition of MaxVote(Cbal, Cqrm, B).
5. MaxVote(Cbal, Cqrm, B) ∈ Votes(B)
Proof: By 4 (which implies that MaxVote(Cbal, Cqrm, B) is not a null vote)
and the definition of MaxVote(Cbal, Cqrm, B).
6. MaxVote(Cbal, Cqrm, B)
dec = Cdec.
Proof: By 5 and B3(B).
7. MaxVote(Cbal, Cqrm, B)
dec
= Bdec
Proof: By 6, 1, and the definition of Ψ(B, B).
8. MaxVote(Cbal, Cqrm, B)
bal > Bbal
Proof: By 4, since 7 and B1(B) imply that MaxVote(Cbal, Cqrm, B)
bal
= Bbal.
9. MaxVote(Cbal, Cqrm, B) ∈ Votes(Ψ(B, B))
Proof: By 7, 8, and the definition of Ψ(B, B).
10. MaxVote(Cbal, Cqrm, B)
bal < Cbal
Proof: By definition of MaxVote(Cbal, Cqrm, B).
11. Contradiction
Proof: By 9, 10, and 1.
End Proof of Lemma
With this lemma, it was easy to show that, if B1–B3 hold, then any two successful
ballots are for the same decree.
Theorem 1. If B1(B), B2(B), and B3(B) hold, then
((Bqrm ⊆ Bvot)∧(B

qrm ⊆ B

vot)) ⇒ (B

dec = Bdec)
for any B, B
in B.
Proof of Theorem
If B
bal = Bbal, then B1(B) implies B
= B. If B
bal
= Bbal, then the theorem
follows immediately from the lemma.
End Proof of Theorem
The Paxons then proved a theorem assertingthat if there are enough priests in
the Chamber, then it is possible to conduct a successful ballot while preserving B1–
B3. Although this does not guarantee progress, it at least shows that a balloting
protocol based on B1–B3 will not deadlock.
Theorem 2. Let b be a ballot number and Q a set of priests such that b>Bbal
and Q∩Bqrm
= ∅ for all B ∈ B. If B1(B), B2(B), and B3(B) hold, then there is a
ballot B
with B
bal = b and B
qrm = B
vot = Q such that B1(B∪{B
}), B2(B∪{B
}),
and B3(B∪{B
}) hold.
Proof of Theorem
Condition B1(B∪{B
}) follows from B1(B), the choice of B
bal, and the assumption
about b. Condition B2(B∪{B
}) follows from B2(B), the choice of B
qrm, and the
assumption about Q. If MaxVote(b, Q, B)
bal = −∞ then let B
dec be any decree,
else let it equal MaxVote(b, Q, B)
dec. Condition B3(B∪{B
}) then follows from
B3(B).
End Proof of TheoremThe Part-Time Parliament · 9
2.2 The Preliminary Protocol
The Paxons derived the preliminary protocol from the requirement that conditions
B1(B)–B3(B) remain true, where B was the set of all ballots that had been or were
beingconducted. The definition of the protocol specified how the set B changed,
but the set was never explicitly calculated. The Paxons referred to B as a quantity
observed only by the gods, since it might never be known to any mortal.
Each ballot was initiated by a priest, who chose its number, decree, and quorum.
Each priest in the quorum then decided whether or not to vote in the ballot. The
rules determininghow the initiator chose a ballot’s number, decree, and quorum,
and how a priest decided whether or not to vote in a ballot were derived directly
from the need to maintain B1(B)–B3(B).
To maintain B1, each ballot had to receive a unique number. By remembering
(with notes in his ledger) what ballots he had previously initiated, a priest could
easily avoid initiatingtwo different ballots with the same number. To keep different
priests from initiatingballots with the same number, the set of possible ballot
numbers was partitioned amongthe priests. While it is not known how this was
done, an obvious method would have been to let a ballot number be a pair consisting
of an integer and a priest, using a lexicographical ordering, where
(13, Γρα˘ι) < (13, Λινσ˘ι) < (15, Γρα˘ι)
since Γ came before Λ in the Paxon alphabet. In any case, it is known that every
priest had an unbounded set of ballot numbers reserved for his use.
To maintain B2, a ballot’s quorum was chosen to contain a µαδζ∂ωριτ˘ιστ of
priests. Initially, µαδζ∂ωριτ˘ιστ just meant a simple majority. Later, it was observed that fat priests were less mobile and spent more time in the Chamber than
thin ones, so a µαδζ∂ωριτ˘ιστ was taken to mean any set of priests whose total
weight was more than half the total weight of all priests, rather than a simple majority of the priests. When a group of thin priests complained that this was unfair,
actual weights were replaced with symbolic weights based on a priest’s attendance
record. The primary requirement for a µαδζ∂ωριτ˘ιστ was that any two sets containinga µαδζ∂ωριτ˘ιστ of priests had at least one priest in common. To maintain
B2, the priest initiatinga ballot B chose Bqrm to be a majority set.
Condition B3 requires that if MaxVote(b, Q, B)
dec is not equal to blank, then
a ballot with number b and quorum Q must have decree MaxVote(b, Q, B)
dec. If
MaxVote(b, Q, B)
dec equals blank, then the ballot can have any decree. To maintain B3(B), before initiatinga new ballot with ballot number b and quorum Q, a
priest p had to find MaxVote(b, Q, B)
dec. To do this, p had to find MaxVote(b, q, B)
for each priest q in Q.
Recall that MaxVote(b, q, B) is the vote with the largest ballot number less than
b amongall the votes cast by q, or nullq if q did not vote in any ballot numbered
less than b. Priest p obtains MaxVote(b, q, B) from q by an exchange of messages.
Therefore, the first two steps in the protocol for conductinga single ballot initiated
by p are:10
10Priests p and q could be the same. For simplicity, the protocol is described with p sending
messages to himself in this case. In reality, a priest could talk to himself without the use of
messengers.10 · Leslie Lamport
(1) Priest p chooses a new ballot number b and sends a NextBallot(b) messag e to
some set of priests.
(2) A priest q responds to the receipt of a NextBallot(b) message by sending a
LastVote(b, v) message to p, where v is the vote with the largest ballot number
less than b that q has cast, or his null vote nullq if q did not vote in any ballot
numbered less than b.
Priest q must use notes in the back of his ledger to remember what votes he had
previously cast.
When q sends the LastVote(b, v) message, v equals MaxVote(b, q, B). But the
set B of ballots changes as new ballots are initiated and votes are cast. Since priest
p is going to use v as the value of MaxVote(b, q, B) when choosinga decree, to keep
B3(B) true it is necessary that MaxVote(b, q, B) not change after q has sent the
LastVote(b, v) message. To keep MaxVote(b, q, B) from changing, q must cast no
new votes with ballot numbers between v
bal and b. By sendingthe LastVote(b, v)
message, q is promisingnot to cast any such vote. (To keep this promise, q must
record the necessary information in his ledger.)
The next two steps in the ballotingprotocol (begun in step 1 by priest p) are:
(3) After receivinga LastVote(b, v) message from every priest in some majority
set Q, priest p initiates a new ballot with number b, quorum Q, and decree d,
where d is chosen to satisfy B3. He then records the ballot in the back of his
ledger and sends a BeginBallot(b, d) message to every priest in Q.
(4) Upon receipt of the BeginBallot(b, d) message, priest q decides whether or not
to cast his vote in ballot number b. (He may not cast the vote if doingso
would violate a promise implied by a LastVote(b

, v
) message he has sent for
some other ballot.) If q decides to vote for ballot number b, then he sends a
Voted(b, q) message to p and records the vote in the back of his ledger.
The execution of step 3 is considered to add a ballot B to B, where Bbal = b,
Bqrm = Q, Bvot = ∅ (no one has yet voted in this ballot), and Bdec = d. In step 4,
if priest q decides to vote in the ballot, then executingthat step is considered to
change the set B of ballots by adding q to the set Bvot of voters in the ballot B ∈ B.
A priest has the option not to vote in step 4, even if castinga vote would not
violate any previous promise. In fact, all the steps in this protocol are optional.
For example, a priest q can ignore a NextBallot(b) message instead of executing
step 2. Failure to take an action can prevent progress, but it cannot cause any
inconsistency because it cannot make B1(B)–B3(B) false. Since the only effect not
receiving a message can have is to prevent an action from happening, message loss
also cannot cause inconsistency. Thus, the protocol guarantees consistency even if
priests leave the chamber or messages are lost.
Receivingmultiple copies of a message can cause an action to be repeated. Except
in step 3, performingthe action a second time has no effect. For example, sending
several Voted(b, q) messages in step 4 has the same effect as sending just one.
The repetition of step 3 is prevented by usingthe entry made in the back of the
ledger when it is executed. Thus, the consistency condition is maintained even if a
messenger delivers the same message several times.
Steps 1–4 describe the complete protocol for initiatinga ballot and votingon it.The Part-Time Parliament · 11
All that remains is to determine the results of the ballotingand announce when a
decree has been selected. Recall that a ballot is successful iff every priest in the
quorum has voted. The decree of a successful ballot is the one chosen by the Synod.
The rest of the protocol is:
(5) If p has received a Voted(b, q) message from every priest q in Q (the quorum
for ballot number b), then he writes d (the decree of that ballot) in his ledger
and sends a Success(d) message to every priest.
(6) Upon receivinga Success(d) message, a priest enters decree d in his ledger.
Steps 1–6 describe how an individual ballot is conducted. The preliminary protocol
allows any priest to initiate a new ballot at any time. Each step maintains B1(B)–
B3(B), so the entire protocol also maintains these conditions. Since a priest enters
a decree in his ledger only if it is the decree of a successful ballot, Theorem 1 implies
that the priests’ ledgers are consistent. The protocol does not address the question
of progress.
In step 3, if the decree d is determined by condition B3, then it is possible that
this decree is already written in the ledger of some priest. That priest need not be
in the quorum Q; he could have left the Chamber. Thus, consistency would not be
guaranteed if step 3 allowed any greater freedom in choosing d.
2.3 The Basic Protocol
In the preliminary protocol, a priest must record (i) the number of every ballot
he has initiated, (ii) every vote he has cast, and (iii) every LastVote message he
has sent. Keepingtrack of all this information would have been difficult for the
busy priests. The Paxons therefore restricted the preliminary protocol to obtain
the more practical basic protocol in which each priest p had to maintain only the
followinginformation in the back of his ledger:
lastTried[p] The number of the last ballot that p tried to initiate, or −∞ if there
was none.
prevVote[p] The vote cast by p in the highest-numbered ballot in which he voted,
or −∞ if he never voted.
nextBal[p] The largest value of b for which p has sent a LastVote(b, v) message,
or −∞ if he has never sent such a message.
Steps 1–6 of the preliminary protocol describe how a single ballot is conducted by
its initiator, priest p. The preliminary protocol allows p to conduct any number
of ballots concurrently. In the basic protocol, he conducts only one ballot at a
time—ballot number lastTried[p]. After p initiates this ballot, he ignores messages
that pertain to any other ballot that he had previously initiated. Priest p keeps all
information about the progress of ballot number lastTried[p] on a slip of paper. If
he loses that slip of paper, then he stops conductingthe ballot.
In the preliminary protocol, each LastVote(b, v) message sent by a priest q represents a promise not to vote in any ballot numbered between v
bal and b. In the
basic protocol, it represents the stronger promise not to cast a new vote in any
ballot numbered less than b. This stronger promise might prevent him from casting
a vote in step 4 of the basic protocol that he would have been allowed to cast in
the preliminary protocol. However, since the preliminary protocol always gives q12 · Leslie Lamport
the option of not castinghis vote, the basic protocol does not require him to do
anythingnot allowed by the preliminary protocol.
Steps 1–6 of the preliminary protocol become the followingsix steps for conductinga ballot in the basic protocol. (All information used by p to conduct the ballot,
other than lastTried[p], prevVote[p], and nextBal[p], is kept on a slip of paper.)
(1) Priest p chooses a new ballot number b greater than lastTried[p], setslastTried[p]
to b, and sends a NextBallot(b) message to some set of priests.
(2) Upon receipt of a NextBallot(b) message from p with b > nextBal[q], priest q
sets nextBal[q] to b and sends a LastVote(b, v) message to p, where v equals
prevVote[q]. (A NextBallot(b) message is ignored if b ≤ nextBal[q].)
(3) After receivinga LastVote(b, v) message from every priest in some majority
set Q, where b = lastTried[p], priest p initiates a new ballot with number b,
quorum Q, and decree d, where d is chosen to satisfy B3. He then sends a
BeginBallot(b, d) message to every priest in Q.
(4) Upon receipt of a BeginBallot(b, d) message with b = nextBal[q], priest q casts
his vote in ballot number b, sets prevVote[q] to this vote, and sends a Voted(b, q)
message to p. (A BeginBallot(b, d) message is ignored if b
= nextBal[q].)
(5) If p has received a Voted(b, q) message from every priest q in Q (the quorum
for ballot number b), where b = lastTried[p], then he writes d (the decree of
that ballot) in his ledger and sends a Success(d) message to every priest.
(6) Upon receivinga Success(d) message, a priest enters decree d in his ledger.
The basic protocol is a restricted version of the preliminary protocol, meaning
that every action allowed by the basic protocol is also allowed by the preliminary
protocol. Since the preliminary protocol satisfies the consistency condition, the
basic protocol also satisfies that condition. Like the preliminary protocol, the basic
protocol does not require that any action ever be taken, so it does not addresses
the question of progress.
The derivation of the basic protocol from B1–B3 made it obvious that the consistency condition was satisfied. However, some similarly “obvious” ancient wisdom
had turned out to be false, and skeptical citizens demanded a more rigorous proof.
Their Paxon mathematicians’ proof that the protocol satisfies the consistency condition is reproduced in the appendix.
2.4 The Complete Synod Protocol
The basic protocol maintains consistency, but it cannot ensure any progress because
it states only what a priest may do; it does not require him to do anything. The
complete protocol consists of the same six steps for conductinga ballot as the basic
protocol. To help achieve progress, it includes the obvious additional requirement
that priests perform steps 2–6 of the protocol as soon as possible. However, to
meet the progress condition, it is necessary that some priest be required to perform
step 1, which initiates a ballot. The key to the complete protocol lay in determining
when a priest should initiate a ballot.
Never initiatinga ballot will certainly prevent progress. However, initiatingtoo
may ballots can also prevent progress. If b is larger than any other ballot number,
then the receipt of a NextBallot(b) message by priest q in step 2 may elicit aThe Part-Time Parliament · 13
promise that prevents him from votingin step 4 for any previously initiated ballot.
Thus, the initiation of a new ballot can prevent any previously initiated ballot from
succeeding. If new ballots are continually initiated with increasing ballot numbers
before the previous ballots have a chance to succeed, then no progress might be
made.
Achievingthe progress condition requires that new ballots be initiated until one
succeeds, but that they not be initiated too frequently. To develop the complete
protocol, the Paxons first had to know how longit took messengers to deliver
messages and priests to respond. They determined that a messenger who did not
leave the Chamber would always deliver a message within 4 minutes, and a priest
who remained in the Chamber would always perform an action within 7 minutes
of the event that caused the action.11 Thus, if p and q were in the Chamber when
some event caused p to send a message to q, and q responded with a reply to p, then
p would receive that reply within 22 minutes if neither messenger left the Chamber.
(Priest p would send the message within 7 minutes of the event, q would receive
the message within 4 more minutes, he would respond within 7 minutes, and the
reply would reach p within 4 more minutes.)
Suppose that only a single priest p was initiatingballots, and that he did so
by sendinga message to every priest in step 1 of the protocol. If p initiated a
ballot when a majority set of priests was in the chamber, then he could expect
to execute step 3 within 22 minutes of initiatingthe ballot, and to execute step 5
within another 22 minutes. If he was unable to execute the steps by those times,
then either some priest or messenger left the Chamber after p initiated the ballot,
or a larger-numbered ballot had previously been initiated by another priest (before
p became the only priest to initiate ballots). To handle the latter possibility, p had
to learn about any ballot numbers greater than lastTried[p] used by other priests.
This could be done by extendingthe protocol to require that if a priest q received
a NextBallot(b) or a BeginBallot(b, d) message from p with b < nextBal[q], then he
sent p a message containing nextBal[q]. Priest p would then initiate a new ballot
with a larger ballot number.
Still assumingthat p was the only priest initiatingballots, suppose that he were
required to initiate a new ballot iff (i) he had not executed step 3 or step 5 within
the previous 22 minutes, or (ii) he learned that another priest had initiated a
higher-numbered ballot. If the Chamber doors were locked with p and a majority
set of priests inside, then a decree would be passed and recorded in the ledgers of
all priests in the Chamber within 99 minutes. (It could take 22 minutes for p to
start the next ballot, 22 more minutes to learn that another priest had initiated
a larger-numbered ballot, then 55 minutes to complete steps 1–6 for a successful
ballot.) Thus, the progress condition would be met if only a single priest, who did
not leave the chamber, were initiatingballots.
The complete protocol therefore included a procedure for choosinga single priest,
called the president, to initiate ballots. In most forms of government, choosing a
11I am assuming a value of 30 seconds for the δζ∂ιφ˘ι, the Paxon unit of time. This value is within
the range determined from studies of hourglass shards. The reaction time of priests was so long
because they had to respond to every message within 7 minutes (14 δζ∂ιφ˘ι), even if a number of
messages arrived simultaneously.14 · Leslie Lamport
president can be a difficult problem. However, the difficultly arises only because
most governments require that there be exactly one president at any time. In the
United States, for example, chaos would result if some people thought Bush had
been elected president while others thought that Dukakis had, since one of them
might decide to sign a bill into law while the other decided to veto it. However, in the
Paxon Synod, havingmultiple presidents could only impede progress; it could not
cause inconsistency. For the complete protocol to satisfy the progress condition, the
method for choosingthe president needed only to satisfy the following presidential
selection requirement:
If no one entered or left the Chamber, then after T minutes exactly one
priest in the Chamber would consider himself to be the president.
If the presidential selection requirement were met, then the complete protocol would
have the property that if a majority set of priests were in the chamber and no one
entered or left the Chamber for T + 99 minutes, then at the end of that period
every priest in the Chamber would have a decree written in his ledger.
The Paxons chose as president the priest whose name was last in alphabetical
order amongthe names of all priests in the Chamber, though we don’t know exactly
how this was done. The presidential selection requirement would have been satisfied
if a priest in the Chamber sent a message containing his name to every other priest
at least once every T −11 minutes, and a priest considered himself to be president
iff he received no message from a “higher-named” priest for T minutes.
The complete Synod protocol was obtained from the basic protocol by requiring
priests to perform steps 2–6 promptly, addinga method for choosinga president who
initiated ballots, and requiringthe president to initiate ballots at the appropriate
times. Many details of the protocol are not known. I have described simple methods
for selectinga president and for decidingwhen the president should initiate a new
ballot, but they are undoubtedly not the ones used in Paxos. The rules I have given
require the president to keep initiatingballots even after a decree has been chosen,
thereby ensuringthat priests who have just entered the Chamber learn about the
chosen decree. There were obviously better ways to make sure priests learned about
the decree after it had been chosen. Also, in the course of selectinga president,
each priest probably sent his value of lastTried[p] to the other priests, allowingthe
president to choose a large enough ballot number on his first try.
The Paxons realized that any protocol to achieve the progress condition must
involve measuringthe passage of time.12 The protocols given above for selecting
a president and initiatingballots are easily formulated as precise algorithms that
set timers and perform actions when time-outs occur—assumingperfectly accurate
timers. A closer analysis reveals that such protocols can be made to work with
timers havinga known bound on their accuracy. The skilled glass blowers of Paxos
had no difficulty constructingsuitable hourglass timers.
Given the sophistication of Paxon mathematicians, it is widely believed that
they must have found an optimal algorithm to satisfy the presidential selection
requirement. We can only hope that this algorithm will be discovered in future
12However, many centuries were to pass before a rigorous proof of this result was given.[Fischer
et al. 1985]The Part-Time Parliament · 15
excavations on Paxos.
3 The Multi-Decree Parliament
When Parliament was established, a protocol to satisfy its consistency and progress
requirements was derived from the Synod protocol. The derivation and properties of
the original parliamentary protocol are described in Sections 3.1 and 3.2. Section 3.3
discusses the further evolution of the protocol.
3.1 The Protocol
Instead of passingjust one decree, the Paxon Parliament had to pass a series of
numbered decrees. As in the Synod protocol, a president was elected. Anyone who
wanted a decree passed would inform the president, who would assign a number
to the decree and attempt to pass it. Logically, the parliamentary protocol used a
separate instance of the complete Synod protocol for each decree number. However,
a single president was selected for all these instances, and he performed the first
two steps of the protocol just once.
The key to derivingthe parliamentary protocol is the observation that, in the
Synod protocol, the president does not choose the decree or the quorum until step 3.
A newly elected president p can send to some set of legislators a single message that
serves as the NextBallot(b) message for all instances of the Synod protocol. (There
are an infinite number of instances—one for each decree number.) A legislator q
can reply with a single message that serves as the LastVote messages for step 2 of
all instances of the Synod protocol. This message contains only a finite amount of
information, since q can have voted in only a finite number of instances.
When the new president has received a reply from every member of a majority
set, he is ready to perform step 3 for every instance of the Synod protocol. For
some finite number of instances (decree numbers), the choice of decree in step 3 will
be determined by B3. The president immediately performs step 3 for each of those
instances to try passingthese decrees. Then, whenever he receives a request to pass
a decree, he chooses the lowest-numbered decree that he is still free to choose, and
he performs step 3 for that decree number (instance of the Synod protocol) to try
to pass the decree.
The followingmodifications to this simple protocol lead to the actual Paxon
Parliament’s protocol.
—There is no reason to go through the Synod protocol for a decree number whose
outcome is already known. Therefore, if a newly elected president p has all
decrees with numbers less than or equal to n written in his ledger, then he sends
a NextBallot(b, n) message that serves as a NextBallot(b) message in all instances
of the Synod protocol for decree numbers larger than n. In his response to this
message, legislator q informs p of all decrees numbered greater thann that already
appear in q’s ledger (in addition to sending the usual LastVote information for
decrees not in his ledger), and he asks p to send him any decrees numbered n or
less that are not in his ledger.
—Suppose decrees 125 and 126 are introduced late Friday afternoon, decree 126 is
passed and is written in one or two ledgers, but before anything else happens, the
legislators all go home for the weekend. Suppose also that the following Monday,16 · Leslie Lamport
∆φωρκ is elected the new president and learns about decree 126, but she has no
knowledge of decree 125 because the previous president and all legislators who
had voted for it are still out of the Chamber. She will hold a ballot that passes
decree 126, which leaves a gap in the ledgers. Assigning number 125 to a new
decree would cause it to appear earlier in the ledger than decree 126, which had
been passed the previous week. Passingdecrees out of order in this way might
cause confusion—for example, if the citizen who proposed the new decree did so
because he knew decree 126 had already passed. Instead, ∆φωρκ would attempt
to pass
125: The ides of February is national olive day
a traditional decree that made absolutely no difference to anyone in Paxos. In
general, a new president would fill any gaps in his ledger by passing the “oliveday” decree.
The consistency and progress properties of the parliamentary protocol follow
immediately from the correspondingproperties of the Synod protocol from which
it was derived. To our knowledge, the Paxons never bothered writing a precise
description of the parliamentary protocol because it was so easily derived from the
Synod protocol.
3.2 Properties of the Protocol
3.2.1 The Ordering of Decrees Ballotingcould take place concurrently for many
different decree numbers, with ballots initiated by different legislators—each thinkinghe was president when he initiated the ballot. We cannot say precisely in what
order decrees would be passed, especially without knowinghow a president was
selected. However, there is one important property about the orderingof decrees
that can be deduced.
A decree was said to to be proposed when it was chosen by the president in step 3
of the correspondinginstance of the Synod protocol. The decree was said to be
passed when it was written for the first time in a ledger. Before a president could
propose any new decrees, he had to learn from all the members of a majority set
what decrees they had voted for. Any decree that had already been passed must
have been voted for by at least one legislator in the majority set. Therefore, the
president must have learned about all previously passed decrees before initiating
any new decree. The president would not fill a gap in the ledgers with an important
decree—that is, with any decree other than the “olive-day” decree. He would also
not propose decrees out of order. Therefore, the protocol satisfied the following
decree-ordering property.
If decrees A and B are important and decree A was passed before decree
B was proposed, then A has a lower decree number than B.
3.2.2 Behind Closed Doors Although we don’t know the details involved in
choosinga new president, we do know exactly how Parliament functioned when
the president had been chosen and no one was enteringor leavingthe Chamber.
Upon receivinga request to pass a decree—either directly from a citizen or relayed
from another legislator—the president assigned the decree a number and passed it
with the following exchange of messages. (The numbers refer to the corresponding
steps in the Synod protocol.)The Part-Time Parliament · 17
(3) The president sent a BeginBallot message to each legislator in a quorum.
(4) Each legislator in the quorum sent a Voted message to the president.
(5) The president sent a Success message to every legislator.
This is a total of three message delays and about 3N messages, assuming a
parliament of N legislators and a quorum of about N/2. Moreover, if Parliament
was busy, the president would combine the BeginBallot message for one decree with
the Success message for a previous one, for a total of only 2N messages per decree.
3.3 Further Developments
Governingthe island turned out to be a more complex task than the Paxons realized.
A number of problems arose whose solutions required changes to the protocol. The
most important of these changes are described below.
3.3.1 Picking a President The president of parliament was originally chosen
by the method that had been used in the Synod, which was based purely on the
alphabetical orderingof names. Thus, when legislator Ωκι returned from a sixmonth vacation, he was immediately made president—even though he had no idea
what had happened in his absence. Parliamentary activity came to a halt while
Ωκι, who was a slow writer, laboriously copied six months worth of decrees to bring
his ledger up to date.
This incident led to a debate about the best way to choose a president. Some
Paxons urged that once a legislator became president, he should remain president
until he left the Chamber. An influential group of citizens wanted the richest
legislator in the Chamber to be president, since he could afford to hire more scribes
and other servants to help him with the presidential duties. They argued that once
a rich legislator had brought his ledger up to date, there was no reason for him
not to assume the presidency. Others, however, argued that the most upstanding
citizen should be made president, regardless of wealth. Upstanding probably meant
less likely to be dishonest, although no Paxon would publicly admit the possibility
of official malfeasance. Unfortunately, the outcome of this debate is not known; no
record exists of the presidential selection protocol that was ultimately used.
3.3.2 Long Ledgers As the years progressed and Parliament passed more and
more decrees, Paxons had to pore over an ever longer list of decrees to find the
current olive tax or what color goat could be sold. A legislator who returned to the
Chamber after an extended voyage had to do quite a bit of copyingto bringhis
ledger up to date. Eventually, the legislators were forced to convert their ledgers
from lists of decrees into law books that contained only the current state of the law
and the number of the last decree whose passage was reflected in that state.
To learn the current olive tax, one looked in the law book under “taxes”; to learn
what color goat could be sold, one looked under “mercantile law”. If a legislator’s
ledger contained the law through decree 1298 and he learned that decree 1299 set
the olive tax to 6 drachmas per ton, he just changed the entry for the olive-tax law
and noted that his ledger was complete through decree 1299. If he then learned
about decree 1302, he would write it down in the back of the ledger and wait until
he learned about decrees 1300 and 1301 before incorporatingdecree 1302 into the
law book.18 · Leslie Lamport
To enable a legislator who had been gone for a short time to catch up without
copyingthe entire law book, legislators kept a list of the past week’s decrees in the
back of the book. They could have kept this list on a slip of paper, but it was
convenient for a legislator to enter decrees in the back of the ledger as they were
passed and update the law book only two or three times a week.
3.3.3 Bureaucrats As Paxos prospered, legislators became very busy. Parliament could no longer handle all details of government, so a bureaucracy was established. Instead of passinga decree to declare whether each lot of cheese was fit
for sale, Parliament passed a decree appointinga cheese inspector to make those
decisions.
It soon became evident that selectingbureaucrats was not as simple as it first
seemed. Parliament passed a decree making ∆˘ικστρα the first cheese inspector.
After some months, merchants complained that ∆˘ικστρα was too strict and was
rejectingperfectly good cheese. Parliament then replaced him by passingthe decree
1375: Γωυδα is the new cheese inspector
But ∆˘ικστρα did not pay close attention to what Parliament did, so he did not learn
of this decree right away. There was a period of confusion in the cheese market when
both ∆˘ικστρα and Γωυδα were inspectingcheese and makingconflictingdecisions.
To prevent such confusion, the Paxons had to guarantee that a position could
be held by at most one bureaucrat at any time. To do this, a president included
as part of each decree the time and date when it was proposed. A decree making
∆˘ικστρα the cheese inspector might read
2716: 8:30 15 Jan 72—∆˘ικστρα is cheese inspector for 3 months
This declares his term to begin either at 8:30 on 15 January or when the previous
inspector’s term ended—whichever was later. His term would end at 8:30 on 15
March, unless he explicitly resigned by asking the president to pass a decree like
2834: 9:15 3 Mar 72—∆˘ικστρα resigns as cheese inspector
A bureaucrat was appointed for a short term, so he could be replaced quickly—
for example, if he left the island. Parliament would pass a decree to extend the
bureaucrat’s term if he was doinga satisfactory job.
A bureaucrat needed to tell time to determine if he currently held a post. Mechanical clocks were unknown on Paxos, but Paxons could tell time accurately to
within 15 minutes by the position of the sun or the stars.13 If ∆˘ικστρα’s term
began at 8:30, he would not start inspecting cheese until his celestial observations
indicated that it was 8:45.
It is easy to make this method of appointingbureaucrats work if higher-numbered
decrees always have later proposal times. But what if Parliament passed the decrees
2854: 9:45 9 Apr 78—Φρανσζ is wine taster for 2 months
2855: 9:20 9 Apr 78—Πνυλ˘ι is wine taster for 1 month
13Cloudy days are rare in Paxos’s balmy climate.The Part-Time Parliament · 19
that were proposed between 9:30 and 9:35 by different legislators who both thought
they were president? Such out-of-order proposal times are easily prevented because
the parliamentary protocol satisfies the followingproperty.
If two decrees are passed by different presidents, then one of the presidents proposed his decree after learningthat the other decree had been
proposed.
To see that this property is satisfied, suppose that ballot number b was successful
for decree D, ballot number b

was successful for decree D
, and b prevBal[p].
– Send a LastVote(nextBal[p], v) message to priest owner(nextBal[p]), where
v
pst = p, v
bal = prevBal[p], and v
dec = prevDec[p].The Part-Time Parliament · 27
Receive LastVote(b, v) Message
If b = lastTried[p] and status[p] = trying, then
– Set prevVotes[p] to the union of its original value and {v}.
Start Polling Majority Set Q
Enabled when status[p] = trying and Q ⊆ {v
pst : v ∈ prevVotes[p]}, where Q is a
majority set.
– Set status[p] to polling.
– Set quorum[p] to Q.
– Set voters[p] to ∅.
– Set decree[p] to a decree d chosen as follows: Let v be the maximum element
of prevVotes[p]. If v
bal
= −∞ then d = v
dec, else d can equal any decree.
– Set B to the union of its former value and {B}, where Bdec = d, Bqrm = Q,
Bvot = ∅, and Bbal = lastTried[p].
Send BeginBallot Message
Enabled when status[p] = polling.
– Send a BeginBallot(lastTried[p], decree[p]) message to any priest in quorum[p].
Receive BeginBallot(b, d) Message
If b = nextBal[p] > prevBal[p] then
– Set prevBal[p] to b.
– Set prevDec[p] to d.
– If there is a ballot B in B with Bbal = b [there will be], then choose any such
B [there will be only one] and let the new value of B be obtained from its old
value by setting Bvot equal to the union of its old value and {p}.
Send Voted Message
Enabled whenever prevBal[p]
= −∞.
– Send a Voted(prevBal[p], p) message to owner(prevBal[p]).
Receive Voted(b, q) Message
If b = lastTried[p] and status[p] = polling, then
– Set voters[p] to the union of its old value and {q}
Succeed
Enabled whenever status[p] = polling, quorum[p] ⊆ voters[p], and outcome[p] =
blank.
– Set outcome[p] to decree[p].
Send Success Message
Enabled whenever outcome[p]
= blank.
– Send a Success(outcome[p]) message to any priest.
Receive Success(d) Message
If outcome[p] = blank, then28 · Leslie Lamport
– Set outcome[p] to d.
This algorithm is an abstract description of the real protocol performed by Paxon
priests. Do the algorithm’s actions accurately model the actions of the real priests?
There were three kinds of actions that a priest could perform “atomically”: receivinga message, writinga note or ledger entry, and sendinga message. Each of these
is represented by a single action of the algorithm, except that Receive actions both
receive a message and set a variable. We can pretend that the receipt of a message
occurred when a priest acted upon the message; if he left the Chamber before actingupon it, then we can pretend that the message was never received. Since this
pretense does not affect the consistency condition, we can infer the consistency of
the basic Synod protocol from the consistency of the algorithm.
A2 Proof of Consistency
To prove the consistency condition, it is necessary to show that whenever outcome[p]
and outcome[q] are both different from blank, they are equal. A rigorous correctness proof requires a complete description of the algorithm. The description given
above is almost complete. Missingis a variable M whose value is the multiset of
all messages in transit.15 Each Send action adds a message to this multiset and
each Receive action removes one. Also needed are actions to represent the loss
and duplication of messages, as well as a Forget action that represents a priest
losinghis slip of paper.
With these additions, we get an algorithm that defines a set of possible behaviors,
in which each change of state corresponds to one of the allowed actions. The Paxons
proved correctness by findinga predicate I such that
(1) I is true initially.
(2) I implies the desired correctness condition.
(3) Each allowed action leaves I true.
The predicate I was written as a conjunction I1∧...∧I7, where I1–I5 were in turn
the conjunction of predicates I1(p)–I5(p) for all priests p. Although most variables
are mentioned in several of the conjuncts, each variable except status[p] is naturally
associated with one conjunct, and each conjunct can be thought of as a constraint
on its associated variables. The definitions of the individual conjuncts of I are given
below, where a list of items marked by ∧ symbols denotes the conjunction of those
items. The variables associated with a conjunct are listed in bracketed comments.
I1(p)

= [Associated variable: outcome[p] ]
(outcome[p]
= blank) ⇒ ∃B ∈ B : (Bqrm ⊆ Bvot)∧(Bdec = outcome[p])
I2(p)

= [Associated variable: lastTried[p] ]
∧ owner(lastTried[p]) = p
∧ ∀B ∈ B : (owner(Bbal) = p) ⇒
∧ Bbal ≤ lastTried[p]
∧ (status[p] = trying) ⇒ (Bbal < lastTried[p])
15A multiset is a set that may contain multiple copies of the same element.The Part-Time Parliament · 29
I3(p)

= [Associated variables: prevBal[p], prevDec[p], nextBal[p] ]
∧ prevBal[p] = MaxVote(∞, p, B)
bal
∧ prevDec[p] = MaxVote(∞, p, B)
dec
∧ nextBal[p] ≥ prevBal[p]
I4(p)

= [Associated variable: prevVotes[p] ]
(status[p]
= idle) ⇒
∀v ∈ prevVotes[p] : ∧ v = MaxVote(lastTried[p], vpst, B)
∧ nextBal[v
pst] ≥ lastTried[p]
I5(p)

= [Associated variables: quorum[p], voters[p], decree[p] ]
(status[p] = polling) ⇒
∧ quorum[p] ⊆ {v
pst : v ∈ prevVotes[p]}
∧ ∃B ∈ B : ∧ quorum[p] = Bqrm
∧ decree[p] = Bdec
∧ voters[p] ⊆ Bvot
∧ lastTried[p] = Bbal
I6

= [Associated variable: B]
∧ B1(B)∧B2(B)∧B3(B)
∧ ∀B ∈ B : Bqrm is a majority set
I7

= [Associated variable: M]
∧ ∀NextBallot(b) ∈ M : (b ≤ lastTried[owner(b)])
∧ ∀LastVote(b, v) ∈ M : ∧ v = MaxVote(b, vpst, B)
∧ nextBal[v
pst] ≥ b
∧ ∀BeginBallot(b, d) ∈ M : ∃B ∈ B : (Bbal = b)∧(Bdec = d)
∧ ∀Voted(b, p) ∈ M : ∃B ∈ B : (Bbal = b)∧(p ∈ Bvot)
∧ ∀Success(d) ∈ M : ∃p : outcome[p] = d
= blank
The Paxons had to prove that I satisfies the three conditions given above. The
first condition, that I holds initially, requires checkingthat each conjunct is true for
the initial values of all the variables. While not stated explicitly, these initial values
can be inferred from the variables’ descriptions, and checkingthe first condition is
straightforward. The second condition, that I implies consistency, follows from I1,
the first conjunct of I6, and Theorem 1. The hard part was provingthe third
condition, the invariance of I, which meant provingthat I is left true by every
action. This condition is proved by showingthat, for each conjunct of I, executing
any action when I is true leaves that conjunct true. The proofs are sketched below.
I1(p) B is changed only by addinga new ballot or addinga new priest to Bvot for
some B ∈ B, neither of which can falsify I1(p). The value of outcome[p] is changed
only by the Succeed and Receive Success Message actions. The enablingcondition and I5(p) imply that I1(p) is left true by the Succeed action. The enabling
condition, I1(p), and the last conjunct of I7 imply that I1(p) is left true by the
Receive Success Message action.30 · Leslie Lamport
I2(p) This conjunct depends only on lastTried[p], status[p], and B. Only the Try
New Ballot action changes lastTried[p], and only that action can set status[p] to
trying. Since the action increases lastTried[p] to a value b with owner(b) = p,
it leaves I2(p) true. A completely new element is added to B only by a Start
Polling action; the first conjunct of I2(p) and the specification of the action imply
that addingthis new element does not falsify the second conjunct of I2(p). The
only other way B is changed is by adding a new priest to Bvot for some B ∈ B,
which does not affect I2(p).
I3(p) Since votes are never removed from B, the only action that can change
MaxVote(∞, p, B) is one that adds to B a vote cast by p. Only a Receive
BeginBallot Message action can do that, and only that action changes prevBal[p]
and prevDec[p]. The BeginBallot conjunct of I7 implies that this action actually
does add a vote to B, and B1(B) (the first conjunct of I6) implies that there is only
one ballot to which the vote can be added. The enablingcondition, the assumption that I3(p) holds before executingthe action, and the definition of MaxVote
then imply that the action leaves the first two conjuncts of I3(p) true. The third
conjunct is left true because prevBal[p] is changed only by setting it to nextBal[p],
and nextBal[p] is never decreased.
I4(p) This conjunct depends only upon the values of status[p], prevVotes[p],
lastTried[p], nextBal[q] for some priests q, and B. The value of status[p] is changed
from idle to notidle only by a Try New Ballot action, which sets prevVotes[p] to
∅, making I4(p) vacuously true. The only other actions that change prevVotes[p]
are the Forget action, which leaves I4(p) true because it sets status[p] to idle,
and the Receive LastVote Message action. It follows from the enablingcondition and the LastVote conjunct of I7 that the Receive LastVote Message action preserves I4(p). The value of lastTried[p] is changed only by the Try New
Ballot action, which leaves I4(p) true because it sets status[p] to trying. The
value of nextBal[q] can only increase, which cannot make I4(p) false. Finally,
MaxVote(lastTried[p], vpst, B) can be changed only if v
pst is added to Bvot for
some B ∈ B with Bbal < lastTried[p]. But v
pst is added to Bvot (by a Receive
BeginBallot Message action) only if nextBal[v
pst] = Bbal, in which case I4(p)
implies that Bbal ≥ lastTried[p].
I5(p) The value of status[p] is set to polling only by the Start Polling action.
This action’s enablingcondition guarantees that the first conjunct becomes true,
and it adds the ballot to B that makes the second conjunct true. No other action
changes quorum[p], decree[p], orlastTried[p] while leaving status[p] equal to polling.
The value of prevVotes[p] cannot be changed while status[p] = polling, and B is
changed only by addingnew elements or by addinga new priest to Bvot. The only
remainingpossibility for falsifying I5(p) is the addition of a new element to voters[p]
by the Receive Voted Message action. The Voted conjunct of I7, B1(B) (the
first conjunct of I6), and the action’s enablingcondition imply that the element
added to voters[p] is in Bvot, where B is the ballot whose existence is asserted in
I5(p).
I6 Since Bbal and Bqrm are never changed for any B ∈ B, the only way B1(B),
B2(B), and the second conjunct of I6 can be falsified is by addinga new ballot toThe Part-Time Parliament · 31
B, which is done only by the Start Polling Majority Set Q action when status[p]
equals trying. It follows from the second conjunct of I2(p) that this action leaves
B1(B) true; and the assertion, in the enablingcondition, that Q is a majority set
implies that the action leaves B2(B) and the second conjunct of I6 true. There
are two possible ways of falsifying B3(B): changing MaxVote(Bbal, Bqrm, B) by
addinga new vote to B, and addinga new ballot to B. A new vote is added
only by the Receive BeginBallot Message action, and I3(p) implies that the
action adds a vote later than any other vote cast by p in B, so it cannot change
MaxVote(Bbal, Bqrm, B) for any B inB. Conjunct I4(p) implies that the new ballot
added by the Start Polling action does not falsify B3(B).
I7 I7 can be falsified either by addinga new message to M or by changing the
value of another variable on which I7 depends. Since lastTried[p] and nextBal[p]
are never decreased, changing them cannot make I7 false. Since outcome[p] is never
changed if its value is not blank, changing it cannot falsify I7. Since B is changed
only by addingballots and addingvotes, the only change to it that can make I7
false is the addition of a vote by v
pst that makes the LastVote(b, v) conjunct false by
changing MaxVote(b, vpst, B). This can happen only if v
pst votes in a ballot B with
Bbal < b. But v
pst can vote only in ballot number nextBal[v
pst], and the assumption
that this conjunct holds initially implies that nextBal[v
pst] ≥ b. Therefore, we need
check only that every message that is sent satisfies the condition in the appropriate
conjunct of I7.
NextBallot : Follows from the definition of the Send NextBallot Message action
and the first conjunct of I2(p).
LastVote: The enablingcondition of the Send LastVote Message action and
I3(p) imply that MaxVote(nextBal[p], p, B) = MaxVote(∞, p, B), from which it
follows that the LastVote message sent by the action satisfies the condition in I7.
BeginBallot : Follows from I5(p) and the definition of the Send BeginBallot
Message action.
Voted : Follows from I3(p), the definition of MaxVote, and the definition of the
Send Voted Message action.
Success : Follows from the definition of Send Success Message.
ACKNOWLEDGMENTS
Daniel Duchamp pointed out to me the need for a new state-machine implementation. Discussions with Mart´ın Abadi, Andy Hisgen, Tim Mann, and Garret Swart
led me to Paxos. Λων´ιδας Γ κ´ιµπας provided invaluable assistance with the Paxon
dialect.
References
Bernstein, P. A., Hadzilacos, V., and Goodman, N. 1987. Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading, Massachusetts.
De Prisco, R., Lampson, B., and Lynch, N. 1997. Revisiting the Paxos algorithm. In
M. Mavronicolas and P. Tsigas (Eds.), Proceedings of the 11th International Workshop on Distributed Algorithms (WDAG 97), Volume 1320 of Lecture Notes in Computer32 · Leslie Lamport
Science, Saarbruken, Germany, pp. 111–125. Springer-Verlag.
Dijkstra, E. W. 1974. Self-stabilizing systems in spite of distributed control. Commun.
ACM 17, 11 (Nov.), 643–644.
Dwork, C., Lynch, N., and Stockmeyer, L. 1988. Consensus in the presence of partial
synchrony. Journal of the ACM 35, 2 (April), 288–323.
Fekete, A., Lynch, N., and Shvartsman, A. 1997. Specifying and using a partitionable
group communication service. In Proceedings of the Sixteenth Annual ACM Symposium on
Principles of Distributed Computing, pp. 53–62. ACM Press.
Fischer, M. J., Lynch, N., and Paterson, M. S. 1985. Impossibility of distributed consensus
with one faulty process. Journal of the ACM 32, 2 (April), 374–382.
Gray, C. G. and Cheriton, D. R. 1989. Leases:An efficient fault-toerant mechanism for distributed file cache consistency. In Proceedings of the Twelfth ACM Symposium on Operating
Systems Principles, New York, pp. 202–210. ACM.
Keidar, I. and Dolev, D. 1996. Efficient message ordering in dynamic networks. In Proceedings
of the 15th Annual ACM Symposium on Principles of Distributed Computing. ACM.
Ladin, R., Liskov, B., Shrira, L., and Ghemawat, S. 1992. Providing high availability using
lazy replication. ACM Transactions on Computer Systems 10, 4 (Nov.), 360–391.
Lamport, L. 1978. Time, clocks, and the ordering of events in a distributed system. Commun.
ACM 21, 7 (July), 558–565.
Lamport, L. 1984. Using time instead of timeout for fault-tolerant distributed systems. ACM
Trans. on Programm. Lang. Syst. 6, 2 (April), 254–280.
Lampson, B. W. 1996. How to build a highly available system using consensus. In O. Babaoglu
and K. Marzullo (Eds.), Distributed Algorithms, Volume 1151 of Lecture Notes in Computer Science, Berlin, pp. 1–17. Springer-Verlag.
Oki, B. M. and Liskov, B. H. 1988. Viewstamped replication:A new primary copy method to
support highly-available distributed systems. In Proceedings of the Seventh Annual ACM
Symposium on Principles of Distributed Computing, pp. 8–17. ACM Press.
Schneider, F. B. 1990. Implementing fault-tolerant services using the state machine approach:
A tutorial. ACM Computing Surveys 22, 4 (Dec.), 299–319.
Skeen, M. D. 1982. Crash recovery in a distributed database system. Ph. D. thesis, University
of California, Berkeley.
Received January 1990; Accepted March 1998

How to Format Lyrics:

  • Type out all lyrics, even repeating song parts like the chorus
  • Lyrics should be broken down into individual lines
  • Use section headers above different song parts like [Verse], [Chorus], etc.
  • Use italics (<i>lyric</i>) and bold (<b>lyric</b>) to distinguish between different vocalists in the same song part
  • If you don’t understand a lyric, use [?]

To learn more, check out our transcription guide or visit our transcribers forum

About

Have the inside scoop on this song?
Sign up and drop some knowledge

Q&A

Find answers to frequently asked questions about the song and explore its deeper meaning

Credits
Tags
Comments