Understanding “Impossibility of Distributed Consensus with One Faulty Process”

Consensus Protocol

Let us start by integrating some unfamiliar definitions in this paper. By reading through these concepts, you should be able to grasp the big picture of how the consensus protocol works. I added some illustrations to further understand how these concepts are linked together to form a protocol.

FIGURE 1. Concept of process, and message system.
FIGURE 1. Concept of process, and message system.
FIGURE 2. Concept of the configuration and sequence.
FIGURE 2. Concept of the configuration and sequence.
  • consensus protocol: denoted by P (capital). This paper assumes that P is an asynchronous system of N processes (N ≥ 2).
  • process: denoted by p (lower case). A process can be understood as a node, which has its internal storage, and state that is determined by inputs and a transition function. See Figure 1 below.
  • input register: process p has one input register x_p with values either of b or 0 or 1.
  • output register: process p has one output register y_p with values either of b or 0 or 1. process p has its y_p value as b initially, but as soon as it becomes 0 or 1, it cannot be changed (thus, output register is “write once”).
  • internal state: The values in x_p and y_p comprise the internal state.
  • initial state: initial state has its y_p value “b” but x_p value can vary.
  • decision state: a state in which y_p has a value either of 0 or 1. once a process becomes the decision state, y_p cannot be changed anymore.
  • transition function: transition function determines y_p (output). but cannot change y_p anymore once p becomes decision state.
  • message: a message is a pair (p, m), where p is the destination process and m is a message value from a fixed universe M.
  • message buffer: a FIFO queue of messages that have been sent but not yet received. message buffer supports two operations:
    1. send(p, m): send message m to the process p. place (p, m) in the message buffer.
    2. receive(p): deletes some message (p, m) from the buffer and returns m OR returns null and leaves the buffer unchanged.
  • configuration: configuration consists of 1. internal state of each process and 2. contents of the message buffer. See Figure 2.
  • initial configuration: a configuration in which each process starts at an initial state and the message buffer is empty.
  • step: a step takes one configuration C to another configuration C’.
  • event: event is denoted by e. e = (p, m), which means a receipt of m by p. e(C) denotes the resulting configuration, and we say that e can be applied to C.
  • sequence: sequence σ is a set of multiple events.
  • schedule: finite or infinite sequence σ that can be applied starting from C.
  • resulting configuration: σ(C).
  • reachable: if σ(C), σ(C) is reachable from C.
  • accessible: a configuration reachable from some initial configuration is accessible.

So, with the background knowledge above, we are going to prove that:

Theorem: There is always a sequence of events in an asynchronous distributed system such that the group of processes never reach consensus

by showing that there is a case in which protocol remains forever indecisive.

Lemma 1

LEMMA 1. Commutativity property of schedules. Suppose that from some configuration C, the schedules σ1, σ2 lead to configurations C1, C2, respectively. If the sets of processes taking steps in σ1 and σ2, respectively, are disjoint, then σ2 can be applied to C1 and σ1 can be applied to C2, and both lead to the same configuration C3.

i.e., σ1(σ2(C)) = σ2(σ1(C)), if the sets of processes taking steps in σ1 and σ2 are disjoint.

PROOF: The result follows at once from the system definition, since σ1 and σ2 do not interact.

Source: Impossibility of Distributed Consensus with One Faulty Process, p.377
Source: Impossibility of Distributed Consensus with One Faulty Process, p.377

So, LEMMA 1 proves that schedules are applicable without considering an order, resulting in the same resulting configuration. This “commutativity” property of schedule is used to prove Lemma 3.

Lemma 2

We need to understand what the “valency” is before we step into the proof of lemma 2. Here is a table for the grasp.

  • univalent: if all reachable configurations have decision value v (thus in a decision state), we just call the configuration univalent.
  • 0-valent: univalent, with all of the reachable configurations having decision value 0.
  • 1-valent: univalent, with all of the reachable configurations having decision value 1.
  • bivalent: if all reachable configurations do not have the same decision value v, we just call the configuration bivalent.

LEMMA 2. Consensus protocol P has a bivalent initial configuration.

PROOF: We use contrary proof. Assume that P must have only univalent initial configurations. So, P must have both 0-valent and 1-valent initial configurations.

Let us call two initial configurations adjacent if they differ only in the initial value x_p of a single process p.

(REMINDER: initial configuration: a configuration in which each process starts at an initial state and the message buffer is empty. initial state: initial state has its y_p value “b” but x_p value can vary.)

If we list all the possible initial configurations, there must be two adjacent initial configurations, 0-valent and 1-valent respectively, and the only difference between them is the x_p value of a single process p.

Now we consider applying sequence σ to both initial configuration n and n+1 assuming that the process p does not take any step.

FIGURE 3. initial configurations and contradiction.
FIGURE 3. initial configurations and contradiction.

If sequence σ can be applied to initial configuration n, then it is also applicable to initial configuration n+1 because they are identical except the x_p value of a single process p, and now p is halted by the assumption. σ(config n) = σ(config n+1). If the decision value of configuration C is 1, then config n is bivalent. Reversely, if the decision value of configuration is 0, then config n+1 is bivalent. This contradicts the assumed nonexistence of a bivalent initial configuration. End of proof.

So, in short, LEMMA 2 proves that there exists a bivalent initial configuration.

Lemma 3

LEMMA 3. Let C be a bivalent configuration of P, and let e = (p, m) be an event that is applicable to C. Let C be the set of configurations reachable from C without applying e, and let D = e(C) = {e(E) | E ∈ C and e is applicable to E}. Then, D contains a bivalent configuration.

PROOF: Since e is applicable to C, then by the definition of C and the fact that messages can be delayed arbitrarily (by LEMMA 1), e is applicable to every E ∈ C.

We also use contrary proof here again. Now assume that D contains no bivalent configurations, so every configuration F ∈ D is univalent.

FIGURE 4. Reachable configurations from C.
FIGURE 4. Reachable configurations from C.

If F is a configuration reachable from E, then F is also either 0-valent or 1-valent since D contains no bivalent configuration by the assumption. As e is applicable to every E ∈ C, by an easy induction, there exist E_n and E_n+1 such that F_n = e(E_n) and F_n+1 = e(E_n+1).

FIGURE 5. Case 1, when p’ ≠ p.
FIGURE 5. Case 1, when p’ ≠ p.

Let event e = (p, m) and e’ = (p’, m’).

CASE 1: If p’ ≠ p, then F_n+1 = e’(F_n) by LEMMA 1. This is impossible, since any successor of a 0-valent configuration is 0-valent. i.e., F_n(0-valent configuration) → F_n+1(1-valent configuration) is impossible.

CASE 2: If p’ = p, then consider any finite deciding schedule from E_n in which p takes no steps:

FIGURE 6. Case 2, when p’ = p.
FIGURE 6. Case 2, when p’ = p.

Let σ the schedule, and let configuration A = σ(E_n). By LEMMA 1, σ is applicable to F_n and F_n+1, and it leads to 0-valent configuration G_n = σ(F_n) and 1-valent G_n+1 = σ(F_n+1). Also by LEMMA 1, e(A) = G_n and e(e’(A)) = G_n+1. So, configuration A is bivalent, but this is impossible since the schedule to A is deciding by the assumption. So A must be univalent. In this case, we reach a contradiction. End of proof.

So, LEMMA 3 proves that there exists a bivalent configuration that can be reached from the bivalent configuration.

Integration

Let us integrate the lemmas we have just proved so far and build a conclusion.

(LEMMA 1: schedules are applicable without considering an order, resulting in the same resulting configuration. i.e., commutativity property of schedules.)

LEMMA 2: there exists a bivalent initial configuration.

LEMMA 3: there exists a bivalent configuration that can be reached from the bivalent configuration, proved with LEMMA 1.

Considering LEMMA 2 and LEMMA 3, we can finally prove that there is a case in which protocol remains forever indecisive.

If there is a schedule that runs from bivalent configuration (whose existence is proved by LEMMA 2 and 3) to a univalent configuration, there must be a some single step that goes from a bivalent to a univalent configuration. Such a step determines the eventual decision value. It is always possible to run the asynchronous system in a way that avoids such steps by failing single process.

References:

FEEDBACK IS ALWAYS WELCOMED! Please DM(@ysjlitt twitter) me if you find any errors in this article.

Subscribe to imlearning.eth
Receive the latest updates directly to your inbox.
Verification
This entry has been permanently stored onchain and signed by its creator.