Abstract:
Herlihy’s consensus hierarchy ranks the power of various synchronization primitives for solving consensus in a model where asynchronous processes communicate through shared memory, and may fail by halting. This paper revisits the consensus hierarchy in a model with crash-recovery failures, where the specification of consensus, called recoverable consensus (RC) in this paper, is weakened by allowing non-terminating executions when a process fails infinitely often. Two variations of this model are considered: with independent process failures, and with simultaneous (i.e., system-wide) process failures. We prove several fundamental results: (i) any synchronization primitive at level 2 in the conventional consensus hierarchy remains at level 2 in the RC hierarchy if failures are simultaneous; (ii) any commutative or overwriting primitive (including Test-And-Set, Fetch-And-Add, and Fetch-And-Store) at level 2 in the conventional consensus hierarchy drops to level 1 in the RC hierarchy if failures are independent, unless the number of such failures is bounded; and (iii) there exists a primitive at level 2 in the conventional consensus hierarchy that remains at level 2 in the RC hierarchy. Result (iii) implies that result (ii) cannot be generalized to all primitives at level 2 in the conventional consensus hierarchy. To our knowledge, this collection of results exhibits the first separation between the simultaneous and independent crash-recovery failure models with respect to the computability of consensus.