Title: An introduction to snapshot algorithms in distributed computing
Abstract:Recording on-the-fly global states of distributed executions is an important paradigm when one is interested in analysing, testing, or verifying properties associated with these executions. Since Chan...Recording on-the-fly global states of distributed executions is an important paradigm when one is interested in analysing, testing, or verifying properties associated with these executions. Since Chandy and Lamport`s (1985) seminal paper on this topic, this problem is called the snapshot problem. Unfortunately, the lack of both a globally shared memory and a global clock in a distributed system, added to the fact that transfer delays in these systems are finite but unpredictable, makes this problem non-trivial. This paper first discusses issues which have to be addressed to compute distributed snapshots in a consistent way. Then several algorithms which determine on-the-fly such snapshots are presented for several types of networks (according to the properties of their communication channels, namely, FIFO, non-FIFO, and causal delivery).Read More