• Aucun résultat trouvé

Fault-Tolerance: Checkpointing in Distributed Asynchronous Systems

N/A
N/A
Protected

Academic year: 2022

Partager "Fault-Tolerance: Checkpointing in Distributed Asynchronous Systems"

Copied!
16
0
0

Texte intégral

(1)

Master Recherche en Informatique - Novembre 2010

Fault-Tolerance: Checkpointing in Distributed Asynchronous Systems

Achour Most´efaoui

Irisa/Ifsic, Universit´e de Rennes [email protected]

http://www.irisa.fr/asap/

Checkpointing Distributed Computations 1

Some Failure Types

• Software errors

• Process failures

⋆ Crash failure

⋆ Send/Receive omission

⋆ Arbitrary (Byzantine)

• Link failures

⋆ Omission/duplication failure

• Clock/Performance failures

Checkpointing Distributed Computations 2

How to Tolerate Failures?

• Debbuging/Validation.

• Duplication of processors and memories.

• Faul-tolerant software.

⋆ Data replication.

⋆ Fault-tolerant services (consensus, NBAC, etc.)

⋆ Checkpointing/Rollback-Recovery.

Computation Model

Asynchronous Distributed System

• Set of processes: P1, . . . , Pn.

• No shared memory.

• No global clock.

• Fail-stop processors (crash failures).

(2)

What Is Rollback?

P

σ4

P

σ4

P

e1 e2 e3

σ4

e5

e4

• Failure occurrence: restart at a safe state.

• Necessity to save safe states.

Checkpointing Distributed Computations 5

Local States and Local Checkpoints

• Local History of Pi: ei,1 ei,2 · · · ei,s · · ·

• Local State:

⋆ initial state of Pi: σi,0

⋆ σi,s is obtained by applying the event ei,s to the local state σi,s−1

• Local Checkpoint:

A local checkpoint C is a recorded state (snapshot) of a process.

A local state is not necessarily recorded as a local check- point, so the set of local checkpoints is only a subset of the set of local states.

Checkpointing Distributed Computations 6

Distributed Computation: Example

Pi

Pj

Pk

Ik,1 Ik,2 Ik,3

Ij,1

m1 m2

m3 m4

m5

m6 m7 Ci,3

Ci,2

Ci,1

Ci,0

Cj,3

Cj,2

Cj,1

Cj,0

Ck,3

Ck,2

Ck,1

Ck,0

A State of a Distributed Computation?

• Remind: There isno global memory andno global clock

• Impossibility to compute instantaneously a global snap- shot

• Illustration of Chandy-Lamport (1985)

• A global checkpoint is a set of local checkpoints, one from each process.

(3)

A pair of Mutually Consistent Checkpoints

Pj

Pj

Pj

Pi

Pi

Pi

Checkpointing Distributed Computations 9

A Missing Message

Pj

Pj

Pj

Pi

Pi

Pi

?

• The message is missing ⇒ recording.

Checkpointing Distributed Computations 10

An Orphan Message

Pj

Pj

Pj

Pi

Pi

Pi

?

• The two local checkpoints are definitelyinconsistent.

Consistency Rules of a Global Checkpoint (CL 1985)

Let C1,x1, C2,x2, . . . , Cn,xn be a global checkpoint. Each pair (Ci,xi, Cj,xj) of local checkpoints respects:

R1 ∀m, send(m) ∈Ci,xi ⇒delivery(m) ∈Cj,xj or M is recorded (no missing).

R2 ∀m, send(m) 6∈Ci,xi ⇒delivery(m) 6∈Cj,xj (no orphan).

(4)

Consistent Global Checkpoints

• Consistent Global Checkpoint

A global checkpoint is consistent if it has no message delivered and not sent (no orphan message).

• Message recording can easily overcome the problem of missing messages.

Checkpointing Distributed Computations 13

Meaning of a Global Checkpoint

Pi

Pj

Pk

Ik,1 Ik,2

Ci,0 Ci,1 Ci,2

Cj,0 Cj,1 Cj,2 Cj,3

Ci,3

Ck,3

Ck,1

Ck,0

m3 m4 m6

m2 m1

Ik,3

m5

Ck,2

m7 Ij,1

• Does a consistent global checkpoint represent a real state of the computation?

Checkpointing Distributed Computations 14

Why to Compute Global Checkpoints?

• Rollback recovery.

• Stable/Unstable properties detection

• Monitoring

• etc.

Caracteristics of a Global Checkpoint

• To be as close as possible to a real state (monitoring)

• To be the most recent possible (rollback)

• To have as few forced checkpoints as possible (proper- ties detection)

• etc.

(5)

Example: Rollback Recovery

Pi

Pj

Pk

Ik,1 Ik,2

Ci,0 Ci,1 Ci,2

Cj,0 Cj,1 Cj,2 Cj,3

Ci,3

Ck,1

Ck,0

m3 m4 m6

m2

m1

Ck,2

m7 Ij,1

Checkpointing Distributed Computations 17

Limits of Chandy-Lamport’s Consistency Rules

• Chandy-Lamport result applies to global checkpoints:

⋆ Considering a global checkpoint, one can say whether it is consistent or not.

• Question:

⋆ Considering a single local checkpoint taken by a pro- cess, How to know whether there exists a consistent global checkpoint to which it belongs?

Checkpointing Distributed Computations 18

Consistent Global Checkpoints

Checkpointing protocols must ensure that each local checkpoint belongs to at least one global checkpoint with:

1 No orphan messages.

2 No missing messages.

Limits of Chandy-Lamport’s Consistency Rules: Example

Pi

Pj

Pk

Ik,1 Ik,2

Ci,0 Ci,1 Ci,2

Cj,0 Cj,1 Cj,2 Cj,3

Ci,3

Ck,3

Ck,1

Ck,0

m3 m4 m6 m2

m1

Ik,3

m5

Ck,2

m7

Ij,1

• Hidden dependencies.

(6)

Theorem of Netzer and Xu (1995)

• Considering a subset of local checkpoints (possibly one local checkpoint), one can say whether this subset can be extended to form a consistent global checkpoint.

Checkpointing Distributed Computations 21

Z-Paths

A relation exists from local checkpoint A to local check- point B if there exist a Z-path from A to B.

B

A

Checkpointing Distributed Computations 22

Z-Paths

A Z-path exists from local checkpoint A to local check- point B if and only if:

• A precedes B within the same process, or

• a sequence of messages [m1, m2, . . . , mq] (q ≥ 1) exists such that:

1. A precedes send(m1) in the same process, and

2. for each mi, i < q, delivery(mi) is in the same or earlier interval as send(mi+1), and

3. delivery(mq) precedes B in the same process.

Causal Z-Paths, Z-Patterns and Z-Cycles

• A Z-Path is Causal iff for each mi, i < q, we have delivery(mi) →hb send(mi+1).

• a Z-Path has a Z-Pattern iff ∃i such that:

send(mi+1) →hb delivery(mi).

• a Z-Cycle is a Z-Path going from a local checkpoint C to the same local checkpoint C.

(7)

Causal Z-Path: Example

Pi

Pj

Pk

Ik,1 Ik,2

Ci,0 Ci,1 Ci,2

Cj,0 Cj,1 Cj,2 Cj,3

Ci,3

Ck,3

Ck,1

Ck,0

m3 m4 m6

m2

m1

Ik,3

m5

Ck,2

m7 Ij,1

Checkpointing Distributed Computations 25

Z-Cycle: Example

Pi

Pj

Pk

Ik,1 Ik,2

Ci,0 Ci,1 Ci,2

Cj,0 Cj,1 Cj,2 Cj,3

Ci,3

Ck,3

Ck,1

Ck,0

m3 m4 m6

m2

m1

Ik,3

m5

Ck,2

m7 Ij,1

Checkpointing Distributed Computations 26

Basic Theorem

• A local checkpoint C is Useless if it cannot belong to any consistent global checkpoint.

• Netzer-Xu Theorem (1995): A local checkpoint C is useless iff it is involved in a Z-cycle.

Basic Checkpoints

• Some local states of each process called local check- points are saved on stable storage.

⋆ periodically,

⋆ upon the reception a signal,

⋆ according to the value of a predicate

⋆ according to the OS convenience (light-load, etc.)

⋆ etc.

(8)

Uncoordinated Checkpointing: Example 1

Each process has its own checkpointing policy.

Pj

Pi

Risk : domino effect.

Checkpointing Distributed Computations 29

Forced Checkpoints

Pj

Pi

Pk

• In order each local checkpoint belongs to at least one consistent global checkpoint, some processes may have to take additional checkpoints (forced checkpoints).

Checkpointing Distributed Computations 30

Checkpointing Protocols

• Coordinated Checkpointing

⋆ only Forced checkpoints, no Domino effect.

• Uncoordinated Checkpointing

⋆ only Basic checkpoints, possibly: Domino effect.

• Communication Induced Checkpointing

⋆ Forced + Basic checkpoints, no Domino effect.

Chandy-Lamport’s Coordinated Protocol (1985)

This protocol is based on the use of control messages:

markers.

Pj

Pi

marker

• A marker is a message that can neither overtake nor be overtaken by any other message sent on the same unidirectionnal channel.

(9)

Communication-Induced Checkpointing

• No communication ⇒ no Z-cycles.

• Avoid Z-cycles formed by application messages.

⋆ Detect Z-cycles: control information carried by mes- sages

⋆ Break Z-cycles: forced checkpoints

Checkpointing Distributed Computations 33

Breaking a Z-Cycle

Pj

Pi

Pj

Checkpointing Distributed Computations 34

Main Idea of the Protocol

• Idea: Asssociate a Lamport Timestamp with each local checkpoint.

• Theorem: If for any pair of checkpoints Cj,y and Ck,z: Z-path from Cj,y to Ck,z ⇒ Cj,y.t < Ck,z.t,

then no checkpoint can be involved in a Z-cycle.

A Protocol: Second Step

Each message carries the value of its sender’s clock at sending time.

• Init: cli := 0

• Upon the definition of a local checkpoint cli :=cli+ 1

<Take a local checkpoint timestamped with the current value of cli >

• When Pi sends a message m:

m.cl :=cli; send(m, m.cl)

• Upon the reception of a message (m, m.cl) cli :=max(cli, m.cl)

(10)

A Protocol: First Step

Use a lamport clock to timestamp checkpoints.

Pi

Pj

Pk

m3 m4

m2

m1 m5

m7

3

4 4 3

2 1

1

2

1 2 3

2 4 5

• Does not take into account hidden dependencies (non- causal Z-paths).

Checkpointing Distributed Computations 37

Hidden Dependencies

• Timestamps of messages increase along causal Z-paths.

• Timestamps of messages should increase along all Z-paths.

Checkpointing Distributed Computations 38

To Checkpoint or Not to Checkpoint

Pk

Pi

Pj

m1

m2 Ck,z Cj,y

Pk

Pi

Pj Cj,y

m1 Ck,z m2

a. b.

Ci,x

m1.t ≤ m2.t m1.t > m2.t

General Structure of the Protocol

• Init:

cli := 0;. . .

• When Pi takes a local checkpoint

cli :=cli+ 1; < resetting of data structures >

< Take a local checkpoint timestamped with cli >

• when Pi sends a message m:

m.cl :=cli; send(m, m.cl, . . .)

• Upon the reception of a message (m, m.cl, . . .) if < condition > then < take a ckpt >; (*forced*)

(11)

A First Condition to Checkpoint

• sent toi[k] has the value true iff Pi has sent a message to Pk since its last checkpoint.

• min toi[k] keeps the timestamp of the first message Pi sent to Pk since Pi’s last checkpoint.

C ≡(∃k : sent toi[k]∧m1.t > min toi[k])

Checkpointing Distributed Computations 41

Refining the Condition (1)

Pi Cj,y

Ck,z

Pk

Pj

m2 µ2

m1

Pi Cj,y

Ck,z

Pk

Pj

m2

m1

µ1

a. b.

Checkpointing Distributed Computations 42

Refining the Condition (2)

• cli(k)= value of Pk’s local clock as perceived by Pi (Pi can obtain this knowledge with a classical piggybacking technique).

(m1.t≤m2.t)∨ P, where P ≡ (Ci,y.t≤m1.t≤cli(k)< Ck,z.t).

Does cli(k) Refers to a Correct Value?

a. b.

Pi Cj,y

Pk

Pj

m1

Pi Cj,y

Pk

Pj

m1

µ1 µ2

Ci,x Ci,x

m2 Ck,z m2 Ck,z

Pi

Pk

m2 µ

Ci,x

Ck,z

causal Z-cycle

(12)

Final Condition

C

(∃k : sent toi[k]∧(m1.t > min toi[k])∧(m1.t > cli(k)∨ C1))

C1 being the condition that detects causal Z-cycles

Checkpointing Distributed Computations 45

Particular Cases

• C′′≡ ∃k : (sent toi[k]∧(m.lc > min toi[k]))

• C′′′ ≡(m.lc > mini)

Checkpointing Distributed Computations 46

Consistent Global Checkpoints

Checkpointing protocols must ensure that each local checkpoint belongs to at least one global checkpoint with:

1 No orphan messages.

2 No missing messages.

Question: How to ensure “no missing” in a not too much costly way?

Chandy-Lamport’s Coordinated Protocol: Example

Pj

Pi

σj

σi

σj

σi

m1 m2 m1 m2

• Which messages are in-transit (must be recorded)?

(13)

Chandy-Lamport’s Recording Rule

Upon the reception Pi of a marker sent by Pj

• If Pi has not yet taken a local checkpoint:

⋆ no message is in transit wrt this pair of local check- points

• If Pi has already taken a local checkpoint:

⋆ all the messages received after σi and before the re- ception of the marker.

Checkpointing Distributed Computations 49

Timestamps in a Checkpoint Interval (1)

Pk Pi

Pj

m1

m2 Ck,z Cj,y

Pk Pi

Pj Cj,y

m1 Ck,z

m2

a. b.

Ci,x

m1.t ≤ m2.t m1.t > m2.t

Checkpointing Distributed Computations 50

Timestamps in a Checkpoint Interval (2)

Pi

M ax Reci ≤cli ≤M in Senti

(initially, M ax Reci =−∞ and M in Senti= +∞)

In-Transit vs. Orphan (1)

Pi

Pj

in-transit orphan

Reversed computation Computation

(14)

Remark: In-Transit vs. Orphan (2)

• Let us consider the computation where all messages are reversed.

• Ensuring each local checkpoint belongs to a least one orphan-free global checkpoint of the reversed computa- tion is equivalent to ensure each local checkpoint be- longs to a least one missing-free global checkpoint of the original computation.

M ax Senti ≤cli≤ M in Reci

Checkpointing Distributed Computations 53

Recording Messages vs. Checkpointing

• Question: Can a recorded message be missing wrt a global checkpoint?

M ax Sent N Ri≤cli≤M in Rec N Ri

Checkpointing Distributed Computations 54

No Orphan and No Missing Messages (R1 and R2)

Within each interval the following invariant must be pre- served:

max(M ax Reci, M ax Sent N Ri)≤cli ≤ min(M in Senti, M in Rec N Ri)

Sketch of a Protocol (1)

M in Rec N Li

M in Senti

cli

M ax Reci

M ax Sent N Li

(15)

Sketch of a Protocol (2)

What to do (upon a send or a receive operation) in order to maintain the invariant?

• Increase the value of cli.

• Record some sent or received messages.

• Take a forced checkpoint.

Checkpointing Distributed Computations 57

Sketch of a Protocol: Example 1

m1

Pi

m2

8 5

2

Variable Before m1 Before m2 After m2

M ax Reci −∞ 5 8

M ax Sent N Ri −∞ −∞ −∞

cli 2 5 ?

M in Senti +∞ +∞ +∞

M in Rec N Ri +∞ 5 5

Checkpointing Distributed Computations 58

Sketch of a Protocol: Example 1

Variable Before m2 Record m1? Ckpt before m2?

M ax Reci 5 8 −∞

M ax Sent N Ri −∞ −∞ −∞

cli 5 8 >5

M in Senti +∞ +∞ +∞

M in Rec N Ri 5 8 +∞

There is a choice: checkpointing or recording m1

Sketch of a Protocol: Example 2

m1

Pi

m2

8 2

2

Variable Before m1 Before m2 After m2

M ax Reci −∞ −∞ 8

M ax Sent N Ri −∞ 2 2

cli 2 2 ?

M in Senti +∞ 2 2

M in Rec N Ri +∞ +∞ 8

(16)

Sketch of a Protocol: Example 2

Variable Before m2 Record m1? Ckpt before m2?

M ax Reci −∞ 8 −∞

M ax Sent N Ri 2 −∞ −∞

cli 2 ? >2

M in Senti 2 2 +∞

M in Rec N Ri +∞ 8 +∞

There is no choice: checkpointing

Checkpointing Distributed Computations 61

Références

Documents relatifs

We then observed, through multivariate analyses performed on eight pooled cohorts including 190 samples of unresectable stage III and IV melanoma, that PD-L1 expression on

enfg8fdèuxansavantla'i:hutë-àù&#34;Mur:j - [*~;1 Ein Orcfiê5têrfûr--diëWelf_= =-~=1IT En 1993, ilréalisait danslavillealors réuni-il tOrchestre Philharmonique

In a way similar to the one followed for LTT D5 the concepts of global states and unitary global formulas are defîned (unitary global formulas specify, for each global state,

Interferon gamma, an important marker of response to immune checkpoint blockade in non-small cell lung cancer and melanoma patients.. Ther Adv

Analysis of precisely engineered mutations revealed that the BUB-1 interaction is required for MAD-1 kinetochore localization and checkpoint signaling.. These

We derive the optimal two-level solutions based on both online scheduling and offline scheduling and prove that the optimal solution must adopt the equal-size checkpoint intervals

Prediction of clinical response to checkpoint blockade immunotherapy is improved with ensembling..

Une autre étude, menée chez des patients traités cette fois-ci par nivolumab, a montré que les patients qui développaient un IRAE quel que soit le stade avaient un taux de