Distributed Computing with Byzantine Players
Franck Petit / Sébastien Tixeuil Firstname.Lastname@lip6.fr
Motivation
Approach
•
Faults and attacks occur in the network•
The network’s user must not notice something wrong happened•
A small number of faulty components•
Masking approach to fault/attack tolerancePrinciple
CPU
CPU
CPU
Comparator Temperature
Pressure
Throttle
Problems
•
Replicated input sensors may not give the same data•
Faulty input sensor or processor may not fail gracefully•
The system might not be tolerant to software bugsTelling Truth from Lies
The Island of Liars and Truth-tellers
•
An island is populated by two tribes•
Members of one tribe consistently lie•
Members of the other tribe always tell the truth•
Tribe members can recognize oneanother, but an external observer can’t
Puzzle 1
•
You meet a man and ask him if he is a truth-teller, but fail to hear the answer•
You inquire: “Did you say you are a truth- teller?”•
He responds: “No, I did not.”•
To which tribe does the man belong ?Puzzle II
•
You meet a person on the island.•
What single question can you ask him/her to determine whether he/she is a liar or a truth-teller?Puzzle III
•
You meet two people A and B on the island•
A says: “Both of us are from the liar tribe.”•
Which tribe is A from ?•
What about B ?Puzzle IV
•
You meet two people, C and D on the island.•
C says: “Exactly one of us is from the liars tribe.”•
Which tribe is D from ?Puzzle V
•
You meet two people E and F on the island•
E says: “It is not the case that both of us are from truth-tellers tribe.”•
Which tribe is E from?•
What about F?Puzzle VI
•
You meet two people G and H on the island•
G says: “We are from different tribes.”•
H says: “G is from the liars tribe.”•
Which tribes are G and H from ?Puzzle VII
•
You meet three people A, B, and C•
You ask A: ”how many among you aretruth-tellers?”, but don’t hear the answer
•
You ask B: “What did A say?”, hear “one.”•
C says: “B is a liar.”•
Which tribes are B and C from?Puzzle VII
A
B B
C C
C C
0 1 1 2 1 2 2 3
The Island of Selective Liars
•
Inhabitants lie consistently on Tuesdays, Thursdays, and Saturdays. However, they always say the truth on the remaining days.•
You ask: “What is today?” ”Tomorrow?”•
Responses: “Saturday.”, “Wednesday.”•
What is the current day ?The Island of Random Liars
•
A new Island has three tribes•
truth-tellers•
consistent liars•
randomly lie or tell the truth•
How to identify three representants ofeach tribe standing in a line with only three yes/no questions?
Byzantine Generals
Settings
•
Byzantine generals are camping outside an enemy city•
Generals can communicate by sending messengers•
Generals must decide upon common plan of action•
Some of the Generals can be traitorsGoal
•
All loyal generals decide upon the same plan of action•
A small number of traitors cannot cause the loyal generals to adopt a bad planTwo Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
Two Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
Attack at noon ?
Two Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
Attack at noon ?
Two Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
Two Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
Ack !
Two Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
Ack !
Two Generals Paradox
G1 G2
Unreliable communication media Besieged
city
Besieged city
The Byzantine Generals Problem
L1
L2 Besieged
city G
The (simple) Byzantine Generals Problem
•
Generals lead n divisions of the Byzantine army•
The divisions communicate via reliable messengers•
The generals must agree on a plan(“attack” or “retreat”) even if some of them are killed by enemy spies
Oral Model
•
A1: Every message that is sent is delivered correctly•
A2: The receiver of a message knows who sent it•
A3: The absence of a message can be detectedSolution?
plan: array of {A,R}; finalPlan: {A,R}
1: plan[myID] := ChooseAorR()
2: for all other G send(G, myID, plan[myID]) 3: for all other G receive(G, plan[G])
4: finalPlan := majority(plan)
Reliable Networks
Alice:A
Bob:R Charlie:A
Reliable Networks
Alice:A
Bob:R Charlie:A
Reliable Networks
Alice:A
Bob:R Charlie:A
(A,A,R) (A,A,R)
(A,A,R)
Reliable Networks
Alice:A:A
Bob:R:A Charlie:A
(A,A,R) (A,A,R)
(A,A,R)
Crashing Networks
Alice:A
Bob:R Charlie:A
Crashing Networks
Alice:A
Bob:R Charlie:A
A
Crashing Networks
Alice:A
Bob:R Charlie:A
A
A
Crashing Networks
Alice:A
Bob:R Charlie:A
R
R
(R,A,-) (A,A,R)
Crashing Networks
Alice:A
Bob:R:R Charlie:A:A
R
R
(R,A,-) (A,A,R)
The Byzantine Generals Problem
•
A general and n-1 lieutenants lead n divisions of the Byzantine army•
The divisions communicate via messengers that can be captured or delayed•
The generals must agree on a plan(“attack” or “retreat”) even if some of
them are traitors that want to prevent agreement
The Byzantine Generals Problem
•
A commanding general must sent an order to his n-1 lieutenants generals such that•
IC1: all loyal lieutenants obey the same order•
IC2: if the commanding general is loyal, then every loyal lieutenant obeys theorder he sends
Oral Model
•
A1: Every message that is sent is delivered correctly•
A2: The receiver of a message knows who sent it•
A3: The absence of a message can be detected3k+1 nodes are
necessary (oral model)
Commander
Lieutenant 1 Lieutenant 2
Attack Attack
Retreat
3k+1 nodes are
necessary (oral model)
Commander
Lieutenant 1 Lieutenant 2
Retreat Retreat
Attack
3k+1 nodes are
necessary (oral model)
Commander
Lieutenant 1 Lieutenant 2
Attack Retreat
Retreat
3k+1 nodes are
necessary (oral model)
Commander
Lieutenant 1 Lieutenant 2
Retreat Attack
Attack
3k+1 nodes are
necessary (oral model)
Commander
Lieutenant 1 Lieutenant 2
Attack Retreat
Retreat
Retreat Attack
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 1 Lieutenant 2
v v
Lieutenant 3 v
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v)
Lieutenant 3 (v)
v
v
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v)
Lieutenant 3 (v,v)
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v)
Lieutenant 3 (v,v)
v v
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v,v)
Lieutenant 3 (v,v)
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v,v)
Lieutenant 3 (v,v)
x
x
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v,v,x)
Lieutenant 3 (v,v,x)
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 2 Lieutenant 1
(v,v,x)
Lieutenant 3 (v,v,x)
Maj(v,v,x)
Maj(v,v,x)
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 1 Lieutenant 2
x z
Lieutenant 3 y
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 1 (x,y,z)
Lieutenant 2 (x,y,z)
Lieutenant 3 (x,y,z)
3k+1 nodes are
sufficient (oral model)
Commander
Lieutenant 1 (x,y,z)
Lieutenant 2 (x,y,z)
Lieutenant 3 (x,y,z)
Min(x,y,z) Min(x,y,z)
Min(x,y,z)
Written Model
•
A1-A3: Same as before•
A4:•
A loyal general’s signature cannot beforged, and any alteration of the contents of his signed messages can be detected
•
Anyone can verify the authenticity of a general’s signaturek+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Lieutenant 2
Attack:C Attack:C
k+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Attack:C:L1 Lieutenant 2
k+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Attack:C:L2 Lieutenant 2
k+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Lieutenant 2
Attack:C Retreat:C
k+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Attack:C:L1 Lieutenant 2
k+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Retreat:C:L2 Lieutenant 2
k+2 nodes are sufficient (written model)
Commander
Lieutenant 1 Lieutenant 2
Attack:C
Retreat:C:L2
Retreat:C Attack:C:L1
Arbitrary Networks
Topology Discovery
•
Given•
asynchronous network•
up to k Byzantine nodes•
each node knows its immediate neighbors identifiers•
Goal•
each node must discover the complete network topologyWeak Topology Discovery
•
Termination•
either all non-faulty processes determine the system topology or at least one detects fault•
Safety•
for each non-faulty process, the determined topology is subset of actual•
Validity•
fault detected only if it indeed existsWeak Topology
Discovery
Weak Topology
Discovery
Weak Topology Discovery
•
Bounds•
cannot determine presence of edge if two adjacent nodes are faulty•
cannot be (completely) solved if network is less than k+1 connectedStrong Topology Discovery
•
Termination•
all non-faulty processes determine the system topology•
Safety•
for each non-faulty process thedetermined topology is subset of actual
Strong Topology
Discovery
Strong Topology
Discovery
Strong Topology
Discovery
Strong Topology Discovery
k=1
A
Strong Topology Discovery
k=1
A
Strong Topology Discovery
k=1
A
Strong Topology Discovery
k=1
A
Strong Topology
Discovery
Strong Topology
Discovery
Strong Topology Discovery
•
Bounds•
cannot determine presence of edge if one neighbor is faulty•
cannot be solved if network is less than 2k+1 connectedSolutions Preliminaries
•
Main idea•
Menger’s theorem: if a graph is kconnected then for any two vertices
there exists two internally node-disjoint paths connecting them
•
a single (non-source) node cannotcompromise info if it travels over two node-disjoint paths
Solutions Preliminaries
•
Common Features•
every solution essentially involvesflooding each node’s neighbor info to the other nodes
•
solutions differ on how the nodesforward neighborhood info received from other nodes
A Naive Solution
•
Store traveled path in message, forward message that contains simple path to all outgoing links•
Solves strong (and weak) topology discovery problemsA Naive Solution
h
a
g
b
f d
c
e
A Naive Solution
•
Store traveled path in message, forward message that contains simple path to all outgoing links•
Solves strong (and weak) topology discovery problems•
requires exponential number of messagesDetector
•
Basic design•
propagate neighbor info message for each process exactly once (first time)•
if receive different info for same process, signal fault•
since network is k+1 connected, info about non-faulty nodes reaches every nodeDetector
•
Handling fake nodes•
faulty process may send info about non- existent (fake) nodes thus compromising safety and termination•
only faulty nodes can be connected to fake nodes ? (discovered network isless that k+1 connected)
Detector
•
Handling fake nodes•
faulty process may send info about non- existent (fake) nodes thus compromising safety and termination•
when the network is not completelydiscovered yet, it may also be less than k+1 connected, problems with validity
Detector
h
a
g
b
f d
c
e
Detector
•
Neighborhood closure•
connect all nodes whose neighbor information is not received•
the connectivity of this graph is no less than the actual topology•
if the connectivity of this graph falls below k+1, signal faultDetector
a b
d c
e
d c
e
Detector
a b
d c
e
d c
e
a,b,d
Detector
a b
d c
e
d c
e
a,b,d
Detector
a b
d c
e
d c
e
a b
Detector
a b
d c
e
d c
e
a b
b,c
Detector
a b
d c
e
d c
e
a b
b,c
Detector
a b
d c
e
d c
e
a b
b,c
Detector
a b
d c
e
d c
e
a b
Detector
a b
d c
e
d c
e
a b
a,c,d
Detector
a b
d c
e
d c
e
a b
a,c,d
Detector
a b
d c
e
d c
e
a b
a,c,d
Detector
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
???
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Handling Faults
a b
d c
e
d c
e
a b
Detector
•
Definition•
Solution is adjacent-edge complete if non- faulty nodes discover all non-faulty nodes and their adjacent edgesDetector
•
Theorem•
Detector is an adjacent-edge complete solution to the weak topology discoveryproblem if the connectivity of the system exceeds the maximum number of faults
Explorer
•
Main idea•
collect node’s neighbor information such that the info goes along more than twice as many node disjoint paths as maxnumber of faulty nodes
Explorer
•
Confirmed neighbor information•
k+1 disjoint paths from source•
non-intersecting paths from k+1 confirmed neighborsExplorer
a b
d c
e
Explorer
a b
d c
e
a<b,c>
Explorer
a b
d c
e
a<b,c>
a<b,c>
Explorer
a b
d c
e
a<b,c>
a<b,c>
Explorer
a b
d c
e
a<b,c>
a<b,c>
Explorer
a b
d c
e
a<b,c>b:a<b,c>
c:a<b,c>
Explorer
a b
d c
e
a<b,c>
b:a<b,c>
c:a<b,c>
Explorer
a b
d c
e
b:a<b,c>
c:a<b,c>
b,d:a<b,c>
Explorer
a b
d c
e
b:a<b,c>
a<b,c>
Explorer
•
Definition•
Solution is two-adjacent edge complete if non-faulty nodes discover all non-faulty nodes and edges adjacent to two non- faulty nodesExplorer
•
Theorem•
(generalized) Explorer is a two-adjacent- edge complete solution to the strongtopology discovery problem in case the
graph connectivity is more than twice the number of faults
Composing Detector and Explorer
•
Observation•
Detector uses less messages when there are no faults•
Idea•
run Detector, if a node discovers fault, invoke Explorer•
requires 2k+1 connected topologiesMalice in Online Video Games
Some materials of this part are from Wei Tsang Ooi, National University of Singapore
Online Games
•
First Person Shooter (FPS)•
Real-time Strategy (RTS)•
Role playing Game (RPG)•
Massively MultiplayerOnline Real-Time Game (MMORTG)
•
Sports, puzzlesOnline Games
•
First Person Shooter (FPS)•
Real-time Strategy (RTS)•
Role playing Game (RPG)•
Massively MultiplayerOnline Real-Time Game (MMORTG)
•
Sports, puzzlesOnline Games
•
First Person Shooter (FPS)•
Real-time Strategy (RTS)•
Role playing Game (RPG)•
Massively MultiplayerOnline Real-Time Game (MMORTG)
•
Sports, puzzlesOnline Games
•
First Person Shooter (FPS)•
Real-time Strategy (RTS)•
Role playing Game (RPG)•
Massively MultiplayerOnline Real-Time Game (MMORTG)
•
Sports, puzzlesAge of Empire Series
http://compactiongames.about.com/library/games/screenshots/blscreens-ageofkings.htm
8
E-sport
•
International competitions (ESWC, WCG, WSVG)•
Prizes over $1 million•
Professional leagues•
Professional players with sponsors, coachs ...•
In some countries, e-players are really famous
Architectures
•
Client Server•
safer, server is trustable•
“easy” to design•
“centralized”•
expensive, not scalable, faults ?•
Peer-to-peer•
scalable•
cheap•
autonomous•
difficult to design, cheating is easierPeer-to-Peer Architecture
Collect Simulate
Render Wait
Game States Collect
Simulate Render
Wait Game
States
5
How to order events?
9
Bucket Synchronization
13
Player A
The game is divided into rounds (e.g. 25 rounds per second).
Player C Player B
14
Player A
Players are expected to send an event in each round (e.g. 25 updates per second).
Player C Player B
15
Player A
Players are expected to send an event in each round (e.g. 25 updates per second).
Player C Player B
16
Player A
Every round has a “bucket” that collects the events.
Player C Player B
17
Events generated in the same round go into the same
bucket of a future round. We know which round an event is generated in, based on time-stamp.
Player A
Player C Player B
18
Events generated in the same round go into the same
bucket of a future round. We know which round an event is generated in, based on time-stamp.
Player A
Player C Player B
18
Player A
In each round, the events in the bucket are processed (in order of time-stamp).
Player C Player B
19
Player A
Two parameters needed : round length and lag. Lag depends on maximum network latency among the players; round length depends on rendering speed.
Player C Player B
round length
lag
20
Player A
Players periodically ping among themselves and
exchange latency information. They also measure the average time taken to render a frame and exchange that information to estimate lag and round length.
Player C Player B
round length
lag
21
Player A
If events from another player is lost (or late), we can
predict its update (e.g. using dead reckoning) when possible.
Player C Player B
22
Player A
Alternative is to ensure every event is received before executing the bucket. What are the
drawbacks?
Player C Player B
23
Cheating
Online Cheat
•
First major online cheat: Diablo 1997•
FPS: aim bot, aim proxy (Quake, Counter strike)•
RTS: maphack (Warcraft, Age of Empires)•
1999-2000: awareness of industry•
recently: gaming bots in MMORTG (World of Warcraft)Look-Ahead Cheat
30
Player C (or a bot) can peek at A’s and B’s actions first, before deciding his/her moves.
Player A
Player C Player B
31
Player C (or a bot) can peek at A’s and B’s actions first, before deciding his/her moves.
Player A
Player C Player B
31
Player C (or a bot) can peek at A’s and B’s actions first, before deciding his/her moves.
Player A
Player C Player B
31
Time-stamp Cheat
35
Player C (or a bot) can put in an earlier time- stamp in its messages.
Player A
Player C Player B
36
Suppress-Update Cheat
37
If dead reckoning is used, Player C can stop sending update and let others predict its position. C then sends an update at
appropriate time to “surprise” other players.
Player A
Player C Player B
38
A C
39
A
C stops sending update.
A predicts C’s position.
40
A
C stops sending update.
A predicts C’s position.
41
A
C stops sending update.
A predicts C’s position.
42
A
C sends an update and shoots A.
43
(or A proceeds data of C in its bucket)
Cheater! Not my fault! My
packets were dropped..
44
Inconsistent Cheat
45
Player C can send different events to different players.
Player A
Player C Player B
46
With synchronous simulations, we can compare periodically the
game states to detect inconsistency cheats
47
Dealing with cheaters:
1. Prevent cheats (hard) 2. Detect cheats (easier)
32
Binaries Protection
•
Avoid client-side modifications•
avoid unauthorized behaviors•
ensure clients follow the same protocol•
[Munch06] proposes to execute dynamic verifications named mobile agentsDetection Mechanisms
•
Sometimes it is not possible to prevent cheating•
Keep log and verify afterwards [Kabus05]•
Runtime verification of rules [Delap04]•
Detection against Prevention•
Latency constraint are very high, prevention needs many message exchanges impacting this latencyProtocols
•
Enforcing fairness in spite of various latencies•
[Aggarwal05] on dead-reckoning•
[Guo03] removing unfair advantage of low delay•
Synchronisation protocols•
[Baughman01] lockstep protocol•
[GD04] lockstep with improvementsExample: Lockstep Protocol
•
Each client sends to every other a commit of its update•
When every client has received everyother update, they send the clear update
•
The game view is updated and broadcast•
Performance issue: a late message freezes all messagesDefeating Maphack
•
In RTS, maphack is to be avoided•
Game clients are not trustable•
Any information that leaked may be revealed•
Zero-Knowledge ProtocolsDefeating Maphack
•
Consider two players such that:•
Player 1 has value A•
Player 2 has value B•
Question: How to know whether A=B without revealing A or B if A!=B•
Bad solution: exchange hash(A) and hash (B) and then compareDefeating Maphack
•
Let f and g be two commutativecryptographic functions respectively known only to P1 and P2
•
f(g(A)) = g(f(A)) for any ADefeating Maphack
•
P1 computes f(A)•
P1 sends f(A) to P2•
P1 computes f(g(B))•
P1 sends f(g(B)) to P2•
if f(g(B))=g(f(A)) then A=B•
P2 computes g(B)•
P2 sends g(B) to P1•
P2 computes g(f(A))•
P2 sends g(f(A)) to P1•
if f(g(B))=g(f(A)) then A=BConclusion
Conclusion
•
Goal: mask faults and attacks to the user•
Basic principle: redundancy and majority•
not necessary to identify who misbehaves•
most people must be reliable•
protocols are much easier withcryptography (but how is crypto set up?)