Introduc)on to Peer to Peer Systems
Based on materials by:
- Franck Petit, Lip6
- Yang Guo and Christoph Neumann, Corporate Research, Thomson Inc.
- Jon Crowcroft, Univ. of Cambridge
Fabien Mathieu LIAFA, INRIA
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
History, mo9va9on and evolu9on
1999: Napster, first widely used P2P-‐applica9on
P2P represented ~65%
of Internet
Traffic at end 2006
1999: Napster, first widely used P2P applica9on
The applica9on:
• A P2P applica9on for the distribu9on of mp3 files
– Each user can contribute its own content
How it works:
• Central index server
– Maintains list of all ac9ve peers and their available content
• Distributed storage and download
– Client nodes also act as file servers – All downloaded content is shared
History, mo9va9on and evolu9on -‐ Napster (cont’d)
Central index server …
join
Ini)al join
–
Peers connect to Napster server–
Transmit current lis9ng of shared files to serverHistory, mo9va9on and evolu9on -‐ Napster (cont’d)
• Content search
– Peers request to Napster server
– Napster server checks the database and returns list of matched peers
1) query
2) answer Central index server …
History, mo9va9on and evolu9on -‐ Napster (cont’d)
• File retrieval
– The reques9ng peer contacts the peer having the file directly and downloads the it
1) request 2) download Central index server …
1) 2)
History, mo9va9on and evolu9on -‐ File Download
• Napster was the first simple but successful P2P-‐
applica9on. Many others followed…
P2P File Download Protocols:
• 1999: Napster
• 2000: Gnutella, eDonkey
• 2001: Kazaa
• 2002: eMule, BitTorrent
Why is P2P so successful?
• Scalable – It’s all about sharing resources
– No need to provision servers or bandwidth – Each user brings its own resource
– e.g. resistant to flash crowds
• flash crowd = a crowd of users all arriving at the same 9me Resources could be:
• Files to share;
• Upload bandwidth;
• Disk storage;…
Why is P2P so successful? (cont’d)
• Cheap -‐ No infrastructure needed
• Everybody can bring its own content (at no cost)
– Homemade content – Niche content
– Illegal content
– But also legal content – …
• High availability – Content accessible most of the 9me (replica9on)
Remark
• P2P is decreasing since 2008
– Growth of web-‐based streaming (YouTube) – On-‐line storage (MegaUpload)
– Ergonomy – Legal issues
• But next-‐gen (hybrid) P2P is on the move
Roadmap
• To begin with…
– Why P2P?
– Defini)on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Defini9ons of P2P
• Common sense: BitTorrent, eMule, UUSee PPLive, Skype
• DADVSI: allow to get illegal content
• Network I: out of the client/server paradigm
• Network II: distributed system made of end users
that shares resources
Defini9ons of P2P
Peer to peer systems
• Assumed to be with no a priori dis)nguished role
• So no single point of bofleneck or failure
• However, this means they need distributed algorithms for
– Connec9on protocol
– Service discovery (name, address, route, metric, etc) – Neighbour status tracking
– Applica9on layer rou9ng (based possibly on content, interest, etc)
– Resilience, handling link and node failures – etc
Overlays and peer 2 peer systems
• P2P systems rely on overlays structure
Focus at the application level
Overlays and peer 2 peer systems
• P2P systems rely on overlays structure
P2P-‐Overlay (cont’d)
Underlay Overlay
Source Source
P2P-‐Overlay (cont’d)
Underlay Overlay
Source
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Taxonomy of Applica9ons
Applica9ons
Parallel Computa9on
(GRID Compu9ng) Resource
Management Collabora9on
Applica9ons
Chagng
Shared Appli.
(Games, Specific …) Intensive
Compu9ng (same task x
9mes)
Component Compu9ng
(y small tasks)
Content
Sharing Shared File
System Distributed Database
Taxonomy of Applica9ons
• P2P-‐File download
– Napster, Gnutella, KaZaa, eDonkey, Biforrent…
• P2P-‐Communica9on
– VoIP, Skype, Messaging, …
• P2P-‐Video-‐on-‐Demand
• P2P-‐Computa9on
– GRID, scien9fic computa9on
(se9@home, XtreemOS, BOINC,…)
• P2P-‐Streaming
– PPLive, End System Mul9cast (ESM),…
• P2P-‐Gaming
– WOW, City of Heroes,…
History, mo9va9on and evolu9on -‐ Applica9ons
• P2P is not restricted to file download!
P2P Protocols:
• 1999: Napster, End System Mul9cast (ESM)
• 2000: Gnutella, eDonkey
• 2001: Kazaa
• 2002: eMule, BitTorrent
• 2003: Skype
• 2004: PPLive
• 2006: TVKoo, TVAnts, PPStream, SopCast, Video-‐on-‐
Demand, Gaming
File Download Streaming
Telephony
Video-on-Demand Gaming
Application type:
Technical taxonomies
• Indexa9on/distribu9on
• Structured/unstructured
• Push/pull/publish-‐suscribe
• …
Examples
• Unstructured P2P-‐overlays
– Generally random overlay
– Used for content download, telephony, streaming
• Structured P2P-‐overlays
– Distributed Hash Tables (DHTs)
– Used for node localiza9on, content download, streaming
To summarize
• P2P gathers a lot of techniques
• Not black or white, all grey
• Hint: P2P is all about answering the ques9on
Who gets what from whom?
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu)ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Unstructured P2P-‐overlays
• Unstructured P2P-‐overlays do not really care how the overlay is constructed
– Peers are organized in a random graph topology
• e.g., new node randomly chooses three exis9ng nodes as neighbors
• Flat or hierarchical
– Build your P2P-‐service based on this graph
• Several proposals
– Gnutella
– KaZaA/FastTrack*
– BitTorrent*
Unstructured P2P-‐overlays (cont’d)
• Unstructured P2P-‐overlays are just a framework, you can build many applica9ons on top of it
• “Pure” P2P
• Unstructured P2P-‐overlays pros & cons – Pros
• Very flexible: copes with dynamicity
• Supports complex queries (conversely to structured overlays)
– Cons
• Content search is difficult: There is a tradeoff between
generated traffic (overhead) and the horizon of the par9al
One Example of usage of unstructured overlays
• Typical problem in unstructured overlays: How to do content search and query?
– Flooding
– Limited Scope, send only to a subset of your neighbors – Time-‐To-‐Live, limit the number of hops per messages
Search “Britney Spears”
Example of flooding:
(similar to Gnutella)
Found entry!
Notify Upload
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu)ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Structured P2P-‐overlays
• Mo9va9on
– Locate content efficiently
• Solu9on – DHT (Distributed Hash Table)
– Par9cular nodes hold par9cular keys
– Locate a key: route search request to a par=cular node that holds the key
– Representa9ve solu9ons
• CAN, Pastry/Tapestry, Chord, etc.
Challenges to Structured P2P-‐overlays
• Load balance
– spreading keys evenly over the nodes.
• Decentraliza9on
– no node is more important than any other.
• Scalability
– Lookup must be efficient even with large systems
• Peer dynamics
– Nodes may come and go, may fail
– Make sure that “the/a” node responsible for a key can always be found
Content-‐Addressable Network (CAN)
• Maps a 2-‐dimensional coordinate space
• Each node is responsible for a frac9on of the space
• A hash pair is provided using
two hash func9ons on the value to be stored
• Rou9ng is done by forwarding to the neighbor that is closest in Cartesian distance
0 2m
2m
(x, y)
CAN example
8
1
8 7
6
5 4
3
9
10 2
Node Range 2 (2, 4) to (4, 8) 1 (0, 4) to (2, 8) 3 (4, 7) to (5, 8) 4 (4, 6) to (5, 7) 6 (0, 2) to (2, 4)
Routing table for node 2
2
(7, 1)
Entry hashed to (7, 1) is stored at node 10
Search path from 2 to (7, 1):
10 2
4
9
10
CAN joins and departures
• Joining node picks a random pair (X, Y) and contacts node A
– A sends a join message to B, the node responsible for the pair – B divides is its space with A and shares its rou9ng informa9on
appropriately
– A and B contact all its neighbors with updated info
• Depar9ng node contacts its neighbors and finds one that can manage its space
– The node sends updated rou9ng info to its neighbors
CAN features
• Simple to understand and thus simple to add features to:
– Overlapping regions for redundancy – D-‐dimensional space (not just 2)
– Dynamic division of space for load balancing
• Hop #: O(Dn
1/D)
• Rou9ng Table: O(D)
• Possible overload
Tapestry/Pastry
• Both based on Plaxton Mesh
– Nodes – Act as routers, clients and servers simultaneously.
– Objects – Stored on servers, searched by clients.
• Objects and nodes have unique, loca)on independent names – Random fixed-‐length bit sequence in some base
– Typically, hash of hostname or of object name
– A hashing func9on h assigns each node and key a m-‐bit iden9fier
• What Plaxton does:
Key=“LetItBe” h ID=60 IP=“157.25.10.1” h ID=101
Plaxton Mesh Rou9ng
• Rou9ng done incrementally, digit by digit
– In each step, message sent to node with address one digit closer to des9na9on.
• Example -‐ node 0325 sends M to node 4598
– Possible route:
• 0325
• B4F8
• 9098
• 7598
• 4598
Plaxton Mesh Rou9ng Tables
• Each node has a neighbor map
– Used to find a node whose address has one more digit in common with the des9na9on
• Size of the map: b*log
bN, where b is the base used
for the address
Plaxton Mesh Rou9ng Tables
• x are wildcards. Can be filled with any address.
– Each slot can have several addresses. Redundancy.
• Example: node 3642 receives message for node 2342
– Common prefix: XX42
• Look into second column.
– Must send M to a node one digit “closer” to the
des9na9on.
• Any host with an address like X342.
Node 3642
Chord Protocol
• A consistent hashing func9on (SHA-‐1) assigns each node and key a m-‐bit iden9fier
• Iden9fiers are ordered on a iden9fier circle modulo 2
mcalled a chord ring.
• successor(k) = first node whose iden9fier is >=
iden9fier of k in iden9fier space.
Key=“LetItBe” SHA-1 ID=60 IP=“157.25.10.1” SHA-1 ID=101
Consistent Hashing (cont.)
N32"
N123" K20"
K5"
Circular 7-bit ID space
IP=“198.10.10.1” 0
K101"
SHA1(Key=“LetItBe”)
A key is stored at its successor: node with next higher ID
Theorem
• For any set of N nodes and K keys, with high probability:
1. Each node is responsible for at most (1+e)K/N keys.
2. When an (N+1)st node joins or leaves the network, responsibility for O(K/N) keys changes hands.
e = O(log N)
Simple Key Loca9on Scheme
N1
N8
N14
N38 N42
N48 K45
Scalable Lookup Scheme
N8+1 N14 N8+2 N14 N8+4 N14 N8+8 N21 N8+16 N32 N8+32 N42
N1
N8
N14
N38 N42
N48 N51
N56 Finger Table for N8
finger 1,2,3
finger 4 finger 6
finger [k] = first node that succeeds (n
finger 5
Lookup Using Finger Table
N1
N8
N14
N38 N42
N51
N56
N48
lookup(54)
Scalable Lookup Scheme
• Each node forwards query at least halfway along distance remaining to the target
• Theorem: With high probability, the number of
nodes that must be contacted to find a successor
in a N-‐node network is O(log N)
Chord joins and departures
• Joining node gets an ID N and contacts node A
– A searches for N’s predecessor B – N becomes B’s successor
– N contacts B and gets its successor pointer and finger table, and updates the finger table values
– N contacts its successor and inherits keys
• Depar9ng node contacts its predecessor, no9fies it of its departure, and sends it the new successor pointer
Chord features
• Correct in the face of rou9ng failures since can correctly route with successor pointers
• Replica9on can easily reduce the possibility of loosing data
• Caching can easily place data along a likely search path for future queries
• Simple to understand
Scalable Lookup Scheme
// ask node n to find the successor of id n.find_successor(id)
if (id belongs to (n, successor]) return successor;
else
n0 = closest preceding node(id);
return n0.find_successor(id);
// search the local table for the highest predecessor of id n.closest_preceding_node(id)
for i = m downto 1
if (finger[i] belongs to (n, id)) return finger[i];
return n;
Pastry: Rou9ng procedure
If (destination is within range of our leaf set) forward to numerically closest member else
let l = length of shared prefix
let d = value of l-th digit in D’s address if (Rld exists)
forward to Rld
else
forward to a known node that
(a) shares at least as long a prefix
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu)on
– Bandwidth provisioning – Delay issues
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
Full hybrid model (Servers,Leechers,Seeders)
Central Servers (C)
Leechers (L) Client
Nota9on
• Servers/leechers/seeders:
• Popula9on size:
• Upload:
• Download:
The Bandwidth Conserva9on Law
• What you download is uploaded somewhere
• Assuming unbounded download capaci9es
• : efficiency parameter
• Ideal systems:
The Bandwidth Conserva9on Law
• Assume bandwidth is evenly distributed
• Works for fixed size or open systems
• Works for streaming or elas9c rate
• Can be upgraded at will (efficiency, variable d
max,
Example #1: streaming
• For a given let and
• If , the streamrate can be achieved, otherwise…
• : normalized capacity of the L/S system
• Case 1:
– System is scalable (it works independently of ),
– Servers are unnecessary (except for bootstrap/security)
• Case 2 :
– is bounded by
Joost (the old version)
• The monitored upload/download ra9o gives
• If (no seeders), P2P increases the server capacity by 20%
• If , this becomes 50% (most likely behavior)
• If , 100%
• …
• True scalability can be achieved only if all the 9me!
• Conclusion: Joost was probably hybrid (not scalable)
Example #2: filesharing
• Goal: download a file of size
• Peers enter at rate as leechers
• Peer needs to download the file, then seeds
• A•er some 9me a seeder leaves
• This is a dynamic system
• Focus on sta9onary regime
Sta9onary regime
λ
Overprovisioning condi9on
Roadmap
• To begin with…
– Why P2P?
– Defini9on and key concepts – Taxonomies
• Content indexing
– Unstructured solu9ons – Structured solu9ons
• Content distribu9on
– Bandwidth provisioning – Delay issues
One-‐to-‐one transmission
• Time needed for data to be transmifed from one peer to another
• For stripes (con9nuous streams of data) latency
• For chunks (atomic units of data)
NegoTime +latency+ChunkSize/Bandwidth
• For ``big’’ chunks
ChunkSize/Bandwidth
P2P broadcas9ng Delays
• Play-‐out delay:
– Gap between the actual event and its display – For live only
• Start-‐up delay:
– Time needed to start watching – For any mul9media content
• Propaga9on delay
– Time for data to reach all peers a•er injec9on – Absolute lower bound for play-‐out
The logarithmic bound
• Homogeneous case: all values are equal
– stripes
– Big chunks
The constant bound
• Delay grows logarithmically with
• If is bounded, so is the delay
– When not scalable, you have necessarily
– Even if scalable, you may want a high
Pour aller plus loin…
• DHTs
– Thèse de Philippe Gauron (et références)
hfp://tel.archives-‐ouvertes.fr/docs/00/14/09/01/PDF/
thesePhilippeGauron.pdf
• Distribu9on de contenus
– Mémoire d’habilita9on (et références)
hfp://gang.inria.fr/~fmathieu/pub/files/mathieu09autour.pdf