Introduc)on to Peer to Peer Systems

(1)

Introduc)on to Peer to Peer Systems

Based on materials by:

-  Franck Petit, Lip6

- Yang Guo and Christoph Neumann, Corporate Research, Thomson Inc.

-  Jon Crowcroft, Univ. of Cambridge

Fabien Mathieu LIAFA, INRIA

(2)

Roadmap

•  To begin with…

– Why P2P?

– Deﬁni9on and key concepts – Taxonomies

•  Content indexing

– Unstructured solu9ons – Structured solu9ons

•  Content distribu9on

– Bandwidth provisioning – Delay issues

(3)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

•  Content distribu9on

(4)

History, mo9va9on and evolu9on

1999: Napster, ﬁrst widely used P2P-‐applica9on

P2P represented ~65%

of Internet

Traffic at end 2006

(5)

1999: Napster, ﬁrst widely used P2P applica9on

The applica9on:

•  A P2P applica9on for the distribu9on of mp3 ﬁles

– Each user can contribute its own content

How it works:

•  Central index server

– Maintains list of all ac9ve peers and their available content

•  Distributed storage and download

– Client nodes also act as ﬁle servers – All downloaded content is shared

(6)

History, mo9va9on and evolu9on -‐ Napster (cont’d)

Central index server …

join

 Ini)al join

– 

Peers connect to Napster server

– 

Transmit current lis9ng of shared ﬁles to server

(7)

•  Content search

– Peers request to Napster server

– Napster server checks the database and returns list of matched peers

1) query

2) answer Central index server …

(8)

•  File retrieval

– The reques9ng peer contacts the peer having the ﬁle directly and downloads the it

1)  request 2)  download Central index server …

1) 2)

(9)

History, mo9va9on and evolu9on -‐ File Download

•  Napster was the ﬁrst simple but successful P2P-‐

applica9on. Many others followed…

P2P File Download Protocols:

•  1999: Napster

•  2000: Gnutella, eDonkey

•  2001: Kazaa

•  2002: eMule, BitTorrent

(10)

Why is P2P so successful?

•  Scalable – It’s all about sharing resources

– No need to provision servers or bandwidth – Each user brings its own resource

– e.g. resistant to ﬂash crowds

•  ﬂash crowd = a crowd of users all arriving at the same 9me Resources could be:

• Files to share;

• Upload bandwidth;

• Disk storage;…

(11)

Why is P2P so successful? (cont’d)

•  Cheap -‐ No infrastructure needed

•  Everybody can bring its own content (at no cost)

–  Homemade content –  Niche content

–  Illegal content

–  But also legal content –  …

•  High availability – Content accessible most of the 9me (replica9on)

(12)

Remark

•  P2P is decreasing since 2008

– Growth of web-‐based streaming (YouTube) – On-‐line storage (MegaUpload)

– Ergonomy – Legal issues

•  But next-‐gen (hybrid) P2P is on the move

(13)

Roadmap

•  To begin with…

– Why P2P?

– Deﬁni)on and key concepts – Taxonomies

•  Content indexing

•  Content distribu9on

(14)

Deﬁni9ons of P2P

•  Common sense: BitTorrent, eMule, UUSee PPLive, Skype

•  DADVSI: allow to get illegal content

•  Network I: out of the client/server paradigm

•  Network II: distributed system made of end users

that shares resources

(15)

Deﬁni9ons of P2P

(16)

Peer to peer systems

•  Assumed to be with no a priori dis)nguished role

•  So no single point of bofleneck or failure

•  However, this means they need distributed algorithms for

–  Connec9on protocol

–  Service discovery (name, address, route, metric, etc) –  Neighbour status tracking

–  Applica9on layer rou9ng (based possibly on content, interest, etc)

–  Resilience, handling link and node failures –  etc

(17)

Overlays and peer 2 peer systems

•  P2P systems rely on overlays structure

(18)

Focus at the application level

Overlays and peer 2 peer systems

•  P2P systems rely on overlays structure

(19)

P2P-‐Overlay (cont’d)

Underlay Overlay

Source Source

(20)

P2P-‐Overlay (cont’d)

Underlay Overlay

Source

(21)

Roadmap

•  To begin with…

– Why P2P?

– Deﬁni9on and key concepts – Taxonomies

•  Content indexing

•  Content distribu9on

(22)

Taxonomy of Applica9ons

Applica9ons

Parallel Computa9on

(GRID Compu9ng) Resource

Management Collabora9on

Applica9ons

Chagng

Shared Appli.

(Games, Speciﬁc …) Intensive

Compu9ng (same task x

9mes)

Component Compu9ng

(y small tasks)

Content

Sharing Shared File

System Distributed Database

(23)

Taxonomy of Applica9ons

• P2P-‐File download

–  Napster, Gnutella, KaZaa, eDonkey, Biforrent…

• P2P-‐Communica9on

–  VoIP, Skype, Messaging, …

• P2P-‐Video-‐on-‐Demand

• P2P-‐Computa9on

–  GRID, scien9ﬁc computa9on

(se9@home, XtreemOS, BOINC,…)

• P2P-‐Streaming

– PPLive, End System Mul9cast (ESM),…

• P2P-‐Gaming

–  WOW, City of Heroes,…

(24)

History, mo9va9on and evolu9on -‐ Applica9ons

•  P2P is not restricted to ﬁle download!

P2P Protocols:

•  1999: Napster, End System Mul9cast (ESM)

•  2000: Gnutella, eDonkey

•  2001: Kazaa

•  2002: eMule, BitTorrent

•  2003: Skype

•  2004: PPLive

•  2006: TVKoo, TVAnts, PPStream, SopCast, Video-‐on-‐

Demand, Gaming

File Download Streaming

Telephony

Video-on-Demand Gaming

Application type:

(25)

Technical taxonomies

•  Indexa9on/distribu9on

•  Structured/unstructured

•  Push/pull/publish-‐suscribe

•  …

(26)

Examples

•  Unstructured P2P-‐overlays

– Generally random overlay

– Used for content download, telephony, streaming

•  Structured P2P-‐overlays

– Distributed Hash Tables (DHTs)

– Used for node localiza9on, content download, streaming

(27)

To summarize

•  P2P gathers a lot of techniques

•  Not black or white, all grey

•  Hint: P2P is all about answering the ques9on

Who gets what from whom?

(28)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

•  Content distribu9on

(29)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

– Unstructured solu)ons – Structured solu9ons

•  Content distribu9on

(30)

Unstructured P2P-‐overlays

•  Unstructured P2P-‐overlays do not really care how the overlay is constructed

– Peers are organized in a random graph topology

•  e.g., new node randomly chooses three exis9ng nodes as neighbors

•  Flat or hierarchical

– Build your P2P-‐service based on this graph

•  Several proposals

– Gnutella

– KaZaA/FastTrack*

– BitTorrent*

(31)

Unstructured P2P-‐overlays (cont’d)

•  Unstructured P2P-‐overlays are just a framework, you can build many applica9ons on top of it

•  “Pure” P2P

•  Unstructured P2P-‐overlays pros & cons – Pros

•  Very ﬂexible: copes with dynamicity

•  Supports complex queries (conversely to structured overlays)

– Cons

•  Content search is diﬃcult: There is a tradeoﬀ between

generated traﬃc (overhead) and the horizon of the par9al

(32)

One Example of usage of unstructured overlays

•  Typical problem in unstructured overlays: How to do content search and query?

–  Flooding

–  Limited Scope, send only to a subset of your neighbors –  Time-‐To-‐Live, limit the number of hops per messages

Search “Britney Spears”

Example of flooding:

(similar to Gnutella)

Found entry!

Notify Upload

(33)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

– Unstructured solu9ons – Structured solu)ons

•  Content distribu9on

(34)

Structured P2P-‐overlays

•  Mo9va9on

– Locate content eﬃciently

•  Solu9on – DHT (Distributed Hash Table)

– Par9cular nodes hold par9cular keys

– Locate a key: route search request to a par=cular node that holds the key

– Representa9ve solu9ons

•  CAN, Pastry/Tapestry, Chord, etc.

(35)

Challenges to Structured P2P-‐overlays

•  Load balance

– spreading keys evenly over the nodes.

•  Decentraliza9on

– no node is more important than any other.

•  Scalability

– Lookup must be eﬃcient even with large systems

•  Peer dynamics

– Nodes may come and go, may fail

– Make sure that “the/a” node responsible for a key can always be found

(36)

Content-‐Addressable Network (CAN)

•  Maps a 2-‐dimensional coordinate space

•  Each node is responsible for a frac9on of the space

•  A hash pair is provided using

two hash func9ons on the value to be stored

•  Rou9ng is done by forwarding to the neighbor that is closest in Cartesian distance

0 2^m

2^m

(x, y)

(37)

CAN example

8

1

8 7

6

5 4

3

9

10 2

Node Range 2 (2, 4) to (4, 8) 1 (0, 4) to (2, 8) 3 (4, 7) to (5, 8) 4 (4, 6) to (5, 7) 6 (0, 2) to (2, 4)

Routing table for node 2

2

(7, 1)

Entry hashed to (7, 1) is stored at node 10

Search path from 2 to (7, 1):

10 2

4

9

10

(38)

CAN joins and departures

•  Joining node picks a random pair (X, Y) and contacts node A

–  A sends a join message to B, the node responsible for the pair –  B divides is its space with A and shares its rou9ng informa9on

appropriately

–  A and B contact all its neighbors with updated info

•  Depar9ng node contacts its neighbors and ﬁnds one that can manage its space

–  The node sends updated rou9ng info to its neighbors

(39)

CAN features

•  Simple to understand and thus simple to add features to:

– Overlapping regions for redundancy – D-‐dimensional space (not just 2)

– Dynamic division of space for load balancing

•  Hop #: O(Dn

^1/D

)

•  Rou9ng Table: O(D)

•  Possible overload

(40)

Tapestry/Pastry

•  Both based on Plaxton Mesh

–  Nodes – Act as routers, clients and servers simultaneously.

–  Objects – Stored on servers, searched by clients.

•  Objects and nodes have unique, loca)on independent names –  Random ﬁxed-‐length bit sequence in some base

–  Typically, hash of hostname or of object name

–  A hashing func9on h assigns each node and key a m-‐bit iden9ﬁer

•  What Plaxton does:

Key=“LetItBe” ^h ID=60 IP=“157.25.10.1” ^h ID=101

(41)

Plaxton Mesh Rou9ng

•  Rou9ng done incrementally, digit by digit

–  In each step, message sent to node with address one digit closer to des9na9on.

•  Example -‐ node 0325 sends M to node 4598

–  Possible route:

•  0325

•  B4F8

•  9098

•  7598

•  4598

(42)

Plaxton Mesh Rou9ng Tables

•  Each node has a neighbor map

– Used to ﬁnd a node whose address has one more digit in common with the des9na9on

•  Size of the map: b*log

_b

N, where b is the base used

for the address

(43)

Plaxton Mesh Rou9ng Tables

•  x are wildcards. Can be ﬁlled with any address.

–  Each slot can have several addresses. Redundancy.

•  Example: node 3642 receives message for node 2342

–  Common preﬁx: XX42

•  Look into second column.

–  Must send M to a node one digit “closer” to the

des9na9on.

•  Any host with an address like X342.

Node 3642

(44)

Chord Protocol

•  A consistent hashing func9on (SHA-‐1) assigns each node and key a m-‐bit iden9ﬁer

•  Iden9ﬁers are ordered on a iden9ﬁer circle modulo 2

^m

called a chord ring.

•  successor(k) = ﬁrst node whose iden9ﬁer is >=

iden9ﬁer of k in iden9ﬁer space.

Key=“LetItBe” ^SHA-1 ID=60 IP=“157.25.10.1” ^SHA-1 ID=101

(45)

Consistent Hashing (cont.)

N32"

N123" K20"

K5"

Circular 7-bit ID space

IP=“198.10.10.1” 0

K101"

SHA1(Key=“LetItBe”)

  A key is stored at its successor: node with next higher ID

(46)

Theorem

•  For any set of N nodes and K keys, with high probability:

1.  Each node is responsible for at most (1+e)K/N keys.

2.  When an (N+1)^st node joins or leaves the network, responsibility for O(K/N) keys changes hands.

e = O(log N)

(47)

Simple Key Loca9on Scheme

N1

N8

N14

N38 N42

N48 K45

(48)

Scalable Lookup Scheme

N8+1 N14 N8+2 N14 N8+4 N14 N8+8 N21 N8+16 N32 N8+32 N42

N1

N8

N14

N38 N42

N48 N51

N56 Finger Table for N8

finger 1,2,3

finger 4 finger 6

finger [k] = first node that succeeds (n

finger 5

(49)

Lookup Using Finger Table

N1

N8

N14

N38 N42

N51

N56

N48

lookup(54)

(50)

Scalable Lookup Scheme

•  Each node forwards query at least halfway along distance remaining to the target

•  Theorem: With high probability, the number of

nodes that must be contacted to ﬁnd a successor

in a N-‐node network is O(log N)

(51)

Chord joins and departures

•  Joining node gets an ID N and contacts node A

–  A searches for N’s predecessor B –  N becomes B’s successor

–  N contacts B and gets its successor pointer and ﬁnger table, and updates the ﬁnger table values

–  N contacts its successor and inherits keys

•  Depar9ng node contacts its predecessor, no9ﬁes it of its departure, and sends it the new successor pointer

(52)

Chord features

•  Correct in the face of rou9ng failures since can correctly route with successor pointers

•  Replica9on can easily reduce the possibility of loosing data

•  Caching can easily place data along a likely search path for future queries

•  Simple to understand

(53)

Scalable Lookup Scheme

// ask node n to ﬁnd the successor of id n.ﬁnd_successor(id)

if (id belongs to (n, successor]) return successor;

else

n0 = closest preceding node(id);

return n0.ﬁnd_successor(id);

// search the local table for the highest predecessor of id n.closest_preceding_node(id)

for i = m downto 1

if (ﬁnger[i] belongs to (n, id)) return ﬁnger[i];

return n;

(54)

Pastry: Rou9ng procedure

If (destination is within range of our leaf set) forward to numerically closest member else

let l = length of shared prefix

let d = value of l-th digit in D’s address if (R_l^d exists)

forward to R_l^d

else

forward to a known node that

(a) shares at least as long a prefix

(55)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

•  Content distribu)on

(56)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

•  Content distribu9on

– Bandwidth provisioning – Delay issues

(57)

Full hybrid model (Servers,Leechers,Seeders)

Central Servers (C)

Leechers (L) Client

(58)

Nota9on

•  Servers/leechers/seeders:

•  Popula9on size:

•  Upload:

•  Download:

(59)

The Bandwidth Conserva9on Law

•  What you download is uploaded somewhere

•  Assuming unbounded download capaci9es

•  : eﬃciency parameter

•  Ideal systems:

(60)

The Bandwidth Conserva9on Law

•  Assume bandwidth is evenly distributed

•  Works for ﬁxed size or open systems

•  Works for streaming or elas9c rate

•  Can be upgraded at will (eﬃciency, variable d

_max

,

(61)

Example #1: streaming

•  For a given let and

•  If , the streamrate can be achieved, otherwise…

•  : normalized capacity of the L/S system

•  Case 1:

–  System is scalable (it works independently of ),

–  Servers are unnecessary (except for bootstrap/security)

•  Case 2 :

–  is bounded by

(62)

Joost (the old version)

•  The monitored upload/download ra9o gives

•  If (no seeders), P2P increases the server capacity by 20%

•  If , this becomes 50% (most likely behavior)

•  If , 100%

•  …

•  True scalability can be achieved only if all the 9me!

•  Conclusion: Joost was probably hybrid (not scalable)

(63)

Example #2: ﬁlesharing

•  Goal: download a ﬁle of size

•  Peers enter at rate as leechers

•  Peer needs to download the ﬁle, then seeds

•  A•er some 9me a seeder leaves

•  This is a dynamic system

•  Focus on sta9onary regime

(64)

Sta9onary regime

λ

(65)

Overprovisioning condi9on

(66)

Roadmap

•  To begin with…

– Why P2P?

•  Content indexing

•  Content distribu9on

– Bandwidth provisioning – Delay issues

(67)

One-‐to-‐one transmission

•  Time needed for data to be transmifed from one peer to another

•  For stripes (con9nuous streams of data) latency

•  For chunks (atomic units of data)

NegoTime +latency+ChunkSize/Bandwidth

•  For ``big’’ chunks

ChunkSize/Bandwidth

(68)

P2P broadcas9ng Delays

•  Play-‐out delay:

– Gap between the actual event and its display – For live only

•  Start-‐up delay:

– Time needed to start watching – For any mul9media content

•  Propaga9on delay

– Time for data to reach all peers a•er injec9on – Absolute lower bound for play-‐out

(69)

The logarithmic bound

•  Homogeneous case: all values are equal

–  stripes

– Big chunks

(70)

The constant bound

•  Delay grows logarithmically with

•  If is bounded, so is the delay

– When not scalable, you have necessarily

– Even if scalable, you may want a high

(71)

Pour aller plus loin…

•  DHTs

– Thèse de Philippe Gauron (et références)

hfp://tel.archives-‐ouvertes.fr/docs/00/14/09/01/PDF/

thesePhilippeGauron.pdf

•  Distribu9on de contenus

– Mémoire d’habilita9on (et références)

hfp://gang.inria.fr/~fmathieu/pub/ﬁles/mathieu09autour.pdf

Introduc)on to Peer to Peer Systems