for Knowledge Management
KazuhiroKojima
AIST,1-2-1Namiki,Tsukuba,Ibaraki,Japan
k.kojima@aist.go.jp
Abstract. Locatingcontentsisanessentialfunction, butit presentsa
verydiÆcultandchallengingproblemforlarge-scalePeer-to-Peer(P2P)
systems.ManyP2Psystems,protocols,architectures,andsearchstrate-
giesare proposed forthis problem.Inthis paper, we focus ontheself-
organizationofacommunitystructurebasedonuserpreferencesforP2P
systems.WeproposethesemethodstoimproveP2Psearchperformance:
1)ExtendedPong,2) PongProxy,3)QRP withFirework4)Backward
Learning, and 5) Community Self-Organization Algorithm. We evalu-
ate the performance of the self-organized community network through
simulations. Theseresultsshowthatthe self-organizedcommunitynet-
workmaintainsahigh queryhitratewithout overow.Asanexample
ofimplementation,weproposetheSelf-OrganizableP2PSearchEngine.
1 Introduction
Peer-to-Peer(P2P)protocolsandapplicationssuchasGnutella[1]andFreenet[2]
extendthepowerofInternetusers.ThecurrentmajorP2Papplicationisshar-
ingofillegalcontents.Inthefuture,P2Psystemswillbeappliedtodistributed
searchengines,knowledgemanagement,andmyriadInternetapplications.How-
ever,P2Psystemspresentmanyproblems,oneof whichisLocating Contents.
InGnutellaprotocolv0.4,aoodingalgorithm[1]isusedtolocatecontents.
Peer sends a Query tagged with a maximum Time-To-Live (TTL) to its all
neighborsontheoverlaynetwork.EachpeerforwardstheQuerytoitsneighbors.
Whenthe Querystringpartiallyorexactly matchesale namestored locally,
the peer replieswith the QueryHit. Thoughthis algorithm is verysimple and
robust even when peers are joining or leaving the system, it lacks scalability.
A systemcomprisingalargenumberofpeersisinundatedwith Queriesevenif
peersdonotforwardQueriesthattheyhaveforwardedpreviously.
Thispaperproposesanapproachtolocatingcontents,aself-organizationof
communities based onthe user's preferences. It is verynatural to introducea
structure of communityinto P2P systems.Communities in the real world and
cyberspace are formed by people who have similar preferences. For example,
they form mailing lists, chatrooms, and Bulletin BoardSystems (BBSs). The
membersofeachcommunitysharethecommonknowledgeswitheachother.We
proposeanewP2Pprotocol:TellaGatethatcanself-organizecommunitiesand
Browser
Document DB
Routing Engine Search
Engine
Connection Engine
Connection Engine
TCP/IP LAN or WAN Self-Organizable Search Engine
(a)ArchitectureofSelf-Organizable
P2PSearchEngine. (b)ScreenShotofTellaGateBrowser.
Fig.1.Self-OrganizableP2PSearchEngineandTellaGateBrowser.(a)Theproposed
Self-OrganizableP2PSearchEnginecomprises4modules:i.DocumentDB,ii.Search
Engine,iii.RoutingEngine,andiv.ConnectionEngine.
Theremainderof thispaperproceeds asfollows: Section2proposesaSelf-
OrganizableP2PSearchEngine;Section3givesafulldetailofTellaGateproto-
col; Section 4showsthroughsimulationsthat aself-organizedcommunitynet-
workmaintainsahighqueryhitratewithoutoverow;andSection5concludes
thediscussionandproposesfuture work.
2 Self-Organizable P2P Search Engine
Weshowthearchitecture ofthe Self-OrganizableP2PSearch Enginein Fig.1
(a).Theproposedsystemcomprisesthefollowing4modules.
i.DocumentDB
TheshareddocumentsarestoredontheDocumentDB.Eachdocumenthas
aGloballyUnique Identier(GUID).
ii.Search Engine
TheSearchEngineprovidesafulltextsearchservice.Keywordsareextracted
fromtheshareddocumentsusingtextminingtechniques.
iii.Routing Engine
TheRoutingEngineresolvesmessagesroutingaccordingtotheQueryRout-
ingProtocolwithFirework(Sec.3.2).
iv.Connection Engine
TheConnectionEngine controlsthe TCP/IPconnectionswith otherpeers
accordingtotheCommunitySelf-OrganizationAlgorithm(Sec.3.3).
InFig.1(a),useraccessestothesystembybrowserasshownin Fig.1(b).
TheTCPconnectionsbetweenpeersarereconguredbyTellaGateprotocol, so
that thecommunitystructuresareself-organizedonoverlaynetworks.
To implement the Self-Organizable P2P Search Engine, we propose a new
P2P protocol:TellaGate protocol. Inthe next section, wegivea full detail of
Descriptor ID Payload
Descriptor TTL Hop Payload Length
0 15 16 17 18 19 22
(b)Pong.
port IP Address Number of Files Shared
Number of Kilobytes Shared
0 1 2 5 6 10 11 13
(c)ExtendedPong.
port IP Address
0 1 2 5 6 32005
PeerDigest
Fig.2. Structure of message: (a), (b), and (c) are the Gnutella header, Pong, and
proposed extendedPong, respectively.(a)InPing,thePayloadLengthis 0.Thatis,
Pingcomprisesonlyaheader.(c)TheproposedextendedPongincludesaPeerDigest.
3 TellaGate Protocol
There areevidentcommunities in someconventionalclient-serverwebapplica-
tionssuchasmailinglists,chatrooms,andBBSs.Thesecommunitiesaremain-
tainedbycentralizedservers.Ontheotherhand,decentralizedanddistributed
P2Psystemshavenosuchservers.Inasocialnetwork,aso-calledacquaintance
network[3],communitiesarefrequentlyself-organizedbylocalinteractions.This
studyimplementslocalinteractionstoadecentralizedP2Psystemusing1)Ping-
Pongand2)Bloomlters.[4]Next,webrieyexplainthesekeytechnologies.
3.1 Key Technologies
Ping-Pong InGnutellaprotocolv0.4, apeer usesPing to probe thenetwork
activelyforotherpeers.Gnutellaprotocolmessagescompriseaheaderandpay-
load.APinghasonlyaheaderportion,asshownin Fig.2(a).Apeerreceiving
a Ping forwardsit to itsneighbors and responds with aPong, which contains
its own IPaddressand the Portnumber andthe amountof data it issharing
ontheGnutellanetwork.ThepayloadpartofaPongmessageisshowninFig.
2(b).Thenumberof Pingsincreasesexponentiallythroughreplicationandfor-
warding,therebyooding thenetworkwithPing-Pongs.
Bloom Filter and QRP ABloomlter[4]isaquickand space-eÆcientdata
structure for representingaset of N elements to support membership queries.
AnarrayofMbitsandKindependenthashfunctionsisusedforaBloomlter.
ABloomltermaygeneratefalsepositives.Accordingtotheanalysisin[5],the
falsepositiverateisgivenas
f =
1 e K
r
K
; (1)
whererrepresentsthenumberofbitsperelement,M=N.Someexemplaryvalues
M=N =12K=8f =0:00314 M=N =16K=11f =0:000458
WecanchooseareasonablevalueofM=N andK by consideringthe expected
falsepositiverate.
RohrsproposedaQueryRoutingProtocol(QRP)[6]for Gnutellanetworks.
InQRP,keywordsarehashedandembeddedinaBloomlter.TheBloomlter
of each peer is sentand forwardedto its neighbors so that the Bloom lter is
propagated throughoutthe network.Queries arerouted accordingto theQRP
withoutaooding algorithm.
3.2 Improvements
TheabovesubsectionbrieyexplainedPing-Pong,Bloomlters,andQRP.We
extendtheseimportantelementstoimprovetheP2Psystemperformance.
Extended Pong We extendaPongmessageso thatwecanuseBloomlters
to self-organizecommunities. The Extended Pong (ExPong) includes aBloom
lter as shown in Fig. 2(c). In web cache systems[7], a Bloom lter is called
a Summary Cache[8] or Cache Digests[9]. In this paper,a Bloom lter of the
ExPongiscalledaPeerDigest.ThePeerDigestofPeeriisrepresentedasdigest
i .
Werestrictthesharedcontentswithinsuchdocumentleshavingextensions
as .html, .ps, and .pdf. Shared .pdf or .ps les are transformed into text les
bypdftotextorps2text.Keywordsareextractedfromthesetextlesusingtext
miningtechniques.[10]It isconsidered that digest
i
showsthepreference ofthe
Peeribecausethese keywordsareembeddedintothePeerDigest.
PeerDigest Cache and Pong Proxy InGnutellaprotocol v0.6[1],ahierar-
chicalstructureisdenedasUltraPeerandLeafPeer.TheGnutellabackboneis
dominatedbyconnectionsofUltraPeersbecauseLeafPeerconnectsonlytoUl-
traPeers.UltraPeerservesasaproxyorserverlikeClip2ReectororNapster,
therebyreducingredundantPingsandQueries.
Inthispaper,weproposeaPongProxyinsteadofahierarchicalstructureto
reducePing-Pongs.EachpeerhastwoPeerDigestCaches:cache
l
(l=1;2).Pairs
(address
j
;digest
j
)arecachedintothecache
l
,whereaddress
j isip
j :port
j ofPeer
j. The rst and second neighbors of Peer i are representedas nbr
i;l
(l =1;2).
The index l of cache
l
and nbr
i;l
indicates the degree of neighbors. If Peer j
receivesaPingfrom Peeri, Peerj respondswithanExPongincludingdigest
j .
Moreover, Peer j responds with ExPongs including digest
k
(k 2 nbr
j;1 ). Peer
i caches digest
j
and digest
k
s into cache
i;1
and cache
i;2
, respectively. In this
manner,Peerj playstheroleofPongProxy.
If each peer sends Pings to all rst neighbor peers, as in Gnutella v0.4,
B A
C
D
E F
q
check(idx_C,q)=1 check(idx_E,q)=0 check(idx_F,q)=1 check(idx_D,q)=1
Peer B
qh
qh
qe
Fig.3.QRPwithFireworkandBackwardLearning.PeerAsendstheQuerytoPeer
B. Queryroutingisdecided bycheck(digest
j
;q)(j2nbr
B;1
) atPeerB.q,qh,andqe
showtheQuery,QueryHit,andQueryError,respectively.
size is usually very large. We also improve the Ping-ExPong protocol in the
following respect. Each peer sends Pings to MaxCon peers that are selected
randomly from cache
1
. The peer responds with MaxCon ExPongs that are
selected randomly from cache
1
. Hence, the upper bound for the total size of
ExPongsthatapeerwillreceiveisgivenas
upper bound=MaxCon(1+MaxCon)
Size
ExPong
: (2)
ItisfairtoaddthattheGnutellapeeralsosendsandforwardsPingstoMaxCon
peersinsimulationsofSection4.
QRPwithFirework AccordingtothePongProxy,eachpeerhasinformation
about the contents sharedby its neighbors. We usethe PeerDigestsfor query
routing.
InFig.3,Peer AsendsaQuery:q,includingsearchkeywords,toPeerB.If
PeerBhasnocontentfortheQuery,PeerBmust forwardtheQuery.Routing
of the Queryis implemented asfollows:Peer B checkswhether keywords of q
areindigest
j
(j2nbr
B;1
)ornot.Itisgivenas
check(digest
j
;q)=ftrue;falseg: (3)
If keywordsexist in digest
j
(j 2nbr
B;1
),PeerBforwardstheQuery toPeer j.
InFig.3,PeerBforwardstheQueryto Peers C,D, andF.Ifcheck(digest
j
;q)
is falseforall rstneighbors,theQueryis forwardedto Peerj(2nbr
1
),which
isselectedrandomly.ThisprocessisequivalenttoaRandomWalkSearch.
F. IfPeerF hasnocontentfortheQueryand isin Dead Lock, Peer Freturns
QueryError:qetoPeerB.AccordingtotheQueryError,PeerBmodiesdigest
F
as
digest (m)
F
=0; m=h
k
(q)(k=1;;K);
wheredigest (m)
isthem-thbitofdigestandhisahashfunction.Ontheother
hand,ifPeerChascontentsoughtbytheQuery,PeerCrepliesto PeerBwith
QueryHit:qh.TheQueryHit:qhisbackwardedtothePeerAbywayofPeerB.
AccordingtotheQueryHit,eachmediatepeerandoriginatorroutedbyrandom
walkstrategymodies digestas
digest (m)
j
=1; m=h
k
(q)(k=1;;K);
wherej(2nbr
i;1
)showsthenextorlast mediatepeer.
3.3 CommunitySelf-Organization Algorithm
Thissubsection presentstheCommunitySelf-OrganizationAlgorithm(CSOA).
The abovesection mentionedthat thePeerDigestshowsthe respectiveprefer-
ences of peers. Therefore, we can use PeerDigestto self-organizecommunities
basedonrespectiveusers'preferences.
Wemust denethe similarityof thepreference betweenPeer iand Peer j,
sim
ij
.Thatsimilarityisdenedas
sim
ij
count(digest
i
^digest
j
); (4)
where^isanANDoperatorandcount()isafunctionthatcountsbitswith1.
Self-organizationisimplementedasfollows.Firstly,Peericalculatesthesim-
ilarity sim
i;j
betweenitself and thesecond neighboring Peersj(2nbr
i;2 ).Sec-
ondly,PeeriselectsthePeerjthathasthelargestvalueofsim
i;j
.Thirdly,ifthe
numberof currentconnectionsCon
i
isgreaterthan themaximumconnections
MaxCon
i
,thenPeerirandomlyselectsPeerk(2nbr
i;1
)todisconnect.Finally,
Peeri disconnectsPeerkand connectstoPeerj. Thepseudo-codeofthis self-
organization algorithm is shown in Fig. 4. This algorithm is very simple and
requires onlylocal information,that is, thePeerDigestsof therstand second
neighborpeers.Moreover,thislocalinformationisobtainedbylocalinteractions:
Ping-ExPongandPongProxy.
4 Simulations
SimulatorofcomplexsystemssuchasGnutellaisnotaneasytask.Weprovided
cal cul atesim
ij
forallPeersj2nbr
i;2
nbr
i;2
=sortnbr
i;2
accordingtosim
ij
Peerj=topofnbri;2
connecttoPeerj
addPeerjtonbri;1andcachei;1
removePeerj fromnbr
i;2
andcache
i;2
if(Con
i
>MaxCon
i )
Peerk=sel ectfromnbri;1atrandom
disconnectPeerk
removePeerkfromnbr
i;1
andcache
i;1
Fig.4.PseudocodeoftheproposedCommunitySelf-OrganizationAlgorithm.Inthe
followingsimulationsofSection4,MaxCon=4.
{ InitialTopology
Theinitial topology of anetwork is given asarandom network,as shown
inFig.10(a).Thenetworkcomprises500peers.Eachpeerhasfourrandom
outgoinglinks;thereforeMaxCon
i
=4.Thenumberofpeersisconstantin
the following simulations. We also simulated cases wherein the network is
growing,butidenticalresultstothese wereobtained.
{ Community
Allpeersareclassiedintovecommunities.Eachpeerisassignedrandomly
tooneofthesecommunities.There arenoexplicitcommunitystructuresin
theinitialnetworktopologyasshownin Fig.10(a).
{ ContentsandKeywords
Eachkeywordisanintegervalueforsimplicity.[2]Thenumberofkeywords,
N,is500.Thekeywordsaredividedtoveclasses:KC
i
(i=0;;4).Class
KC
i
comprisestheinherent100keywordsand20keywordswhichbelongto
theclass KC
i+1 (KC
5
=KC
0
).The content associated with each keyword
isanintegerequaltothekeyword.
{ PeerDigest
Thehash functions are builtby rstcalculating theMD5 signatureof the
keywordstring, which yields 128bits, thendividing the 128bits into four
32-bitunsignedintegervalues,thennallytakingthemodulusofeach32-bit
unsignedintegervaluebythePeerDigestsizeM[8].MD5isacryptographic
messagedigestalgorithmthathashesarbitrarylengthstringsto128bits.The
bitsperelementM=N isset tofour in thefollowingsimulations;therefore,
thetheoreticalfalsepositiverateis0:160.
{ Local Storage
InrealP2Psystems,mostusershavenosharedlocalstorage;theyarecalled
LSC
NC
; (5)
whereLSC andNC arethelocal storagecapacityandthe numberof con-
tentsassignedtoacommunity,respectively.If1,eachpeercanstoreall
contentssharedbyitsowncommunityonthelocal storage.If <1,LRU
cachereplacementisused forstoredcontent.
{ Uploader andDownloader
Alllocalstoragelocationsareinitiallyempty.Contentsareprovidedbyspe-
cicpeers called Upload-Only Members(UOMs). Theremainingpeers are
Download-OnlyMembers(DOMs). Weassumedthat thenumberofUOMs
is 10%of allpeers. In Fig.10(a), UOM andDOM peers are shown asbig
andsmallcircles,respectively.
{ MessageSizeandBuer
TheTTLofPingandQueryaresettofive.Wedenethesizeofeachmes-
sageasshowninTable1.Accordingtotheaboveassumption,MaxCon
i
=4,
theupperboundeq.(2)is0:64M-byte.Themessagebuer,thatis,thenum-
berofmessagesthatPeeri canprocessinasingletimestep,isrepresented
asmb
i
.Eachpeerhasthesamesizemb.Ifthenumberofmessagesreceived
isgreaterthanthesizeofmb,theoverowedmessagesaredropped.
MessageTypeMessageSize[byte]
Ping 22
Pong 35
ExPong 32027
Query 50
QueryHit 100
QueryError 50
Table 1.Messagesize.
{ TimeStep
EachpeergeneratesaQueryandPingevery10and1000timesteps.CSOA
isappliedevery1000timesteps.
4.1 Performance
We compared the performance of TellaGate with Gnutella protocol v0.4. Re-
spective performances were evaluated by the success rate of search, network
load,averageHOP of thesuccessfulQuery, and theaveragenumberof Query-
Hit. Weimplementedseveralsimulationsunder various parametervalues, that
is, the local storage capacity factor (0 < 1:0) and the message buer
mb (10 mb 100).Due to limitation space, this paperdescribesresults of
0 0.2 0.4 0.6 0.8 1
0 10000 20000 30000
Success Rate
Time Steps Gnutella TellaGate
0 0.2 0.4 0.6 0.8 1
0 10000 20000 30000
Success Rate
Time Steps Gnutella TellaGate
0 0.2 0.4 0.6 0.8 1
0 10000 20000 30000
Success Rate
Time Steps Gnutella TellaGate
(a)=1:0,mb=100. (b)=0:25,mb=100. (c)=0:25,mb=20.
Fig.5.Performance:successrate.
0 1 2 3 4 5
0 10000 20000 30000
Messages [G byte]
Time Steps Gnutella TellaGate
0 1 2 3 4 5
0 10000 20000 30000
Messages [G byte]
Time Steps Gnutella TellaGate
0 1 2 3 4 5
0 10000 20000 30000
Messages [G byte]
Time Steps Gnutella TellaGate
(a)=1:0,mb=100. (b)=0:25,mb=100. (c)=0:25,mb=20.
Fig.6.Performance:networkload.
Resultsrepresentanaverageof 10simulations;theyareshownin Figs.5-8.
Inthosegures,resultsofGnutellav0.4andTellaGateareshownbythebroken
linewithopensquaresandthesolidlinewithlled squares,respectively.
Thesuccessratesconvergeto 1.0forGnutellav0.4andTellaGatewith=
1:0,mb=100,asshowninFig.5(a).Moreover,theaverageHOPconvergesto0,
asshowninFig.7(a)becauseeachpeerhaslocalstorageofsuÆcientcapacityto
storethecontentsandamessagebuerthatissuÆcienttoreceivethemessages.
However,Gnutellav0.4 induces aood of Ping-Pongs andQueries in an early
timestepasshowninFig.6(a).ThenetworkloadofGnutellav0.4convergesto
thelowervalueafterthepeak.Whenthesizeofthelocalstorageislimited,the
network loadof Gnutella v0.4 converges to the highervalue, as shown in Fig.
6(b).When thesize ofthe messagebueris alsolimited, asmanyQueries are
dropped out; the average number of QueryHit decreases and the success rate
convergestothelowervalueof0.68,asshownin Fig.8(c)andFig.5(c).
TellaGate doesnotengender arapid increaseofthe network load.Network
loads of TellaGate converge to the lower value, as shown in Fig. 6. In Fig.5,
eventhoughthesizeoflocalstorageandmessagebuerarelimited,thesuccess
rateofTellaGate convergesto1.0.
Wealso simulated TellaGate withoutthe CSOA to clarify the reasonwhy
the self-organized communitynetwork retains the high success rate value: the
0 1 2 3 4 5
0 10000 20000 30000
Average HOP
Time Steps Gnutella TellaGate
0 1 2 3 4 5
0 10000 20000 30000
Average HOP
Time Steps Gnutella TellaGate
0 1 2 3 4 5
0 10000 20000 30000
Average HOP
Time Steps Gnutella TellaGate
(a)=1:0,mb=100. (b)=0:25,mb=100. (c)=0:25,mb=20.
Fig.7.Performance:averagevalueofHOP.
0 5 10 15 20 25 30 35 40 45
0 10000 20000 30000
Average number of QUERYHIT
Time Steps Gnutella TellaGate
0 5 10 15 20 25 30 35 40 45
0 10000 20000 30000
Average number of QUERYHIT
Time Steps Gnutella TellaGate
0 5 10 15 20 25 30 35 40 45
0 10000 20000 30000
Average number of QUERYHIT
Time Steps Gnutella TellaGate
(a)=1:0,mb=100. (b)=0:25,mb=100. (c)=0:25,mb=20.
Fig.8. Performance:averagenumberofQueryHit.
of Queries isimplementedby QRPwith Firework.Resultsof thesuccess rate,
networkloadandaveragenumberofQueryHitareshowninFig.9.Thesethree
valuesofTellaGatewithouttheCSOAarelowerthanthevaluesoftheoriginal
TellaGate.
Accordingtothese results,weinferred thefollowinganswertothequestion
above.ForTellaGatewithouttheCSOA,theprobabilitythattherstneighbors
havethesamepreferenceisalowvalue;thereby,theroutingstrategyisequivalent
toaRandomWalkSearch.If1,thesearchstrategyisimprovedgraduallyto
QRPwithFireworkbybackwardlearningbasedonQueryErrorandQueryHit.
However,if<1,asthecontentsonthelocalstoragearereplacedbytheLRU
replacementpolicy,thePeerDigestwillbeinconsistentwiththecurrentcontents
sharedbythedistantpeers.ManyQueriesaredroppedoutaccordingtotheTTL
limitation.
Ontheother hand,in theself-organizedcommunitynetwork shownin Fig.
10(b), each peer has severalrst neighbors that haveidentical preferences. As
QRPwithFireworkreplicatesandforwardsaQuerybasedoneq.(3),theproba-
bilitythattheQueryisreplicatedishigherthanintherandomnetwork.Hence,
the probability of locating contents will be a high value. This eect that is
inducedbythecommunitystructureandreplicationofQueryiscalledtheCom-
0 0.2 0.4 0.6 0.8 1
0 10000 20000 30000
Success Rate
Time Steps with CSOA without CSOA
0 0.2 0.4 0.6 0.8 1
0 10000 20000 30000
Messages [G byte]
Time Steps with CSOA without CSOA
0 5 10 15
0 10000 20000 30000
Average number of QUERYHIT
Time Steps with CSOA without CSOA
(a)successrate. (b)networkload. (c)averagenumberofQueryHit.
Fig.9.Theeectofcommunitystructure:CSOAistheCommunitySelf-Organization
Algorithm.
5 Conclusion
This paper proposed the Self-Organizable P2P Document Search Engine and
anewP2Pprotocol:TellaGateprotocol.Simulationresultsshowthat theself-
organizedcommunitynetworkmaintainsahighqueryhitratewithoutoverow.
However,wedidnotconsiderthatpeersoftengooineintheabovesimulations.
Therefore,futureworkwilladdresssimulationsunder suchdynamicconditions.
TheproposedCommunitySelf-OrganizationAlgorithmisbasedonlocalin-
teractions.Recently,Davidsenandco-workersproposedamodelofacquaintance
networks.[19]In theirmodel, the network isself-organized from aninitial ran-
domnetworktoaSmallWorldnetwork[17][18]characterizedashaving:1)short
pathlength,2)highclustering[17],and3)scale-freeorexponentiallinkdistribu-
tions[18].TheproposedCSOAinthispaperandDavidsen'salgorithmarebased
onsimilarlocalinteractions;indeed,theSelf-OrganizedCommunityNetworkhas
shortpathlength(L'3:0)andahighclustering(C '0:14)property.However,
in Davidsen's model, not everynode has aninternal state such asPeerDigest.
HowdoSmallWorldpropertiesinuencetheself-organizationprocesses?Future
workswilladdressthisquestion.
References
1. RFC-Gnutella.http://rfc-gnutella.sourceforge.net/
2. I.Clarke,O.Sandberg,B.Wiley,andT.W.Hong:Freenet:Adistributedanonymous
informationstorageandretrievalsystem.Proc.theOCSIWorkshoponDesignIssues
inAnonymityandUnobservability(2000).
3. StanleyMilgram:TheSmall-WorldProblem.PsychologyToday1(1967)60{67.
4. B.Bloom:Space/timetrade-osinhashcodingwithallowableerrors.Communica-
tionsoftheACM13(7)(1970)422{426.
5. M.Mitzenmacher:Compressed BloomFilters.Proc. ofthe 20thACMSymposium
onPrinciplesofDistributedComputing(2001).
6. ChristopherRohrs: QueryRoutingfortheGnutellaNetwork.http://rfc-gnutella.
source-forge.net.
Fig.10.Networktopology(a)Initial and(b)afterself-organization (30CSOAswere
applied). Network topology is illustrated using the spring embedded model[12]. All
peersareclassiedintovecommunities.Thegrayscaleofeachcirclerepresentseach
community.Big and small circles represent UOMs and DOMs, respectively. In this
gure,thenumberofUOMsis50peersandthenumberofDOMsis450peers.
8. L.Fan,P.Cao,J.Almeida,andA.Broder:Summarycache:Ascalablewide-areaWeb
cache sharing protocol. IEEE/ACM Transactions on Networking 8(3)(2000) 281{
293.
9. A.RousskovandD.Wessels:Cachedigests.ComputerNetworksandISDNSystems
30(22-23)(1998)2155{2168.
10. Y.MatsuoandM.Ishizuka:KeywordExtractionfromaDocumentusingWordCo-
occurrenceStatistical Information(inJapanese).Trans.oftheJapaneseSocietyfor
ArticialIntelligence,17(3)(2002)217{223.
11. E.AdarandB.A.Huberman:FreeRidingonGnutella.FirstMonday5(10)(2000).
12. Graphviz.http://www.research.att.com/sw/tools/ gra phviz.
13. B.Zhao,J.KubuatowiczandA.Joseph:Tapestry:AnInfrastructureforWide-area
fault-tolerant Location and Routing. University of California, Berkeley Computer
ScienceDivision,TechnicalReport,UCB/CSD-01-1141(2001).
14. A.Rowstron and P.Druschel: Pastry: Scalable, decentralized object location and
routingfor alarge-scale peer-to-peersystem.Proc.ofthe18thIFIP/ACMInterna-
tional ConferenceonDistributedSystemPlatforms(2001).
15. I.Stoica,R.Morris,D.Karger,M.F.Kaashoek,andH.Balakrishnan:Chord:AScal-
able Peer-to-peerLook-upServicefor InternetApplication.Proc.oftheACMSIG-
COMMConference(2001).
16. S.Ratnasamy,P.Francis,M.Handley,andR.Karp:AScalableContent-Addressable
Network.Proc.oftheACMSIGCOMMConference(2001).
17. D.J.Watts:Small-Worlds.PrincetonUniv.Press(1999).
18. R.Albert,H.Jeong,andA.-L.Barabasi:Errorandattacktoleranceofcomplexnet-
works.Nature406(2000)378{381.
19. Jorn Davidsen,HolgerEbel,andStefan Bornholdt:EmergenceofaSmallWorld
formLocalInteractions:ModelingAcquaintanceNetworks.PhysicalReviewLetters
88(12),128701 (2002).