JITeR: Just-in-time application-layer routing

(1)

ContentslistsavailableatScienceDirect

Computer Networks

journalhomepage:www.elsevier.com/locate/comnet

JITeR: Just-in-time application-layer routing

Alysson Bessani

^a

, Nuno F. Neves

^a

, Paulo Veríssimo

^b

, Wagner Dantas

^c

, Alexandre Fonseca

^d

, Rui Silva

^d

, Pedro Luz

^d

, Miguel Correia

^d^,^∗

aLaSIGE/FCUL, Portugal

bUni.Lu, Luxembourg

cUFSC, Brazil

dINESC-ID/IST, Portugal

a rt i c l e i nf o

Article history:

Received 13 October 2015 Revised 4 March 2016 Accepted 12 May 2016 Available online 13 May 2016 Keywords:

Overlay networks Multihoming Timely communication Critical

Information infrastructures

a b s t ra c t

Thepaperaddressestheproblemofprovidingmessagelatencyandreliabilityassurancesforcontroltraf- ﬁcinwide-areaIPnetworks.Thisisanimportantproblemforcloudservicesandothergeo-distributed informationinfrastructuresthatentailinter-datacenterreal-timecommunication.Wepresentthedesign and validationofJITeR (Just-In-TimeRouting),analgorithmthattimelyroutesmessagesatapplication- layerusingoverlaynetworkingandmultihoming,leveragingthenaturalredundancyofwide-areaIPnet- works.WeimplementedaprototypeofJITeRthatweevaluatedexperimentallybyplacingnodesinsev- eralregionsofAmazonEC2.Wealsopresentascenario-based(geo-distributedutilitynetwork)evalua- tioncomparingJITeRwithalternativeoverlay/multihomingroutingalgorithmsthatshowsthatitprovides bettertimelinessandreliabilityguarantees.

1. Introduction

An increasing number of applications over wide-area IP networks exhibit timeliness requirements. Many of them can be served by protocols that provide those guarantees most of the time,forexamplebyguaranteeingagivenaveragethroughputand accepting occasional violations of deadlines (e.g., video streaming).However, other applications exhibit more stringent requirements,i.e.,theneed thatsome oftheir messagesmeetindividual deadlinesin thepresenceoffaults likecongestionandomissions.

Whilstsolutionsexisttotheproblemwithinover-provisioneddat- acenternetworks,weknowofnosolutionforgenericwide-areaIP networks.

Thispaperaddressestheproblemofprovidinglatencyandreli- abilityassurancesforcontroltraﬃc – notforall traﬃc– inwide- area IP networks. Twoexamples show therelevance of thework forwhatwedesignatebygeo-distributedinformationinfrastructures (GDII).

First,cloudserviceswithsoftreal-timerequirementsoftenspan multiple datacenters in different geographical locations, imply- ing deadline propagationamongst them,hampered by the wide- areaIPnetworkinterconnects[58].Anexamplearecloud services that need to exchange control traﬃc such as Google’s Megastore

∗ Corresponding author.

E-mail address: miguel.p.correia@tecnico.ulisboa.pt (M. Correia).

coordination serviceaccess[8]andAmazon’s Dynamo failure de- tectionprotocol[17].

The second example is the context of critical information in- frastructures,suchaspowergeneration,transportanddistribution.

Such cyber-physicalsystemsare spreadoverlarge geographical ar- eas and controlled remotely from command and control centers usingSCADA/PCS¹ systems,overcommunicationnetworksusually basedonwide-areaIPnetworks[10,20,31,37].DespitetheuseofIP, the timelinessof critical remote commandsis essential to main- tain the integrity of the infrastructure (e.g., to avoid power out- ages). Severalexamples ofsuch applicationsandcommandswere describedinprojectCRUTIAL[23].

We present the design and validation of a novel algorithm for Just-In-Time Routing, JITeR (pronounced “jitter”),which routes deadline constrained messages –control messages– atapplication layer, using an overlay network createdon top ofa multihomed communicationinfrastructure,leveraging thenaturalcommunica- tionredundancythat exists ingeo-distributed informationinfras- tructures [37].JITeR uses a setofnodes locatedin differentsites of the GDII to route messages among them,instead of following the routesimposed by the network-level routing. For instance,if the network-level routing makes a message sent by node ra to noder_bpassthroughtheautonomoussystemAS1,ramaysendthe

1Supervisory Control and Data Acquisition (SCADA) and Process Control Systems (PCS).

(2)

messagetorcandthisnodesendittor_b,lettingthemessagepass alternativelythroughAS2andAS3.

Ourscheme isbased ontwo important assumptions.The ﬁrst isthat amongsta collectionofalternative overlayroutesbetween twosites,therewillbeasubsetwhichwillbefastenoughtoper- formreliableandtimely communicationinthepresenceoffaults.

The secondis thatit isusedtosend onlycontrol trafficandthat thattrafficconsumesnegligibleresources(e.g.,bandwidth)incom- parisontotherestofthetraffic.

The key objective of our work is to devise a practical and non-intrusive solution to achieve timely andreliable communication withhigh probability in current GDIIs, takingthree require- mentsintoconsideration:(1)CompatibilitywithcurrentGDIIs:JITeR should allowseamlessintegrationwithcurrentGDIIs,withoutre- quiringmajor changestothe operationandorganizationofexist- ing networks; (2) No wide-area IP network changes: the solution should not requireany special support fromthe underlying network (e.g., resource reservation). Timeliness should be obtained on topof best-effort communicationchannelssuch asthose provided by IP-based networks, and therefore, JITeR cannot ensure strict hard real-time properties (e.g.,like insmall-scale real-time networks);and(3)Costconsciousness:JITeRshoulduseredundancy parsimoniously,avoidingexpensivesolutionsliketraﬃcﬂooding.

Unlikeprevious worksonoverlay networks, whichaimto improve reliability by detecting anddeviating communication from faulty and congested paths, JITeR aims to provide soft real-time communicationbysecuringindividual messagedeliverydeadlines with high probability. To achieve this goal, JITeR uses temporal andspatialredundancyjudiciously. Inanutshell,eachmessageis sentthroughseveraloverlaychannels,possiblyfromdifferentser- vice providers: one base channel plus a few backup channels.In thepresenceofdelaysoromissions,themessagemayberetrans- mitted. Anovel channelscheduling policy inthe JITeRalgorithm, which we call just-in-time, selects the baseand backupchannels not to be the fastest, butthe ones which match each message’s deadlineneeds(somefaster,someslower).Thereadermaywonder thatitisimpossibletoguaranteereal-timebehavioronbest-effort IP networks.As amatter offact,notevenhard real-timesystems have100%coverage:theyhavetoachievesuﬃcientcoverage[53].

WeimplementedJITeRandevaluateditexperimentallybyplac- ing nodes in several regions of the Amazon AWS cloud offering (in theEC2service).Moreover,we evaluatedthestrategy bysim- ulation of scenarios over a realistic model of a wide-area utility network (the Italian electric power infrastructure) with both accidental andmaliciousfaults (DDoSattacks).Its effectiveness and costswerecomparedwithseveralotheroverlay/multihomingrout- ing algorithmsin theliterature.The evaluationshowedthat JITeR providesbettertimelinessandreliabilityguaranteesthantheother schemes, very close to those of a ﬂooding strategy but sending muchfewermessages.

Thepaperprovidesthefollowingcontributions:

1. JITeR,the ﬁrst(to the bestofour knowledge)wide-areaIP overlaynetwork algorithmandarchitecture withtheobjec- tiveofprovidingmessagelatencyandreliabilityguarantees;

2. Acomparativeanalysisofseveraloverlaynetworksproposed inthe literature, showingthat JITeRprovides better timeliness;

3. The description and modelling of a representative critical information infrastructure, the Italian power system utility network,whichwebelievetobeofuseasabenchmarkfor futurestudies.

4. An evaluation of JITeR and other techniques in providing timely communication between Amazon EC2 regions. This analysisalsoshowsthecommunicationlatencyandpathdi- versitybetweentheEC2availabilityzones.

2. Relatedwork

Thereis a vastbibliography on thetopicof thepaper,so this section isnecessarily a summary.It presentswork along thefol- lowingaxes: thepropertieswe wantto provide(QoS,timeliness);

existingchallenges(networkfailures),thetechniquesweuse(overlay networks, multihoming); the scenarios we consider (clouds, criticalinformationinfrastructures).

QoSinmultimedianetworks. Weassumethatthenetworkprovides only a best-effort service, with no latency and bandwidth guarantees.It ispossibletohavetheseguarantees, e.g.,by usingDiff- Serv[38]orATM[16].However,theseservicesarenotprovidedby the generality ofInternet serviceproviders (ISPs), especially ina geo-distributedcontext,whichwouldconstraintheapplicabilityof oursolution.Itisalsopossibletouseleasedlines,whicharerather expensive. On the other hand,many ISPs provide phone and TV overIP,whichhavetimelinessrequirements.Theseprovidersusu- ally employ fast convergence mechanisms for sub-second recov- eryfrom link androuter failures: bidirectionalforwarding detec- tion[33],statefulswitchover[12],andfasthellopackets[11].Nev- ertheless,deadlinesarefrequentlymissed,causingimagefreezesof severalseconds.Weproposeasolutionthatdoesnotrequiresuch guarantees fromthe network,only plainInternet-like IP network service.

Peer-to-peer networks have been proposed as a solution to streammultimediaoverbest-effort networkssuchastheInternet.

CoopNet is one of the ﬁrst of thisline of research [39]. It focus on live streaming and leverages the notion of multiple descrip- tioncoding,i.e.,ofencodingaudioandvideointoseparatestreams.

Zigzag provides scalable single-source media streaming using an application-layermulticasttree[51].Promiseisapeer-to-peerme- dia streaming system that supports multiple sendersand allows one recipient to receive media from several senders [27]. These systemsaimtoprovideQoSguarantees,butnotspeciﬁcallythede- liveryofmessagesbeforeacertain,possiblyshort,deadline.More- overthesesystemstoleratesomelevelofpacketloss,whereaswe areinterestedindeliveringallmessages.Ontheotherhand,these systemshandlemuchmoretraﬃc,astheyaretargetedatcontinu- ousmedia(audio,video).

Timelinessin networks. When the web started to be adopted for commercialpurposes,itbecameclearthattherearelimitsonthe timeusersarewillingtoacceptforrepliestoarrive,i.e.,thatthere aretimeliness(maximumlatency)requirementsforwebcommuni- cation.Contentdistributionnetworks(CDNs)likeAkamai[19]ap- pearedasasolutiontothisproblem[7,40].Theapproachconsists essentially in placing content geographically near its consumers, reducing the latency. This solution is unfeasible for the applications we envision asthey do not have content to distribute, but controlmessagesthathavetobesentoveradistancethatcannot be reduced.There are commonissues though,e.g., ensuring path diversity[7].

Recentlytherehasbeensomeworkontheproblemofensuring timelinessindatacenternetworks(e.g.,[52,57]). Thefundamental difference ofthese worksin relationto ours is that they require modiﬁcations of the network devices. Consequently, these solu- tionscannotbeeasilyadaptedtopublic/legacynetworks,likethe Internetandother WANs.However,they increase thesigniﬁcance ofourwork,sincetheymotivatetheneedforsolvingthedeadline propagationprobleminthegeo-distributedinter-datacenter communication,forapplications thatspanmultipledatacenters,with- outmodifyingthenetwork.

Network failuresin IP backbones. Network backbonefailures may adversely affect IP routing and delay or disrupt information in-

(3)

Table 1

List of the strategies evaluated. OV and MH means that an overlay or multihoming are used.

Strategies Technique Basic idea Reference

Best-Path OV Send message through the best overlay channel. In case of failure, retransmit message

at most 3 times using a timeout of 3 seconds using the best overlay channel. RON [4]

JITeR⁰ OV/MH JITeRwithout backup channels. This work

JITeR¹ OV/MH JITeRwith 1 backup channel. This work

Flooding OV/MH Send message a single time using all available overlay channels. N/A Multi-Path OV Send message through 2 overlay channels: the direct channel and a randomly chosen

overlay channel (direct or not).

Mesh-routing [49] with variation in [6]

Hybrid OV Send message through one direct channel. If failure, send message via 4 randomly

chosen overlay channels (direct or not). SOSR [25]

Round-Robin MH Send message in a circular fashion alternating among all existent access links.

Retransmit 3 times at most using a timeout of 3 seconds.

N/A Primary-Backup MH Send message always through one speciﬁc access link until it fails. In case of failure

and if there are redundant links, pick another one. Retransmission scheme as RR.

Used in the Italian backbone

frastructure communication. Studies on the impact of failures in IP backbones have shown that they occur daily, beingoften the resultof problems either at or under the IP level [35]. Some of theseproblemscanleadtonetworkinstabilityperiodsanddisrupt applications [30]. Other failures come from interference of mis- conﬁguredorobsoleterouting protocolsrunningincustomer net- worksconnectedto theISPbackbone[55].Evenstablecorerout- ing infrastructures, BGP-based, are prone to failures [46,48]. Ma- licious faults can also happen as a result of acts of hacktivism, cyber-crimeorcyber-terrorism[56],such asdistributeddenial-of- service(DDoS) attacks,inwhichthe attacker(s)usea largenum- berof computers(bots orzombies) to generate traﬃc andcause congestion[29,36].Depending on thecapabilities ofthe attacker, the rate of messages delayed and lost can be alarmingly high.

Research on this matter is vast and many ways of countering thoseattackshaveappeared[44],butaﬁnal solutionisstilltobe found.

Overlay routing. Nodes of an overlay network relay messages through thevirtual paths among them,according to application- level criteria. This ﬁts quite well with applications with spe- ciﬁcrequirements,hard tosatisfy by normalwide-areanetworks.

Therefore,over the years application-aware overlay routing solu- tions have beenproposed, withvarious virtual channel selection schemes[2,4,6,49,50].However,tothebestofourknowledgenone oftheseworksaimstoprovidelatencyguarantees,whichistheob- jectiveofourwork.Themostcommonobjectiveofoverlayrouting algorithmsistodeviatetraﬃcfromchannelsthatarefaultyorcon- gested.Forinstance,RONmonitorsthenetworktodecidetoroute messages directly or through an overlay channel [4]. The works nearestto ours havetheobjective ofimprovingend-to-end communication latency, not of attaining individual message delivery deadlines[2,49].Spines,usesadenseoverlaynetworkwithseveral overlaynodesper-channel,recoveringmissingpacketsinaper-hop basis[2].Thisrecoveryisattemptedonlyonce,sinceSpinestargets videotransmission,inwhichmissingpacketsareundesirable, but acceptableto a certainlevel. Mesh-routing usesXML routersand theDiversity ControlProtocolformulticastingdata[49].Although timelinessappearstobearequisite,itisnotclearwhetherorhow itwouldbeachievedinstringentscenarioswith,forexample,per- sistentpacketlossescausedbylong-termcongestions[6].OverQoS exploresthecontrolled lossvirtuallinkabstractiontoprovidesta- tisticallossandbandwidthguarantees[50].Hanetal.introduced theidea of topology-awareoverlay routing to improve thediver- sityof paths when detouringtraﬃc to escape congestionorfail- ures[26]. However, the topology may change dueto changes at network(IP)layerrouting,sowedonotcreateanoverlaynetwork basedonthetopology,butinsteadselectoverlaychannelsdynam- icallytakingdiversityintoaccount.

Table 1shows a comparison of RON, Mesh-routing and other overlayroutingstrategieswithJITeR.

Multihoming. Multihoming allowshoststoaccessaWAN through twoormoreredundantlinks,toresistnetworkfailureslikethose exempliﬁedaboveandimprovepropertiessuchasavailabilityand performance[9].Informationinfrastructurestakeholdersfrequently deployIP connectionscontractedwithmorethanoneISP[1].The workclosest toours inthesense ofexploitingoverlaysandmul- tihomingisMONET[5],butitsobjectiveistomaskfaultsandim- provetheavailability perceivedbywebclients.Onthecontraryto JITeR,MONETdoesnottakelatencyexplicitlyintoconsideration,as itdoesnotaimtomeetmessagedeadlines.

The initialideasofJITeR appearedin aworkshopposition pa- persomeyearsago[15].Thatpaperalsoexploredoverlaynetworks andmultihoming. Thepresentpaperthoroughlyimprovesonthat earlier version in several ways: the algorithm was improved, is now formalized andits properties discussed; we compare it analytically withother representative routing strategies; we imple- mented it andhave experimental results; the simulations isway more completeand realistic (e.g.,underlying network routing ef- fectsareaccountedfor,morestrategiesarecompared)andprovide moreresults.

CloudcommunicationandSDNs. Thepopularityofsoftwaredefined networks (SDNs) promises to increase the possibility to control the network fabric by applications. In particular, recently Google showedhowitmanagesitsdedicatedinter-datacenterbackbone(a WAN)usingSDNtechnologytoachieve impressivelevelsofband- widthutilizationwithoutsacrificingapplicationSLAs(includingla- tency)[32].Theirdesignisbasedonacentralizedtrafficengineer- ing algorithm that controls the network. Although not explicitly designedfortimeliness,thissolutioncansolveoratleastalleviate theneed foran applicationlevelsolution likeJITeR.However, we donotenvisionsolutionslikethatbeingusedforsystemsspanning multiple administrative domains (without centralized control) or forcriticalinformation infrastructures(GDIIs)that usually do not ownthecommunicationbackbone(e.g.,powergridoperators).

Critical information infrastructure communication. Ratatoskr [54] exploits several channels and retransmissions to provide communication timeliness in critical infrastructures, but on the contrary of JITeR it does not try to enforce deadlines explicitly andit isbased ona publish-subscribe middleware,GridStat [24]. Esposito etal. presenta broadcast protocolfor publish-subscribe systems [22].This protocol aims to support reliable communication among the nodes of an overlay network based on network coding and gossiping [21]. Although the paper mentions timeliness as a desirable property, the protocol neither considers the

(4)

Facility 2

Facility 1

Facility 3

Facility 4

LAN LAN

LAN

WAN

ISP_c

ISP_a ISP_b

ISPd JITeR Node

JITeR Node

Fig. 1. WAN-of-LANs model of an information infrastructure using JITeR.

existence of deadlines nor takes into account communication delays.Theprotocolissimilartoﬂoodingasitaimstodeliverthe messages toall nodes.Todaiis apeer-to-peer datadissemination scheme forlarge-scale complex criticalinfrastructures [10]. Simi- larly toJITeR,itleveragesan overlaynetworkinordertoimprove communication in GDIIs. However, its main goals are to provide reliability, scalability, andresilience(using semi-activereplication for this purpose), whereas JITeR’s main goals are timeliness and reliability,asitisfocusedondeliveringcontrolmessagesonly.

3. TheJITERalgorithm

This section presents JITeR, a channel selection scheme that leverages the available connection redundancy and diversity to providetimelyandreliabledeliveryofcriticalmessageswithhigh probability.

3.1. Designrationale

WAN-of-LANsstructure. Thedesign rationale ofJITeR isdriven by the architecture of modern GDIIs.GDIIs are geo-distributed over several facilities, followinga WAN-of-LANs model(see Fig. 1). In thismodel,dependingon thekindofinfrastructure (e.g.,utilities, cloudproviders),facilitiescanbeanyof:clouddatacenters,corpo- rateoﬃces,substations,SCADAcommandandcontrolcenters,etc.

Facilitiesgenerallyhavehighconnectivitylinks,theLAN-typepart, interconnected by apoint-to-point wide-area network,the WAN- type part. EachLAN (orset thereof) ofa facilityis logically con- nectedtotheWANthroughaJITeRnode,whichexecutestheover- lay andmultihomingchannel selectionalgorithm. Ifthecompany hastoofewfacilities,helperJITeRnodescanbeplacedsomewhere else,e.g.,incloudservices.JITeRnodes,however,areneitheraccess routersnorusedtosendall thefacilities’traffic,onlytime-critical controlmessages.Thattrafficisassumedtohavenegligibleimpact intermsofbandwidth.The JITeRarchitecturemakesnomodifica- tiontotheexistingWANnetwork,andpreserveslegacyfeaturesof internalsubnetworksandsystems,onlyrequiringtheintroduction oftheJITeRnodes.

Stable overlay network. Contrary to classical uses of overlay networks, e.g., in the context of peer-to-peer applications, whose

granularityisoftenatthelevelofindividualhosts,andwhosedy- namicsisquitehigh,overlaynetworkinginGDIIsisbestperformed attheinter-facilitylevel,i.e.,amongstJITeRnodes.Thesearepretty stableintimeaswell(itisunlikelytohaveGDIIfacilitiesjoinand leave the systemfrequently) and thiscomes forfree as a design principle which presents several advantages. In consequence, we assume each node knowsall the other nodesof theoverlay network. This allows aggressive re-routing and monitoring policies, fundamentalforprovidingthestrongtimelinesspropertiesdesired.

Finally,aJITeRnodecanbereplicatedforfaulttoleranceandscal- abilityreasons.

One-hopsourceoverlayrouting. JITeRnodesdefineanoverlaynet- work atop a general IP network, andrun the JITeR algorithm to selectoverlay channelsthat are expectedto providetimely communication.TheJITeRalgorithmisaone-hopsourceroutingscheme. Theoverlayrouteofeachmessageisdefinedatthesender(source routing),based on the local knowledge of the state ofthe links, andis composedofatmostone intermediaterelaying JITeRnode (one-hop).Theoptionofhavingasinglehop,i.e.,asingleinterme- diatenode,isduetoitssimplicityandtheconclusionofGummadi etal.[25]thatthereisnoconsiderablebenefitinusingmorehops.

Proactive monitoring. JITeR does monitoring proactively in order to to deal withnetwork changes. JITeR nodes may be connected to the WAN via multihoming, i.e., by two or more access links provided by distinct ISPs, allowing messages to be transmitted throughseveraloverlaychannels.Multihomingcanonlyguarantee fault toleranceeffectively ifthe ISPs’networksshare a minimum amountofresources. GDIIsnormallytryto ensurethislink inde- pendencein abest-effort andad-hocmanner whenselecting the ISPs.TheJITeRalgorithmusesrouteinspection mechanismstoas- sessthedegreeofindependenceamongtheoverlaychannels[13], andkeepsametricofthestatusandqualityofthelinksbetween pairsofJITeRnodes,measuredbythelatencyortransmissiontime² (TXT).Bothkindsofinformationareupdatedperiodicallyandused to guiderouting decisions. Infact there is a tradeoff involved:if

2This time includes not only the physical-level transmission delay at the sender, but also other delays such as transmission delays, propagation delays, queuing and processing delays at routers

(5)

theperiodicityishighthealgorithmadaptsslowlytothenetwork conditions;iflowmoremessagesaresent.

Deadline-aware multichannel transmission. Since IP offers only best-effort communication guarantees, JITeR has to use temporal andspatialredundancytoobtainthedesiredmessagelatencyand reliability assurances. Each message is transmitted using one or moretries, untileitheran ACKisreceived orthemessagerecep- tiondeadlineisreached.

Ineach try,themessageissentthroughone basechannelplus Bbackupchannels.Thesechannels areselected inawaythat im- provesthenumberofpossibleretransmissionsofthemessagebe- forethe deadline. Foreach message,JITeR startsby using a base channelthatdoesnotofferthebestTXTbutisfastenoughtostill permit messages to arrive in time andbe retransmitted through otherchannels.This approachleavesthe bestTXTchannels tobe used:(1) forretransmissions, asthe available time fordelivering themessagebecomesshorter;(2)fortransmittingothermessages withashorterdeadline.Thissolutionachievessomeloadbalancing amongmessages withdifferentdeadlinesscheduled fortransmis- sionwithin agivenshort interval.Messages withstringentdead- linesaretransmittedthroughthefastestchannels,whilemessages allowinglargerdeliverytimesgothroughslowerchannels.

Thebackupchannelsareselectedinsuchawaythattheyhave aslittlecorrelationaspossiblewiththebasechannelandbetween themselves(basedonmonitoringinformationabouttheusedlinks androuters),whilestill beingable todeliverthe messagewithin thetimeconstraints.Asaresult,themessagesarenotsentasfast aspossible,butfastenough,orjustintime.

Immediate and incremental deployment. From the deployment pointof view,using baseline IP protocols ensures immediatede- ployment, without the need for global changes at the network level.In consequence, GDII stakeholdersdo not need to add the JITeR nodesimmediately to all facilities: deployment maybe incremental.

3.2.Systemmodel

The system is composed by a set of JITeR nodes R=

{

^r0,...,r_n₋₁

}

^.^A^JITeR^logical^overlay^channelinterconnectstwoJITeR nodes,andisimplementedeitherbyadirectchannelorbyaone- hopindirectchannel.Inthissense,eachpairofnodesr_i,r_j(withi= j)isconnectedbyatleastonedirectchannelc_ij,providedby routing across the network IP level. When two nodes are connected byan indirectchannel,anotherJITeRnodeworksasarelay.Inthis sense,anindirectchannelconnectingr_itor_jpassingthroughr_k,is thecompositionoftwodirectchannelsc_ikandc_kj,anddenotedby c_ikj.

The notion of channel is, however, deeper than this. A channel is an abstraction deﬁned at each JITeR node. The algorithm takesadvantage ofmultihoming, soeachchannel beginswiththe connectiontoone oftheISPs thenode isconnectedto, fromthe ISPsetI=

{

^p0,...,p_n₋₁

}

^.^Whenever^needed, ^weûseâsuperscript toindicate theISP. Forexample, ifthedirect channel isc_{i j}^p,then whenr_isendsamessagethroughthischannelthemessageisfirst passedtoISPp.Therecipientr_jtypicallyreceivesthemessagefrom adifferentISPq (themessagecanpassthroughseveralISPs/au- tonomous systems, see Section 5.3);if it replies to the message, itsends the reply through q. Forindirectchannels,we employa superscriptwiththeISPs towhich thesender andtherelay pass themessages.Forinstance,ifthechannel isc_{ik j}^pq, thenthesender r_i passesthe messagesto ISPpandthe relay r_k passesthe messages to ISPq.If p=q we abuseof the notationand write only oneletter.

The algorithm is used to exchange control messages m=

^data,d

∈M,wheredataisthecontentofthemessageanddthe deadline by which it has to be delivered, relative to the instant whenitwassent(itisatimeinterval,notaninstant).Thesemes- sages are assumedto be sporadic(do not createcongestion)and small(ﬁt inone IP datagram).In thisworkwe disregard the delays thatoccur within theLANsofthe facilities(typicallysmaller than 0.2ms, much less thanthe values ina WAN), andwe also ignoretheprocessingdelaysintheJITeRnodes,sincethesedelays areusuallymuchsmallerthantheWANtransmissiontimes.Inthe caseswherethisisnottrue,onecanassumeaboundonthesetwo delaysTextra,anduseittoupdatethedeadlinedaccordingly(one woulduse adeadline d=d−2×Textra).A messageissaid to be sent whenit is placed in theexternal transmission queue atthe senderJITeRnode,andissaidtobedeliveredwhenitisputinthe internal transmission queue of the destination JITeR node, to be forwardedtoitsdestinationbythereceivernode.

Weassume that channelsare lossyandasynchronous,i.e., they can lose messages andthere are no boundson the communica- tiondelayswithintheWANlinks. Wealsoassumethat corrupted messagesare detectedanddropped (e.g.,usingcyclicredundancy checks or message authentication codes), allowing only correct messagestobedelivered.

Channel correlation. Pairs of channels are assigned correlation numbersinthe range[0,1].0means thatthereisno correlation atall,i.e.,thattherearenotransmissionmedia,equipmentorad- ministrationcommontobothchannels;1meansthat theseitems arecommontobothchannels.

There are severalpossibilities for the data employed to com- putethecorrelation.Apracticalsolutionistousethesetofrouters commontobothchannels,whichcanbeobtainedbyrunningtools liketraceroute.Weconsiderthateachchannelcischaracterizedin terms of a set ofrouters rout such that, given the set of routers of another channel c, the correlation of the two channels can be computedfrom routand rout usingthe Sørensen-Dice coeﬃ- cient:s=2

|

^rout^rout

|

/(

|

^rout

|

+

|

^rout

|

)^.^Data^about^routes^has^to beupdatedperiodicallytoaccountforroutechanges.

Analternative solutionisto considerdiversityinterms ofau- tonomous systems.The conceptofAShasbeenevolving[45],but it suggestsa setof routersunderthe sametechnical administra- tion,soitcanbe usedasaunitofdiversity.Thismakessense es- peciallyinworld-widenetworks,asourexperimentswithAmazon EC2 show that pairs of regions are connected by a considerable numberofASs. Thecorrelation ofASscanbe calculatedsimilarly towhatwasgivenforrouters.

Athirdsolutionwouldbetocomputechannelcorrelationbased onpathavailabilityhistory,asproposedbyZhangandPerrig[59].

Channel latency. Each node keeps information about the trans- missiontime (TXT) ofeveryoverlay channel.Theactual measure- mentmethod ofTXTisindependentfromthe algorithm,butcur- rently we approximate it as half of the round-trip-time (TXT= RTT/2). Measuring one-way delays is known to be diﬃcult as it requires strict time synchronization between hosts [41]. For the sake ofexample,intheexperimentsforthispaperwe haveused a method similar to the TCP protocol[42], estimatingTXT using an exponential weightedmoving average:TXT=(¹−

α

)×TXT+

α

×TXT_measured. If no normal traﬃc is being exchanged between the nodes, then messages are exchanged periodically to support these measurements. The protocol used to send these messages is UDP insteadof ICMP, because JITeR sends messages over UDP.

Moreover,itwasrecentlyshownthattheuseofICMPforthispur- poseisproblematicdueatleastinparttostrangeprocessingthat some routersdoto packets [43]. Ifround-trip-time measurement messagesfora directchannel takemore thana certainthreshold

(6)

Fig. 2. The Overlay Channel (OC) matrix of a node r i. A row j of the matrix contains the channels for sending messages from r ito r jordered by TXT . Each matrix entry can represent either a direct channel (e.g., OC [ j , 0]) or an indirect channel (e.g., OC [ j , 1]).

TXT_max to be answered more than Od_max times, we assume this channel tohaveTXT=+∞,andthealgorithmstopsusingit.No- ticethatsincethesemeasurementsaredoneperiodically,ifachan- nelisreestablished,itwillagainbeconsideredfortransmission.

If a message takesmore than the current estimate TXT to be delivered (orisnot delivered)then thereisa communicationtim- ingfault.Moreformally,givenanytwonodesr_iandr_jandthees- timate TXT_ij of the transmission time between them maintained at r_i, there is a communication timing fault if r_i sends a mes- sagetor_jatinstantt_send andthemessageisnotreceivedatr_jby t_send+TXT_{i j}.

3.3. Supportingdatastructures

Every node r_i keepsa copy of a system-wide matrix DC that stores dataabout all direct channelsc^p_jk, forr_j,r_k∈R and p∈I. EachentryDC[j,k,p]hastwoﬁelds:txt isan estimateoftheTXT of the channel c^p_jk; and rout is the vector with the routers that messages traverse onthis channel (asexplained above).³ Noder_i periodicallyupdatestheentriesDC[i,∗,∗]withthenewvalueses- timatedlocally,andthendisseminatesthissubmatrixtotheother nodes. Whenevera node r_k receives such a submatrix from r_i,it updatestheentriescorrespondingtoDC[i,∗,∗]initsownmatrix.

Itisimportanttonoticethat differentnodesdonotneedto have exactly the sameDC matrix, butonly convergeto thesame, just likeindistancevectorroutingprotocols.

ThedataintheDCmatrixisusedtopopulatetheoverlaychan- nel matrixOC (seea representationin Fig.2). Anode r_istores in theOCmatrixdataaboutalloverlaychannels,directandindirect, availabletoconnectitselfandallothernodesr_j.Eachentryofthe matrixrepresentsanoverlaychannelandcontainsastructurewith thefollowingﬁelds:txt istheTXTofthechannel(the sumofthe TXTsofthetwo sub-channelsifit isanindirectchannel);relayis the node that relays the message (or NULL for direct channels);

isp1istheISPtowhichr_iisconnected;isp2istheISPtowhichthe relaynode isconnected(orNULLfordirectchannels); androutis thevector withtheroutersthat havetobetraversed. Theentries OC[j, ∗] in the array correspond to theoverlay channels towards

3Although there is at most one relay per channel, there can be an arbitrary of (network-layer) routers per channel.

thedestinationnode r_j,andtheyareorderedfromthelowestTXT to the highest, which places slow(i.e., high-latency) channels in thelastcolumns.

3.4.Thealgorithm

Thealgorithm worksin a loop.Each cycleconsistsinsending themessagethoughoneormorechannels,thenwaitingforanac- knowledgment.Iftheacknowledgmentisnot receiveduntila cer- taintimeout,anewiterationoftheloopisexecuted.

Whennoder_iwantstosendamessagetor_j,thealgorithmse- lectsfromrowjoftheOCmatrixanumberofchannelsaccordingly tothefollowingrules:

Basechannel:Thebasechannelischosentomaximizethepos- siblenumberofretransmissionsofthemessagethrough different channelsbefore the deadline expires,andto allow some level of loadbalancing among messages with different deadlines,leaving thebestchannelsforthemessageswithshortestdeadline. There- fore,the basechannel is not theone that provides the bestTXT, onlyaTXTthatisenoughforthemessagetobedeliveredintime.

Backup channels: Atotal of B backupchannels are selected in a way that: (i) minimizes the correlation with the base channel andbetweenthemselves;(ii)preservestheir abilitytodeliverthe messagebeforethedeadline.

Uponatransmission,atimeoutof2×TXTsetsthewaitingtime foranacknowledgmentofthemessagedelivery.Ifnoacknowledg- mentisreceived,thentheselectionprocessisexecutedagaintoal- lowfortheretransmissionofthemessage,possiblythroughother channels.

Algorithm 1 contains the pseudo-code executed by node r_i when a message m=

^data,d

^is ^to ^be ^sent ^to ^node ^rj. The al- gorithmuses primitivesend(^TYPE,m,c)^to ^transmit^message^m ^of type

TYPE

throughanoverlaychannelc(

TYPE

isoneof

DATA

or

ACK

^). ^The destination node isimplicit in c, andcan be obtained withfunction destination(c). Functionchannel(OC[j,i])returns the overlaychannel of thecorresponding entryof OC,whilefunction correlation() gives the correlation betweenthe routes of two OC entries.ThenumberofchannelsinanOCmatrixrowjisdenoted by#OC[j,∗].The deadlineofmessagem relativetothe instantin whichthemessagewassentisintheﬁeldm.d.

Themainpartofthealgorithmisinprocedure jiter_send(Lines 3-21).Thisprocedureisexecutedintwocases:(i)whenr_ireceives

(7)

Algorithm1 JITeRalgorithmexecutedbynoder_i. 1: uponthereisamessagemtodelivertor_j: 2: jiter_send(^m,r_j)

3: procedure jiter_send(^m,r_j)

4: if(^OC^[^j,0].txt>m.d)^then

5: signalNOT_ENOUGH_TIME(m) {notimetosend,exit}

6: endif

7: bc←0 {thebasechannel}

8: elapsed←OC[j,0].txt {overallusedtime}

9: whilebc+1<#OC[j,∗]do

10: if (^elapsed+2×OC[j,bc+1].txt≤m.d)^then

11: elapsed←elapsed+2×OC[j,bc+1].txt 12: bc←bc+1

13: else

14: exitloop {notimetousemorechannels}

15: endif 16: endwhile

17: send(^DATA,m,channel(^OC^[^j,bc])) ^{send^to^base^channel}

18: P←setofBbackupchannelswhere (i) OC[j,∗].txt≤m.d and

(ii)correlation(OC[j,bc],OC[j,∗])^and^between^themselves^is thelowest

19:

∀

^p∈P:send(^DATA,m,p) ^{send^to^backup^channels}

20: m.d←m.d−2×OC[j,bc].txt 21: start_timer(^m,r_j,2×OC[j,bc].txt)

22: uponexpirestimerformtoreachr_j:

23: jiter_send(^m,r_j) ^{retry^to^send^m^to^rj} 24: upon(^DATA,m,c)^is^received:

25: if(destination(^c)=r_i)^then

26: send(ÂCK,m,c) ^{^c^:îs^the^backward^channelôf^c}

27: delivermtoitsﬁnaldestination 28: else

29: send(^DATA,m,destination(^c)) ^{relay^m^to^itsdestination}

30: endif

31: upon(^ACK,m,c)^is^received:

32: if(destination(^c)=r_i)^then

33: stop_timer(^m)

34: signalOK_DELIVERED(m) 35: else

36: send(^ACK,m,destination(^c)) ^{relay^m^to^itsdestination}

37: endif

amessagem tobe transmittedthroughtheWAN (Lines1-2); (ii) whenthetimerthatisstartedwhenthemessageissent(Line21) expires(Lines22-23).Noder_iﬁrstchecksifthefastestchannelhas aTXTlow enoughtoallowthemessagetobereceived beforethe deadline,i.e.,ifTXTisatleastequaltothetime availableto send themessage(Line4).IftheTXTisnotlowenough,anerrorsignal israised(Line5).

Otherwise, thealgorithmwillﬁndthe basechannelby testing theOC[j,∗]entriesinascending TXTorder(Lines7-16). Thealgo- rithm predicts the maximum possible number of future retransmissions ofm through different basechannels based on the ref- erencedeadline valuem.d.Itprogressivelytakesadvantage ofthe distinct overlay channels available in the entries of OC, each of whichwithlatencycostOC[j,∗].txt.Thealgorithmchoosesasbase channel theone withthe highestTXT,fromthose that inprinci- pleallowthetransmissionofthemessage,andsendsthemessage throughthatchannel(Lines17).

Next,asetPofBbackupchannelsiscreated(Line18).Backup channelssatisfy twoconditions: (i)they expectedlyallowthede- liveryofmbefore itsdeadline;and(ii)theyhavethe lowestcor-

relationwiththebasechannelandbetweenthemselves.Themes- sageis sentthrough thebackupchannels(Line19), andthen the algorithmupdatesthemessagedeadline andstartsa timer(Lines 20-21). If thetimer expires,the message isresent (Lines 22-23).

Noticethatthemessagecanberesentwithoutneedduetoalate ACK,butthisisnotproblematic.⁴

When thenode receives a

DATA

^message^m^,^it ^either^delivers

ittoitsﬁnaldestinationintheLAN,sendinganACKtothesender node (Lines26 and27),orrelays m toits destination JITeR node (Line29),dependingonitsdestination.Ifr_ireceivesan

ACK

^mes-

sagefromnode r_j,the timeris stoppedandthealgorithm termi- nateswithsuccess(Lines33-34).Noticethattheerrorsignaledin Line5canbepessimistic,asthemessagemaybedeliveredbutthe acknowledgmentlostorreceivedlater,afterthesignalisraised.

3.5. Properties

Themainpropertythatthealgorithmguaranteesistimelymes- sagedeliverywithhighprobability,despitecommunicationtiming faults. Consider thatbc istheindexof thebasechannel (starting with0)whena messagemisfirst transmitted(Algorithm1,Line 17)andthat Bisthe numberofadditionalbackupchannelsused oneachtry(Line18).Thealgorithmtransmitsmthrough1+Bdi- verse channelsin each try (executionof jiter_send), andatmost bc+1triesareexecutedtosendm.Itmeansthatthesystemtoler- atesatmost(¹+B)×(^bc+1)−1communicationtimingfaultson overlay channels.This expression reflects themain idea of JITeR: toexplore spaceandtimeredundancyto increasetheprobability oftimelymessagetransmission.Naturally,theeffectivenessofthe redundancy employed is dependent on the characteristics of the network. Ifthe networkis very unstable(the channels’ TXTesti- mates do not reflectthe actual transmission times) and/or ifthe linksexhibit a highcorrelation,neither JITeRnor anyother routing strategy can effectively ensure timely communication. How- ever, given ISP diversity andconsidering the naturalredundancy ofwell-engineerednetworks,ouralgorithmwillexplorebothquite effectively,evenunderharshscenariosofdisastersorDDoSattacks, asweshowinSection5.2.

3.6. Anexample

Thissectionpresentsan exampletoshow:(1)thewayaJITeR nodeselectschannelstotransmitamessage;(2)thatbadchannels arenotused;and(3)thatloadbalancingisachievedandthebest channelsareleftforthemessageswithshortestdeadline.Wecon- sideranoverlaywithfournodes{r_i,r_j,r_k,r_l}andB=1.Noder_iis connectedto twoISPs, pandq,andsends twomessages m₁ and m₂ tonode r_kinarow,withdeadlinesrespectivelyof30ms and 70ms. Fig.3 representsthe lineofthe OCmatrixwithchannels connectingr_itor_j(linejofFig.2).Eachcolumncorresponds toa channel.Thechannelidwasaddedtohelpourexplanation.

Message m₁ hasa deadline of30 ms. Lines 7-16 ofthe algo- rithmconcludethattosendm₁thereistimetousechannels0and 1 (2×11+8≤30), but not channel 2 (2×14+2×11+8>30).

Therefore, the ﬁrst base channel used for m₁ is channel 1. For backupchannelit choosesthelesscorrelatedchannel that allows achievingthe deadline, channel 5,which hascorrelation s=0.33 withchannel 1.Channel9isexcludedbecauseitsTXTistoohigh.

Ifanacknowledgmentisnotreceiveduntil2×TXT₂=22ms,then r_ipicks the secondbasechannel (channel0) andanotherbackup channelandretransmitsthemessage.

4Instead of setting the timeout to 2 ×TXT , line 21 might add to that value four times an estimative of the deviation, similarly to what is done for TCP’s retransmission timer [42] . However, this would waste time that could be used to resend the message through a different channel and improve the chances of reception on time.

(8)

Fig. 3. Line of the OC matrix at Jiter node r iwith the channels connecting it to r j.

For m₂ something similar happens. The deadline is 70ms, so it hastime touse channels0 to 2(2×14+2×11+8≤70), but notchannel3,sotheﬁrstbasechannelis2.Forbackupchannelit chooses amongchannels1,3,7and9,all withcorrelations=0.5 withchannel2.

Thisexampleshowshownodesselectchannels.Italsodemon- strates that bad channels are never selected (channel 9 is never selectedtosendm₁).Finally,itshowsloadbalancinginaction:the two messages are sent through different base channels the ﬁrst time they are sent. Its is importantto recall that the OCmatrix isquite dynamic:ouraggressivemonitoringstrategyensures that the valuesof thistable (and the order ofthe channels) are con- stantly beingupdated, ensuring thesystem adapts to changes in thenetworkconditions.

4. JITERimplementation

Current implementation. We implementedthe JITeR nodesinJava (around 6000 lines of code). The prototype uses the JGroups toolkit⁵ to managemembership,i.e.,tokeep andupdate viewsof the JITeR nodesthat are active. The prototypeuses the functions ofthattoolkittoallownodestoenterthegroupofJITeRnodes,to leavethatgroup,andtoberemovedautomaticallyincasetheybe- comeinaccessible,similarly tomembership servicesinthe literature[3].Otherwisetheprototypedoesnotusethecommunication primitivesprovidedbyJGroups,butsendsmessagesontopofUDP.

The prototype alsoimplements the FloodingandPrimary-Backup strategies(seeTable1).

Innormaloperation thenodesconstantlymonitortheTXTbe- tweenthemselves,eitherexplicitlybysendingheartbeatmessages, orimplicitlybymeasuringthetimetakentogetrepliestothedata messages they send.Thediversity amongchannelsisperiodically assessedusinginformationaboutroutersobtainedusingtraceroute.

Scalabilityandnodereplication. TheJITeRnodeofafacilitycanbe replicatedtocopewithmoremessages(scalability)andtotolerate faults. The basic idea is to replicate each JITeR node in a set of hosts inthesamegeo-location. Eachreplica handlesafractionof the messages andavailability is ensured by the other replicas if someofthemcrash.

Theimplementationisbasicallythefollowing.Eachreplicahas softstatereﬂectingcacheddatastructuresthataremaintainedina coordinationservicessuchasZookeeper[28].Themembership(list ofreplicas)ofeachnodeisalsokeptinZookeeper,thatcantrivially detectreplicafailureandsupportreplicaadditionandremoval.⁶All replicaswillmonitordifferentchannelsofthenetworkinorderto update the data structures on Zookeeper, andwill read thisma- trixtomemoryperiodically(e.g.,everyfewseconds).Aclientthat wants to send amessage usingsuch replicatednode will choose

5http://www.jgroups.org/ .

6http://zookeeper.apache.org/doc/trunk/recipes.html .

oneofthereplicasrandomlyanduseitasitsJITeRnode.Thesame thinghappenswhenotherJITeRnodeswanttousethisnodeasa relay. Thissolutionallows scaling upaslong asZookeeperisnot thebottleneck,whichisunlikelyasitscaleswellwiththenumber ofreadoperations[28].

Accessandadmissioncontrol. JITeRimprovesthetimelinessofcon- troltraffic that we assume to be negligiblein comparisonto the overallnetworktraffic.However,inpracticeaccesscontrolandad- missioncontrolhavetobeimplementedtolimit,respectively,who cansendmessagesusingJITeRandhowmuchtraffic canbe sent.

Thereare severaloptions toimplementbothmechanisms.Forin- stance,accesscontrolcanbebasedonSOCKS5andadmissioncon- trolsimilartoATM’s[16].

5. Evaluation

ThissectionevaluatesJITeR analytically,usingsimulations,and experimentally. These three evaluations complement each other andshedlight onthe fundamentalcharacteristicofJITeR andre- latedstrategies.

5.1. Analyticalcomparisonofstrategies

This section compares analytically JITeR with other potential candidates to improve the timeliness of control traﬃc in wide- area IP networks, whether with overlays, multihoming or both.

Table1 describesthe strategies,where we consider two possible conﬁgurationsforJITeR:usingB=0orB=1backupoverlaychan- nels, dubbed JITeR⁰ and JITeR¹,respectively. Results are not pre- sentedforhighernumbersofbackupchannelsbecauseourexper- imentshavenotshownbeneﬁtsinrelationtoJITeR¹.

Thevariousstrategiescanberoughly characterizedintermsof threemetrics:thenumberoftimestheycanresendmessages;the numberofchannelstheyuse(overlaying); andthenumberofac- cess ISPs they exploit (multihoming). Notice that these numbers are not constant, asthey depend on the actual physical conﬁgu- rationof thenetwork (e.g.,the numberofISPs) andthe runtime conditions(e.g.,the messagedeadlinesandthe channels’TXTac- tually deﬁne how manyretransmissions are possiblein JITeR). In anycase, it is still possible to calculate a level of fault tolerance ofeachstrategy:thenumberoffaultychannels(channelsthatare interruptedorwithveryhighdelay)andaccessISPsthataretoler- ated.Thesemetricsmeanthat ifthat numberofchannelsorISPs fail altogether it is still possible for a message to be received in time, because there is still redundancyin the system (assuming thatfailuresareindependent).Theydonotmeanthatthemessage willactuallybereceivedbecausetheydisregardtemporaryissues, likeshortdurationcongestionthatcanaffecttheotherchannel/ISP anddelaythe message.It isalsopossibleto obtainvaluesofcost intermsofthenumberofextramessagestransmitted.

(9)

Table 2

Analytical comparison of the algorithms. Refer to Table 1 for the evaluated algorithms descriptions ( # ch is the number of overlay channels available).

Strategies Technique Features Fault tolerance Cost (extra messages sent)

#Res- #Channels #ISPs #faulty channels #faulty ISPs no one two ends used used tolerated tolerated faults omission omissions

Best-Path OV 3 4 1 3 0 1 2 3

JITeR ⁰ OV/MH bc bc + 1 # ISPs bc # ISPs −1 1 2 3

JITeR ¹ OV/MH bc 2(bc + 1) # ISPs 2(bc + 1)−1 # ISPs −1 2 2 4 Flooding OV/MH 0 # ch # ISPs all −1 # ISPs −1 # ch # ch # ch

Multi-Path OV 0 2 1 1 0 2 2 2

Hybrid OV 1 5 1 4 0 1 5 5

Round-Robin MH 3 # ISPs # ISPs # ISPs −1 # ISPs −1 1 2 3

Prim.-Backup MH 3 # ISPs # ISPs # ISPs −1 # ISPs −1 1 2 3

ThesemetricsandnumbersareshowninTable2.Thesymbol# means“numberof” andbcistheindexofthebasechannelwhen themessageisﬁrsttransmitted(Algorithm1,Line17),whichgives thenumberoftimesthemessagecanbesentthroughbasechan- nelsminusone.Notice that#Resendsaccountsfortriestoresend themessage;forinstanceJITeRresendsa messageuptobctimes.

Brieﬂy,thetable wasobtainedthe followingway.The 3rdto 5th columnscomefromTable1.The numberoffaultychannelstoler- ated(6th column) is equalto the number ofchannels used(4th column)minus1.The numberoffaultyaccessISPs tolerated(7th column)isthe numberofISPsused bythe scheme(5th column) minus 1.The cost withno faults (8th column) is the number of packets sent without faults (e.g., JITeR¹ always sends two pack- etsper application-level messageand ﬂooding sends asmany as the number of channels). With omissions this number increases (9th/10thcolumn).

ThetableclearlyshowsthatJITeR¹andJITeR⁰exploreallthedi- mensionsofdiversity,consideringtime,channelandaccessISPre- dundancy.Thissuggeststhattheyareabletoachievebettertimeli- nessthanalternativeschemesthatdonotexploreallthoseoptions.

FloodingexploitschannelandISPredundancytotheextreme,with theassociatedhighcostoftransmittingmessagesthroughallavail- able channels (#channels).The Best-Path, Multi-Path andHybrid strategiesexplore time and/or channel redundancy within a single ISP. Round-Robing and Primary-Backup explore time and ISP redundancy,butnotchannelredundancythroughotherfacilities.

5.2.Scenario-basedsimulation:criticalinformationinfrastructure

This section presents a simulation-based evaluation of the strategiesdescribedinTable1usingamodelofareal-worldcrit- ical information infrastructure. We developed a detailed model of the Italian power grid GDII using information publicly available[18,47].Weoptedforsimulationbecauseitwouldbevirtually impossibletohaveaccesstoanyGDIIproductionenvironment.

The evaluation aims to answer two important questions: (1) Givena setof messages withdifferent deadlines,to whatextent arethesemessagesreceived intime whenemployingthevarious strategies?(2)What are thetransmissioncosts incurredby these strategies?

Simulatednetwork environment. Simulations were carried out on the J-Sim simulator.⁷ The simulated network is based on a ISP backbonetopology that is used by the largest Italian powergrid company for control communication [47]. This topology is com- posedof31routersand51directchannels,asdepictedinFig.4(a).

Each router is capable of pushing data at 1 Gbps, and network

7http://sites.google.com/site/jsimoﬃcial/ .

Table 3

Power control messages of the simulation.

Message Deadline Period Description

CC-STATE 4s 4s CCs state info exchange SS-STATE 1s 2s SS state info to its CC SS-ALARM 1s sporadic SS alarm to its CC CC-CMD 2s sporadic CC command to a SS CC-HPCMD 1s sporadic CC high-priority com-

mand to a SS

channelsprovideapropagationdelayof50ms.Torepresentmul- tihoming, we replicate the network topology to create two fully decoupledISPbackbones.

Onthe underlyingnetwork topology,we considered 17candi- date gateways (the polygons in Fig. 4(b)), each corresponding to anItalianregion.Gatewaysarelocatedatacontrolcenter(CC)or asubstation(SS),andtheycansenddataat100Mbps.Fourspecial gateways(circlesinthefigure)arelocatedinthemainItaliancities in differentregions. Theyconnect regionalremote controller sta- tions,thusbeingresponsibleforpassingalongalltrafficrelatedto them(e.g.,receivemonitoringdatafromvariousSSsandtransmit commandstoreconfigurea SS).Theremainder gateways(squares inthe figure)areincharge oftraffic relative toa SS(e.g.,signal- ing messages sent by the SSto therespective CC).The gateways correspondtotheJITeRnodes,buttheycanrunanyofthestrate- giesofTable1.Tolimit thenumberoftransmittedmessages, we usedonly7ofthe17gatewaysinthesimulations:the4CCsand3 otherwellpositionednodes(darksquaresinFig.4(b)).Everynode hadaccesstotheWANthroughtworedundantlinks,eachonepro- videdbyadistinctISP.

Workload. The traffic of each CC and SS nodes involved in the setup was generated separately. We created 17 different traffic sources withthesame duration ofasimulation, totalizing90,134 messagessent.Thesametrafficsourceswereusedinallstrategies.

The workloadwasgeneratedbased oninformation abouttyp- ical powercontrol traﬃc [18].Traﬃc can be periodic orsporadic andhave different deadlines,according to the type of operation.

Fivedistinct traﬃc patternswere identiﬁed, asshowninTable3. Allthesepatternsareappliedinthesimulation,withperiodicmes- sagesstartingtobesentafterarandominitialdelay.

Faultloads. Thefaultloadsweregeneratedforthesimulationinter- valof5hours.Afaultloadhasatotalnumberoffaults f=148to be injected across allnetwork components. Faultsare injected in backbonerouters,linksandgatewayinterfaces(toemulatetheef- fect ofafailure ontheISPaccessrouters).Afaultnumberfc was deﬁnedforeachcomponentclass,followingthedistributionofun- plannedISPIPbackbonefailuresshownin[35].

(10)

Fig. 4. ISP network topology: reality and simulation.

• Faultload1(fault-free).Ideal,nofailures.

• Faultload 2 (accidental faults). A scenario where there are ac- cidentalproblemsin theWAN.Accidentalfaults aregenerated usingacombinationofthreenetworkfailure modelsfromthe literature. The model in [35] is used to determine the fail- urestartingtimeandlocalizationpernetworkcomponent, and the failure duration is based on the models in [14,34]. The resulting model states that (1) the starting time of failures is randomly picked following a Weibull distribution over the network-widesimulatedtimewindow[35]; (2)30% ofallfail- ureslastmorethan30secondsaccordingtoaParetotruncated distribution [14], whilethe remainder oneslast up to 30sec- onds following an Exponential distribution [34]; (3) for each classofnetwork devices,an individual elementisselectedac- cording to a Power-Law based distribution [35] to inject the fault.

• Faultload 3(crisis). Amore stringentscenario wherethe WAN is subject to accidental and malicious faults. This faultload is similar to faultload 2, but it also includes faults that have a longerdurationandaffectmorecomponentstosimulateDDoS attacks.AfteranalyzingdataaboutrealDDoSattacksinthelit- erature [29,36],a model wasbuilt withthe following characteristics:(1) theinitial time ofa single failure is obtainedby uniformlyselectingarandomnumberwithinthesimulationin- terval;(2)80%ofallfailureslastmorethan30secondsaccord- ing toa Pareto truncateddistribution,andthe remainder 20%

lastupto30secondsfollowinganExponentialdistribution;(3) foreachclassofnetworkcomponents,anelementisuniformly chosenforfaultinjection.

Simulation results. Table 4 shows the results of the simulations.

Twometrics are usedto evaluatethe strategiesofTable 1inthe simulations:thenumberofmisseddeadlines,whichisameasureof the effectivenessofthe schemetoachieve timely communication;

and thepercentage ofextra messages sent, which is a measure of cost.Thislast metricisthetotalnumberofmessagessent bythe schememinus thenumberofmessages transmittedbythe application,dividedbythenumberofmessagessent.

Table 4

Simulation results in terms of missed deadlines (effectiveness) and % extra messages sent (cost) considering fault-free (FF), accidental faults (AF) and crisis (C) scenarios. for the evaluated algorithms descriptions.

Strategy Missed deadlines % of extra messages sent

FF AF C FF AF C

B.P. 0 101 1 ,527 0 .00 0 .16 1 .76 JITeR ⁰ 0 0 97 0 .04 0 .96 49 .29 JITeR ¹ 0 0 37 100 .02 100 .41 167 .63 Flood. 0 0 5 2410.09 2410.09 2410 .09 M.P. 0 43 8 ,041 100 .00 100 .00 100 .00 Hybrid 0 111 8 ,627 0 .00 0 .21 26 .62%

R.R. 0 109 9 ,600 0 .00 0 .16 18 .67 P.B. 0 5 1 ,535 0 .00 0 .01 1 .78

The table shows that there were no deadlines missed in the simulations of any of the strategies in the failure-free (FF) scenario.However, afewoftheschemesincurredinadditionalcosts in terms of extra messages. Not surprisingly, Flooding was the most expensive scheme, as it sends messages through all channels. Its cost was severalorders ofmagnitude higher than those ofJITeR⁰ and JITeR¹.However, onthe contrary ofJITeR,Flooding does not need to keep information about the transmission time ofthe overlay channels, so there is a tradeoff involved. Compar- ing our strategies withthe others, one can conclude that apply- inganadditionalbackupchannelimpliedan increaseintheover- head,sinceJITeR¹ duplicates eachtransmitted messagebyalways exploringthebackupchannel(likeMulti-Path).JITeR⁰doesnotuse abackupchannel,sothecostisnegligible.

When accidental faults (AF) are considered, we observe that theHybridalgorithmhadthehighestamountofdeadlinesmissed.

Floodingmissed no deadlines,butalso didnot JITeR⁰ and JITeR¹ atmuchlowercosts.Interestingly,thenon-overlayprimary-backup schemethatisusedbymostGDIIsoutperformedsomeoftheover- lay strategies, conﬁrming that often it can cope with accidental faultscenarios.