HAL Id: inria-00111139
https://hal.inria.fr/inria-00111139
Submitted on 3 Nov 2006
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Increasing Data Resilience of Mobile Devices with a Collaborative Backup Service
Damien Martin-Guillerez, Michel Banâtre, Paul Couderc
To cite this version:
Damien Martin-Guillerez, Michel Banâtre, Paul Couderc. Increasing Data Resilience of Mobile Devices with a Collaborative Backup Service. [Research Report] 2006, pp.16. �inria-00111139�
inria-00111139, version 1 - 3 Nov 2006
a p p o r t
d e r e c h e r c h e
Increasing Data Resilience of Mobile Devices with a Collaborative Backup Service
Damien Martin-Guillerez, Michel Banâtre, Paul Couderc
N˚????
November 2006
Systèmes communicants
Damien Martin-Guillerez,Mihel Banâtre, PaulCouder
∗
Systèmesommuniants
ProjetACES
Rapportdereherhe n???? November200616pages
Abstrat: Whoeverhashadhisellphonestolenknowshowfrustratingitistobeunable
to gethis ontatlist bak. To avoid dataloss whenlosing ordestroying amobile devie
likeaPDA or aellphone, data is usually baked-upto axed station. However,in the
timebetweenthelastbakup andthefailure,importantdataanhavebeenproduedand
thenlost.
Tohandlethis issue,weproposeatransparentollaborativebakupsystem. Indeed,by
savingdataonothermobiledevies betweentwoonnetionstoaglobalinfrastruture,we
anresistto suhsenarios.
Inthispaper,afterageneraldesriptionofsuhasystem,wepresentawaytorepliate
dataonmobiledeviestoattainaprerequiredresilieneforthebakup.
Key-words: Dataresiliene, mobile omputing, ollaboration,bakup,sensor networks,
mobileadhonetworks,data MULEs.
(Résumé: tsvp)
ThisworkispartiallyfundedbytheACISIMoSAICandtheReSISTNetworkofExellene
∗
{dmartin,banatre,poude r} iri sa.f r
Unité de recherche INRIA Rennes
ollaboratif pour terminaux mobiles
Résumé: Quionqueàdéjàperdusontéléphoneportablesaitqu'outrelapertematérielle,
lapertesde laliste desontatsest très génante. Pouréviter toute perte dedonnées lors
deladestrutionoulaperted'unappareilmobiletelqu'unPDAouuntéléphone portable,
lesdonnéessonthabituellementsauvegardéessur unestation xe. Cependant, lesdonnées
aquisesdepuisladernièresauvegardeserontdénitivementperdues.
Pourproteger es données,nous proposons d'utiliser unsystème de sauvegardeollab-
orative. En eet, sauvegarder les données importantes sur les terminaux voisins via un
dispositifdeommuniationsans-lspermettraitdepaliere problème.
Mots-lé: Tolérane auxfautes, informatiquemobile, système ollaboratif,sauvegarde,
réseauxdeapteurs,réseauxmobileadho
1 Introdution
Theuseofmobileomputers,suhaslaptops,PDAs,mobilephonesordigitalameras,has
inreasedamazinglyduringpastyears. Thus,theprodutionofsensibledataonsuhdevie
hasalso inreased. The lossof suh data an havepainful onsequenefor users: lossof
phonenumbers,lossofmeetingdates,ordeletionofimportantnotesorpitures.
Toredue data loss,those devies usely haveasynhronization-likemehanism,whih
mainissueisthatyouneedtobenearyouromputerbringinguptimeperiodsduringwhih
devie failure means irreversibledata loss. Forexample, if you takea note on your PDA
duringameeting andyourPDAgetlost,stolenorbrokenonyourwaybak,thenthenote
isdenitelylost.
However,moreandmoremobiledeviesomewithwirelessonnetivitylikeIEEE802.11
orBluetooth. Using neighbor devies to savedata rightafter its prodution anderease
data lossbyrestoring dataeither from a global-salenetwork likethe Internet ordiretly
fromabakupdevie. Savingautomatiallyonaglobal-salenetworkseemsto beaviable
assumptionbeauseofthegrowingnumberofwirelessaesstotheInternet. Nevertheless,
therequiredinfrastrutureforthiskindofaessisexpensive(e.g. GPRS,UMTS).Insuh
asituation,theuseofneighborpeersto bakupsensible dataisawaytodereasetheost
ofthebakup.
We aim at designingand implementing a transparentollaborative bakup servie for
mobiledevies[9℄. Suhaserviediersfromexistingworksandthusneedstomeetspei
requirements we outline in setion 2. Then, we analyze several issues spei to mobile
deviedata andrepliationin setion3. Afterwards,wepresentawayto orderreplias in
that systemin setion 4 and ideas for bakup terminals to manage replias in setion 5.
Finally,afteroutliningworksthatarestillpendinginsetion6,wepresentexistingsystems
insetion7andonludeinsetion8.
2 Design overview
Our main purpose is to design an eient bakup system alled MoSAIC [10℄ that an
handlehighmobility,whih meansthatitneedstohandletwosenarios:
- Whenonneted to aglobal networklike theInternet, the systemmust use thisop-
portunitytosavedataonaresilientserver.
- Whendisonnetedfromtheglobalnetwork,itmustuseneighborstobakupseleted
data(i.e. dataofhigherimportane).
Also,dependingondataprodution(e.g. produtionrate,dataimportane),thesystem
should adapt the level of repliation. We espeially want fair use of the system to avoid
uselessresoureonsumption. Moreover,thesystemneedsto beprotetedagainstegoisti
partiipantsthat bakupbut donotprovideresouresto others.
Furthermore,thesystemshouldavoiduselessenergyonsumption. Asthesystemshould
workonmobilesystem,energyandotherresourearequitesareandshouldbeusedwisely.
Wewantthesystemtobeasimpliitfortheuseraspossible. Thatmeans:
- veryfewationsarerequiredfromtheuserwhenperformingthebakuporthereovery
(i.e. thebakupneedstobeaompleteoneandeasytorestore),
- nopriortrustrelationshipwithotherpeersis required,
- noextrahardwareisrequired.
As shownin gure1, alientterminal aneither bakup its data to anotherterminal
(thebakuppeer)ortoanInternetserver. Dataanbetransferedfrom thebakuppeerto
theInternetserver. Thelientterminalanthenrestoreitsdataeither fromabakuppeer
orfrom theInternetserver. Wedonotonsiderto propagatebakupthroughpeers dueto
tworeasons:
- Copyofbakupthroughterminalsostsenergyandothersresoures. Justpropagating
repliawithdeletionoftheoriginaloneostsinommuniationresoures(e.g. energy
andtime)anddoesnotimprovebakup reliability.
- Only the owner of the data anknow when it's neessaryto start arepliation. A
repliationissuedbyabakupterminalhasahighprobabilityto beuseless.
ThatshemealsotswellfordataMULE[19℄networks. DataMULEsaremobilewireless
terminalsorsensorsthatarrydatafromaloationtoanotherbythemobilityofitsarrier.
Forexample,Burrellet al.[4℄proposetouseasensorin ashovelorothertoolsthat ollet
datafromsensorsin thevineyardsothat theomputerat thefarm willbeableto analyze
databroughtbakbythemovementofthefarmer.
Inthe sameway, attlehealthanbemonitoredusingsensors that transmitdata to a
basestation. Eitherhealthdata,liketemperature,oralertsanbeissuedbysensors. Those
data, espeially alerts, need to reah the base station even if the sensor fails. Using the
proposed systeman help thosedatato reahthebase station. Sensorsonbirdsthat keep
traksofenountersanbeusedto monitorepidemis. Duringtheenounters,thesensors
analsouseneighborsto savetheirdataandoptimizethereadingbytheobserver(hejust
hastoreadonebird'ssensorinsteadofallsensors).
Besides the two previous arhiteture, Banâtre et al. [1℄ propose to use ollaborative
robots to realize tasks without a entralized brain. In that system, data gathering and
transmission are key points. In that system some information should be baked-up to a
loal brain (i.e. a reliable storage lose to the robots) or a global server. Then the
MoSAICshemeanbeappliedtoinreasedataavailabilityorreduetheneedforaglobal
wirelessoveragearea.
Figure1: Consideredbakupsenarios.
3 Data issues
3.1 Mobile devie data
Inthissetion,welookatprodueddataonlassialmobiledevies(i.e. PDAsandmobile
phones)and attheirattributestounderstandtheirspei issues.
Therstdata attributeis, obviously,thesize. Thesizedependshighly onthetypesof
data: theygofromlessthan200bytesforSMSorasheduleentry,tohundredofmegabytes
forvideoaptures. Theseond attributeistheprodution method. Forinstane,ifanote
anbereated,updatedordeleted,pituresaregenerallyonlyreatedordeleted onthose
devies. Another attribute is the importane of data, whih an be from high for notes
takenduringameeting to verylowforholidaypitures. Dependeniesarealsoimportant:
ane-mailanbeuselessuntil youhaveallpreedinge-mails inadisussionthread. When
adataitemdependsonpreedingdatalikein adisussionthread,weallthatdependeny
atemporal dependeny ontrary to a spatial dependeny where adata item D depends on
several others that andepend on D. Finally, alast attribute that interestsus is thelife
timeofdata. Atually,somedatalikesheduleentrybeomelessimportantwhenthetime
oftheeventhaspassed,evenifitmaybestillimportantto saveit.
IntheaseofdataMULEs,dataitemsaregenerallysmallentries(urrenttemperature)
ortrakofpastevents(enountersforepidemimonitoring). I.e. weanonsiderthatthose
dataitemsaresmallentries withpotentialtemporaldependenies.
So, mobile devie data an be ategorized by size, prodution method (reation only,
read/write,append only),dependenies(temporalandspatial),life timeandimportane.
Thesizestronglyaetsthebakupsysteminorderto:
- Resist to mobility or network problems during a transmission. The apaity of a
transmissiondepends mainly onthebandwidth andtheonnetion time. TheMTU
(MaxTransferUnit)analsobeimportant. WhilebandwidthandMTUaregenerally
easytoknow,theonnetiontimedepends moreonmobility.
- Avoidto monopolize oneterminalmemory. Memoryonsumption isaritialaspet
inamobilebakupsystem. Inthesameway,whenbaking-upadataitemonamobile
terminal,thesizeoftheitemaetsthelengthofthetransmissionandthustheenergy
onsumed by the bakup terminal. Therefore, deletion of replias an be needed to
freesomespaeonterminals. Itanbedeideddepending onthesizeofthereplias,
onthearrivalofanewversion,onthenumberofreplias,et...
On theother hand,produtionmethod aetsthepart ofdata that needsto be saved
(i.e. theentireleorthenewentry,et...) andthedependeny(e.g. whenbaking-upjust
anentrythat dependsonother). Moreover,dependeniesaettheintegrityofthebakup
andthusneedssomeversiontrakingpresentedin setion3.3. Finally, weaetapriority
toeahdataitemrelativelytoitsimportaneandtrytosavedatawithhighestpriorityrst
(setion4).
3.2 Dispersion of replias
Sine data size an be quite huge, there is a need for fragmentation of les. Moreover,
the high probabilityof aterminal failure to restore a replia reates aneed for aexible
repliationsheme. Courteset al.[6℄havealready lookedat methods forredundany and
ompression in that system. First, we onsider that all the data items that have spatial
dependeniesareagglomeratedintoonedataitem(thepriorityofthenewitemisthehighest
oftheagglomerateditems)sothattheonlydependeniesweonsiderarethetemporalones.
Then, we onsider the (n, k) repliation sheme (as in Rabin's information dispersal algorithm [16℄) that fragments the data into n fragments where only k are required to
reonstrut thedata. We alsoonsider delta-ompressionwhih savesonly thedierenes
betweenanoldversionandanewversionofthesamele. Whilethe(n, k)repliationsheme
reateslooselyspatialdependenies,delta-ompressionreatesstrittemporaldependenies.
Simple repliationis just a (n, k)repliation sheme with k= 1. Repliation when delta-
ompressingismadeongenerateddelta.
So,wenowonsiderthefollowingformatforeverydataitemtosave:
- nfragments.
- Onlykfragmentsarerequiredtoreonstrutthedataitem.
- The data item an have temporal dependenies on some other data (and then the
priorityofolddatashouldbeinreasedifthepriorityofthenewdataitemishigher).
3.3 Versions traking
Figure2: Dependeniesanpreventustofreesomememoryspaewhenanewversionofa
dataitemarrives.
Givenonsideredpropagationanddisseminationshemes,someissuesanappearregard-
ing arrival of newversionof adata item to bakup. Firstly, in presene of dependenies,
theoldversionofadatashouldbekeptuntilalldependeniesofthenewversionhavebeen
baked-uptotheresilientserverexhibitedin gure2).
Conits mayappear in our system. As a matter of fat, when you bakup the data
of amobile devie onaxed station, no onit anours sineall versionsof the same
dataitem areonthesamedevie(the mobile terminal). But, onsideringourpropagation
sheme,aonitmayappear(seegure3)whenadataitemisbaked-uponanothermobile
devieandanoldversionofthissamedataitemisloatedontheInternetserver:ifafailure
ours,thelientmayrestorethe oldversionfrom theserverand work onit,generating a
onitwiththeversionbaked-uponthemobiledevie. Whenfaingsuhasituation,our
systemmustuseonit resolutionmehanismssuhasin Coda[12℄orBayou[22℄.
Regardingthoseissues,thesystemmustkeeptrakofreplias' versions.
Figure 3: Conitappearingduringarestoration.
4 Estimating bakup reliability
Inthissetion,welookat howtoestimatein realtimetheprobabilityofadataitemtobe
orretlyrestoredandhowwean usethisestimationtoorderbakups.
4.1 Reliability estimation
Forthemoment,letjustonsiderthe(n, k)repliationsheme. LetPi betheprobabilityof gettingbaktherepliaiandPlitheprobabilityofbeingabletogetbaklrepliasbetween
therstiones. Thenweaninferfrom gure4):
Pli= (1−Pi)·Pli−1+Pi·Pl−1i−1 (1)
Figure4: Graphial proofofequation (1).
Inpartiular,
Pkk = Yk
l=1
Pl (2)
Pli = 0 if i < l (3)
P0i = 1 (4)
Whenbaking-upanadditionalreplia,weanestimatetheinueneontheprobability
ofgettingbaktheentiredataitem. That isorret, ofourse,onlyifwesaveeahreplia
ona dierentterminal whih meansthat all Pi areindependent. We anhandle thease of two replias being baked-up on the same terminal by onsidering that they will have
thesameprobabilityPi,whihisaviableassumptionsinetheyusethesametransmission anal. Thus, ifwe savem replias onto thesameterminal at thesametime, theequation
(1)beomes:
Pli+m= (1−Pi+1)·Pli+Pi+1·Pl−mi (5)
Weonsiderthatsavinganotherreplialateronanalreadyusedterminalisanindependent
eventbeausetoomuhtimegenerallyhappensbetweentwoenountersofthesameterminal.
The last things that we must take into aount are the temporal dependenies. The
probabilityoforretlyrestoringanewdataitem depending onoldonesis theprobability
ofrestoringthenewdataitemmultipliedbytheprobabilityofrestoringtheolddataitems.
Consideringallthosepoints,weanestimatetheprobabilityofaorretreoveryduring
thebakupitself.
4.2 Priority and replia sheduling algorithm
We said in setion 3.1 that eah data item is assoiated with apriority. That priority is
supposedtobeestablishedbypriormehanisms(userinterventionforinstane)andisgiven
asa desired bakup resiliene (e.g. a probability). We an lassifydata to be baked-up
usinga queueordered by thepriorityofthe dataitem minus the omputedprobabilityof
suessfulbakup.
Fpisapriorityqueueofdatatosave.Thepriorityeldofadatastrutureisthepriority
aetedtoadataitem. Thealgorithm1showsageneralalgorithmtoorderdatapaketto
save. First,if wehavedatato save,wetryuse theterminaluntil itbeomesunreahable.
Wepullo thequeuethe rstdata itemwhihanbesavedonterminalt. Then wesave
the next paket (index i) and reompute the probability of a suessful bakup. If the probabilityisnothighenoughthenthedataitemisre-enqueuedinFp.
OnMeeting(t)
(1) whileReahable(t)and DataToSave
(2) d←Pull(Fp,CanSave,t)
(3) if notExists(d)thenbreak
(4) l←NextPaket(d)
(5) p←Save(t,GetPaket(d,i))
(6) proba←ReomputeP(d, p,t,i)
(7) if proba < d.priority
(8) Push(Fp,d,d.priority−proba)
Algorithm1: Whenmeeting anotherterminal
ReomputeP in algorithm 1 an be done using equations (1) and (5). It needs to
keepkentries(tokeep(Pli)1≤i≤k)andto reomputethemeahtimeanewpaketisadded
(andthus needsk operations). Therefore, wehavearealisti real-timealgorithm to order repliate.
5 Replia management by bakup terminals
Using neighbors wireless applianesfor data bakups onsumes resouresasmentioned in
setion 2. In this setion, we onentrate on memory usage. If we onsider assigning a
ertainamountofmemoryto thesystem,freespae anbeomeaproblem afteraertain
time. Firstly,anapplianean needmorememorytoperformitstasks. Seondly,replias
moreimportantthanthosestoredontheapplianeanberefusedduetoalakofmemory.
Sowemustseewhih aretheriteriatomanagerepliasonbakupterminals.
5.1 Deteting useless replias
Arepliabeomesuselesseitherwhenithasbeensavedonthedestinationserverorwhenit
hasbeenoutdated. Arepliabeingoutdated meansthat eitheritsdataarenomoreuseful
(likeone-montholdtemperaturesifwejustwantlessthanone-weekoldones)orthatithas
beenupdatedbynewdata(likeasheduleentrybeingreplaed).
A terminalan know whenareplia hasbeensavedonthe resilientserverorhasbeen
updatedwheneither:
1. thebakuphasbeenperformedbytheterminal,
2. theownerhasnotiedtheterminal,
3. theserverhasnotiedtheterminal,
4. anotiationhasbeenissued byotherterminal.
While ases 1, 2 and 3 anbe done when interating with either the terminal orthe
server,thease4needspropagationandthusanwasteommuniationresoures.
The lifetime of a replia an be given by the owner when doing the bakup. Besides
deletingrepliasafteraertaintime, weaneasilyaddmessagestosaythatarepliaisno
longerneededduring other transmissionbutaneientprotoolhasto be designedtodo
it. Moreover,wemustlook attheostof notiationpropagationandtherelatedseurity
problems.
5.2 Criteria to free memory when needed
Afteralongdisonnetiontime,memoryusageanbeomeaproblemeitherforoursystemor
forthelassialterminalusagesevenifoutdatedorbaked-uprepliasaredeleted.Tohandle
thisissue,somerepliasmustbedeletedbasedonpartialinformations. Whendeletingsuh
areplia,severalriteriashould betakenintoaount:
- the age of the replia an help the terminal to estimate if it has been baked-up,
updatedorifitshasagoodhanetobenolongerrelevant. Thetimeperiodbetween
onnetionsoftheowneroftherepliaandthemeantimebeforearepliareahesthe
Internet inthesystemanbeusefultoestimatethelife-time.
- thebakupresilieneanditsimportaneanbeusedtoseletthelessrelevantreplias.
Whenareplia hasreahedahigh resilieneompared to itsimportane,deleting it
anbepainless. Of ourse, wewant tobefair towardseahuser andpreventauser
to delarethat allhis datais important(weanalso inludemoretrustedterminals
totheomparison).