Information Storage, Caching, Sharing and Replication

Related Work

2.2 Information Storage, Caching, Sharing and Replication

When talking about information storage, we refer to the fact of storing information in a geo-graphical area in a persistent way and only relying on MANETs. Due to the dynamic of the environment, this task is challenging and few research have been done in the domain. However, it is pertinent to mention that several works have already proposed infrastructure-dependent so-lutions for storing and accessing geo-referenced information (i.e. linked to geographical areas) by mobile devices [Pas97,DKM⁺03, Fit93, CCD⁺04]. These solutions are based on servers and the internet connection of mobile devices. Regarding information caching and sharing in MANETs, they aim at enabling mobile devices to cache and share information (of any type) but without taking persistence as one of their strong requirements. Finally, data replication aims at preserving the availability of data in spite of network partitions and failures.

2.2.1 Persistent Storage

Persistent storage in MANETs has only become a subject of interest for the research community during the last years. Therefore, there are only some few works tackling it and there is not a precise categorisation of methods. The following paragraphs describe the most relevant works that we can find in the literature.

In [KOK10,OHL⁺11] authors proposed a fully distributed ephemeral content sharing ser-vice which only depends on the neighbouring mobile deser-vices using principles of opportunistic networking. They called this approach Floating Content. A floating content is attached to an anchor zone which is defined by a geographical centre pointP, a replication ranger(radius) and an availability range a(radius). It is inside its anchor zone that a content is meaningful for a predefined lifetimeT T L. All the previous attributes are defined by the user creating the content.

In order to keep the content “floating” at its anchor zone despite the mobility of nodes, mobile nodes hosting content items try to replicate them into other mobiles nodes they encounter. For this, a 4-phase protocol is employed. First, mobile nodes continuously send to their neighbours discovering beacons. Second, a mobile nodeAreceiving a beacon sends back a summary of its content items. Third, a mobile nodeBreceiving a summary asks, by sending a message toA, for those items considered relevant. Each content item is considered relevant with probability p_r(h)wherehis the distance betweenBand the centre pointPof the content item, where p_r(h) is defined as follows: whereR(h)∈[0,1]is some decreasing function that provides the probability of replication out-side the replication range but within the availability range. Fourth, all the content items con-sidered relevant byBare sent toBbyA. Moreover, the deletion of content items is done either after encountering another node or after immediately leaving the anchor zone. In both cases, a mobile nodeCdecides to delete a content item with probability p_d(h)wherehis the distance between theCand the centre pointPof the content item, and pd(h)is defined as follows:

pr(h) = whereD(h)∈[0,1]. When there is still the need for free space in the buffer of a node, the oldest content items are deleted. The previous replication and deletion mechanisms aims at reduc-ing replications and increasreduc-ing deletions while gettreduc-ing farther from the anchor zone. However, authors evaluated both mechanism with R(h) =0 and D(h) =0. Authors claim that such an approach is also resistant to DoS attacks due to the fact that the traffic is only local (no messages set by remote nodes). Finally, a content can be only added and not (explicitly) deleted by a user.

In [LKG⁺09, LKG⁺10] authors study the storage capabilities of transient information in vehicular ad hoc networks. They define a transient information as the one produced by a spon-taneous event like a traffic jam or accident that will eventually disappear. This kind of events are called Events of Interest (EOI) and they are scoped to a limited geographical region called the Region of Interest (ROI). For the storage, authors propose that Vehicular Mesh Networks (VMesh), composed by cars which are in communication range, pick up an EOI whenever it hap-pens. The transient information contained by an EOI is disseminated along the whole VMesh.

Whenever the VMesh is leaving the ROI it tries to pass the information to another VMesh that stays at the ROI. The way how the information is passed is not clearly defined by the authors but it relies on the 802.11b UDP messages. Authors measured how long time a transient infor-mation can stay at a geographical region. They called this the Mean Time To Inforinfor-mation Loss (MTTIL). Thus, a transient information has not a pre-defined lifetime.

In [LCM09b,LCM09a] authors present a geographical information dissemination mecha-nism for VANETs with the particularity of keeping the information persistent for some duration.

A fixed number of replicas of the information are created at the source of the information, and using a geographical routing algorithm they are routed towards their persistent area (i.e. where the information must stay). Each replica is associated to a home zone which is located inside the persistent area, existing several home zones. The geographical routing algorithm, which is also proposed by the authors, makes use of infostations to route replicas close to their home zones. Once in an infostation, a replica is routed towards its home zone by jumping from one car to another by taking advantage of the planed routes of cars (information which is found in the navigation systems of cars). Once the replica is in its home zone, it tries to stay as close as possible to its home zone by jumping from one car to another whose location and speed let determine that this car is closer to the home zone or that it will reach it faster. The way how a replica is aware of other cars presence and location is done by a beaconing mechanism where each car periodically broadcasts its current location, speed and direction. Those cars interested in receiving information of their interest periodically broadcast their interests. Whenever a car that receives the interests of another has a replica that matches the interests, it sends the replica to the respective car. Moreover, those cars that heard about a replica from infostations, or re-ceived a replica satisfying a subscription, they opportunistically reply to subscriptions of other cars.

In [CC07] authors propose a collaborative location-based annotation system, called Ad-Loc.

Persistent virtual notes are tied to geographical locations which are called Area of Relevance (AOR). In order to keep notes persistent, mobile devices constantly cache the notes accordingly to four types of caching policies: basic, publish, periodic, location-aware periodic, and all. In the basic caching policy, a mobile device broadcasts a note whenever a query was previously received from another device and the note was matched. In the publish policy, a mobile device broadcast a note just after having been published. In the periodic caching, a mobile device periodically broadcasts notes in a less recently cached basis. The location-aware periodic policy is similar to the periodic policy but only notes having an AOR relevant to the current location of the mobile device are broadcasted. When a mobile device wants to match some interests, it first checks in its local buffers for notes that match its interest and then it broadcasts a query (that contains its interests) to its 1-hop neighbours. Those neighbours that have a note that matches the received query broadcast a reply message containing the note. Logically, several replies might exist for a query and they are sorted following a ranking function. An interesting point of the AOR of a note is that it may grow if the note is requested by a mobile device that is located outside the AOR. Therefore, the geographical scope of a note is dynamic and tends to grow and never to decrease.

In [FSWF06,WSH⁺06], authors introduced the concept of Hovering Data Clouds (HDC).

An HDC is a data or information structure existing in an ad hoc manner in a precise geograph-ical location as a result of the context (i.e. the data results from an event that happens on the environment). Although HDC is presented as a general concept, authors focus on one appli-cation: detecting traffic jams. Authors describe a precise pseudo-code for detecting a traffic

jam. The algorithm makes use of HDCs structures which are defined as part of the pseudo-code.

Thus HDCs are not part of an external independent middleware service (i.e. an independent algorithm) but rather it is defined inside the algorithm of the problem specific solution.

2.2.2 Information Caching and Sharing

The works presented in this subsection aim at caching and sharing information between mobile nodes. However, compared to persistent storage, the algorithms do not focus on keeping the information persistent despite the dynamic of the environment. A survey on data sharing systems can be found in [MD06].

PeopleNet [MSN05] describes a mobile wireless virtual social network which mimics the way how people seek information via social networking (i.e. asking their friends, neighbours, etc.). One interesting idea that authors proposed is that of bazaars which are geo-referenced com-munities of mobile devices. Each of these bazaars stores information items of a precise category like sports, travelling, sales, etc. When someone posts a query or an information item about for instance sports, the query/information item is routed towards the respective bazaar through the cellular infrastructure (i.e. an additional software is added into cellular infrastructure for the relaying). Once the query/information item is in the appropriate bazaar, it is propagated through the community of mobile devices using peer-to-peer communications. Thus, PeopleNet is an hybrid system that uses infrastructure and ad hoc networking. When a query/information item arrives to a mobile device that contains a query or an information item that matches, the mobile device sends a reply to the device that initiated the respective query through the infrastruc-ture (text message or email). Authors proposed two strategies to propagate a query/information item through a community of mobile devices. In the random spread strategy, when two mobile devices are in contact, they randomly exchange a query/information item by another query/in-formation item. However, they keep the query/inquery/in-formation item that they have exchanged. Thus, the buffer of mobile devices tends to fill up. However, if a mobile device needs to store a new query or information item and its buffer is full, it removes an older one from its buffer. In the random swap strategy, the behaviour is similar to the random spread strategy with the difference that mobile devices do not keep the query/information item that they have exchanged. Thus, the population of queries/information items does not increase. The evaluations and analytical analysis carried out by the authors showed that the random swap strategy performs better than the random spread strategy when talking about matching rate and delay.

In [PS01] authors proposed a system called 7DS whose main goal is to provide Internet ac-cess (to web pages) to hosts which are not neac-cessarily connected to Internet. The hosts might be mobile or fixed and the systems relies on peer-to-peer (P-P) and client-server (C-S) communi-cations. When a hostAis looking for a web page and is not connected to the Internet, it sends a query to other hosts using a 1-hop 802.11 (working in ad hoc mode) multicast message. If a hostsBof the multicast group has the requested web page in its cache, it sends a report. The hostAwaits for some while so that it can receive several reports. It then sends a GET message (HTTP) to one of the hosts that sent a report. Afterwards, the respective host sends back the web page. This means that any host acts like a HTTP server. This way of querying is called “active query”. There is another querying method called “passive query” where some hosts, normally those without several power constrains, called 7DS-enabled servers periodically send advertise-ments. This advertisements contain the type of information or application that the 7DS-enabled server support. A mobile hosts that receives an advertisement may decide, if requires some data,

to send a query directly to the 7DS-enabled server. In order to save energy, host can work in three different ways depending on its level of power: from fully collaborative to not participating at all.

Finally, another information sharing systems that make use of mobile ad hoc networking and/or access to Internet are proposed in [HYK⁺10] and [KPB⁺04,CKP06].

2.2.3 Data-Centric Storage

In Subsection2.1.3we have first introduced the problem of gathering data of events in WSNs, and we have presented several approaches to route events to nodes interested on these events (data-centric routing). In these approaches, however, interests must be propagated throughout the entire network as events may be detected in any place which is not known in advance. In this subsection, we review another way of tackling the problem of gathering events, called Data-Centric Storage (DCS). The key idea is to first map an event to a geographical location and then to store the event (i.e. all the data of an event) into a node or nodes located in or around the mapped geographical location. The mapping between events and locations is in general done through hash functions. When a node wants to gather events matching an interest, it then applies the same hash function to the interest and routes the interest towards this location instead of flooding the entire network.

This technique of geographical hash functions has taken inspiration from Distributed Hash Tables (DHTs) like Chord, Tapestry, CAN and Pastry [RD01,RFH⁺01,SMK⁺01,ZKJ01]. In DHT a hash function is used to map data to nodes. However, it is important to notice that instead of mapping to nodes the idea is to map to geographical locations. The advantage of this mapping approach is that geographical locations can be used as stable addresses (geographical locations are not going to move). Moreover, in WSNs a global addressing mechanism is general not established due to the quantity of sensors (thousands to millions) and some sensors may move.

While most of the approaches presented in this subsection belong to the domain of WSNs, there are some works that tackle a similar problem in MANETs. In this case, we generalise the concept of events to that of any data, and that of interests to any query. In MANETs, mobility plays an important role because nodes that store events or data may leave their current geographical location and with them data. Thus, additional mechanism to be resilient against mobility are required. Before reviewing the different works proposed for data-centric storage, it is important to clarify that events or data are structured as pair (key,value) and interests or queries are expressed askey’s. A survey and a comparison of DCS approaches for WSNs and MANETs can be found in [DH12].

In [RKY⁺02,SRK⁺03] authors were the first in introducing DCS for WSNs by proposing Geographical Hash Table (GHT). Whenever a node detects and event which is encoded as a (key,value) pair, the node computes loc=H(key) where H is a hash function mapping the key key to a geographical location loc. The location loc is known as the home location of the event. The node then routes the event towardslocusing the GPSR position-based routing protocol [KK00]. It is important to notice that the event is routed to a precise location, in this case loc, instead of a precise node. The event will eventually get as close as possible to loc and GPSR will switch to variation of perimeter mode. In this mode, the event will be forwarded using the right-hand rule from one node to another, trying to reachloc. The event will eventually make a tour aroundlocand come back to the node that triggered the perimeter mode. This node will be set as the home node and all nodes that were visited in perimeter mode will be defined as

the home perimeter. While turning aroundloc, it may happen that a node other than the node that triggered the perimeter mode is closer toloc. If such is the case, this node will become the home node and it will trigger the variation of perimeter mode again. The event will also be stored in the home node and home perimeter nodes during the previously described process. In this way, whenever a node has an interest for gathering events having a keykey⁰, the node must compute loc⁰=H(key⁰)to then route the interest toloc⁰using GPSR. The interest will eventually reach the home node or one of the nodes of the home perimeter. Such a node will then send back a response to the node that initiated the interest. In order to ensure resilience to failures and mobility, authors proposed a Perimeter Refresh Protocol (PRP). In PRP, the home node of an event periodically triggers the forwarding of the event throughout the home perimeter using the variant of perimeter node. In this way, a node belonging to the home perimeter will receive the event (refresh), a node not being part of the home perimeter will not, and new nodes being part of the home perimeter will store the event. In addition to this refresh of an event and the update of members of the respective home perimeter, a new home node may also be elected if there is a node that is located closer to the home location. A perimeter node resets a take over timerT_t each time it receives a refresh packet. At the expiration ofTt, a node initiates a refresh round. A node also resets a death timerT_d after receiving a refresh packet, at the expiration ofT_d a node removes the event. In general we have thatT_d>T_t>T_h. Finally, in order to avoid overloading the nodes around a certain location, a structured replication mechanism is proposed. The area around a home location is divided into a hierarchy of squares similar to the one proposed in the location service GLS [LJDC⁺00]. In this hierarchy,n-order squares contains exactly(n− 1)-order squares forming what is called as a quadtree. These squares are considered as mirrors in which events are stored. In this way, a node willing to store an event computes the closest mirror and sends the event towards that mirror. Whenever a node wants to submit an interest, this is sent to the 0-order mirror and it is then propagated in a recursive way towards the 1-order, 2-order, etc. mirrors. Matching events are then propagated back in a similar way but in reverse direction.

In [ARK⁺05] propose a variant of GHT called Cell Hash Routing (CHR). In this variant, the same principle of home node and home perimeter is applied to cells instead of nodes. The geographical space is divided into rectangular cells. Nodes know their location (e.g. GPS) and they are then able to determine to which cell they belong. The size of each cell is small enough to enable any node located in a cell to communicate with all other height neighbouring cells.

Cells are numerated in an incremental way form left to right and from top to bottom. Thus, the result ofH(key)is the a number of cell wherekeyis the key of some data. In this way, GPSR and PRP work on cells as if it would be a grid of fixed nodes. Inside a cell, all nodes may store data which provides a certain level of redundancy. Another interesting work is presented in [DMR06]. Authors proposed a variant of GHT for MANETs. First, instead of relying on a home node and home perimeter, the concept of a partial perimeter is used. Second, data is stored in server nodes which are not part of this partial perimeter. Some specific geographical

Dans le document Hovering information: a self-organising, infrastructure-free information storage and retrieval service for mobile applications (Page 37-58)