Storage Algorithm - Storage and Retrieval Algorithms

Storage and Retrieval Algorithms

4.3 Storage Algorithm

As discussed in Section 4.1, this chapter presents a feasibility study of hovering information by proposing storage and retrieval algorithms. Indeed, once a mobile application has some information to store into a geographical region, the application needs some mechanisms to create the respective hoverinfo containing the information, to store it into the respective geographical region, and to keep it persistent and accessible accordingly to the life cycle defined by the mobile application. We callstorage algorithmsthe mechanisms related with the creation, storage and persistence of hovering information. We callretrieval algorithmsthe mechanisms related with the accessibility of hovering information. Although we make the difference between storage and retrieval algorithms, persistence and accessibility of hovering information are strongly related as we will see in the rest of the chapter. Thus, retrieval algorithms strongly depend on storage algorithms.

In this section, we propose a storage algorithm, called Self-Adaptive Active/Passive Replicas Storage Algorithm (SAPRESA), which aims at creating and storing hoverinfos into geograph-ical regions in a persistent way accordingly to their life cycle. The proposed algorithm has a flooding-like nature as it continuously tries to populate the anchor area of a hoverinfo with repli-cas by taking advantage of the broadrepli-casting nature of the wireless channels. In order to avoid a high network overhead inherent to such an approach, we also propose some mechanisms to control and reduce it, making the algorithm different from a typical flooding approach.

Before giving an overview of SAPRESA and then describing in detail all its mechanisms in the following subsections, it is important to highlight four aspects. First, when describing the mechanism of SAPRESA, we often provide to replicas the capability of performing actions (i.e.

the way how we express our ideas in English). However, these actions are in reality performed by the hosting mobile nodes of replicas. For instance, we say that a replica replicates, which in reality means that its hosting mobile node creates a duplicate of the replica and broadcasts this duplicate to its 1-hop neighbouring nodes. We have chosen this way of explaining the al-gorithms and concepts in order to emphasise the autonomous nature of replicas. Second, all the mechanisms of SAPRESA (and also those of the retrieval algorithms) described in this chapter make use of only broadcast messages and nothing else. Thus, when we do not explicitly pre-cise this aspect when describing some mechanism, we must assume that a broadcast message is used. For instance, if we say that that a replica sends an IPR message, it means that a broad-casting message is used for achieving this task (and no a unicast or multicast message). Third, each replica contains the content of the hoverinfo to which it belongs and additional data (the meta-data of the replica). This meta-data contains the attributes of the hoverinfo (anchor radius, anchor location, creation time, time-to-life, absolute time, etc.) and some other additional data of the SAPRESA algorithm (density estimation, PAR value, etc.). Fourth, in general, all the mechanisms of SAPRESA (it does not apply for the retrieval algorithms) are described for one hoverinfo which has many replicas. For multiple hoverinfos, all these mechanisms are inde-pendently deployed. It means that in a same mobile node, two different replicas h^H_p¹ andh^H_q² belonging to two different hoverinfosH₁ andH₂may co-exist in the same node. Each of these replicas will apply all the mechanisms of SAPRESA in an independent way.

4.3.1 Overview

In this Subsection we give an overview of the mechanisms that make part of SAPRESA. The details of each of them will be described in the following subsections.

[Creation and storage of a hoverinfo] When a mobile application requires to store some information into its surrounding current geographical region, the mobile node on which the ap-plication is running stores the information as a hoverinfo. In order to achieve this, the mobile node creates the first replica of the hoverinfo and stores the replica on its local buffer by creating in this way the hoverinfo itself. [Populating mechanism] The next step is to populate the entire anchor area with replicas in order to ensure persistence and accessibility. Therefore, the popu-lating mechanism floods the mobile nodes that are inside the anchor and availability areas with replicas. At the end of this stage, all or most of the mobile nodes (provided the network was not partitioned) will host a replica of the hoverinfo. If nothing else is done, the replicas will even-tually leave the anchor area to never come back, or they will disappear due to failures of nodes or simply to the permanent departure of them. For this reason, we need a mechanism to contin-uously keep the population of replicas at a suitable level, that required to ensure persistence and accessibility.

[Inside proactive replication mechanism] It is the inside proactive replication mechanism which aims at continuously making efforts to keep the anchor area populated by enough repli-cas. The principle is that a replica located inside the anchor area periodically replicates into its neighbouring nodes. In other words, the hosting mobile node broadcasts a message containing the replica to its 1-hop neighbours. Each node receiving the message stores the received replica in its local buffer provided it was not already stored. Although this mechanism is quite effi-cient due to its flooding nature, it will certainly introduce a high network overhead that might even have negative consequences due to the interference and collisions in the wireless channel.

[Active and passive replicas] Therefore, in order to avoid such an undesirable behaviour, each replica has a status which can be activeorpassive so that only active replicas are enabled to replicate. By keeping the number of active replicas to a minimum but sufficient level, the stor-age algorithm avoids from reaching a high messstor-ages exchange complexity while successfully ensuring persistence and accessibility.

[Density and probability of active replicas] In order to regulate the population of active repli-cas as mentioned previously, each replica continuously estimates the density of active replirepli-cas by measuring the number of duplicate replicas received - those sent by the neighbouring active replicas. Based on this estimation, a replica computes or updates a probability of becoming an active replica which is sent along with the duplicate of the replica when replicating. A node receiving the duplicate replica will set it as active or passive accordingly to this probability.

Whenever the received replica was already stored, the node will either keep or change the status of the replica, to active or passive, with the same received probability. Thus, each replica located inside the anchor area continuously (each time it receives a duplicate replica) evolves from pas-sive to active and vice-versa. The idea behind using this probability of becoming active replica is that the size of the population of active replicas will increase when there are not enough active replicas (low density of active replicas) and decrease when there are too many of them (high density of active replica), trying to reach and keep in this way an ideal number of active replicas - the one required to ensure persistence and accessibility while keeping a low network overhead.

[Inhibition phenomena] However, not all active replicas will replicate. Since an active replica (and also a passive one) keeps or changes its state after receiving a duplicate replicate, only those active replicas that succeeds on staying active until the time of replication arrives (active replicas periodically replicates as a result of the inside proactive replication mechanism) will replicate.

We call this the inhibition phenomena. [Calibration mechanism] The first active replicas are created during the flooding process triggered by the populating mechanism. During this

pro-cess, some of the replicas are set as calibrator replicas with a fixed probability and the others as passives replicas. The calibrator replicas replicate some time later. All replicas, calibrators and passives, decide then to be set as actives or passives based on the number of received duplicates sent by the calibrator replicas.

[Leaving out reactive replication mechanism] While the inside proactive replication mech-anism aims at keeping the anchor area populated of replicas to ensure persistence and accessi-bility, in some scenarios having a low density of nodes it might happen that there is not enough replicas due to the absence of nodes. Under such conditions, the leaving out reactive replication mechanism plays a role. Indeed, a replica, active or passive, leaving its anchor area may decide to replicate when it estimates that there are very few replicas or none (besides its) inside the an-chor area. More precisely, the way how a leaving replica decides to replicate or not is based on the density of active replicas which is also used by the inside proactive replication mechanism as discussed above. As replicas are continuously evolving from active to passive and vice-versa, a very small population of active replicas is an indicator of a very small population of replicas.

Thus, a leaving replica, active or passive, replicates whenever its estimation of the density of active replicas is too low.

[Self-activation mechanism] Another critical situation that may also happen in scenarios having a low density of nodes is a very few number of active replicas or the absence of them, inducing in this way a decrease in the accessibility degree of the hoverinfo due to a very rare or none inside proactive replications (which are the result of the inside proactive replication mech-anism). One possibility of recovering from such a situation might be through the leaving out reactive replication mechanism previously mentioned. A passive leaving replica would replicate and create new replicas in the neighbouring nodes. However, it may happen that no replicas are created due to the absence of nodes (low density scenario), not creating therefore any active replica. In order to overcome this, each passive replica has a mechanism to self-activate - be-come active - after a period of not receiving any duplicate replica while being inside the anchor area.

[Death of a hoverinfo] Finally, each replica knows the time when the hoverinfo has been created and the time when it must die (absolute time or time-to-life duration). The first created replica holds this data (creation and death time) as meta-data which is created and initialised by the node that created the hoverinfo. This meta-data is propagated along with each replica each time a replica is replicated. Whenever a replica notices that the lifetime of the hoverinfo has reached its end (after checking the clock of its hosting node), the replica decides to remove itself from its hosting node. All existing replicas will do the same in a non synchronised way, resulting in the eventual death of the hoverinfo. While this should be the normal way of how a hoverinfo should die (disappear), the premature disappearance of replicas (due to a very low density of nodes, failures, etc.) will also cause the disappearance of the hoverinfo.

4.3.2 Storage, Populating and Removal

As already mentioned in the overview of the storage algorithm (cf. Subsection4.3.1), when a mobile application requires to store a new hoverinfo, the mobile node on which the application is running creates a first replica of the hoverinfo and hosts it on its local buffer. At this point, the hoverinfo is alive and is stored on its anchor area (as we assume the mobile node is in the centre of the anchor area), thus the hoverinfo is persistent. However, the accessibility degree of the hoverinfo can be high or low depending on whether the only existing replica covers the

whole anchor area or not (the area covered by a replica is defined as the area covered by the communication range of its hosting node, cf. Subsection3.2.3).

The next stage of SAPRESA is to populate the entire anchor area with replicas in order to increase the accessibility degree and reinforce the persistence. This populating mechanism is implemented as a scoped flooding, using the distance-based flooding scheme proposed in [NTCS99] to avoid broadcast storms. The node hosting the first replica broadcasts a duplicate of the replica to its neighbouring nodes which receive this duplicate and store it if not already stored. Then, each mobile node schedules a re-broadcasting of the received replica instead of re-broadcasting it immediately. In this way, a mobile node might receive duplicate replicas from neighbouring mobile nodes in the meantime, enabling the mobile node to take a decision on whether or not to continue the re-broadcasting. The time at which the re-broadcasting is sched-uled is a time slot randomly chosen out ofCW_popu(populating contention window) slots where each slot has a duration ofSTpopu(populating slot duration) milliseconds. The re-broadcasting is cancelled whenever the distance to the neighbouring node sending the duplicate replica is below adistance threshold D_popu. Otherwise, the mobile node rebroadcasts a duplicate of the replica at the scheduled time. The distance thresholdDpopufor the distance-based flooding scheme is set at 29.40% of the communication range of mobile nodes. This threshold comes from the results presented in [NTCS99]. The value of the parametersCW_popuandST_popuwill be presented and discussed in Subsection4.5.2when tuning SAPRESA.

In order to bound the flooding to the anchor area - scoped flooding - only mobile nodes being located inside the anchor area are considered to re-broadcast a replica, the others simply host the replica and do not schedule any re-broadcasting. In this way, at the end of the populating stage, most of the nodes inside the availability area¹will host a replica, provided the network is not partitioned.

Finally, a mobile node periodically checks whether or not some of the replicas stored in its buffer should be removed. The removal decision for a replica depends on the remaining lifetime of the replica which is computed based on the current time (clock of the mobile node), the creation time of the hoverinfo, and the absolute lifetime or the time-to-life attributes of the hoverinfo. All these attributes are part of the meta-data of a replica. When a node removes a replica from its buffer, it cancels any pending event (see next sections) related to the removed replica.

Figure4.4 illustrates the populating mechanism previously described. As in Figure4.4(a), the mobile noden₁on which the mobile application is running creates and hosts a first replica h₁, creating and storing in this way the hoverinfoH. In Figure4.4(b), we can then observe the end of the populating phase on which all mobiles nodes that are located inside the availability area, and thus the anchor area, host a replica.

4.3.3 Active/Passive Replicas and Calibration

After the populating phase (cf. Subsection4.3.2), the availability area, and thus the anchor area, is populated by several replicas of the hoverinfo. The next step is to ensure the persistence and accessibility of the hoverinfo by continuously replicating these replicas in a pro-active and re-active way as we will describe in Subsections4.3.4and4.3.7. As the approach essentially uses

1Let us recall that the availability area is the union of the anchor area and a contour to the anchor area. This contour is a ring whose internal radius is the anchor radius and whose external radius is the sum of the anchor radius and the communication range of mobile nodes.

(a) The first replicah₁of the hoverinfoHis cre-ated and hosted by the noden1

(b) The anchor area and availability area are populated by multiple replicas

Figure 4.4: Populating

replication by taking advantage of the broadcasting nature of the wireless channels, we need a mechanism to regulate the number of replications (thus broadcast sent messages) in order to avoid a high network overhead. The regulating mechanism we propose in this thesis is based on the distinction between two types of replicas: actives and passives, and it enables only active replicas to replicate.

More precisely, any replica has a status that can be eitheractive orpassive. It is by only enabling active replicas to replicate and controlling the number of active replicas to only a few that SAPRESA regulates the number of messages. The number of active replicas must be low enough to avoid introducing high network overhead and high enough to ensure persistence and accessibility. The status of a replica is not permanent and can change from active to passive or vice-versa under precise conditions defined by the pro-active and re-active replication mecha-nisms (cf. Subsections4.3.4and4.3.7). Therefore, there is a constant evolution of the popula-tion of active and passive replicas where active replicas become passives and passive replicas becomes actives.

The initial number of active replicas is defined after the populating phase through a cal-ibration process. During the populating phase described in Subsection 4.3.2, some replicas inside the anchor area are set as calibrator replicas with probability p_cali (calibrator replica probability, which is a parameter of SAPRESA) and the others are set as passive replicas with probability 1−pcali. Those replicas which are located outside the anchor area are always set as passive replicas. In this way, at the end of the populating phase, the rate between the number of calibrator replicas and the total number of replicas inside the anchor area will be approximately pcali. Afterwards, each replica, calibrator or not, located inside the anchor area schedules a calibration action in tcali (calibration action timer) milliseconds. Moreover, each calibrator replica schedules a calibration messageint_send−cali(calibration message timer) milliseconds, wheretsend−cali<tcali. The aim of a calibration message is to inform the neighbouring replicas (i.e. those replicas stored in neighbouring nodes) of the presence of a calibrator replica. Each neighbouring replica, calibrator or not, counts the number of calibration messages received. At the expiration of its calibration action timertcali, a replica, calibrator or not, stops counting the number of calibration messages and computes aPARinit (initial probability of becoming active replica) based on the number of received calibration messages. After computing PAR_init, the replica sets itself as active or passive (even if it was already passive) with probabilityPARinit or 1−PARinitrespectively. More precisely, a replicahcomputes its initial probability of becoming

active replicaPARinit(h)as follows:

wherem_cali(h)is the number of received calibrator messages by the replicah,ρIDEALis theideal density of active replicas parameter which will be described in detail in Subsection4.3.5, and

pcaliis the calibrator replica probability parameter as previously mentioned.

In order to reduce the chances of collisions between the calibration messages, each cali-brator replica randomly delays the broadcasting of its calibration message. More precisely, a replica chooses randomly (following a uniform distribution) a time slot out ofCWcali (calibra-tion conten(calibra-tion window) slots, where each slot has a dura(calibra-tion ofST_cali(calibration slot duration) milliseconds. Thus, in reality the replica schedules the calibration message int_send−cali millisec-onds since the start of the time slot that chose.

By applying the previous calibration, we aim at setting the initial number of active replicas to an appropriate initial value. The next mechanisms, pro-active replication (cf. Subsection4.3.4) and re-active replication (cf. Subsection 4.3.7) will make this population of active replicas evolve throughout time. In order to keep the number of active replicas to the most appropriate value, two additional mechanisms exists that aims at continuously controlling the size of the population of active replicas by sensing the activity of these (i.e. the replications initiated by active replicas) and adapting the creation of active replicas accordingly to their sensed activity.

These additional mechanisms will be described in Subsections4.3.6and4.3.5. Finally, the value

Dans le document Hovering information: a self-organising, infrastructure-free information storage and retrieval service for mobile applications (Page 93-108)