• Aucun résultat trouvé

Replication in data grids

Data replication in grid environments

3.2 Data replication

3.2.4 Replication in data grids

Modern supercomputer systems connected through high bandwidth net-works have spurred a new class of data-intensive applications. In recent years the data requirements of both scientific and business applications have been dramatically increasing in both volume and scale. The amount of data gener-ated by data-intensive applications continues to grow each year, and the aggre-gated data volume will reach the exabyte (1 million terabytes) scale by around 2015 [Particle physics data grid, 2009 ]. Several good examples can be listed such as high energy physics (HEP) experiments [Compact muon solenoid, 2009 ], and CERN’s Large Hadron Collider (LHC) experiments [Large hadron col-lider, 2009 ], which processed and produced hundred of terabytes of data. In such applications, data files required might be located in different geograph-ical distributed systems, implying that availability and consistency of data has to be maintained under wide-area environments where network latencies are generally long [Dullmann et al., 2001 ]. Therefore, there is a great need for an integrated architecture which facilitate the storage, processing, and management of data of both large scale and volume.

Data grid is becoming a promising infrastructure for data storage and exe-cution of data intensive applications, which connects a collection of hundreds of geographically distributed computers and storage resources located in dif-ferent parts of the world to facilitate sharing of data and resources [Lame-hamedi et al., 2003 ]. Some examples of data grid are the European DataGrid project [European datagrid, 2009 ], physics data grids [Particle physics data grid, 2009 , GriPhyN, 2009 ], the LHC computing grid (LCG) project [Large hadron collider, 2006 ] for handling the massive amount of data produced from the LHC experiments at CERN, and the biomedical informatics research net-work (BIRN) [Biomedical informatics research netnet-work (BIRN), 2005 ].

Traditionally, grid resources (e.g., computational power, data storage, net-work bandwidth) are allocated to the jobs by the net-workload scheduler according

Data replication in grid environments 73 to the job requirements, the system load, and specified policies. In a data grid environment, an efficient scheduler must also take into account the location of data required by the jobs, which has, in fact, a significant influence on system performance. The reason for this is that the jobs may take a long time to finish because of a long delay in the fetching of required data files on a high latency storage or just hang due to data unavailability. Data repli-cation, which involves the creation of identical copies of data files and their distribution over various sites, is an important technique to avoid such sit-uations. Appropriate placement of data files at different sites in the system not only reduces the data access time of the jobs, bandwidth consumption, and consequently improves the job turnaround time [Stockinger et al., 2002 ], [Ranganathan and Foster, 2001 ], but also increase data availability in many applications [Hoschek et al., 2000 ], [Ranganathan et al., 2002 ], [Lei et al., 2008 ]. Recently there has been a considerable interest in the area of dynamic replication in data grid environments.

Many research works have been carried out to provide basic functionali-ties of replica management in data grid implementations [Chervenak et al., 2002 ], [Chervenak et al., 2005 ], [Gridpp, 2009 ]. These existing components and services can serve as the basis of the replication services or management framework for grid environments at a higher level to provide an automated replica placement support and a full replica management functionality for achieving better access performance, availability, and security of data. In this section, we overview the existing works in data replication and placement in data grids.

In data-intensive applications where data are usually generated from an instrument at one site and are transferred to other storage sites in data grids through the data replication mechanisms, the data consistency is not a big concern as there are no updates. However, as the application domain of data grid continues to expand, the replication strategies need to address the envi-ronments where update requests are frequent. Hence, the replication strate-gies can be distinguished based on whether they are used for read-only or update requests.

Replication strategies for read-only requests In the context of repli-cation strategies for read-only data environments, replirepli-cation algorithms can be classified into three main categories.

Based on replica location: In this model, a replica selection is decided based on a data grid structure and replica location. In [Ranganathan and Foster, 2001 ], the authors identify five replica strategies (best client, cascading replication, caching, caching plus cascading replica and fast spread) with three different kinds of access patterns (random access, small degree of temporal locality, and small degree of geographical and temporal locality) for a hierarchical data grid. The simulation results

74 Fundamentals of Grid Computing

show that significant savings in latency and bandwidth can be obtained if the access patterns contain a small degree of geographical locality.

In [Lin et al., 2006 ], the authors describe a hierarchical tree structure for data grid, where data access requests are generated from the leaf nodes to upper nodes within a range limit. The goal of the replica place-ment algorithm proposed is to improve the load balancing among the replica servers. The optimal locations for placing the replicas, which are bounded by the minimum number corresponding to the server’s work-load capacity, are selected based on the usage frequency of users towards a particular data.

In [Park et al., 2003 ], the authors propose a dynamic replication strat-egy, called BHR, to reduce data access time by avoiding network con-gestions in a data grid network. The basic idea of BHR is to take the benefits from a “network-level locality,” which represents that required file is located in the site which has broad bandwidth to the site of job execution.

Based on cost estimation: In the cost estimation model, a replication decision is taken by evaluating the data access gains and the cost of placing a replica of filef at the new locationn, which is calculated as:

cost(f, i, j) =ftransfer(bandwidthi,j,sizef) +fstorage(f, n) (3.1) In [Lamehamedi et al., 2003 ], the authors propose a cost model based on runtime bandwidth, replica size, accumulated read/write statistics. The model evaluates the data access gains by creating a replica compared with the costs of creation and maintenance for the replica. A hybrid of tree and ring topologies of nodes was proposed to overlay replicas on the data grid and minimize inter-replica communication cost.

In [Ranganathan et al., 2002 ], the benefit of creating a new replica is evaluated based on the storage and transfer cost as in equation (3.1).

An approach is proposed to create replicas automatically in a decentral-ized fashion to maintain desired data availability without consuming an undue amount of storage and bandwidth of the system.

Based on economy policies: Several works [Carman et al., 2002 ], [Bell et al., 2003 ] have been proposed based on economy-based policies for replication decisions. The basic idea behind economy-based replication is to apply the economic concepts of market behavior, where data files represent the goods. In this model, the investment, i.e., replication de-cision, is determined by the difference in cost between the price paid to buy a data file and the expected price, i.e., revenue, if the file is sold in

Data replication in grid environments 75 the future. In [Carman et al., 2002 ], a function that calculates the rev-enue obtained over a future time period by selling a fileF corresponding to the identifierf is defined as:

V(f, k, n) =

k+n

i=k

pi δ(f, fi)δ(s, si) (3.2)

The functionV(f, k, n) computes the revenue for the fileF correspond-ing to the identifier f starting from time tk in the future and taking into account the nextn file requests. pi represents the price paid for the file, which can be calculated based on the equation (3.1). s is the local storage element. δ that represents the incomes of file F returns the value 1 if the arguments are equal and the value 0 if they differ. A replication decision is taken only when the replica with an associated file purchase cost is proved to be potentially beneficial for the system in long term.

A predicting function E[V(f, k, n), r] is defined in [Bell et al., 2002 ], [Bell et al., 2003 ], [Capozza et al., 2002 ] to estimate V(f, k, n) that returns the number of times the filef will be requested in the next period of time considered, based on the latestrfile requests. Simulation results realized in OptorSim [Cameron et al., 2004b ] present some specific realistic cases where the economic model shows tremendous performance improvements over traditional methods.

In [Lei et al., 2008 ], the authors propose two new metrics, namely system file missing rate and system bytes missing rate, to evaluate the reliability of the system. An on-line optimizer algorithm (MinDmr) is proposed to maximize the data availability based on these two metrics and the predicting functionE[V(f, k, n), r] from [Bell et al., 2003 ].

Replication models for update-requests In an environment where each replica gets updated frequently, it is desirable to minimize the divergence between the source data and its replicas by a synchronization process. There are two main approaches for a synchronization process to maintain the data consistency: (i) synchronous replication and (ii) asynchronous replication.

Synchronous strategy is also known aseager replication, while asynchronous is often referred to aslazy replication.

Synchronous: Ideally, replicas need to be kept consistent with the source data at all times. In synchronous replication, updates to any replica should be immediately propagated to all other replicas within the same (distributed) transaction. Synchronous replication enforces mutual con-sistency of replicas. Although synchronous replication provides strong consistency and a high degree of fault-tolerance, it cannot scale up to

76 Fundamentals of Grid Computing

large distributed environments like grid environments due to the com-plexity and cost of the distributed transactions. Synchronous replica-tion strategies, which relies on ROWA or read-one-write-all-available (ROWAA) protocol [Bernstein et al., 1987 ], are more suitable for database systems. A better solution for data grids that scales up is asynchronous replication.

Asynchronous: In grid environments, data collections can be very large or frequently updated and network or computational resources may be limited; propagating the updates to all replicas may be infeasible. In certain situations where exact data consistency is infeasible, propagat-ing data updates to a large number of replicas every time a data item replica is modified greatly affects the performance of the entire system.

Asynchronous replication is more suitable for this environment where replicas can be updated in different transactions at different sites in an asynchronous fashion. While synchronous replication updates every replica before committing the original update transaction, asynchronous replication commits the original update transaction before updating is completed. According to where the original update transaction is com-pleted, asynchronous replication strategies can follow a primary-copy (i.e., master-slave) or multi-master (i.e., update everywhere) approach.

In a primary-copy approach, all the update transactions are forced to be executed and committed on the site holding the primary copy and the site then propagates the updates in various methods. In a multi-master approach, the update transactions are executed and committed on a group of primary copies and then propagated to other replicas.

However, conflicting updates at different sites can introduce replica di-vergence in this approach. Manual or automatic reconciliation processes are used to avoid such situations.