Keeping up with storage: Decentralized, write-enabled dynamic geo-replication

Keeping up with storage: Decentralized, write-enabled dynamic geo-replication

aOntology Engineering Group, Universidad Polit´ecnica de Madrid, Spain

bIRISA/INSA Rennes, France

cIRISA/ENS Rennes, France

dINRIA Rennes, France

Email addresses:[email protected](Pierre Matri), [email protected](Mar´ıa S. P´erez),

Local vote summary

Global vote summary Noden1

Figure 1: Cluster-wide popular object identification overview.Sis the total number of sites in the cluster andNthe total number of nodes.

Merging Complete

Serving Clients

Figure 2: Timeline of vote summary states for a voting round lengtht.

Table 1: Experimental client preferred site settings

(eu-central-1)

(ap-southeast-2)

(ap-northeast-1)

All values Top-10% values

Figure 3: Popular object identification performance.

# total dynamic replicas

# correct dynamic replicas

Figure 4: Popular object replication performance.

False negative rate False positive rate

Figure 5: Client location error rates.

None DRep ARC

Figure 6: Latency over time with a 95% read/5% write workload.

95/5 65/35 50/50 35/65 5/95 0

5 10 15 20 25 30 35 40 45 50 55

(a) Workload read latency impact

95/5 65/35 50/50 35/65 5/95 0

20 40 60 80 100 120

(b) Workload write latency impact Figure 7: Latency impact of the workload.

[1] Microsoft Azure,https://azure.microsoft.com/en-us/

[2] Amazon Web Services,https://aws.amazon.com/(2016).

Keeping up with storage: Decentralized, write-enabled dynamic geo-replication

HAL Id: hal-01617658

https://hal.inria.fr/hal-01617658

Submitted on 16 Oct 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Keeping up with storage: Decentralized, write-enabled dynamic geo-replication

Pierre Matri, María S. Pérez, Alexandru Costan, Luc Bougé, Gabriel Antoniu

To cite this version:

Pierre Matri, María S. Pérez, Alexandru Costan, Luc Bougé, Gabriel Antoniu. Keeping up with

storage: Decentralized, write-enabled dynamic geo-replication. Future Generation Computer Systems,

Elsevier, 2018, 86, pp.1093-1105. �10.1016/j.future.2017.06.009�. �hal-01617658�

Keeping up with Storage:

Decentralized, Write-Enabled Dynamic Geo-Replication

Pierre Matri

, Mar´ıa S. P´erez

, Alexandru Costan

, Luc Boug´e

, Gabriel Antoniu

Abstract

Large-scale experiments on various workloads show a read latency decrease of up to 42% compared to other state-of- the-art, caching-based solutions.

Keywords: cloud, replication, geo-replication, storage, fault-tolerance, consistency, database, key-value store

1. Introduction

However, designing geo-distributed applications is dif- ficult due to the high and often unpredictable latency between sites [3].

scientists worldwide in real-time. The users of com- mercial applications, such as Facebook, ever-increasing amounts of data that needs to be accessible worldwide.

Ensuring the lowest possible access time for users is crucial for the user experience.

A key factor impacting the performance of such ap- plications is data locality, i.e. the location of the data relatively to the application. Accessing remote data is orders of magnitude slower than using local data.

Although such remote accesses may be acceptable for rarely-accessed data (cold data), they hinder applica- tion performance for frequently-used data (hot data).

Dynamic replication [6] proposes to solve this issue

by dynamically replicating hot data as close as possible

to the applications that access it. This technique is lever-

aged in Content Delivery Networks (CDN) to cache im-

mutable data close to the final user [7, 8]. Similarily, it

is used in storage systems such as GFS [9] or HDFS [10]

• After briefly introducing the applications we target (Section 2), the related work (Section 3) and the storage systems we target (Section 4), we char- acterize the challenges of decentralizing write- enabled dynamic data replication (Section 5).

• We address these challenges with a decentral- ized data popularity measurement scheme (Sec- tion 6), which leverages existing state-of-the-art storage system architecture to identify hot data cluster-wide dynamically.

• Based on these popularity measurements, we de- scribe a dynamic data replication algorithm which dynamically creates and manages replicas of hot data as close as possible to the applications (Section 7).

• We enable clients to locate the closest of such data replicas using an approximate object loca- tion method (Section 8), which minimizes storage latency by avoiding communication with any ded- icated metadata node.

We discuss the effectiveness and applicability of our contribution (Section 11), and conclude on future work that further enhances our proposal (Section 12).

2. Large-scale, data-intensive applications

In this paper we target large-scale applications serv- ing large amounts of data to users around the world, while seeking to enable low-latency access for these users to potentially-mutable data. Examples of such ap- plications span multiple use-cases, among which:

To deliver the real-time promise of MonALISA, ensuring that the monitoring data that is needed by scientists around the world is located as close as possible to them is crucial.

Social networks. Business applications such as social networks ingest overwhelming amounts of data.

Facebook, for example, is expected to react 2 bil- lion active profiles in the next few weeks [16].

The real-time promise of such applications requires to keep up-to-date data as close as possible to the end- user. The global scale of these applications makes this di ffi cult while keeping costly bandwidth and storage us- age low. We will detail these challenges in Section 5.

These challenges are independent of the type of plat- forms such as compute grids for MonALISA or clouds for Facebook.

3. Related work

In the literature, dynamic replication stands as a topic

of interest for all applications requiring access to shared

data from many geo-distributed locations. Most of these contributions can be classified in two categories:

Yet, CDNs are targeted at serving content directly to the final user. In this paper, we focus on allow- ing a geo-distributed application to access a geo- distributed data source with the lowest possible la- tency.

Mutable data, centralized management. However, a range of applications rely on mutable data. This is for example the case in MonALISA monitor- ing of the CERN LHC experiment [4, 5]. In a web applications, this is observed with social net- work profile pages, status pages, comments on a

Replica selection algorithm Targeted work on replica

selection prove that adopting a relevant data loca-

tion algorithm can lead to significant performance

improvements. Mansouri et al. [27] propose a

distributed replication algorithm named Dynamic

Hierarchical Replication Algorithm (DHR), which

selects replica location based on multiple criteria

4. Background: The systems we target

Let us first briefly describe the key architectural prin- ciples that drive the design of a number of decentralized systems. Dynamo [13] has inspired the design of many of such systems, such as Voldemort [14], Cassandra [30]

or Riak [31]. In this paper we target this family of sys- tems, which are widely used in the industry today.

Data model. Dynamo is a key-value store, otherwise called distributed associative array. A key-value store keeps a collection of values, or data objects.

Each object is stored and retrieved using a key that uniquely identifies it.

DHT-based data distribution. Objects are distributed across the cluster using consistent hashing [32]

based on a distributed hash table (DHT), as in Chord [33]. Given a hash function h (x), the out- put range [h

, h

] of the function is treated as a circular space (h

sticking around to h

ring, r being the configured replication factor of the system (usually 2).

5. Our proposal in brief: outline and challenges

In this paper, we demonstrate that it is possible to in-

tegrate dynamic replication with the existing architec-

ture of these storage systems, which enables us to lever- age their existing, built-in algorithms to e ffi ciently han- dle read and writes in geo-distributed environments.

When trying to access an object, clients tentatively de- termine the location of its closest replica (either static or dynamic) and address requests directly to the node holding it. Such approach however raises a number of challenges.

We acknowledge that using client votes has been proposed before in the context of replicated relational databases, specifically to ensure data consistency [36].

5.1. Collecting and counting votes

The goal of dynamic replication is to improve storage operation latency for the clients. Therefore, we need to design an e ffi cient way to let the clients cast their votes for objects.

We address both these issues by mixing an approxi- mate frequency estimation algorithm with the existing, lightweight Gossip protocol provided by storage system (Section 6).

5.2. Tunable replication