A service mesh for collaboration between geo-distributed services: the replication case

(1)

HAL Id: hal-03282425

https://hal.inria.fr/hal-03282425

Submitted on 9 Jul 2021

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

A service mesh for collaboration between geo-distributed services: the replication case

Marie Delavergne, Ronan-Alexandre Cherrueau, Adrien Lebre

To cite this version:

Marie Delavergne, Ronan-Alexandre Cherrueau, Adrien Lebre. A service mesh for collaboration be-

tween geo-distributed services: the replication case. AMP 2021 - Workshop on Agility with Microser-

vices Programming, Jun 2021, Online, France. pp.1-8. �hal-03282425�

(2)

geo-distributed services: the replication case

Marie Delavergne¹, Ronan-Alexandre Cherrueau¹, and Adrien Lebre¹ LS2N,Inria, France

Abstract. Edge computing is becoming more and more present, with sites geo-distributed around the globe. Applications on these infrastruc- tures must be able to manage the latency and disconnections inherent to their distribution. One way to deal with these concerns could be to deploy one entire instance of the application per site and use a service mesh to manage the collaboration between the geo-distributed instances.

More precisely, we propose to reify the location of application instances in REST requests and allow redirections between these requests thanks to a dedicated language and a service mesh allowing three types of collaborations. This paper focuses on the replication of a resource between multiple instances. Though it is still a work in progress, we demonstrated the relevance of our approach in the OpenStack ecosystem.

1 Introduction

Edge computing is getting more important, with more and more small datacen- ters at the edge of the network. Nonetheless, lots of applications do not benefit from the geo-distribution of wide-area networks and are not designed to handle the high latencies and disconnections implied by these distributions [8]. To deal with these concerns, we advocate for the placement of an instance of the application on each site. This way, each site is autonomous and can fully work if disconnected from the rest of the network [2]. Unfortunately, the collaboration is still missing: instances are able to function by themselves, but they cannot collaborate between each other and so do not benefit from the geo-distribution.

To provide such a collaboration without changing the code, we propose to leverage the service mesh concept. Service meshes help cloud computing applications solve different problems with their built-in functionalities. For example, to improve overall performance, load-balancing is provided. More largely, by intercepting communications, they provide functionalities to ease different operations, like traffic monitoring, access control, fault tolerance [3]. In general, they are implemented with proxies as sidecars for the services, without interfering with their code as they only work on requests passing from services to services.

Their ability to intercept and redirect communications offers an opportunity to orchestrate requests between endpoints of any instance of the same application.

In this paper, we propose Cheops, a service to use in combination with a service mesh to program on-demand collaborations between multiple instances of an application. To specify where a request will be executed at a fine-grained level, Cheops relies on the scope-lang proposal we initially developed [2].

(3)

2 M. Delavergne et al.

Scope-lang extends applications API and allows the user to specify where (on which services) the request is executed. The language has been designed to provide three types of collaborations between application instances. Sharing is when a resource needed by a service has been created on another instance. This is the basic collaboration which allows to share resources between the instances.

Replication allows operations on identical resources on different sites, to deal with availability of these resources in case of network partitioning or to improve overall performance. Finally,crossallows a resource to span across different sites.

In this paper, we focus on replication (sharing has already been discussed [2], whilecross is let as future work).

It is noteworthy that other frameworks or languages [4,7] have been proposed.

However, they are invasive as they require to entangle geo-distribution in the business code. Non-invasive approaches generally follow the brokering approach:

an entity is in charge of redirecting requests between the different instances.

However, instances are not aware of the others. The goal of our proposal is to allow the instances to collaborate on-demand as if they were a single entity.

In this paper, we focus on how replication is interpreted and executed thanks to Cheops. We first explain scope-lang and how our general model works to allow DevOps to specify the location of a request execution. Then, we dive into the replication collaboration between different instances of the same service. In particular, how we manage replicas of a resource on different sites and how we handle disconnections, partitions or faults.

2 Scope-lang, a language to reify the geo-distribution of requests

In this section, we dive deeper into how scope-lang, Cheops and our general model work together to allow collaborations outside of the application, and so keep a clear separation of concerns.

2.1 General model

As a reminder, we have entire instances of the application on each site. Scope- lang parameters Cheops on a per-request basis in order to orchestrate collaborations between instances.

To explain collaboration between each instance of different services, let us take a look on how microservices based applications work. Each service compos- ing the application exposes endpoints to communicate with other services. These endpoints are linked to a specific part of the business they achieve. When calling endpoints of other services, they form a workflow between services. For example, Fig. 1a shows an applicationAppcomposed of two servicessandtthat expose endpointse,f,g,hand one example of a workflows.e→t.h. Fig. 1b shows the instantiation of the applicationAppon two different sites and their corresponding service instances:s1andt1forApp1;s2andt2forApp2. A client (•) triggers the execution of the workflow s.e→t.honApp2. It addresses a request to the endpointeofs2 which handles it and, in turn, contacts the endpointhoft2.

(4)

App s e f

t g h

(a) Application App made of two servicessandtand four endpoints e, f, g, h. Thes.e→t.hrepresents an example of a workflow.

App1 s1

e f

t1

g h

App2 s2

e f

t2

g h

•

(b) Two independent instances App1 and App2 of theApp application. The•represents a client that executes thes.e → t.h workflow inApp2.

Fig. 1: Microservices architecture of a cloud application.

2.2 Scope-lang

To parameter collaborations, we developed a domain specific language called scope-lang. A scope-lang expression (referred to as the scope or σ in Fig. 2a) contains location information that defines, for each service involved in a workflow, in which instance the execution takes place. The scope “s:App1,t:App2” intuitively means that the request must be achieved on the servicesfromApp1

and t from App2. The scope “t :App1&App2” specifies to execute the request on the servicetofApp₁ andApp₂. Users set the scope of a request to specify the collaboration between instances they want for a specific execution. The scope is theninterpreted by a dedicated module entitled Cheops during the execution of the workflow to fulfill that collaboration. The main operation it performs is request forwarding. To be more precise, reverse proxies in front of each service instance (geos andgeot in Fig. 2b) intercept the request and interpret its scope to forward the request. “Where” exactly depends on locations in the scope.

The reverse proxy uses a specific function R (see Fig. 2a) to resolve the service instance at the assigned location. R uses an internal registry. Building the registry is a common pattern in service mesh using aservice discovery [3].

In summary, scope-lang effectively parameters how Cheops will redirect the request. In the next section, we discuss how the replication is achieved.

Appi, Appj ::= application instance s, t ::= service

si, tj ::= service instance

Loc ::=Appi single location

| Loc&Loc multiple locations σ ::=s:Loc, σ scope

| s:Loc R[[s:App_i]] =s_i

R[[s:Loc&Loc⁰]] =R[[s:Loc]] andR[[s:Loc⁰]]

(a) scope-lang expressions σ and the function that resolves service instance from elements of the scopeR.

Appi

geos

e f

si

e f

geot

g h

ti

g h

σ=s:Appi,t:Appi

• σ R[[σ[s]]]

σ

R[[σ[t]]]

(b) Scope σ interpreted by the geo- distribution service meshgeoduring the execution of thes.e−→^σ t.hworkflow inAppi. Reverse proxies perform requests forwarding based on the scope and theRfunction.

Fig. 2: A service mesh to geo-distribute a cloud application

(5)

3 Replication in Cheops

Replication is the ability to create and maintain identical resources on different sites: an operation on one replica should be propagated to the others, dealing with faults and disconnections and maintaining consistency based on our eventual model. Other consistency policies [1, 10] could be envisioned, but let as future work as they do not change the general concept of scope-lang/Cheops. To get a better understanding of the point of replication, imagine a user who needs a huge resource (like an ISO image) both at home and at work. The resource can be replicated at creation on both sites and it will be the only time when the entire resource will go through the network. This saves a lot of bandwith, and is especially useful if there is a partition between both sites.

3.1 Replication model

Modular applications based on microservices usually follow a RESTful HTTP API. In most cases, they generate an identifier for each resource, which will be used by the API to retrieve, update or delete it. When receigving a request to create replicas, Cheops unify these identifiers with a data model calledreplicant.

A replicant is simply a meta-identifier we generate along with a mapping site→ local identif ier. A replicant can thus be implemented for example as:

meta identif ier: [siten:local identif iern, ...]. We only store the location (site) of the replica and not the service used since it is possible to deduce the service with the incoming request. This is subject to change depending of the evaluation of our prototype. We could store also the involved service and/or the type of resource involved.

These replicants are stored in a database co-located to the Cheops agents.

A copy of the replicant is stored on each site where its replicas are (the sites involved in the replication). Cheops has an API of its own to allow the user to check the state of operations, sites and inspect replicants.

3.2 Architecture overview

Cheops agents are located on each instance site, with a reverse proxy besides every service transfering their requests to the agents. Agents communicate between each other and check each other status via heartbeats. Our implementation of Cheops uses Consul service mesh¹ and Envoy²as reverse proxy to intercept and redirect, when needed, the requests. It is also worth noting that Envoy intercepts inbound and outgoing requests from services except for requests coming from Cheops agents.

In Fig. 3, we represented the reverse proxy and Cheops as one single entity that intercepts the request as it is in Fig. 2b to ease the comprehension.

1 https://www.consul.io/

2 https://www.envoyproxy.io/

(6)

App1 App2

geot

g h

t1

g h

geot

g h

t2

g h

σ=t:App1&App2

• σ R[[t:App1]]

R[[t:App2]]

c1

c2

Fig. 3: Modelling of the replication by forwarding on multiple instances.c1arrows represents Cheops agent on App1 updates to the databases, c2 arrows the one from Cheops agent onApp2.

3.3 CRUD execution workflow

First, to define what is the creation, update or delete workflow, we have to define what they do in our consistency model and what are their boundaries.

The creation of resources replicated in an eventual consistency implies that every replicas are identical at creation and will be created eventually. The update of resources created with the replication in an eventual consistency implies that all replicas will be updated eventually, whether the user specifies a scope or not in its request. It is the same for deletes.

The operation obviously begins when the user makes the request. But for the end, we could consider that an operation ends either when there is one response and is returned to the user, or when the operation is executed on every sites.

In an eventual consistency model, the latter end can come a lot later than the first response. It is important to know what happens in case of failure (partition, disconnection, server failure) during the execution until the first response, but also after, because the operation must be executed on our replicas at some point.

Creation The replication process to create a resource on App1 andApp2 happens as follows:

1. A request for replication is addressed to the endpoint of a service of one application instance. For example in Fig. 3: •−−−−−−−−→^t:App¹^&App² t.g, where g is the endpoint for the creation of the resource managed by the servicet.

2. The scope is extracted in the Cheops agent and theRfunction (from scope- lang) is used to resolve the endpoints that will store replicas. In Fig. 3:R[[t: App1&App2]] is equivalent toR[[t:App1]] andR[[t:App2]]. Consequently, t1

andt2 will be used for the resource creation.

3. The meta-identifier is generated and the replicant created using the meta- identifier and the location of execution. For example, if the generation yielded 72, we have:{ 72 : [App1:none, App2:none]}.

4. Each request is forwarded to the corresponding Cheops agent on involved sites and a copy of the replicant is stored in the database on those sites simultaneously. In Fig. 3: geot forwards the request to t1.g and t2.g and stores the replicant {72 : [App1 : none, App2 : none]} in App1 and App2

(7)

databases. In the figure, this is represented by the c₁ arrows going to the cylinders. The replicant onApp₁(where the request was made) becomes the leader of the replicants. A log is created for future operations on replicas.

5. Each contacted service instance executes the request and returns the re- sults to their local Cheops agent, which updates the replicant with the local identifier. In Fig. 3:t₁andt₂return their local identifier, e.g., 42 and 6.

6. Cheops agents then proceed to propagate the updated information to other agents involved. In parallel, they send the entire response to the Cheops agent that stores the leader replicant. In Fig. 3: the replicant is now {72 : [App1 : 6, App2 : 42]} on App1 and App2 sites databases, thanks to the updates represented byc1 andc2arrows.

7. When the agent where the leader is receives the first creation response, it transfers it to the user who asked for the replication, replacing the local ID with the replicant meta ID.

Read The process of reads is straightforward; to access a specific resource, users must either be on a site where one of its replicas is or specify in the scope on which location a replica of the resource to read is.

Update From now on, every request made to update (or delete) is filtered to check if the id given corresponds either to a replicant meta identifier or a local replica identifier. The process is quite similar to the creation, but does not generate a new replicant or change an existing one. It only applies an update to replicas.

1. A request for an update of a previously created replica is addressed to the endpoint of a service of one application instance.

2. Cheops checks if the ID in the request exists in a replicant. If not, the request is sent back to the service to be executed. If it is, the requests is transferred to the Cheops agent storing the replicant leader. It gets the corresponding replicant to find every replicas (and thus sites) involved. The operation is stored in its log.

3. The request is copied as many times as necessary (with the corresponding local identifier) and sent to the Cheops agent of involved sites.

4. Local Cheops agents send the request to the correponding service on their site, which executes the request normally.

5. Each Cheops agent sends back the response to the Cheops agent where the replicant leader is.

6. This agent sends back the response to the user, once again, with the meta- identifier where the local-identifier would be expected to notify the user that the replicas were updated.

Delete As for the update, a delete on replicas can be identified either by a local identifier or the meta identifier. The process is identical as the update’s.

(8)

3.4 Dealing with faults

We define a fault as: a partition of an involved site, or a failure from this site, whether it is shut down, out of order, or if the request cannot be executed for any reason (not enough memory to create a resource for example).

It is also important to mention that if the site where the user sent its request is faulty (does not work in any way), the request obviously cannot be executed.

The user can make the request to a more distant site.

Moreover, the “during an operation” can refer to two distinct phases. As we discussed before, the end of an operation can be seen as: when a replica has been created/updated/deleted and the user has been notified, and when the operation is applied to all replicas. So “during an operation” is between the request of the user and before one of these end. In our consistency model, this conveys no difference to the process.

If a site fails where a replica is supposed to be, other Cheops will be informed due to its heartbeat (or rather lack of).

A site is considered to be eventually available again unless it is removed. If a site is removed from the system, every replicant that were hosting a replica on this site must delete the site from their mapping (from the replicant).

Faults during operations The operation will be appliedeventually on all involved sites. This eventual consistency uses a consensus protocol, and in our case, an implementation of Raft [6]. For example, the leader’s log allows to replay operations that are not yet applied. It is the responsibility of the Cheops agent where the leader is to ensure that operations are applied eventually.

Faults while there are replicas When a site fails while there are replicas somewhere without any particular operation running, no heartbeat is re- ceived by other Cheops agent and the replica is considered unavailable tem- porarily.

If a site where a replica is was partitioned at some point but could be used locally, only read queries can be made, and these reads might be stale. When rejoigning the cluster, operations will be applied on the site so it is up-to-date thanks to the leader’s log.

3.5 Proof of concept

Though Cheops is still a work in progress, we demonstrated the relevance of sharing a resource in a proof of concept (PoC) on OpenStack [2]. This PoC gives DevOps the ability to make multiple independant instances of OpenStack collaborative. Using our approach with OpenStack would allow to manage a geo-distributed infrastructure as a usual IaaS platform. This is a breakthrough as several initiatives tried to propose a framework to manage edge infrastruc- tures and processes [5, 9], but due to the difficulty of delivering a software as complex/complete as OpenStack, their chance of adoption is definitely limited.

(9)

4 Conclusion

In this paper, we presented the replication mechanisms of Cheops and how scope- lang allows its parametrization. The ultimate goal of this project is to allow generic collaborations between multiple instancesof the same application without applying intrusive changes in the business code. We presented especially the different workflows for the replication collaboration.

As future work, we identified other collaboration mechanisms that could be relevant; for example, our replication strategy could be extended in order to include a controller and propose an abstraction similar to the ReplicaSet and its controller in the Kubernetes ecosystem³. The point would be to add control loop capabilities into Cheops in order to maintain the desired number of replicas according to the infrastructure changes. We could also propose different ways to keep the consistency between replicas, giving more choice for the users (e.g., giving them the choice to change the location of a replica if its site fails).

Acknowledgment

We would like to thank Matthieu Juzdzewski and Arnaud Szymanek for their work on Cheops.

References

1. Akkoorath, D.D., et al.: Cure: Strong semantics meets high availability and low latency. In: 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS). IEEE (2016)

2. Cherrueau, R.A., Delavergne, M., Lebre, A., Rojas Balderrama, J., Simonin, M.:

Edge Computing Resource Management System: Two Years Later! Research Re- port RR-9336, Inria Rennes Bretagne Atlantique (2020)

3. Li, W., et al.: Service mesh: Challenges, state of the art, and future research op- portunities. In: 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE). pp. 122–1225 (2019)

4. Martin, B., Prosperi, L., Shapiro, M.: An environment for composable distributed computing. In: EuroDW 2020-14th EuroSys Doctoral Workshop (2020)

5. Mortazavi, S.H., Salehe, M., Gomes, C.S., Phillips, C., de Lara, E.: Cloudpath: A multi-tier cloud computing framework. In: Proceedings of the Second ACM/IEEE Symposium on Edge Computing. pp. 1–13 (2017)

6. Ongaro, D., Ousterhout, J.: In search of an understandable consensus algorithm.

In: 2014{USENIX}Annual Technical Conference ({USENIX}{ATC}14) (2014) 7. Safina, L., Mazzara, M., Montesi, F., Rivera, V.: Data-driven workflows for mi-

croservices: Genericity in jolie. In: 2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA). IEEE (2016) 8. Satyanarayanan, M.: The emergence of edge computing. Computer50(1) (2017) 9. Wang, N., et al.: Enorm: A framework for edge node resource management. IEEE

transactions on services computing (2017)

10. Zhu, Y., Wang, Y.: Shaft: Supporting transactions with serializability and fault- tolerance in highly-available datastores. In: 2015 IEEE 21st International Confer- ence on Parallel and Distributed Systems (ICPADS). pp. 717–724 (2015)

3 https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/