Coded Caching with Heterogeneous Quality-of-Service Requirements 19

2.3 Thesis Outline and Summary of the Main Contributions

2.3.3 Coded Caching with Heterogeneous Quality-of-Service Requirements 19

depending on the type of device, type of subscription (e.g. base or premium), channel conditions and so on. That is why studying a caching model that captures the users’

heterogeneity in terms of their required QoS is an important and pertinent problem. For the standard K−user shared-link MAN setting [2] discussed in Section 1.2.1, we now use a layered coding approach to encoded each file into H layers. While the first layer provides the video stream of base quality, the other subsequent layers can successively

refine this quality. When a user requests a file at QoS levelh ∈[H], then this means that this user must be successfully delivered the firsth layers (out ofH) of the requested file. The number of users requesting QoS h ∈ [H] is denoted by Kh and we refer to K= (K₁, K₂, . . . , K_H) as the QoS profile. Moreover, assuming as always that each file has a normalized size of one unit, we will userh to denote the size of the firsth layers of a file, which naturally yields thatrH = 1. For this setting, assuming that the vectorKis known during caching, we propose a general placement and delivery scheme that achieves

T(t,K) =

We will prove that by minimizing (2.10) with respect to {x_g,h} and subject to the constraints in equations (2.11),(2.12), (2.13), we obtain the information theoretic optimal normalized delivery time under the assumptions that the cache placement is uncoded and aware of the QoS profileK. An intriguing fact of this result is the ability of the developed tight converse bound to provide useful information on how to design the optimal caching scheme. A sub-optimal choice of{xg,h} which does not require knowledge of Kduring caching is also presented in our work. These results are fully described in Chapter 7 and resulted in the following publication

[50]E. Parrinello, A. ¨Unsal, and P. Elia, “Optimal coded caching under statistical QoS information,” in 2019 IEEE International Symposium on Information Theory (ISIT),2019, pp. 2987–2991.

2.3.4 Heterogeneous Coded Distributed Computing

In the context of distributed computing, one of the most well-known frameworks is the so-called MapReduce model which decomposes the computation in three distinct phases:

the map phase, the shuffle phase and the reduce phase. As argued in [16] and references therein, the shuffle phase — which consists of an exchange of messages between the computing nodes — constitutes the major bottleneck that does not allow distributed computing to scale with the number of nodes. The work in [16] proposed a new variant of the MapReduce framework, calledCoded MapReduce, which employs coded caching in the shuffling phase in order to reduce the inter-node communication load. For a MapReduce-type distributed computing system with Λ nodes that have to work on a

dataset ofN files to obtain K output functions⁴, Li et al. [16] showed that the optimal communication load, i.e. the minimum number of bits communicated by the nodes normalized by the total number of bits required by all the nodes, is given by

τ_{CM R}^∗ (γ,Λ) = 1 Λ

Λ(1−γ)

Λγ , (2.14)

where γ is the fraction of the dataset that is stored by each computing node before the beginning of the map phase. The shuffling scheme that achieves this optimal load employs the same technique developed in [15] for the D2D cache-aided shared-link network.

Motivated by the fact that it is common to have computing nodes with different storage and processing capabilities (as is commonly the case with Amazon EC2 clusters), in Chapter 8 we address the problem of heterogeneous distributed computing. We will assume that each computing node λ has computational power characterized by the constantc_λ (0≤c_λ ≤1) and we denote by L_λ the number of output functions that are assigned to it. We will see that, for each computing nodeλ, an intuitive yet powerful approach is to select the value L_λ to be proportional to the computational power c_λ of the node, so that the time that the system spends in the reduce phase is lessened compared to when using a uniform assignment of the output functions among the nodes.

With the vector L = (L1, L₂, . . . , L_Λ) at hand, assuming that we have the freedom to replicate the datasett (referred to as the totalcomputation load) times across the nodes, we will show that an achievable communication load is

τ(t,L) = PΛ

λ=1Lλ(1−γλ)

Kt , t∈[Λ], (2.15)

whereγλ has been defined in (2.2) and represents the fraction of the dataset that has to be stored in nodeλ. We will then prove that, for a fixed Land a fixed total computation loadt, the above communication load is within a multiplicative gap of 2 from optimal under the assumptions of one-shot shuffling and homogeneous file assignment. A shuffling scheme is said ”one-shot” if any node can recover each of its required data (intermediate values) from the intermediate values computed in the map phase and from at most one transmitted message by some other node. Furthermore, we say that a file assignment is homogeneous when all the files of the dataset are replicated among the nodes the same number of times. The achievable scheme and the technique used for the converse bound draw from those derived for the topology-aware shared-cache setting in Chapter 4, thus also showing the versatility of our developed techniques in this thesis. It is interesting to see that the allocation {γλ}^Λ_λ=1 — whereγλ was defined in equation (2.2) — of the total computation load tacross the nodes as a function of L, is crucial in allowing a shuffle phase that servest computing nodes at a time. Furthermore, under the aforementioned assumption of homogeneous file assignment, we will prove that the expression in (2.15) serves as a lower bound for the optimal communication load of the heterogeneous coded distributed computing problem with fixed arbitrary output functions assignmentL and

4The choice of using the symbolKfor the total number of output functions is motivated by the fact that, as it will be clear in the corresponding chapter, the value ofKis reminescent of the total number of requested files in a shared-cache setting with Λ caches.

fixed arbitrary per-node computation loads {γλ}^Λλ=1 (γλ∈R and 0≤γλ≤1), satisfying PΛ

λ=1γλ=t. In conclusion, the proposed shuffle scheme exploits the heterogeneity of the output functions assignmentL by properly allocating the total computation loadtacross the computing nodes, thus allowing for a reduced communication load compared to the standard coded MapReduce communication load in (2.14). These results are presented in Chapter 8 and will be also part of the following publication:

[51] E. Parrinello and P. Elia, “New optimality results for heterogeneous coded distributed computing,” manuscript in preparation, 2021.

Topology-Agnostic Shared-Cache Networks

In this chapter we explore the fundamental limits of coded caching in the setting where a transmitter with potentially multiple (N₀) antennas serves different users that are assisted by a smaller number of caches. Under the assumption of uncoded cache placement and that the topology of the network is not known during this cache placement phase, we derive the exact optimal worst-case normalized delivery time and DoF, for a broad range of cache occupancy vectors where each such occupancy vector describes how many users are helped by each cache. This is achieved by presenting an information-theoretic converse based on index coding that succinctly captures the impact of the cache occupancy vector, as well as by presenting a coded caching scheme that optimally adapts to the heterogeneity of the topology by exploiting the benefits of encoding across users that share the same cache.

3.1 System Model and Problem Formulation

We consider a shared-link configuration with a transmitting server havingN₀ transmitting antennas and access to a library ofN filesW⁽¹⁾, W⁽²⁾, . . . , W^(N), each of size equal to one unit of ‘file’, where this transmitter is connected toKreceiving users and to Λ≤Khelper nodes that will serve as caches which store content from the library. The communication process is split into the cache placement phase, the user-to-cache assignment phase, and the delivery phase.

Cache placement phase During this phase, helper nodes (caches) store content from the library without having knowledge of the users’ requests. Each helper cache has size M ≤ N (γ , ^M_N) units of file, and no coding is applied to the content stored at the helper caches; this corresponds to the common case ofuncoded cache placement. We will denote byZλ the content stored by helper nodeλduring this phase such that the vector Z= (Z1,Z2, . . . ,ZΛ) represents the overall cache placement. We assume that the cache placement algorithm is oblivious of the subsequent user-to-cache association, i.e. during

the cache placement phase we have no knowledge of the number of users that will be assisted by each cache. In other words, during the cache placement phase the topology of the network is not known yet. This condition will be removed in the next chapter where we will assume that we can allocate the memory to each cache as a function of the topology.

User-to-cache association phase After the caches are filled, each user is assigned to exactlyone helper node/cache, from which it can download content at zero cost. Specifi-cally, each cacheλ= 1,2, . . . ,Λ, is assigned to a set of users Uλ, and all these disjoint sets

U ,{U1,U2, . . . ,UΛ}

form a partition of the set of users{1,2, . . . , K}, describing the overall association of the users to the caches. This cache assignment is independent of the cached content and independent of the file requests to follow. We here consider any arbitrary user-to-cache association U, thus allowing the results to reflect both an ability to choose/design the association, as well as to reflect possible association restrictions due to randomness or topology. Similarly, having the user-to-cache association being independent of the requested files, is meant to reflect the fact that such associations may not be able to vary as quickly as a user changes the requested content.

Delivery phase The delivery phase starts when each userk= 1, . . . , K requests from the transmitter, anyone file W^(d^k⁾,d_k∈ {1, . . . , N} out of theN files of the library. Upon notification of the entire demand vector d = (d₁, d₂, . . . , d_K) ∈ [N]^K, the transmitter aims to deliver the requested files and the objective is to design a caching scheme and a delivery scheme that do so with limited (delivery phase) duration T, where the delivery algorithm has full knowledge of the user-to-cache associationU. For each transmission, the received signal at user ktakes the form

y_k=h^T_kx+w_k, k= 1, . . . , K (3.1) wherex∈C^N⁰^×1 denotes the transmitted vector satisfying a power constraintE(||x||²)≤ PT,hk∈C^N⁰^×1 denotes the channel gain of receiverk, and wk represents the additive white gaussian noise (AWGN) noise with unit power at user k. We will assume that the maximum allowed average powerP_T is high (i.e., we will assume high signal-to-noise ratio (SNR)), that there exists perfect channel state information throughout the active users,

that fading is statistically symmetric, and that each link (one antenna to one receiver) has ergodic capacity log(SN R) +o(log(SN R)).

Observation 1. We notice that in the high-SNR regime of interest, when N₀ = 1 and Λ = K (where each cache is associated to one user), the setting matches identically the original single-stream setting in [2]. In particular, the file size and log(SNR) are here scaled so that, as in [2], each point-to-point link has capacity of 1 file per unit of time. When N0 >1 andΛ =K (again each cache is associated to one user), the setting matches the multi-server setting of [7], which we now explore in the presence of fewer caches serving potentially different numbers of users.

Shared link

Server ^User

Shared cache

Z1 Z2 ZΛ

U1(1) U1(2) U1(L1) U2(1) U2(L2) UΛ(1) UΛ(LΛ)

Cache-aided Access Point

User Multi-antenna

Transmitter

Figure 3.1 – Schematic of the shared-link BC with shared caches (left) and a pictorial representation of the multi-antenna BC with shared caches (right).

Performance measure As one can imagine, some user-to-cache associationsU may allow for higher performance than others; for instance, considering the fact thatU is not known during the placement phase, one can suspect that a uniform distribution of the users among the caches may be preferable. Part of the objective of this work is to explore the effect of such associations on the overall performance. Toward this, for any given U, we define cache occupancy vector the sorted tuple

L= (L₁, . . . , L_Λ)

whereLλ is the number of users assigned to the λ-th most populated helper node/cache¹. Naturally, P_Λ

λ=1L_λ =K. Each vector L defines aclass U^L comprising all the user-to-cache associations U that correspond to the same cache occupancy vector L.

As in [2], the measure of interestT is the number of time slots needed to complete delivery of any request vector² d. For a given fixed Z, we use T^∗(U,d,Z) to define the optimal normalized delivery time required to satisfy demand d in the presence of a user-to-cache association described by U. The optimal worst-case delivery time for a given cache occupancy vector L can be formally stated as

T^∗(L),min

Z max

(U,d)∈(UL,{1,...,N}^K)T^∗(U,d,Z). (3.2) Our interest is in the regime ofN ≥K where there are more files than users.

Dans le document Fundamental limits of shared-cache networks (Page 37-43)