• Aucun résultat trouvé

3.3 Problem Formulation

3.4.2 Schemes we compare with

We refer to the proposed solution of OP 6asOptimal.

• Greedy Aware: We consider as baseline algorithm for network-friendly recommenda-tions Moreover and as stated in the end of 3.3.1, an important baseline for us will be (OP 7[35]), which is a position-aware scheme, and takes into account known

statis-tics about the user click rate with respect to the position that the recommendations appear, but does not take into account that requests are sequential.

• CARS: algorithm [44], a position-unaware scheme for sequential content requests proposed in, will serve as our second baseline. The CARS algorithm optimizes (with no guarantees) the recommendations for a user performing multiple sequential requests, but assumes that the user selectsuniformly one of the recommendations regardless of the position they appear. Note that the objective of Eq.(4.11) assumes knowledge ofv, while the algorithm of [44],

Note on CARS.In our framework, this translates to solvingOP 4 for uniformv. The algorithm will then returnN identical stochastic recommendation matrices. Importantly, whichevervwe choose, the parenthesis of the Eq.(4.11) will be (I−α·(v1·R+..+vN·R)) = (I−α·R). This explains why the hit rate ofCARSin the plots, remains constant regardless of the click distributionv. this force the algorithm to return the same recommendation matrixN times; this will then be normalized by the click probability N1. As it is oblivious tov, the performance ofCARS remains constant regardless of the click distribution.

3.4.3 Datasets

Here is a list of the used datasets and a brief explanation on how we collected them.

YouTube FR. (K = 1054) We used the crawling methodology of [70] and collected a dataset from YouTube in France. We considered 11 of the most popular videos on a given day, and did a breadth-first-search (up to depth 2) on the lists of related videos (max 50 per video) offered by the YouTube API. We built the matrix U∈ {0,1} from

the collected video relations.

last.fm. (K = 757) We considered a dataset from the last.fm database [71]. We applied the “getSimilar” method to the content IDs’ to fill the entries of the matrix U with similarity scores in [0,1]. We then set scores above 0.1 touij = 1 to obtain a denseU matrix.

MovieLens. (K= 1066) We consider the Movielens movies-rating dataset [72], contain-ing 69162 ratcontain-ings (0 to 5 stars) of 671 users for 9066 movies. We apply an item-to-item collaborative filtering (using 10 most similar items) to extract the missing user ratings, and then use the cosine distance (∈ [−1,1]) of each pair of contents based on their common ratings. We setuij = 1 for contents with cosine distance larger than 0.6.

Table 3.2 – Parameters of the simulation q % zipf(s) α N MPH %

MovieLens 80 0.8 0.7 2 23.26

YouTube FR 95 0.6 0.8 2 12.17

last.fm 80 0.6 0.7 3 11.74

0.75 0.8 0.85 0.9 0.95 Entropy

35 40 45 50

C H R (% )

MovieLens

Optimal-Aware CARS

Figure 3.4 – Absolute Cache Hit Rate Performance vs Hv (C/K≈1.00%) - MovieLens.

3.4.4 Results

Optimal vs CARS. We initially focus on answering a basic question: Is the non-uniformity of users’ preferences to some positions helpful or harmful for a network friendly recommender? In Figs. 3.4, 3.5, 3.6 (see Table 3.2 for simulation parameters), we assume behaviors of increasing entropy; starting from users that show preference on the higher positions of the list (low entropy), to users that select uniformly recommendations (maximum entropy). In our simulations, we have used a zipf distribution [15] over the N positions and by decreasing its exponent, the entropy on thex-axis is increased. As an example, in Fig. 3.4, lowestHv corresponds to a vector of probabilitiesv= [0.8,0.2]

(recall that N = 2), while the highest one on the same plot tov= [0.58,0.42].

Observation 1. Our first observation is that the lower the entropy, the higher the optimal result. In the extreme case where the Hv→0 (virtually this would mean N = 1, the user clicks deterministically), the optimal hit rate becomes maximum. This can be validated in Fig. 3.11, where for increasing entropy the the hit rate decreases and its max is attained for N = 1.

0.75 0.8 0.85 0.9 0.95 Entropy

24 26 28 30 32 34

C H R (% )

YouTube FR

Optimal-Aware CARS

Figure 3.5 – Absolute Cache Hit Rate Performance vs Hv (C/K ≈1.00%) - Youtube France.

0.75 0.8 0.85 0.9 0.95 Entropy

0 5 10 15 20 25

R el a ti v e G a in (% )

vs CARS

last.fm YouTube FR MovieLens

Figure 3.6 – Relative Cache Hit Rate Performance vs Hv (C/K≈1.00%) - All Datasets.

0.75 0.8 0.85 0.9 0.95 Entropy

30 35 40 45 50 55

C H R (% )

MovieLens

Optimal-Aware Greedy-Aware

Figure 3.7 – Absolute Cache Hit Rate Performance vs Hv (C/K≈1.00%) - MovieLens.

0.75 0.8 0.85 0.9 0.95 Entropy

15 20 25 30 35

C H R (% )

YouTube FR

Optimal-Aware Greedy-Aware

Figure 3.8 – Absolute Cache Hit Rate Performance Hv (C/K ≈ 1.00%) - YouTube France.

0.75 0.8 0.85 0.9 0.95

Figure 3.9 – Relative Cache Hit Rate Performance vs Hv (C/K≈1.00%) - All Datasets.

Relative Gain Optimal vs CARS

1 2 3 4

N (# of recom/s)

15 20 25 30 35

C H R (% )

Optimal-Aware CARS

Greedy-Aware

Figure 3.11 – Cache Hit Rate vs N (C/K ≈1.00%, α= 0.7)

Optimal vsGreedy. The second question we study is: How would a simpler greedy/myopic, yet position-aware, algorithm fare against our proposed method? Fundamentally, the Greedy algorithm solves a less constrained problem than OP 4, and is therefore a more lightweight option in terms of execution time. However, the merits of using the proposed optimal method are noticeable in Figs. 3.7, 3.8, 3.9 (parameters in Table 3.2). In all three datasets, we see an impressive improvement, between 20−60%.

Observation 2. The constant relative gain of the twoaware algorithms hints that both, as the entropy increases, seem to do the right placement in the positions. However, as Greedy decides with a small horizon, it cannot build the correct long paths that lead to

higher gains in the following requests (clicks) of the user.

Lastly, we investigate the sensitivity of the three methods against the number of recommendations (N).In Fig. 3.11, we present theCHR curves of all three schemes for increasingN, where we keep constant the distributionv∼zipf(0.9). As expected, for N = 1 (e.g., YouTube autoplay scenario)CARS and the proposed scheme coincide, as there is no flexibility in having only one recommendation. However, as N increases, CARS andGreedy decay at a much faster pace than the proposed scheme, which is more

resilient to the increase ofN. This leads to the following observation.

Observation 3. For largeN,CARS may offer the “correct” recommendations (cached or related or both), but it cannot place them in the right positions, as there are now too many available spots. In contrast, our algorithmOptimal recommends the “correct” contents, and places the recommendations in the “correct” positions. Fig. 3.10, strengthens even more the Observation 3; its key conclusion is that with high enough enoughβ (i.e. low Hv) and more than 2 or 3 recommendations, while CARS aims to solve the multiple

access problem, itsposition preference unawareness leads to suboptimal recommendation placement, and thus severe drop of its CHRperformance compared to the Optimal.

The Random Session Case

4.1 Introduction

In this Chapter we dive even deeper to the NFR problem. In the previous two chapters, we formulated the problem borrowing tools from convex optimization and were able to

• Present a heuristic ADMM solution which performed quite well in practice.

• Transform the long session NFR problem to an LP with hard constraints on the user satisfaction.

• Incorporate the position preferences of the users using a basic stochastic model and solve this optimally as well through the LP.

As we have stressed in the beginning of this manuscript, the solution under investiga-tion in the current thesis is essentially a software soluinvestiga-tion and as one it must be able to run inreasonable computational time with good performance guarantees. Although LP solvers such as ILOG CPLEX are extremely efficient and perform very well, they can still suffer when the number of variables and constraints becomes very large. We remind the reader that in the previous two chapters, the long session was approximated by an asymptotically infinite length session. In the MDP formulation we resolve this issue as we now optimize the cost for some session of average length ¯L (measured in contents). In terms of computational efficiency, the MDP has two clear advantages, (a): by assuming some average session length, it avoids unecessary computations and (b): through the DP approach, it breaks down the problem in easier subproblems and thus we managed to

• Achieve improved runtimes while having-optimality guarantees.

More importantly, a major weakness of our previous works is that we considered users whose clickthrough probability is fixed and independent on the quality of the recommendation policy. However, using MDP as our main workhorse allows for some very interesting modeling extensions along this dimension. Thus, a major result of this work/chapter is that we could finally

• Explore the tradeoffs of the long session NFR under users who can be reactive to the quality they receive from the recommender. We did so by maintaining optimality guarantees of our policies.

4.2 Problem Setup