Clicks Pattern Analysis for Online News Recommendation Systems

(1)

Clicks Pattern Analysis for Online News Recommendation Systems

Jing Yuan¹, Andreas Lommatzsch¹, Benjamin Kille¹

1 DAI-Labor, Technische Universt¨at Berlin, Germany {jing.yuan, andreas.lommatzsch, benjamin.kille}@dai-labor.de

Abstract. The NewsREEL challenge provides researchers with an op- portunity to evaluate their news recommending algorithms live based on real users’ feedback. Since 2014, participants evaluated a variety of approaches on the Open Recommendation Platform (ORP), yet popularity- based algorithms constitute the most successful ones. In this working note, we chronologically describe our participation in NewsREEL online task in the year 2016. With approaches including “most impressed”,

“newest”, “most impressed by category”, “content similar” and “most clicked”, we reconfirm that content relevance is not a very good indicator for recommending news. Meanwhile, for the dominating portalSport1, the extrapolation of the time series of impressions and clicks enables us to predict the items most likely to be clicked in the next hours. A sample analysis on one week data shows us that the duration of an item being popular is much longer than we expected. Thus, we propose that when designing recommenders in this contest, more attention should be paid on the time series patterns of clicks and impressions.

1 Introduction

News, as important media content, still keeps its role of guiding social opinion, even in modern world which is full of virtual social network and personal ideas.

Many news providers employ recommender systems and similar personalization techniques to assist users in finding relevant news quickly and conveniently.

Different ways to incorporate recommendations in news publishers have been successfully launched in the current digital news content market. We exemplify three ways in which recommendations are pushed to news consumers. First, as ane-Magazine Provider, Flipboard aggregates news contents from different third party providers and then selects news which is relevant to a user’s pre-defined topics forming their personalized news board. Second, someContent Providers, such as ByteDance, generate contents themselves. They recommend in a closed system based on internal users, news, and interaction in between both. Third, Recommendation Providers, e.g.plista and outbrain, offer recommendation services for different kinds of websites, including news websites. Table 1 com- pares characteristics of the three main-stream news recommenders concerning aspects such as whether they generate content by themselves, the stability of users, and the stable range of news items, respectively. As a representative of

(2)

Recommendation Providers,plista manifests its non-trivial condition in terms of variety of news portals and differences in users’ expectations. Considering that NewsREEL competition receives data stream fromplista, participants have to cope with all these knotty conditions to win the contest[8].

Table 1: Characteristics of Main-stream News Recommender

News Recommender Generating Content

Stability of Users

Stable Range of News

e-Magazine Provider (e.g. Flipboard) 7 3 7

News Content Provider (e.g. Bytedance) 3 3 3

Recommendation Provider (e.g.plista) 7 7 7

The NewsREEL challenge 2016 provides participants with the chance of evaluating recommender algorithms with online live user feedback [4, 3]. In the challenge, teams registered on Open Recommendation Platform (ORP) receive streamed messages describing published news articles, users’ impressions and clicks on items, as well as recommendation requests fromplista. The challenging aspects of participating NewsREEL include: (1) recommendations must be provided in 100ms upon request; (2) participants need to deal with news portals from different domains; (3) user groups on specific portal alter; (4) number of messages varies largely among portals [8, 5].

In contrast to recommending movies or music, news items continuously emerge and become outdated constituting a dynamic environment. This makes the NewsREEL competition particularly challenging. Algorithms have to consider these dynamics in news articles and users’ preferences. We focused on popularity and freshness to cope with the dynamics following the notion that users prefer important and recent news over insignificant and outdated articles. The success of the “most clicked” strategy in terms of CTR further supports this notion.

Even though the method is rather simple, it captures crucial aspects. Visualizing clicks on items over time, we observe continued click activity stretching several hours for popular items. We compared contents of popular items and discovered that they overlap. Still, content-based algorithms have failed to benefit of these overlaps in previous editions of NewsREEL.

The remainder of this paper is structured as follows. In Section 2, we briefly introduce the approaches we used in year 2016 and discuss other algorithms developed in previous years. Subsequently, we analyze characteristic user-item interaction patterns for different news portals in Section 3, and found that “most clicked items” has its own power of self-predicting. Finally, conclusion and an outlook to future work are given in Section 4.

(3)

2 Approach Used

In this section, we chronologically describe the approaches we have deployed in ORP, i.e. the online task of NewsREEL2016, and changes in our thoughts meanwhile. When the most simple approach “most clicked” finally shows its power to outperform other algorithms, it attracts our interest to dig deeper into clicks pattern from the perspective of time series analysis in the next section.

Most Impressed Inspired by the good performance of “baseline” in the past years (see [9]), which directly uses the most recently impressed items as recommendation candidates, we implemented a similar method by sorting the 2000 most recent impressions by their frequencies. Typically, this approach is called

“most popular”, but to distinguish it from “most clicked” which will be introduced later on, we refer to it as “most impressed” in this paper. The approach ran on ORP for two weeks (January 31 to February 13, 2016), and got the CTR 1.21% (ranked 3rd, team “artificial intelligence” got the first place with CTR 1.48%) and 1.35% (ranked 2nd, team “abc” got the first place with CTR 1.4%) in these two weeks separately.

Newest Considering that freshness represents a vital aspect of news, we also implemented an approach “newest” which provides the most recently created items from the same category as the currently visited item as recommendation.

Given the good performance of “most impressed” mentioned above, we used it as an alternative solution when the request lacked an item id, i.e. the category can- not be determined. In addition, for a recommendation request with 6 candidate slots, 3 positions are still filled by “most impressed” approach. Therefore, this approach can be seen as a simple ensemble of “most impressed” and “newest”.

With this solution, from 21–27 February, our team “news ctr” got CTR 1.19%

(ranked 5th, team “artificial intelligence” got the first place with CTR 1.45%) in the contest leader board.

Most Impressed by Category After witnessing how “newest” weakened the effect of “most impressed”, we conducted another experiment which only consid- ered the number of impressions, but separates the impression counts according to categories, thus for the recommendation request with item id the recommending targets will only be the “most impressed” items in the relevant category. The approach ran on ORP for three weeks, from 6–12 March 2016 it got CTR of 0.82% (ranked 7th, team “is@uniol” got the first place with CTR 1.03%), from 13–19 March 2016 it got CTR of 0.97% (ranked 11th, i.e. the last one, team

“xyz” got the first place with CTR 1.85%) and from 20–26 March 2016 it got CTR 1.24%(ranked 6th, team “xyz” got first place with CTR 2.16%).

Content Similar Having confirmed that considering categories in combination with popularity lead to worse performances, we implemented a pure content- based recommender using Apache Lucene to see how content relevance influence

(4)

recommending effect after all. We deployed this content-based recommender on ORP and noticed that from 27 March to 2 April it got a CTR of 0.77% (ranked 11th, team “xyz” got the first place with CTR 1.51%). This confirmed that in real-time news recommendation scenario as in this contest, pure content similarity is not sufficient for a successful recommending strategy. Said et al. [10] came to the same conclusion hypothesizing that content similarity fails to pick up on new stories but redirects users to similar contents.

Most Clicked While varying on different algorithms, we discovered an interesting phenomenon through the clicks message we received from ORP. Even though different contest teams used different algorithms, the clicked items for all of these recommendations tended to be similar. This consistent regularity reminded us to think whether characteristic patterns exist within clicks along the time axis.

Hence we implemented the simplest approach “most clicked” which only serves the most frequently clicked items in the last hour to the recommendation requests. From 3–9 April 2016, this simple approach got a CTR of 1.14% winning the leader board ahead of “xyz” (0.96%). Figure 1 shows the result during this week.

Fig. 1:news ctr got first place in the period 3–9 April, 2016

Having observed this interesting phenomenon, we looked into previous work related to this contest. Kliegr and Kuchaˇr [6, 7] implemented an approach based on association rules. They used contextual features (e.g.ISP,OS,GEO,WEEKDAY, LANG,ZIP, andCLASS) to train the rule engine. The results obtained in the online

(5)

evaluation indicate that association rules do not outperform other algorithms.

Through the investigation in [2], Gebremeskel and de Vries found that there is no striking improvement through including geographic information on news recommendation, yet more randomness of the system should be taken into account when considering evaluation for recommenders. Doychev et al. introduced their 6 popularity-based and 6 similarity-based approaches in [1], but their algorithms seemed to perform poorly compared with “baseline” due to being influenced by content aggregating. As Said et al. concluded in [10] that news article readers might be reluctant to be confronted with similar topic all the way, but more pleased to be distracted by something breaking or interesting. In the following section, we are digging into how this breaking phenomenon is reflected in clicks behavior.

3 Clicks Pattern Analysis

As a further exploration of click patterns, we focus on clicks following recommendation requests inSport1 andTagesspiegel on April 5, 2016. Clicks inSport1 follow more obvious and stable trends. We analyze this consistency for the “most clicked” recommender’s suggestions in terms of the Jaccard similarity of tempo- rally adjacent item groups.

3.1 Clicks Pattern for Sport1

First, we draw the histogram of clicks regarding recommendation requests on items in portalSport1. Considering thatplistahas only delivered part of all recommendation requests to ORP participants, we suppose that the click patterns might slightly differ amid contest teams scope and the wholeplistascope. ORP hides such scope information in its click notification JSON “context.simple” ob- ject where key number ’41’ stands for “contest team” and value number represents specific team number. For instance, “news ctr” is the contest team with team number 2465, while team number -1 signals that the click happened outside contest team range. Thus, the figures are drawn separately by these two scales: contest teams scale excluding clicks outside the contest scope and whole plistarange without any restrictions on contest team.

Figure 2 shows the click conditions among contest teams. In order to track the top clicked items in a time sequence, we draw the figure for each hour for April 5, 2016, i.e. 24 subfigures covering the whole day. In each subfigure, news items are located on the x-axis as points sorted by the click frequency in descending order. A red vertical line separates the six most frequently clicked items—most recommendation requests ask for six suggestions. For a majority of intervals, we observe a power law distribution. Few highly popular items occupy a majority of clicks. The percentage of clicks occupied by the top six items is shown in the red boxes.

We analyze how popularity transitions into the future. Therefore, we high- light the item ids of the six most popular items in the top right corner of each

(6)

0 2 4 6 8

10 _77.14%273681975 273707540 274110878 274109826 273877703 274142904 273681975 273707540 274110878 274109826 273877703

274142904 0:00 - 1:00

0 1 2 3 4 5

6 81.25%^274109826_273681975 273877703 274110878 273707540 274086605 274109826 273681975 273877703 274110878 273707540

274086605 1:00 - 2:00

012 3 4 5

6 85.71%^273681975_274110878 274109826 274062086 273926210 274128003 273681975 274110878 274109826

274062086

273926210

274128003 2:00 - 3:00

01 23 45 67

88.89%^274110878_273681975 274001468 274109826 274115260 273926210 274110878 273681975

274001468

274109826

274115260

273926210

3:00 - 4:00

0.00.5 1.01.5 2.02.5 3.03.5

4.0 58.82%^273681975_273795547 274008188 273707540 274135490 274142904 273681975

273795547 274008188

273707540

274135490

274142904

4:00 - 5:00

0123 45 67 89

91.67%^274110878_273681975 274142904 274146517 273951845 273707540 274110878 273681975 274142904

274146517 273951845

273707540

5:00 - 6:00

02 46 108 1214

16 74.47%^273681975_274110878 274142904 274135490 274133196 273707540 273681975 274110878 274142904 274135490

274133196

273707540

6:00 - 7:00

0 5 10 15 20

25 80.0% ^273681975_274110878 273707540 274135490 273877703 274142904 273681975 274110878 273707540 274135490 273877703 274142904

7:00 - 8:00

0 5 10

15 ^81.16%^273681975^274110878_273707540

274133196 273877703 274135490 273681975 274110878 273707540

274133196

273877703 274135490

8:00 - 9:00

02 46 108 1214

88.24%^273707540_274110878 273681975 274135490 274172365 274109826 273707540 274110878 273681975 274135490 274172365

274109826 9:00 - 10:00

02 46 108 1214

16 72.92%^273707540_274110878 274135490 274190378 274190652 274172365 273707540 274110878 274135490 274190378 274190652 274172365

10:00 - 11:00

02 4 6 8 10

12 69.57%^273707540_274190378 274110878 274172365 273795547 274190652 273707540 274190378 274110878 274172365

273795547

274190652

11:00 - 12:00

05 1015 20

2530 _66.12%273707540 274224023 273770166 274190378 274207242 274219226 273707540 274224023 273770166 274190378 274207242 274219226

12:00 - 13:00

0 10 20 30

40 76.0% ^274224023_273770166 274207242 273877703 274110878 274219226 274224023 273770166 274207242 273877703 274110878 274219226

13:00 - 14:00

0 5 10 15

20 _78.3% ^274207242_274224023

273770166 273877703 274219226 274110878 274207242 274224023 273770166 273877703 274219226 274110878

14:00 - 15:00

0 5 10 15 20

25 88.41%^274224023_273877703 274227548 273770166 274219226 274197901 274224023 273877703

274227548

273770166 274219226

274197901 15:00 - 16:00

05 1015 2025

30 _87.38%^274224023_273770166

274219226 273877703 274062086 274242801 274224023 273770166 274219226 273877703 274062086

274242801 16:00 - 17:00

0 5 10

15 ^77.27%^274275555^273770166_274224023

273877703 274062086 274110878 274275555 273770166 274224023 273877703 274062086 274110878

17:00 - 18:00

0 5 10 15

20 _83.58%^274224023_273877703

274275555 273770166 274110878 274280808 274224023 273877703 274275555 273770166 274110878 274280808

18:00 - 19:00

02 46 108 1214

16 _75.41%274224023 274275555 273877703 273770166 274280808 274110878 274224023 274275555 273877703 273770166 274280808 274110878

19:00 - 20:00

0 5 10

15 ^74.67%^274224023^273877703_274280808

274275555 273952709 274110878 274224023 273877703 274280808 274275555 273952709 274110878

20:00 - 21:00

0 5 10 15

20 79.22%^274224023_273877703 274110878 274275555 274280808 273952709 274224023 273877703 274110878 274275555 274280808 273952709

21:00 - 22:00

05 1015 2025 3035

88.0% ^274224023_273877703 274331454 274110878 274280808 274275555 274224023 273877703 274331454 274110878 274280808 274275555

22:00 - 23:00

0 5 10 15 20

25 _78.03%273877703 274224023 274110878 274331454

274340273

274280808

23:00 - 24:00

Items Clicked Condition in Sport1 for Contest Teams on April 5, 2016

Fig. 2: Top clicks condition ofcontest teams onSport1 on April 5, 2016

subfigure. We color item ids to facilitate tracking individual items throughout the plot. Items tinted in gray only appear in a single one hour interval. From Figure 2, we see that on April 5, 2016, items ranked in the top three manifest more continuity, i.e. they are more likely to re-appear in the next hour’s top clicked 6 items group.

Aside from the scope of contest teams in ORP, we are also interested in the power law distribution of clicked items in the wholeplista range. In Figure 3, we find that along with the increasing number of distinct clicked items in the

“whole” range, the power law distribution of clicked items is even more significant. In all one hour time windows, more than 87% clicks are contributed by the top 6 items. The more complete data may cause the increased steepness of the histograms. The distribution can be described by Zipf’s law. The significant advantage of the six most frequently clicked items reminded us that we should pay more attention to the short head with higher business value. Thereby, we can keep a relatively high CTR. Still, more sophisticated methods are required to leverage the potential of the long tail.

When focusing on the most frequently clicked items within this one day, we find some clues for the future work. First, Table 2 illustrates the four items occurring most frequently in the top 6 group. It lists their item id, the duration contained in thetop 6, their ranking trends, and the date they had been created.

All four items remained in thetop 6 for at least eleven hours. We had expected considerably less time as news continuously emerge. We notice fluctuating rankings of these four items. Recognizing patterns in shifting rankings will be subject to future work. The dates of creation subvert our previous expectations. We as-

(7)

0 100 200 300

400 89.76%^273681975_273707540 273877703 273770166 274110878 273880301 273681975 273707540 273877703 273770166 274110878

273880301 0:00 - 1:00

0 50 100 150

200 87.81%^273681975_273707540 273770166 273877703 273795547 274110878 273681975 273707540 273770166 273877703

273795547

274110878

1:00 - 2:00

200 4060 10080

120 89.66%^273681975_273707540 273877703 273770166 274110878 273926210 273681975 273707540 273877703 273770166 274110878

273926210 2:00 - 3:00

0 20 40 60 80

100 86.51%^273681975_273707540 273877703 273770166 273880301 274110878 273681975 273707540 273877703 273770166

273880301

274110878

3:00 - 4:00

200 4060 10080

120 88.22%^273681975_273707540 273770166 273877703 273926210 274008188 273681975 273707540 273770166 273877703

273926210 274008188 4:00 - 5:00

0 50 100 150 200

250 94.76%^273681975_273707540 273770166 273877703 274110878 273795547 273681975 273707540 273770166 273877703 274110878 273795547

5:00 - 6:00

1000 200300 400500

600 92.69%^273681975_273707540 273770166 273877703 274110878 273795547 273681975 273707540 273770166 273877703 274110878 273795547

6:00 - 7:00

1000 200300 400500 600700

800 93.4% ^273681975_273707540 273877703 273770166 274110878 273795547 273681975 273707540 273877703 273770166 274110878 273795547

7:00 - 8:00

1000 200300 400500

600700 _93.08%273681975 273707540 273770166 273877703 274110878 273795547 273681975 273707540 273770166 273877703 274110878 273795547

8:00 - 9:00

1000 200300 400500

600 91.91%^273681975_273707540 273770166 273877703 274110878 274062086 273681975 273707540 273770166 273877703 274110878 274062086

9:00 - 10:00

0 100 200300 400

500 87.57%^273707540_273770166 273877703 273681975 274110878 274062086 273707540 273770166 273877703 273681975 274110878 274062086

10:00 - 11:00

1000 200 300400

500600 _89.23%273707540 273770166 273877703 274110878 273795547 274062086 273707540 273770166 273877703 274110878

273795547

274062086

11:00 - 12:00

1000 200300 400500

600700 88.71%^273707540_273770166 273877703 274110878 274062086 273926210 273707540 273770166 273877703 274110878 274062086

273926210 12:00 - 13:00

1000 200300 400500

600700 _88.89%^273770166_273877703

274110878 274062086 274224023 274008188 273770166 273877703 274110878 274062086 274224023

274008188 13:00 - 14:00

1000 200300 400500

600 _89.9% ^273770166_273877703

274110878 274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023

273795547 14:00 - 15:00

0 100 200 300 400

500 ^93.19%^273770166273877703 274110878 274062086 274224023 274008188 273770166 273877703 274110878 274062086 274224023 274008188

15:00 - 16:00

1000 200300 400500

600700 _94.19%273770166 273877703 274110878 274062086 274224023 274008188 273770166 273877703 274110878 274062086 274224023 274008188

16:00 - 17:00

0 100200 300 400 500

600 92.39%^273770166_273877703 274110878 274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023 273795547

17:00 - 18:00

0 100 200 300

400 ^93.5% ^273770166^273877703_274110878

274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023 273795547

18:00 - 19:00

500 100150 200250 300350

400 93.38%^273770166_273877703 274110878 274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023 273795547

19:00 - 20:00

500 100150 200250 300350

400 89.5% ^273877703_274110878 274062086 273770166 274224023 274219226 273877703 274110878 274062086 273770166 274224023 274219226

20:00 - 21:00

1000 200300 400500

600 93.42%^273877703_274110878 274062086 274224023 274008188 274219226 273877703 274110878 274062086 274224023 274008188 274219226

21:00 - 22:00

0 200 400 600

800 91.98%^273877703_274110878 274224023 274062086 274219226 274008188 273877703 274110878 274224023 274062086 274219226 274008188

22:00 - 23:00

0 200 400 600

800 ^92.02%^273877703274110878 274224023 274062086 274219226

274275555 23:00 - 24:00

Items Clicked Condition in Sport1 for Whole Range on April 5, 2016

Fig. 3: Top clicks condition ofwholerange onSport1 on April 5, 2016

sumed that news would remain relevant for a very limited time. In contrast, news articles created on April 2, 2016, dominated the top 6 news three days later. This indicates a noticeably longer life-cycle of news than we anticipated.

Table 2: Stable top clicked items condition on April 5, 2016

Item Id Being in Top6 Ranking Created At 273681975 0:00–11:00 Most of the time 1st 2nd, Apr. 2016 273707540 0:00–13:00 2nd→1st 2nd, Apr. 2016 273877703 0:00–24:00 3rd/4th→2nd→1st 3rd, Apr. 2016 273770166 0:00–21:00 2nd→3rd/4th→1st 2nd, Apr. 2016

Next, we analyze the categories, titles, and descriptions of these four items in order to get a better understanding of the contents. Table 3 highlights co- occurring terms in these four items. Among those, we see that the breaking news that coachPep Guadiolawill be leavingBayernattracted many users’ interest on relevant articles. We hypothesize that content similarity affects recommenders, but only for popular items. Still, a majority of users only pays attention to the most popular articles which is why pure content-based recommenders frequently suggest articles with minor click chances.

(8)

Table 3: Stable top clicked items description on April 5, 2016

Item Id 273681975 273707540 273877703 273770166 category fussball intenational-

fussball fussball fussball title

Guardiola machtG¨otze

froh

Van Gaal watscht di Maria ab

Robben:“Van Gaalist wie Guardiola”

Mittelfeldbestie G¨otze: Wechsel

nur im Notfall.

text

Der Coach des FC Bayern l¨asst Youngster

FelixG¨otze erstmals mit den Profis trainieren. Javi

Martinez und Manuel Neuer stehen derweil

vor dem Comeback.

Als Rekordtransfer

geholt, nach nur einer Saison wieder vom Hof gejagt. Jetzt

geht die Geschichte

zwischen United-Trainer Louisvan Gaal

und Angel di Maria in die Verl¨angerung.

Arjen Robben vergleicht den ehemaligen Bayern-Coach

Louisvan Gaalmit dem

aktuellen Trainer Pep Guardiola.

Mit demFC Bayernwill Robben noch viel erreichen.

MarioG¨otzes beherzter Auftritt gegen Frankfurt zeigt:

Er will sich unbedingt beim

FC Bayern durchsetzen.

Der Verein mauert noch beim Thema Transfer.

3.2 Jaccard Similarity Between Clicks in Neighbor Hours

In this subsection, we quantify the continuity of most frequently clicked items and analyze this continuity behavior concerning contextual factors such as time of day and day of week. Jaccard Similarity, as defined in Equation 1, is a metric to measure the similarity of two setsAandB. The value of this metric equates to the cardinality of the intersection divided by the size of union of these two sets. In our scenario,AandBrefer to the sets of the six most frequently clicked items of two neighboring one hour time slots. The higher the Jaccard similarity, the more items users constantly are concerned with across neighbor hours.

Jaccard(A, B) =|A∩B|

|A∪B| = |A∩B|

|A|+|B| − |A∩B| (1)

We expand our view from a single day to the week 3–9 April, 2016. Thereby, we obtain 24×7 = 168 one hour time windows. Thereof, we derive 167 pairs of subsequent time windows to compute the average Jaccard similarity. Figure 4 illustrates our findings overall, for specific times of day, and for each weekday. We distinguish the contest scope and the wholeplistascope by cornflowerblue and violet colors. Throughout the three subfigures, we noted that theplistascope’s Jaccard metric exceeds the contest scope. The gap is most obvious in the night (0:00–8:00). Still, we have to consider the fact that the night has relatively few

(9)

interactions compared with the day time. Independent of context, we observe Jaccard scores in the range of 40–60%. These signal that more than half of the most popular items re-occur in the next hour’stop 6group. Thus, recommending popular items guarantees a good chance to perform well. This explains the good performance of the “baseline” in previous editions of NewsREEL.

Total

0.0 0.2 0.4 0.6 0.8 1.0

Average Jaccard Similarity

Total Contest Teams Whole

0:00-8:00 8:00-16:0016:00-24:00

0.0 0.2 0.4 0.6 0.8 1.0

Time of Day Contest Teams Whole

Sun. Mon. Tue. Wed.Thur. Fri. Sat.

0.0 0.2 0.4 0.6 0.8 1.0

Day of Week Contest Teams Whole

Fig. 4: Jaccard similarity of top clicked items between continuous hours forSport1 in the week 3–9 April, 2106

3.3 Predicting Ability of Impressions and Clicks

Empirically speaking, “Most Impressed” approach always performs well on CTR, thus we compare the predicting ability of impressions and clicks regarding clicks in the next hour. As the six most frequently clicked items receive more than 80%

of all hourly clicks, we define predicting ability here by the Jaccard similarity between the set of recommended items and the set of the six most frequently clicked items in a specific hour. The six items most frequently viewed in the last hour form the recommending set “Most Impressed”. On the other hand, the six items most frequently clicked after having been suggested in the last hour characterize the recommending set “Most Clicked”. Figure 5 shows both methods’ performances over time. The cyan curve refers to “Most Clicked” while the magenta line refers to “Most Impressed”. The upper subfigure shows the comparison of “Most Impressed” and “Most Clicked” in the range “contest teams”, and the bottom subfigure presents the same comparison in range “whole plista”.

“Most Clicked” outperforms “Most Impressed” in both scenarios. This indicates that at least on 5 April 2016, users’ reactions to recommendations let the system better predict future clicks than what they read.

3.4 Clicks Pattern for Tagesspiegel

Hitherto, we focused on Sport1. We repeated our experiments for the second largest publisher—Tagesspiegel. Figure 6 shows a considerably lesser number of clicks compared withSport1. Some one hour intervals have less than six clicks in total. Even considering the whole plista range of clicks, Figure 7 shows a

(10)

0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:0010:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00

0.0 0.2 0.4 0.6 0.8 1.0

Jaccard Similarity between Recommended Set and Most Clicked 6 Items in this Hour

Predicting Ability of 'Most Impressed' vs. 'Most Clicked' within Contest Teams on April 5, 2016

recommended set: most clicked 6 items in last hour recommending set: most impressed 6 items in last hour

0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:0010:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00

0.0 0.2 0.4 0.6 0.8 1.0

Jaccard Similarity between Recommended Set and Most Clicked 6 Items in this Hour

Predicting Ability of 'Most Impressed' vs. 'Most Clicked' on Whole Range on April 5, 2016

recommended set: most clicked 6 items in last hour recommending set: most impressed 6 items in last hour

Fig. 5: “Most Impressed” vs. “Most Clicked” on predicting next Hour “Most Clicked”

inSport1 on April 5, 2016

flatter power law distribution. The top items only account for 20–35% of clicks.

In addition, we observe more variation in the most frequently clicked items such that many items appear only for a single hour in thetop 6group. We hypothesize that the increased variation is caused by the higher diversity in topics. Sport1 exclusively provides sport-related news. Contrarily, Tagesspiegel covers a wide range of topics including politics, economy, sports, and local news.

4 Conclusion and Future Work

In this working note, we describe our experience with the real-time news recommendation contest NewsREEL online task in 2016. Through evaluating approaches such as “Most Impressed”, “Newest”, “Most Impressed by Category”,

“Content Similar”, and “Most Clicked”, we found out that a small subset of news items attracted most clicks. This holds true beyond the scope of individual algorithms. Hence we started analyzing the patterns of clicked items on the dominating portalsSport1 andTagesspiegel. In particular forSport1, item popularity followed a power law distribution and items continued to be popular for hours. This phenomenon was less pronounced onTagesspiegel. Monitoring which articles users clicked provided better information to predict future clicks than tracking which articles users read. These observations inspire us to change the perspective of implementing recommender from analyzing features and contextual factors to investigating clicked items’ time series patterns. Thus, as long as Sport1 continues to be the dominant news source in the contest, we can focus on the following points as future work: (1) analyzing the duration regularity of an item staying in the most clicked items group; (2) the ranking prediction of

(11)

0.0 0.5 1.01.5 2.0 2.5

3.0 100.0%^273697047_273110073 273697322

273697047 273110073

273697322

0:00 - 1:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^273836087_273697322 274035840

273836087

273697322

274035840 1:00 - 2:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%2:00 - 3:00^273520214273520214

0.0 0.5 1.0 1.5 2.0 0%

3:00 - 4:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^272928513_273499850272928513273499850 4:00 - 5:00

0.0 0.5 1.0 1.5 2.0 0%

5:00 - 6:00

0.0 0.5 1.0 1.5 2.0 0%

6:00 - 7:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^273415977_274060060 273593252

273415977 274060060 273593252 7:00 - 8:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^272900321_273570247 273990503

272900321 273570247 273990503 8:00 - 9:00

0.0 0.51.01.5 2.0 2.5

3.0 100.0%^273415978_273697047273415978273697047 9:00 - 10:00

0 1 2 3 4

5 81.82%^273673523_273189241 273316112 273316114 274144707 273520214 273673523

273189241 273316112 273316114 274144707 273520214 10:00 - 11:00

0.00.5 1.01.5 2.02.5 3.03.5

4.0 100.0%^273673523_274035845 273698751 273697964 273673523

274035845 273698751 273697964 11:00 - 12:00

0.0 0.5 1.01.5 2.0 2.5

3.0 100.0%^273110073_274126037 273458037 274144707 273138989 273593251

273110073 274126037 273458037 274144707 273138989 273593251 12:00 - 13:00

0.00.5 1.01.5 2.02.5 3.03.5

4.0 100.0%^274234181_273836087^274234181273836087 13:00 - 14:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^272953019_273593248 272900324 274234181 274083234

272953019 273593248 272900324

274234181

274083234 14:00 - 15:00

0.0 0.51.01.5 2.0 2.5

3.0 85.71%^273836087_272928517 274035849 273189241 273368057 273547435 273836087

272928517 274035849 273189241 273368057 273547435 15:00 - 16:00

01 23 45 67

8 100.0%^273697558_273836087 273189239 273673523 274083237 273189248

273697558

273836087

273189239 273673523 274083237 273189248 16:00 - 17:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^273138987_273593252 274234181 273110071 274035847 273316119

273138987 273593252 274234181 273110071 274035847 273316119 17:00 - 18:00

0 1 2 3 4

5 90.0% ^273232180_273836087 273110073 273189241 273593251 273368056

273232180 273836087 273110073 273189241 273593251 273368056 18:00 - 19:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^273836088_273086578 273697322 273189246 274234175

273836088 273086578 273697322 273189246 274234175 19:00 - 20:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 100.0%^273138991_273393614 274283027 273743269 273209444

273138991 273393614 274283027 273743269 273209444 20:00 - 21:00

0.0 0.51.01.5 2.0 2.5

3.0 100.0%^274126035_273631823 273343967

274126035 273631823 273343967 21:00 - 22:00

0.0 0.5 1.0 1.5 2.0 2.5

3.0 75.0% ^273520214_273836087 273138991 273673523 273743269 273189241

273520214 273836087 273138991 273673523 273743269 273189241 22:00 - 23:00

0.00.5 1.01.5 2.02.5 3.03.5

4.0 100.0%273698756274035850273138989273697964273547435273316112 23:00 - 24:00

Items Clicked Condition in Tagesspiegel for Contest Teams on April 5, 2016

Fig. 6: Top clicks condition ofcontest teamsrange onTagesspiegel on April 5, 2016

an item of being popular; (3) making use of long tail to find relevant features and contexts.

5 Acknowledgement

The work of the first author has been continuously funded by China Scholarship Council (CSC). The research leading to these results is partially supported by the CrowdRec project, which has received funding from the European Union Seventh Framework Program FP7/2007–2013 under grant agreement No. 610594.

References

1. D. Doychev, R. Rafter, A. Lawlor, and B. Smyth. News recommenders: Real-time, real-life experiences. InProceedings of UMAP 2015, pages 337–342, 2015.

2. G. Gebremeskel and A. P. de Vries. The degree of randomness in a live recommender systems evaluation. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, September 8-11, 2015. CEUR, 2015.

3. F. Hopfgartner, T. Brodt, J. Seiler, B. Kille, A. Lommatzsch, M. Larson, R. Turrin, and A. Ser´eny. Benchmarking news recommendations: The clef newsreel use case.

SIGIR Forum, 49(2):129–136, Jan. 2016.

4. B. Kille, A. Lommatzsch, G. Gebremeskel, F. Hopfgartner, M. Larson, J. Seiler, D. Malagoli, A. Ser´eny, T. Brodt, and A. de Vries. Overview of newsreel’16:

Multi-dimensional evaluation of real-time stream-recommendation algorithms. In