Clicks Pattern Analysis for Online News Recommendation Systems
Jing Yuan1, Andreas Lommatzsch1, Benjamin Kille1
1 DAI-Labor, Technische Universt¨at Berlin, Germany {jing.yuan, andreas.lommatzsch, benjamin.kille}@dai-labor.de
Abstract. The NewsREEL challenge provides researchers with an op- portunity to evaluate their news recommending algorithms live based on real users’ feedback. Since 2014, participants evaluated a variety of ap- proaches on the Open Recommendation Platform (ORP), yet popularity- based algorithms constitute the most successful ones. In this working note, we chronologically describe our participation in NewsREEL on- line task in the year 2016. With approaches including “most impressed”,
“newest”, “most impressed by category”, “content similar” and “most clicked”, we reconfirm that content relevance is not a very good indicator for recommending news. Meanwhile, for the dominating portalSport1, the extrapolation of the time series of impressions and clicks enables us to predict the items most likely to be clicked in the next hours. A sample analysis on one week data shows us that the duration of an item being popular is much longer than we expected. Thus, we propose that when designing recommenders in this contest, more attention should be paid on the time series patterns of clicks and impressions.
1 Introduction
News, as important media content, still keeps its role of guiding social opinion, even in modern world which is full of virtual social network and personal ideas.
Many news providers employ recommender systems and similar personalization techniques to assist users in finding relevant news quickly and conveniently.
Different ways to incorporate recommendations in news publishers have been successfully launched in the current digital news content market. We exemplify three ways in which recommendations are pushed to news consumers. First, as ane-Magazine Provider, Flipboard aggregates news contents from different third party providers and then selects news which is relevant to a user’s pre-defined topics forming their personalized news board. Second, someContent Providers, such as ByteDance, generate contents themselves. They recommend in a closed system based on internal users, news, and interaction in between both. Third, Recommendation Providers, e.g.plista and outbrain, offer recommendation services for different kinds of websites, including news websites. Table 1 com- pares characteristics of the three main-stream news recommenders concerning aspects such as whether they generate content by themselves, the stability of users, and the stable range of news items, respectively. As a representative of
Recommendation Providers,plista manifests its non-trivial condition in terms of variety of news portals and differences in users’ expectations. Considering that NewsREEL competition receives data stream fromplista, participants have to cope with all these knotty conditions to win the contest[8].
Table 1: Characteristics of Main-stream News Recommender
News Recommender Generating Content
Stability of Users
Stable Range of News
e-Magazine Provider (e.g. Flipboard) 7 3 7
News Content Provider (e.g. Bytedance) 3 3 3
Recommendation Provider (e.g.plista) 7 7 7
The NewsREEL challenge 2016 provides participants with the chance of evaluating recommender algorithms with online live user feedback [4, 3]. In the challenge, teams registered on Open Recommendation Platform (ORP) receive streamed messages describing published news articles, users’ impressions and clicks on items, as well as recommendation requests fromplista. The challeng- ing aspects of participating NewsREEL include: (1) recommendations must be provided in 100ms upon request; (2) participants need to deal with news portals from different domains; (3) user groups on specific portal alter; (4) number of messages varies largely among portals [8, 5].
In contrast to recommending movies or music, news items continuously emerge and become outdated constituting a dynamic environment. This makes the NewsREEL competition particularly challenging. Algorithms have to consider these dynamics in news articles and users’ preferences. We focused on popular- ity and freshness to cope with the dynamics following the notion that users prefer important and recent news over insignificant and outdated articles. The success of the “most clicked” strategy in terms of CTR further supports this notion.
Even though the method is rather simple, it captures crucial aspects. Visualizing clicks on items over time, we observe continued click activity stretching several hours for popular items. We compared contents of popular items and discovered that they overlap. Still, content-based algorithms have failed to benefit of these overlaps in previous editions of NewsREEL.
The remainder of this paper is structured as follows. In Section 2, we briefly introduce the approaches we used in year 2016 and discuss other algorithms developed in previous years. Subsequently, we analyze characteristic user-item interaction patterns for different news portals in Section 3, and found that “most clicked items” has its own power of self-predicting. Finally, conclusion and an outlook to future work are given in Section 4.
2 Approach Used
In this section, we chronologically describe the approaches we have deployed in ORP, i.e. the online task of NewsREEL2016, and changes in our thoughts meanwhile. When the most simple approach “most clicked” finally shows its power to outperform other algorithms, it attracts our interest to dig deeper into clicks pattern from the perspective of time series analysis in the next section.
Most Impressed Inspired by the good performance of “baseline” in the past years (see [9]), which directly uses the most recently impressed items as recom- mendation candidates, we implemented a similar method by sorting the 2000 most recent impressions by their frequencies. Typically, this approach is called
“most popular”, but to distinguish it from “most clicked” which will be intro- duced later on, we refer to it as “most impressed” in this paper. The approach ran on ORP for two weeks (January 31 to February 13, 2016), and got the CTR 1.21% (ranked 3rd, team “artificial intelligence” got the first place with CTR 1.48%) and 1.35% (ranked 2nd, team “abc” got the first place with CTR 1.4%) in these two weeks separately.
Newest Considering that freshness represents a vital aspect of news, we also implemented an approach “newest” which provides the most recently created items from the same category as the currently visited item as recommendation.
Given the good performance of “most impressed” mentioned above, we used it as an alternative solution when the request lacked an item id, i.e. the category can- not be determined. In addition, for a recommendation request with 6 candidate slots, 3 positions are still filled by “most impressed” approach. Therefore, this approach can be seen as a simple ensemble of “most impressed” and “newest”.
With this solution, from 21–27 February, our team “news ctr” got CTR 1.19%
(ranked 5th, team “artificial intelligence” got the first place with CTR 1.45%) in the contest leader board.
Most Impressed by Category After witnessing how “newest” weakened the effect of “most impressed”, we conducted another experiment which only consid- ered the number of impressions, but separates the impression counts according to categories, thus for the recommendation request with item id the recommending targets will only be the “most impressed” items in the relevant category. The approach ran on ORP for three weeks, from 6–12 March 2016 it got CTR of 0.82% (ranked 7th, team “is@uniol” got the first place with CTR 1.03%), from 13–19 March 2016 it got CTR of 0.97% (ranked 11th, i.e. the last one, team
“xyz” got the first place with CTR 1.85%) and from 20–26 March 2016 it got CTR 1.24%(ranked 6th, team “xyz” got first place with CTR 2.16%).
Content Similar Having confirmed that considering categories in combination with popularity lead to worse performances, we implemented a pure content- based recommender using Apache Lucene to see how content relevance influence
recommending effect after all. We deployed this content-based recommender on ORP and noticed that from 27 March to 2 April it got a CTR of 0.77% (ranked 11th, team “xyz” got the first place with CTR 1.51%). This confirmed that in real-time news recommendation scenario as in this contest, pure content similar- ity is not sufficient for a successful recommending strategy. Said et al. [10] came to the same conclusion hypothesizing that content similarity fails to pick up on new stories but redirects users to similar contents.
Most Clicked While varying on different algorithms, we discovered an interest- ing phenomenon through the clicks message we received from ORP. Even though different contest teams used different algorithms, the clicked items for all of these recommendations tended to be similar. This consistent regularity reminded us to think whether characteristic patterns exist within clicks along the time axis.
Hence we implemented the simplest approach “most clicked” which only serves the most frequently clicked items in the last hour to the recommendation re- quests. From 3–9 April 2016, this simple approach got a CTR of 1.14% winning the leader board ahead of “xyz” (0.96%). Figure 1 shows the result during this week.
Fig. 1:news ctr got first place in the period 3–9 April, 2016
Having observed this interesting phenomenon, we looked into previous work related to this contest. Kliegr and Kuchaˇr [6, 7] implemented an approach based on association rules. They used contextual features (e.g.ISP,OS,GEO,WEEKDAY, LANG,ZIP, andCLASS) to train the rule engine. The results obtained in the online
evaluation indicate that association rules do not outperform other algorithms.
Through the investigation in [2], Gebremeskel and de Vries found that there is no striking improvement through including geographic information on news rec- ommendation, yet more randomness of the system should be taken into account when considering evaluation for recommenders. Doychev et al. introduced their 6 popularity-based and 6 similarity-based approaches in [1], but their algorithms seemed to perform poorly compared with “baseline” due to being influenced by content aggregating. As Said et al. concluded in [10] that news article readers might be reluctant to be confronted with similar topic all the way, but more pleased to be distracted by something breaking or interesting. In the following section, we are digging into how this breaking phenomenon is reflected in clicks behavior.
3 Clicks Pattern Analysis
As a further exploration of click patterns, we focus on clicks following recom- mendation requests inSport1 andTagesspiegel on April 5, 2016. Clicks inSport1 follow more obvious and stable trends. We analyze this consistency for the “most clicked” recommender’s suggestions in terms of the Jaccard similarity of tempo- rally adjacent item groups.
3.1 Clicks Pattern for Sport1
First, we draw the histogram of clicks regarding recommendation requests on items in portalSport1. Considering thatplistahas only delivered part of all rec- ommendation requests to ORP participants, we suppose that the click patterns might slightly differ amid contest teams scope and the wholeplistascope. ORP hides such scope information in its click notification JSON “context.simple” ob- ject where key number ’41’ stands for “contest team” and value number repre- sents specific team number. For instance, “news ctr” is the contest team with team number 2465, while team number -1 signals that the click happened out- side contest team range. Thus, the figures are drawn separately by these two scales: contest teams scale excluding clicks outside the contest scope and whole plistarange without any restrictions on contest team.
Figure 2 shows the click conditions among contest teams. In order to track the top clicked items in a time sequence, we draw the figure for each hour for April 5, 2016, i.e. 24 subfigures covering the whole day. In each subfigure, news items are located on the x-axis as points sorted by the click frequency in descending order. A red vertical line separates the six most frequently clicked items—most recommendation requests ask for six suggestions. For a majority of intervals, we observe a power law distribution. Few highly popular items occupy a majority of clicks. The percentage of clicks occupied by the top six items is shown in the red boxes.
We analyze how popularity transitions into the future. Therefore, we high- light the item ids of the six most popular items in the top right corner of each
0 2 4 6 8
10 77.14%273681975 273707540 274110878 274109826 273877703 274142904 273681975 273707540 274110878 274109826 273877703
274142904 0:00 - 1:00
0 1 2 3 4 5
6 81.25%274109826273681975 273877703 274110878 273707540 274086605 274109826 273681975 273877703 274110878 273707540
274086605 1:00 - 2:00
012 3 4 5
6 85.71%273681975274110878 274109826 274062086 273926210 274128003 273681975 274110878 274109826
274062086
273926210
274128003 2:00 - 3:00
01 23 45 67
88.89%274110878273681975 274001468 274109826 274115260 273926210 274110878 273681975
274001468
274109826
274115260
273926210
3:00 - 4:00
0.00.5 1.01.5 2.02.5 3.03.5
4.0 58.82%273681975273795547 274008188 273707540 274135490 274142904 273681975
273795547 274008188
273707540
274135490
274142904
4:00 - 5:00
0123 45 67 89
91.67%274110878273681975 274142904 274146517 273951845 273707540 274110878 273681975 274142904
274146517 273951845
273707540
5:00 - 6:00
02 46 108 1214
16 74.47%273681975274110878 274142904 274135490 274133196 273707540 273681975 274110878 274142904 274135490
274133196
273707540
6:00 - 7:00
0 5 10 15 20
25 80.0% 273681975274110878 273707540 274135490 273877703 274142904 273681975 274110878 273707540 274135490 273877703 274142904
7:00 - 8:00
0 5 10
15 81.16%273681975274110878273707540
274133196 273877703 274135490 273681975 274110878 273707540
274133196
273877703 274135490
8:00 - 9:00
02 46 108 1214
88.24%273707540274110878 273681975 274135490 274172365 274109826 273707540 274110878 273681975 274135490 274172365
274109826 9:00 - 10:00
02 46 108 1214
16 72.92%273707540274110878 274135490 274190378 274190652 274172365 273707540 274110878 274135490 274190378 274190652 274172365
10:00 - 11:00
02 4 6 8 10
12 69.57%273707540274190378 274110878 274172365 273795547 274190652 273707540 274190378 274110878 274172365
273795547
274190652
11:00 - 12:00
05 1015 20
2530 66.12%273707540 274224023 273770166 274190378 274207242 274219226 273707540 274224023 273770166 274190378 274207242 274219226
12:00 - 13:00
0 10 20 30
40 76.0% 274224023273770166 274207242 273877703 274110878 274219226 274224023 273770166 274207242 273877703 274110878 274219226
13:00 - 14:00
0 5 10 15
20 78.3% 274207242274224023
273770166 273877703 274219226 274110878 274207242 274224023 273770166 273877703 274219226 274110878
14:00 - 15:00
0 5 10 15 20
25 88.41%274224023273877703 274227548 273770166 274219226 274197901 274224023 273877703
274227548
273770166 274219226
274197901 15:00 - 16:00
05 1015 2025
30 87.38%274224023273770166
274219226 273877703 274062086 274242801 274224023 273770166 274219226 273877703 274062086
274242801 16:00 - 17:00
0 5 10
15 77.27%274275555273770166274224023
273877703 274062086 274110878 274275555 273770166 274224023 273877703 274062086 274110878
17:00 - 18:00
0 5 10 15
20 83.58%274224023273877703
274275555 273770166 274110878 274280808 274224023 273877703 274275555 273770166 274110878 274280808
18:00 - 19:00
02 46 108 1214
16 75.41%274224023 274275555 273877703 273770166 274280808 274110878 274224023 274275555 273877703 273770166 274280808 274110878
19:00 - 20:00
0 5 10
15 74.67%274224023273877703274280808
274275555 273952709 274110878 274224023 273877703 274280808 274275555 273952709 274110878
20:00 - 21:00
0 5 10 15
20 79.22%274224023273877703 274110878 274275555 274280808 273952709 274224023 273877703 274110878 274275555 274280808 273952709
21:00 - 22:00
05 1015 2025 3035
88.0% 274224023273877703 274331454 274110878 274280808 274275555 274224023 273877703 274331454 274110878 274280808 274275555
22:00 - 23:00
0 5 10 15 20
25 78.03%273877703 274224023 274110878 274331454
274340273
274280808
23:00 - 24:00
Items Clicked Condition in Sport1 for Contest Teams on April 5, 2016
Fig. 2: Top clicks condition ofcontest teams onSport1 on April 5, 2016
subfigure. We color item ids to facilitate tracking individual items throughout the plot. Items tinted in gray only appear in a single one hour interval. From Figure 2, we see that on April 5, 2016, items ranked in the top three manifest more continuity, i.e. they are more likely to re-appear in the next hour’s top clicked 6 items group.
Aside from the scope of contest teams in ORP, we are also interested in the power law distribution of clicked items in the wholeplista range. In Figure 3, we find that along with the increasing number of distinct clicked items in the
“whole” range, the power law distribution of clicked items is even more signif- icant. In all one hour time windows, more than 87% clicks are contributed by the top 6 items. The more complete data may cause the increased steepness of the histograms. The distribution can be described by Zipf’s law. The significant advantage of the six most frequently clicked items reminded us that we should pay more attention to the short head with higher business value. Thereby, we can keep a relatively high CTR. Still, more sophisticated methods are required to leverage the potential of the long tail.
When focusing on the most frequently clicked items within this one day, we find some clues for the future work. First, Table 2 illustrates the four items oc- curring most frequently in the top 6 group. It lists their item id, the duration contained in thetop 6, their ranking trends, and the date they had been created.
All four items remained in thetop 6 for at least eleven hours. We had expected considerably less time as news continuously emerge. We notice fluctuating rank- ings of these four items. Recognizing patterns in shifting rankings will be subject to future work. The dates of creation subvert our previous expectations. We as-
0 100 200 300
400 89.76%273681975273707540 273877703 273770166 274110878 273880301 273681975 273707540 273877703 273770166 274110878
273880301 0:00 - 1:00
0 50 100 150
200 87.81%273681975273707540 273770166 273877703 273795547 274110878 273681975 273707540 273770166 273877703
273795547
274110878
1:00 - 2:00
200 4060 10080
120 89.66%273681975273707540 273877703 273770166 274110878 273926210 273681975 273707540 273877703 273770166 274110878
273926210 2:00 - 3:00
0 20 40 60 80
100 86.51%273681975273707540 273877703 273770166 273880301 274110878 273681975 273707540 273877703 273770166
273880301
274110878
3:00 - 4:00
200 4060 10080
120 88.22%273681975273707540 273770166 273877703 273926210 274008188 273681975 273707540 273770166 273877703
273926210 274008188 4:00 - 5:00
0 50 100 150 200
250 94.76%273681975273707540 273770166 273877703 274110878 273795547 273681975 273707540 273770166 273877703 274110878 273795547
5:00 - 6:00
1000 200300 400500
600 92.69%273681975273707540 273770166 273877703 274110878 273795547 273681975 273707540 273770166 273877703 274110878 273795547
6:00 - 7:00
1000 200300 400500 600700
800 93.4% 273681975273707540 273877703 273770166 274110878 273795547 273681975 273707540 273877703 273770166 274110878 273795547
7:00 - 8:00
1000 200300 400500
600700 93.08%273681975 273707540 273770166 273877703 274110878 273795547 273681975 273707540 273770166 273877703 274110878 273795547
8:00 - 9:00
1000 200300 400500
600 91.91%273681975273707540 273770166 273877703 274110878 274062086 273681975 273707540 273770166 273877703 274110878 274062086
9:00 - 10:00
0 100 200300 400
500 87.57%273707540273770166 273877703 273681975 274110878 274062086 273707540 273770166 273877703 273681975 274110878 274062086
10:00 - 11:00
1000 200 300400
500600 89.23%273707540 273770166 273877703 274110878 273795547 274062086 273707540 273770166 273877703 274110878
273795547
274062086
11:00 - 12:00
1000 200300 400500
600700 88.71%273707540273770166 273877703 274110878 274062086 273926210 273707540 273770166 273877703 274110878 274062086
273926210 12:00 - 13:00
1000 200300 400500
600700 88.89%273770166273877703
274110878 274062086 274224023 274008188 273770166 273877703 274110878 274062086 274224023
274008188 13:00 - 14:00
1000 200300 400500
600 89.9% 273770166273877703
274110878 274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023
273795547 14:00 - 15:00
0 100 200 300 400
500 93.19%273770166273877703 274110878 274062086 274224023 274008188 273770166 273877703 274110878 274062086 274224023 274008188
15:00 - 16:00
1000 200300 400500
600700 94.19%273770166 273877703 274110878 274062086 274224023 274008188 273770166 273877703 274110878 274062086 274224023 274008188
16:00 - 17:00
0 100200 300 400 500
600 92.39%273770166273877703 274110878 274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023 273795547
17:00 - 18:00
0 100 200 300
400 93.5% 273770166273877703274110878
274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023 273795547
18:00 - 19:00
500 100150 200250 300350
400 93.38%273770166273877703 274110878 274062086 274224023 273795547 273770166 273877703 274110878 274062086 274224023 273795547
19:00 - 20:00
500 100150 200250 300350
400 89.5% 273877703274110878 274062086 273770166 274224023 274219226 273877703 274110878 274062086 273770166 274224023 274219226
20:00 - 21:00
1000 200300 400500
600 93.42%273877703274110878 274062086 274224023 274008188 274219226 273877703 274110878 274062086 274224023 274008188 274219226
21:00 - 22:00
0 200 400 600
800 91.98%273877703274110878 274224023 274062086 274219226 274008188 273877703 274110878 274224023 274062086 274219226 274008188
22:00 - 23:00
0 200 400 600
800 92.02%273877703274110878 274224023 274062086 274219226
274275555 23:00 - 24:00
Items Clicked Condition in Sport1 for Whole Range on April 5, 2016
Fig. 3: Top clicks condition ofwholerange onSport1 on April 5, 2016
sumed that news would remain relevant for a very limited time. In contrast, news articles created on April 2, 2016, dominated the top 6 news three days later. This indicates a noticeably longer life-cycle of news than we anticipated.
Table 2: Stable top clicked items condition on April 5, 2016
Item Id Being in Top6 Ranking Created At 273681975 0:00–11:00 Most of the time 1st 2nd, Apr. 2016 273707540 0:00–13:00 2nd→1st 2nd, Apr. 2016 273877703 0:00–24:00 3rd/4th→2nd→1st 3rd, Apr. 2016 273770166 0:00–21:00 2nd→3rd/4th→1st 2nd, Apr. 2016
Next, we analyze the categories, titles, and descriptions of these four items in order to get a better understanding of the contents. Table 3 highlights co- occurring terms in these four items. Among those, we see that the breaking news that coachPep Guadiolawill be leavingBayernattracted many users’ interest on relevant articles. We hypothesize that content similarity affects recommenders, but only for popular items. Still, a majority of users only pays attention to the most popular articles which is why pure content-based recommenders frequently suggest articles with minor click chances.
Table 3: Stable top clicked items description on April 5, 2016
Item Id 273681975 273707540 273877703 273770166 category fussball intenational-
fussball fussball fussball title
Guardiola machtG¨otze
froh
Van Gaal watscht di Maria ab
Robben:“Van Gaalist wie Guardiola”
Mittelfeldbestie G¨otze: Wechsel
nur im Notfall.
text
Der Coach des FC Bayern l¨asst Youngster
FelixG¨otze erstmals mit den Profis trainieren. Javi
Martinez und Manuel Neuer stehen derweil
vor dem Comeback.
Als Rekordtransfer
geholt, nach nur einer Saison wieder vom Hof gejagt. Jetzt
geht die Geschichte
zwischen United-Trainer Louisvan Gaal
und Angel di Maria in die Verl¨angerung.
Arjen Robben vergleicht den ehemaligen Bayern-Coach
Louisvan Gaalmit dem
aktuellen Trainer Pep Guardiola.
Mit demFC Bayernwill Robben noch viel erreichen.
MarioG¨otzes beherzter Auftritt gegen Frankfurt zeigt:
Er will sich unbedingt beim
FC Bayern durchsetzen.
Der Verein mauert noch beim Thema Transfer.
3.2 Jaccard Similarity Between Clicks in Neighbor Hours
In this subsection, we quantify the continuity of most frequently clicked items and analyze this continuity behavior concerning contextual factors such as time of day and day of week. Jaccard Similarity, as defined in Equation 1, is a metric to measure the similarity of two setsAandB. The value of this metric equates to the cardinality of the intersection divided by the size of union of these two sets. In our scenario,AandBrefer to the sets of the six most frequently clicked items of two neighboring one hour time slots. The higher the Jaccard similarity, the more items users constantly are concerned with across neighbor hours.
Jaccard(A, B) =|A∩B|
|A∪B| = |A∩B|
|A|+|B| − |A∩B| (1)
We expand our view from a single day to the week 3–9 April, 2016. Thereby, we obtain 24×7 = 168 one hour time windows. Thereof, we derive 167 pairs of subsequent time windows to compute the average Jaccard similarity. Figure 4 illustrates our findings overall, for specific times of day, and for each weekday. We distinguish the contest scope and the wholeplistascope by cornflowerblue and violet colors. Throughout the three subfigures, we noted that theplistascope’s Jaccard metric exceeds the contest scope. The gap is most obvious in the night (0:00–8:00). Still, we have to consider the fact that the night has relatively few
interactions compared with the day time. Independent of context, we observe Jaccard scores in the range of 40–60%. These signal that more than half of the most popular items re-occur in the next hour’stop 6group. Thus, recommending popular items guarantees a good chance to perform well. This explains the good performance of the “baseline” in previous editions of NewsREEL.
Total
0.0 0.2 0.4 0.6 0.8 1.0
Average Jaccard Similarity
Total Contest Teams Whole
0:00-8:00 8:00-16:0016:00-24:00
0.0 0.2 0.4 0.6 0.8 1.0
Average Jaccard Similarity
Time of Day Contest Teams Whole
Sun. Mon. Tue. Wed.Thur. Fri. Sat.
0.0 0.2 0.4 0.6 0.8 1.0
Average Jaccard Similarity
Day of Week Contest Teams Whole
Fig. 4: Jaccard similarity of top clicked items between continuous hours forSport1 in the week 3–9 April, 2106
3.3 Predicting Ability of Impressions and Clicks
Empirically speaking, “Most Impressed” approach always performs well on CTR, thus we compare the predicting ability of impressions and clicks regarding clicks in the next hour. As the six most frequently clicked items receive more than 80%
of all hourly clicks, we define predicting ability here by the Jaccard similarity between the set of recommended items and the set of the six most frequently clicked items in a specific hour. The six items most frequently viewed in the last hour form the recommending set “Most Impressed”. On the other hand, the six items most frequently clicked after having been suggested in the last hour characterize the recommending set “Most Clicked”. Figure 5 shows both meth- ods’ performances over time. The cyan curve refers to “Most Clicked” while the magenta line refers to “Most Impressed”. The upper subfigure shows the compar- ison of “Most Impressed” and “Most Clicked” in the range “contest teams”, and the bottom subfigure presents the same comparison in range “whole plista”.
“Most Clicked” outperforms “Most Impressed” in both scenarios. This indicates that at least on 5 April 2016, users’ reactions to recommendations let the system better predict future clicks than what they read.
3.4 Clicks Pattern for Tagesspiegel
Hitherto, we focused on Sport1. We repeated our experiments for the second largest publisher—Tagesspiegel. Figure 6 shows a considerably lesser number of clicks compared withSport1. Some one hour intervals have less than six clicks in total. Even considering the whole plista range of clicks, Figure 7 shows a
0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:0010:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00
0.0 0.2 0.4 0.6 0.8 1.0
Jaccard Similarity between Recommended Set and Most Clicked 6 Items in this Hour
Predicting Ability of 'Most Impressed' vs. 'Most Clicked' within Contest Teams on April 5, 2016
recommended set: most clicked 6 items in last hour recommending set: most impressed 6 items in last hour
0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:0010:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00
0.0 0.2 0.4 0.6 0.8 1.0
Jaccard Similarity between Recommended Set and Most Clicked 6 Items in this Hour
Predicting Ability of 'Most Impressed' vs. 'Most Clicked' on Whole Range on April 5, 2016
recommended set: most clicked 6 items in last hour recommending set: most impressed 6 items in last hour
Fig. 5: “Most Impressed” vs. “Most Clicked” on predicting next Hour “Most Clicked”
inSport1 on April 5, 2016
flatter power law distribution. The top items only account for 20–35% of clicks.
In addition, we observe more variation in the most frequently clicked items such that many items appear only for a single hour in thetop 6group. We hypothesize that the increased variation is caused by the higher diversity in topics. Sport1 exclusively provides sport-related news. Contrarily, Tagesspiegel covers a wide range of topics including politics, economy, sports, and local news.
4 Conclusion and Future Work
In this working note, we describe our experience with the real-time news rec- ommendation contest NewsREEL online task in 2016. Through evaluating ap- proaches such as “Most Impressed”, “Newest”, “Most Impressed by Category”,
“Content Similar”, and “Most Clicked”, we found out that a small subset of news items attracted most clicks. This holds true beyond the scope of individ- ual algorithms. Hence we started analyzing the patterns of clicked items on the dominating portalsSport1 andTagesspiegel. In particular forSport1, item pop- ularity followed a power law distribution and items continued to be popular for hours. This phenomenon was less pronounced onTagesspiegel. Monitoring which articles users clicked provided better information to predict future clicks than tracking which articles users read. These observations inspire us to change the perspective of implementing recommender from analyzing features and contex- tual factors to investigating clicked items’ time series patterns. Thus, as long as Sport1 continues to be the dominant news source in the contest, we can focus on the following points as future work: (1) analyzing the duration regularity of an item staying in the most clicked items group; (2) the ranking prediction of
0.0 0.5 1.01.5 2.0 2.5
3.0 100.0%273697047273110073 273697322
273697047 273110073
273697322
0:00 - 1:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%273836087273697322 274035840
273836087
273697322
274035840 1:00 - 2:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%2:00 - 3:00273520214273520214
0.0 0.5 1.0 1.5 2.0 0%
3:00 - 4:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%272928513273499850272928513273499850 4:00 - 5:00
0.0 0.5 1.0 1.5 2.0 0%
5:00 - 6:00
0.0 0.5 1.0 1.5 2.0 0%
6:00 - 7:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%273415977274060060 273593252
273415977 274060060 273593252 7:00 - 8:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%272900321273570247 273990503
272900321 273570247 273990503 8:00 - 9:00
0.0 0.51.01.5 2.0 2.5
3.0 100.0%273415978273697047273415978273697047 9:00 - 10:00
0 1 2 3 4
5 81.82%273673523273189241 273316112 273316114 274144707 273520214 273673523
273189241 273316112 273316114 274144707 273520214 10:00 - 11:00
0.00.5 1.01.5 2.02.5 3.03.5
4.0 100.0%273673523274035845 273698751 273697964 273673523
274035845 273698751 273697964 11:00 - 12:00
0.0 0.5 1.01.5 2.0 2.5
3.0 100.0%273110073274126037 273458037 274144707 273138989 273593251
273110073 274126037 273458037 274144707 273138989 273593251 12:00 - 13:00
0.00.5 1.01.5 2.02.5 3.03.5
4.0 100.0%274234181273836087274234181273836087 13:00 - 14:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%272953019273593248 272900324 274234181 274083234
272953019 273593248 272900324
274234181
274083234 14:00 - 15:00
0.0 0.51.01.5 2.0 2.5
3.0 85.71%273836087272928517 274035849 273189241 273368057 273547435 273836087
272928517 274035849 273189241 273368057 273547435 15:00 - 16:00
01 23 45 67
8 100.0%273697558273836087 273189239 273673523 274083237 273189248
273697558
273836087
273189239 273673523 274083237 273189248 16:00 - 17:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%273138987273593252 274234181 273110071 274035847 273316119
273138987 273593252 274234181 273110071 274035847 273316119 17:00 - 18:00
0 1 2 3 4
5 90.0% 273232180273836087 273110073 273189241 273593251 273368056
273232180 273836087 273110073 273189241 273593251 273368056 18:00 - 19:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%273836088273086578 273697322 273189246 274234175
273836088 273086578 273697322 273189246 274234175 19:00 - 20:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 100.0%273138991273393614 274283027 273743269 273209444
273138991 273393614 274283027 273743269 273209444 20:00 - 21:00
0.0 0.51.01.5 2.0 2.5
3.0 100.0%274126035273631823 273343967
274126035 273631823 273343967 21:00 - 22:00
0.0 0.5 1.0 1.5 2.0 2.5
3.0 75.0% 273520214273836087 273138991 273673523 273743269 273189241
273520214 273836087 273138991 273673523 273743269 273189241 22:00 - 23:00
0.00.5 1.01.5 2.02.5 3.03.5
4.0 100.0%273698756274035850273138989273697964273547435273316112 23:00 - 24:00
Items Clicked Condition in Tagesspiegel for Contest Teams on April 5, 2016
Fig. 6: Top clicks condition ofcontest teamsrange onTagesspiegel on April 5, 2016
an item of being popular; (3) making use of long tail to find relevant features and contexts.
5 Acknowledgement
The work of the first author has been continuously funded by China Scholarship Council (CSC). The research leading to these results is partially supported by the CrowdRec project, which has received funding from the European Union Seventh Framework Program FP7/2007–2013 under grant agreement No. 610594.
References
1. D. Doychev, R. Rafter, A. Lawlor, and B. Smyth. News recommenders: Real-time, real-life experiences. InProceedings of UMAP 2015, pages 337–342, 2015.
2. G. Gebremeskel and A. P. de Vries. The degree of randomness in a live recom- mender systems evaluation. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, September 8-11, 2015. CEUR, 2015.
3. F. Hopfgartner, T. Brodt, J. Seiler, B. Kille, A. Lommatzsch, M. Larson, R. Turrin, and A. Ser´eny. Benchmarking news recommendations: The clef newsreel use case.
SIGIR Forum, 49(2):129–136, Jan. 2016.
4. B. Kille, A. Lommatzsch, G. Gebremeskel, F. Hopfgartner, M. Larson, J. Seiler, D. Malagoli, A. Ser´eny, T. Brodt, and A. de Vries. Overview of newsreel’16:
Multi-dimensional evaluation of real-time stream-recommendation algorithms. In