Assessing the impact of annotation on understanding and retaining online news articles

(1)

Assessing the Impact of Annotation on Understanding and Retaining

Online News Articles

by Prateek Kukreja B.S., Electrical Engineering University of Maryland, 2012

Submitted to the MIT Integrated Design & Management Program and the Department of Electrical Engineering & Computer Science in Partial Fulfillment of the Requirements for the Degrees

of

Master of Science in Engineering & Management and Master of Science in Electrical Engineering and Computer Science in conjunction with the Integrated Design & Management Program

at the

Massachusetts Institute of Technology June 2019

The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now or known hereafter created.

Signature redacted

S ignature of A uthor...

MIT Integrated Design & Management Program Department of Electrical Engineering and Computer Science

Signature redacted

Certified by ... A ccepted by...

S ig

Accepted by... MASSACHUSETS INSTTUE OF TECHNOLOGY

JUN 27

2019

May 9, 2019 David R. Karger Professor of Electrical Engineering and Computer Science Thesis Supervisor

nature redacted

_...

/ 9

(1

Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students

Signature redacted

Ma hew S. Kressy Integrated Design & Manag ment Program Ex utive Director

(2)

(3)

Assessing the Impact of Annotation on Understanding and Retaining

Online News Articles

by Prateek Kukreja

Submitted to the Integrated Design & Management Program and the Department of Electrical Engineering & Computer Science on May 9, 2019 in Partial Fulfillment of the Requirements for the Degrees of

Master of Science in Engineering & Management and Master of Science in Electrical Engineering and Computer Science

Abstract

Today's online news readers are distracted and inundated with content. Readers tend to skim and spend little time reading an article, leaving them less informed and sometimes, unknowingly, incorrectly informed since they may miss important details of an article. This behavior can also aid in the spreading of clickbait or sensational news articles because the reader may not spend the time needed to adequately evaluate the article. Past research in education and annotations has indicated that annotating while reading initiates critical thinking and helps readers stay focused. Little research has been done to test these findings outside the education domain. This thesis applies this past research to the context of online news reading to evaluate if these benefits can translate to online news readers. Findings from a lab study conducted as part of this thesis show that on average those who annotate while they read tend to spend more time on the article and on average have a better understanding of the article. These outcomes are promising because if the findings from the study can be replicated in a larger study, they could be used in designing a reading experience that is less susceptible to sensational news or fake news.

Thesis Supervisor:

David R. Karger

(4)

(5)

Acknowledgements

This work would not have been possible without the guidance, feedback, and support of peers, friends, and mentors.

I would like to thank Matthew Kressy, for admitting me into what I believe to be one of the finest graduate programs -I couldn't have imagined being anywhere else. The Integrated Design and Management (IDM) program has been one of the best experiences of my life and it's one I will never forget. Thank you as well to the other IDM faculty: Steve, Andy, Melissa, and Lesley for all your support, advice, and guidance.

I would like to also thank my advisor David Karger for teaching me how to really dig deep into a research idea and figure out the core question hidden in the research. In the end, while it took a long time, this process helped me uncover a question and a project that taught me a lot. Thank you for teaching me this valuable skill and always pushing me to dig deeper.

Next, I would like to thank Amy Zhang, for all her support during this project. Thank you for taking the time to review my code, helping me debug, and for providing valuable feedback on my many research

pitches.

This work would not have been possible without the support of my family and friends. Rahat -you've been there for me in countless situations and always supported me - thanks for being a great older

brother. Thank you to all friends from MIT and those outside of MIT, who've been supportive well before my MIT journey began.

Finally, I would like to thank my mother, Suman Kukreja, who inspires me to work hard every day and to never back down from a challenge. Mom, thank you for your support and always believing in me.

(6)

(7)

List of Figures

Figure 1-1: Creating an annotation with Margins (bottom left), hovering over a highlight (top), clicking a

highlight to expand details (bottom right)... 12

Figure 3-1: Image depicting female student. Copyright free, taken from flaticon.com...21

Figure 3-2: Image depicting male consultant. Copyright free, taken from flaticon.com...21

Figure 3-3: Selecting text to highlight using Margins, Article I Excerpt from The New York Times ... 23

Figure 3-4: Selecting an entire paragraph, Article 1 Excerpt from The New York Times...23

Figure 3-5: Right margin menu when creating an annotation, Article I Excerpt from The New York T im e s ... ... ---... 2 4 Figure 3-6: Example of a multiple paragraph highlight, Article Excerpt from The New York Times...24

Figure 3-7: What is seen when hovering over a highlight, Article Excerpt from The New York Times ... 25

Figure 4-1: A rticle I Custom Page for User]5... 28

Figure 4-2: A rticle 2 Custom Page for User]5... 28

Figure 4-3: Sam ple question for A rticle I ... 30

Figure 4-4: Sam ple question for A rticle 2 ... 31

Figure 5-1: Results for "How Helpful did you find the annotation tool?"...33

Figure 5-2: Results for "Do you think making annotations improved or added value to your reading ex p erien ce? " ... 3 4 Figure 5-4: Word Cloud Generated from Responses from User Comments ... 36

Figure 5-5: Taxonom y of U ser Responses... 37

Figure 5-6: Taxonomy Created By Second Coder for Inter-Rater Reliability ... 40

Figure 5-8: T-test setup in M icrosoft Excel... 42

Figure 5-9: Correlation Matrix of Variables from User Study Generated Using R Corrplot Package...43

Figure 5-11: Annotated Article Model in R and its Summary ... 45

Figure 5-12: Performance of Readers with a Focus of 5 on Annotated Article ... 46

Figure 5-13: Performance of Readers with a Focus of 4 or lower on Annotated Article ... 46

Figure 5-14: Performance of Readers with a Focus of 5 on Non-Annotated Article ... 47

(10)

List of Tables

Table 5-3: Results for "How did annotations add or not add value to your reading experience?" ... 36 Table 5-7: Quiz Performance Scores for all Participants... 41 Table 5-1 O:Table Explaining the Variables Tracked for the Study...44

(11)

10 Introduction

The online news reader today is distracted and inundated with content. This often leads to readers rushing through an article and not spending an adequate amount of time to understand its subtleties. Research

done by the Nielsen Norman Group on how people read on the web showed that "79% of [their] test users always scanned any new page they came across", while only "16% read word-by-word" [19]. This behavior has major implications because it can lead to many users missing out on important details and this can be problematic, especially in the context of reading a news article. This is important because there may be details embedded deep in an article that could be critical for comprehending the article. With the proliferation of fake and divisive news, there is even a stronger need to read articles carefully in order identify potential fallacies and be less susceptible to their influence.

Past research on annotations in education demonstrates that the act of annotating while reading triggers critical thinking and helps the reader stay focused [1,6,12]. It is unclear if these benefits can translate to other domains because the purpose behind reading a textbook or another educational text is very different from reading a news article. This work tests to see if the benefits of annotation in the education domain can transfer to reading online news without the reader needing the same strong purpose. We hypothesize that if online news readers annotate while they read, they will spend more time on the article page, have a better understanding of the article, and retain more.

We evaluate this hypothesis through a lab study in which participants were asked to read two articles, of similar length, one with an annotation tool and one without. All participants were asked to make at least one annotation. The annotations were made using a custom-built tool called Margins (Figure 1-1), a modified version of the Eyebrowse chrome extension [27], that allowed participants to make two types of annotations: a highlight and a highlight with a comment.

(12)

dean of the Friedman

f Nutrition Science and Policy at Tufts University, who

was not involved in the

ch. "The observed metabolic difference was large, more

than enough to explain the yo-yo effect so often experienced by people trying to lose

weight."

Margins Margins

ftP W~l fen- effect so often

Annotaton crested by: MurginsUserStudyl

If you'd like to add a comment about your highight, urn the box Comenwi

below. 1 (xlfment to dksplay

Wfte comment here... _{s t}

Nov 1 2018, 04:50 AM

Write a comrent...

Annotation

Delete Annott

Figure 1-1: Creating an annotation with Margins (bottom left), hovering over a highlight (top), clicking a highlight to expand details (bottom right)

The day after reading both articles, the participant completed a comprehension quiz, which they were not warned about in advance. The quiz tests to see how well the participant understood the article and how much they remember. Results from the study showed that with the annotation tool participants spent about 22% more per time reading the article and performed about 8% better on the questions.

The next chapter covers past research and related work. It reviews research covering annotations, their impact, and examples of past annotations systems. The reminder of this thesis is organized as follows. Following related work, the Margins system is explained in Section 3 covering the system's design and implementation. Section 4 outlines the lab study that was conducted to evaluate the hypothesis, this is followed by the results section that details implications of the findings, potential next steps, and final remarks.

(13)

2. Literature Review

This work lies in the intersection of annotation and online news reading. In order to get a good understanding of this space, this section explores research in both areas separately and research at the intersection of both. Please note that the term annotation, unless otherwise stated, can imply any

combination of: highlights, underlines, in-margin notes, and other markings. The section will begin with research about the function of annotations and their use, then cover the differences between paper-based annotation and digitally made annotations. Followed by studying the impact of annotations on future readers and then a look into various annotation systems.

2.1 Annotations: What are they and how do they help?

The logical place to start is to try and define why we annotate and how annotations help us. Catherine Marshall identifies six different functions of annotations in her research from studying annotations (in the form of highlights, underlines, markings, and in-margin notes) in used student textbooks.

* The first is that annotations serve as "procedural signals", a way to indicate where attention should be given [6]. From her findings, it seems students used annotations as a way to indicate what material in the textbook was relevant for the class.

* Second, "annotations serve as place-markings and aids to memory" [6]. Students used annotations as a way to indicate parts of the text that were important to remember or refer to again.

* Third, annotations represent a way for "problem-working", or in other words a way to add notes next to figures and equations that could further explain them [6].

e The fourth Marshall states is to record interpretation - in margin notes that represent how that text was interpreted by the reader [6]. This type of annotation is interesting because it helps the

(14)

annotator in summarizing what they have just learned and it may also be helpful for future readers as it could help them understand the text.

* The fifth function of annotation she shares is that annotations help the reader "maintain

attention". She finds that difficult passages with long narratives tend to have more highlights and underlines. So, the thinking here is that the act of annotating helps readers stay focused.

Interestingly, she points out that seeing annotations made by others provides a "visible trace of the reader's attention", so a heavily underlined passage may imply that the reader was very focused [6].

These five forms of annotation are applicable to the news context because these are the use cases we expect if readers were to annotate news articles.

* The last function touches upon how annotations can often be personal. Marshall shows that annotations can be circumstantial to the annotator, in other words the annotation may only make sense to the annotator [6].

While Marshall's categorization of annotations is comprehensive and very useful, Kawase et al. provide insight on how we annotate based on the goal for reading. In their research, they categorize the goals for reading into: "reading for writing", "reading for learning", "reading for reviewing", and "other" [12]. In "reading for writing" the goal is read to extract ideas and references for the purposes for writing, whereas "reading for learning" involves reading to stay current with advancements or learn a new skill or

technique. This latter category is similar to the purpose of reading the news as reading news articles is a way of staying informed or learning about something new. Their study shows that there were more annotations per page when "reading for learning", but "reading for writing" had a higher number of articles read and annotated [12]. They indicate that the higher number of annotations per page for

"reading for learning" shows that those annotations "have a clear purpose for memorizing certain parts of the text (by actively doing something with it)" [12]. The higher number of annotations per page could also be connected back to Marshall's point that there tend to be more annotations in areas where the readers paid very close attention, and close attention is required for learning.

The effectiveness of annotating while reading is further reinforced in Carol Porter-O'Donnell's 2004 study in which high school students were asked to annotate texts [24]. O'Donnell explains that annotating helped her students understand that "reading is a process and that applying the ways of responding

(15)

through annotation changes comprehension" [24]. A student in O'Donnell's study indicated that annotating helped her "comprehend and focus easier" and prior to annotating this student "used to get distracted easily which would cause [her] to read something over and over" [24]. This symptom of getting distracted easily is common for online news readers and another reason why testing annotation in this context is valuable. Furthernmore, O'Donnell explains that "because annotating slows the reading down, students discover and uncover ideas that would not have emerged otherwise" [24]. She explains this outcome occurs because when readers slow down, "they become more active". They "give themselves the opportunity to become more aware of their thinking process" as well as "consider and work to make sense of ideas that they may not have been aware even existed when they read quickly" [24]. These conclusions also connect to Lavagino's idea of how to interpret what we read as he describes in 'Reading,

Scholarship, and Hypertext Editions' [14]. Lavagino states that reading "can lead to interpretation, but only by way of generating reactions that we subsequently seek to describe or explain", meaning

comprehension or interpretation of what we read comes from sharing or explaining our understanding of the text. Thus, annotations can tend to help with understanding what we read because they serve as a

channel to share or explain our reactions to the text. These positive outcomes are exactly what's needed for online news readers. Annotations offering a way to slow down reading to help readers discover and uncover details of an article that may otherwise go undetected when reading quickly and aid in

understanding of the article by offering a channel to share reactions contextually.

2.2 Print vs Digital

When studying the benefits of annotation, a natural question that arises is how making an annotation on paper compares to making an annotation digitally and whether the benefits of the former translate to the latter? To answer this question, it is important to understand the differences between paper-based reading and digital reading. Kawase et al. summarize past research covering these differences into four main

usability categories: tangibility, orientation, multiple displays, and cooperative interaction [12]. The four categories center around the idea that with paper readers get a physical object that can be moved and manipulated to help with handwriting, comfort, and obtain a "sense of location within the text" [12]. Furthermore, key differences between the two mediums as pointed out by O'Hara and Sellen are "the major advantages that paper offers in supporting annotation while reading, quick navigation, and flexibility of spatial layout" [21]. However, Kawase et al., in reference to O'Hara and Sellen's work, remark that the benefits of paper annotations could be obtained if the digital annotation tool provides the ability to make an "in-context" annotation that it is "visible within the original resource" [12, 21]. In other words, the tool must allow the annotator to digitally mark up a particular part of the text and the

(16)

annotation must be visible at that point - similar to how with paper a reader can mark up any text and connect it to a note in the margin.

Another aspect to be mindful of is how comprehension may be impacted when moving from paper to digital mediums. Margolin et al. ran a study to do just that by measuring changes in comprehension when reading on paper, on a computer, and on a Kindle [16]. Each participant read two texts and was randomly placed in one of three groups, with each group representing a particular medium [16]. Their investigation into prior research in the field revealed that outcomes from similar studies were mixed, with some

favoring paper and some favoring computers. They referenced studies done by Gould and Grishkowsky in 1984 that suggested "in terms of reading speed and reading ability, traditional paper was superior to computerized text", but their reference to research done in 1987 by Mills and Weldon suggested "that although reading speeds differed, comprehension did not change because people tended to read at a speed in which they can maintain meaning and understanding" [16]. Their reference to more recent studies showed similar contentions. However, computer screen technology continues to improve every year, so Margolin et al.'s study, conducted in 2013, is a welcomed study that presents results leveraging more modem technology. In their study, comprehension of participants was measured through the performance on a quiz given to participants after they finished reading. The results of their study "indicated no significant differences" and this lack of significant difference as stated by them, indicates that "if

comprehension differences exist, the present research did not find them and therefore are likely to be very small differences" [16].

2.3 Annotations and their Impact on Other Readers

While, annotations have shown to be beneficial to the annotator because the act of making an annotation can "stimulate critical thinking" [12], it is also important to consider how they may impact future readers.

The Popular Highlights feature in Amazon's Kindle is an implementation through which readers are impacted by existing annotations. The reception of Public Highlights as a feature of the Kindle has been met with very mixed reviews. Barnett points out that major criticisms of the feature revolve around: privacy concerns, "its disruptive nature", and the "distraction created by the presence of a public

readership in the book" [2]. On the other hand, Barnett points out how for some the highlighting feature is "a social networking tool." It offers "a way to talk back to the book", "a way to talk back to the author", and "a way to locate or build a community of like-minded readers" [2]. It is clear the reception of the feature is mixed, but what's interesting is how one user may see another user as similar because of his or her annotation.

(17)

Barnett also shows an example of how a particular user's highlights and questions influenced readers. This user placed highlights and questions in key parts of the book such that the questions provoked "contemplation of the question" and as a result "may [have prompted] deeper learning about the content and a more nuanced method of reading for learning" [2]. This type of learning is an ideal outcome of an annotation. An annotation should augment both the annotator and the reader's experience by helping both develop a deeper understanding of the content.

Contrary to the supporters of Popular Highlights, Dodson et al. show how existing highlights, "passive highlights", could negatively impact readers [7]. Their study investigates how passive highlights can impact the comprehension of readers. Participants were asked to read articles in three conditions: articles with irrelevant highlights, articles with no highlights, and articles with relevant highlights. It is important to note that participants of this study, like the reviewers of Amazon's Popular Highlights feature, had mixed feelings about the highlights. From the participants who liked the highlights, their responses "[emphasized] that highlights helped with focus and overall understanding" [7]. This description of "helped with focus" is interesting because in this case the annotation helped the reader keep focused, whereas prior research only showed how annotations helped those who made the annotation stay focused. From those who disliked the highlights, they described them to be "annoying" and "distracting" [7]. This description is also revealing because it aligns with the feedback of Popular Highlights indicating that passive highlights could be disruptive regardless of whether it's articles or books. Furthermore, results of the study provided "no evidence that relevant highlights improve comprehension for any of the groups". The study even showed "some evidence that irrelevant highlights negatively affect comprehension" [7].

2.4 Example Annotations Systems

Now that annotation benefits and use-cases have been understood, we switch gears to studying annotation systems that have been created for research purposes. As expected, many of these studies occur in the areas of education and learning, with the goal of wanting to measure how annotations can impact learning. For example, David Lebow and Dale Lick's study shows the potential benefits from employing the use of annotations in distance education [13]. Their annotation system, HyLighter, works by first having teachers create a course in their program, upload HTML documents for the course, and then add participants or students to the course. The students use HyLighter to read and annotate the documents. The annotations are made by selecting text and then adding a comment about that selected text. Lebow and Lick point out that results from initial field tests suggested that HyLighter "[helps] students develop

(18)

active reading skills" and "learn how to gain knowledge from the text" [13]. These early findings are not surprising as they directly relate to Marshall's reasons for annotating and conclusions supported by O'Donnell's work. However, what's unique about Lebow and Lick's study is that it shows that the benefits Marshall outlined may also translate over to annotations made digitally. A potential reason for why these benefits may translate ties back to the critical feature of "in-context" annotation that was highlighted by Kawase et al [12].

Similarly, Brush et al. investigated how the use of two systems: a discussion board-based system (EPost) and an annotation-based system (WebAnn) would impact in-class discussions for a graduate course [4]. The study involved students in the course to read papers and create a post in one of the two systems about the paper. The WebAnn system was integrated within IE and allowed users to select any text on the web page and attach a note to the selected text. Epost was based on traditional threaded discussion board websites and did not offer the ability to anchor comments to particular sentences. Their research revealed that not only was there greater participation on WebAnn, but students also wrote more per post. Brush et al attribute the greater participation to the ability to anchor comments to parts of the text because in EPost one comment could be a concatenation of multiple WebAnn posts. Furthermore, the authors point out how WebAnn's design could impact the focus of the user because the focus is split between the

annotations and the actual reading. This is interesting because it ties back to the previous topic about how annotations could impact readers. Ultimately, the study revealed that these systems, especially WebAnn, do have the potential to provide value, especially for those students who do not participate as often in class.

Another system that was deployed in a similar capacity by Zyto et al. was the NB system, a collaborative web-based tool that allows users to annotate PDF documents. At the time of the study when the NB system was deployed in 10 universities and over 55 courses, researchers state that its "best case" yielded over 14,000 annotations in one semester from a class of 91 students [29]. Zyto et al. state part of the success of NB comes from their implementation of a concept called "situated discussions", which essentially allowed users to anchor their comments to particular parts of the text and start discussions in the margins without disrupting the reading process [29]. The success of this feature is not surprising as its user-experience closely resembles the experience of annotating on paper, which as mentioned earlier Kawase et al. stated was key to porting the benefits of paper annotation to digital annotation.

Based on the outcomes and findings from all this past research, a natural question that comes to mind is can these benefits translate to online news readers? Similar to students, can annotations help news readers stay focused, slow down, and improve comprehension of the article? The answer to these questions and

(19)

more are provided in the next three sections. The order is as follows, the next section details the annotation system used for the study, followed by the details of the study, and then the results.

(20)

3.

Margins

This section covers the design process behind building Margins, developing it, and how the tool works. The motivation behind building Margins was to create a simple tool that would allow users to highlight any text on a web page and if desired, comment on their selection. The highlights anchor to the selected text on the page and comments are made in the margins. Therefore, an annotation could represent just a highlight or a highlight with a comment.

3.1 Defining Margins

Developing Margins began with first understanding the essential tasks the user must be able to perform when using the tool. This was helpful in prioritizing feature development and preventing scope creep.

Margins Functionality was broken down into essential and secondary tasks, like so:

Essential User Tasks:

" The user can create a highlight.

" The user can add a comment to their highlight.

" The user can see their highlights and the comments for each highlight. " The user can delete annotations.

Secondary Tasks:

" User can share their annotations.

" Users can see and comment on annotations made by others.

Once the application's core functionality was understood, the next steps centered around understanding the target user and ideal use cases. To best illustrate the target user, user personas were created and potential use cases for each. Furthermore, since it was pre-determined that this would be a Chrome extension, the primary method for accessing the news was preset to the Chrome browser.

(21)

Ideal Margins User Personas

a

Figure 3-1: Image depicting female student. Copyrightfree, taken from

flaticon.

com

Name: Jane Doe Age: 19

Occupation: Student Major: Computer Science Interests: Health, Tech

News Habits: Reads the news sporadically throughout the day when there's free time. Favorite Publisher: The New York Times

Device for reading the news: Chrome browser on computer

Figure 3-2: Image depicting male consultant. Copyright

free,

takenfrom

flaticon.com

Name: John Doe Age: 28

Occupation: Consultant Industry: Healthcare

Interests: Health, Tech, Politics

News Habits: Reads the news in the morning before the work day and at end of the day after dinner. Favorite Publisher: The Wall Street Journal

Device for reading the news: Chrome browser on computer

Margins Use Cases for Annotations in Online News Reading

As mentioned in the Literature Review, Catherine Marshall identified six main functions for annotations; however, in the context of online news reading, the ideal functions for annotations would likely be:

(22)

Use Case

1:

Using Annotations to Keep Track of Important Details

For this use case, users would highlight parts of the article they believe to be important in order to better remember the details of the article. In addition, user could add a comment to each highlight to either justify the highlight or reflect what was highlighted.

Use Case 2: Using Annotations to Mark Areas of Contention

In this case, the user would highlight parts of the article they disagree with or don't readily believe. The comment, if added, could contain a counterpoint to what was highlighted or a question to the author.

Use Case 3: Using Annotations to Identify Areas Not Well Understood

For this case, the user would highlight phrases or words that they don't understand. This could mean not knowing the definition of a word, not knowing the meaning of a phrase, or not being able to make sense of that part of the article.

Use Case 4: Using Annotations to Start a Discussion

For this case, the user would create highlights with comments in order to start a discussion with other readers. The annotations would offer readers a way to have inline discussions as oppose to relying on a general commenting section, usually found below the article.

3.2 Building Margins

Margins was built by modifying the existing Eyebrowse Chrome Extension, which was built and is currently maintained by members of the Haystack group at MIT CSAIL [27].

While the Eyebrowse extension was built as a tool for users to publicly share their web surfing data, it offered ways to collect the necessary data needed for Margins and helpful out-of-the-box features. The data includes how much time the user has spent on a website and on what date the website was accessed. This data is accessible to other Eyebrowse users and to the general public through the Eyebrowse API

[27]. The extension has developed over time to include features that allow users to chat with others on the same website, see the moral framing of articles, and even annotate and tag articles. The recent additions (moral framing, tagging, and annotation) were integrated into Eyebrowse from the work done for Pano, also a system built by the Haystack group but for discussing moral framing in articles [22]. Along with these features, Eyebrowse also provided a working backend solution as it has a working Django server

(23)

and integrated MySQL database. With these benefits, Eyebrowse provided an excellent foundation to build a custom annotation tool.

In Margins, an annotation can be a highlight or a highlight with a comment. While, the existing Eyebrowse extension did offer the ability to select text on a page and highlight that text, each highlight was required to have several tags and saved under the added tag. For Margins, the tags were unnecessary and each highlight needed to be saved as a highlight first. These changes were implemented as well as the ability for users to highlight any text on a page, with the exception of text in an input field, and the ability to highlight an entire paragraph. This flexibility was important because the length of an annotation can vary with each user. However, one aspect that was kept unchanged was the basic mechanism and design of creating a highlight. In evaluating existing leading annotation systems, such as Hypothesis [11], it made sense to have an interface that emulated the industry standard for highlighting text on a page. This mechanism involves simply clicking on the page and dragging the mouse until finished covering the desired text to highlight. Since, the existing Eyebrowse highlighting mechanism followed this standard, no changes were made to alter its behavior.

The figures below show various examples of highlights and overall process of creating an annotation.

unts on restaurant me

labels. Many

he obesity epidemic is

- -

cans eat too

asy access to cheap and I

Dalatable foods,

1. On its website, for

e

the National Ins

count calories and warns

dietary fat has mo

"You

need

to limit fats to avoid extra calories,"

Figure 3-3: Selecting text to highlight using Margins, Article 1 Excerpt

from

The New York Times

kinds, prompted by easy access to cheap

alatable foods, and that they need to

exercise portion control. On its website, F,

the National Institutes of Health

encourages people to count calories and wa

t dietary fat has more calories per gram

than protein or carbs: "You need to limit fat

id extra calories," it states.

But experts like Dr. Ludwig, argue that the obesity epidemic is driven by refined

carbohydrates such as sugar, juices, bagels, white bread, pasta and heavily processed

cereals. These foods tend to spike blood sugar and insulin, a hormone that promotes fat

storage, and they can increase appetite. Dr. Ludwig and his colleague Dr. Cara Ebbeling

have published studies suggesting that diets with different ratios of carbs and fat but

identical amounts of calories have very different effects on hormones, hunger and

metabolism. He has also written a best-selling book on lower-carb diets.

(24)

After clicking the "+" icon, a menu slides out into the right margin of the page to ask the user whether they would like to add a comment. Figure 3-5 below, shows this menu.

Figure 3-5: Right margin menu when creating an annotation, Article 1 Excerpt

from

The New York Times

When designing this form, the goal was to keep it simple so only two major elements are shown: the text that was included in the highlight and an input box for the comment. The most challenging aspect about designing this form, however, was what to name the submit button - "Add Annotation" or "Add

Highlight". Informal user testing ultimately revealed that "Add Annotation" is the better choice because it better reflects what can be done.

Once "Add Annotation" is clicked, a light green background color is added to the selected text to clearly

identify the highlight on the page. Figure 3-6, shows an example of a highlight.

Dr. David Ludwig, an endocrinologist at Harvard Medical School and one of the study

authors, disagreed, saying: "We used a gold standard method that has been validated

across a wide range of experimental conditions and universally adopted in the field."

Dr. Hall added,"I would love it to be true that there was a diet combination of carbs and

fats that led to large increases in energy expenditure

-

and I really hope it is true. But I

think there are reasons to question whether or not it is:'

Figure 3-6: Example of a multiple paragraph highlight, Article Excerpt from The New York Times.

Margins

"You need to lInit fats to avoid extra cabries,"

If you'd like to add a comment about your highlight, use the box

below.

Write

comment

here...

(25)

Hovering over highlighted text will show the number of comments on that highlight (Figure 3-7). This feature was implemented to prevent the reader from having to click on each highlight to check for a comment.

burn rate after weight loss, lowering

hsm," said Dr. Dariush Mozaffarian, the

dean of the Friedman School of Nutn

ience and Policy at Tufts University, whc

was not involved in the research. "The observed metabolic difference was large, more

than enough to explain the yo-yo effect so often experienced by people trying to lose

weigeht "

(26)

4. Lab Study Design

The study was designed as a within-subjects study. Participants were asked to read the same two news articles, one of which had to be annotated using Margins. The next day participants were sent a survey that consisted of a 10-question reading comprehension quiz about the two articles, while participants were told about the survey they were not told about it containing a quiz. By making the study within-subjects, the performance of the participant with the tool and without the tool could easily be measured and compared. Therefore, a key result being tracked was the performance (number correct) on the questions for the article the participant annotated.

4.1 Part One: Participant Selection and Onboarding

In order to participate in the study, each participant had to be a native English speaker, over the age of 18 who attended college or was currently enrolled in college. This allowed us to minimize impact from differences in English comprehension skills. Overall, the study consisted of 20 participants. Most participants were recruited from the Massachusetts Institute of Technology (MIT) community, with a significant portion being graduate students. The participants were in the age range of 20-35 years of age and seven of the 20 were female, the remaining male.

To conduct the actual experiment, a 30-minute time slot was scheduled with each participant at a location that was quiet and uncrowded. Prior to taking part in the study, the participant was presented with a consent form to sign that explained the study as well as how data was going to be collected. The

participant is also told that the exact measurements being taken (i.e. time) cannot be shared until they've completed the study to prevent bias, but they are welcome to ask after completing the study. Furthermore, as explained to each participant, all data collected in the study is anonymized and done so by giving a unique username (i.e. userl, user2) to each participant. Once the form is signed, participants were told about how the in-person session involves reading two news articles and the next day they would have to complete a survey, ideally within a four-hour window of receiving it. They were not told about the survey containing a I 0-question reading comprehension quiz on the two articles they were going to read.

(27)

Note, the participants were divided into two groups based on when their in-person meeting was

scheduled. The first eight were put in group one, which meant they annotated article two. The next eight were put in group two, which was the opposite, so they annotated article one instead of two. Doing this

allows us to reduce the impact from any differences in interest in the content and any differences in the difficulty of the questions for each article. Any additional participants to reach 20 were assigned a group by alternating between numbers. In retrospect, it would have been cleaner to run the whole study by alternating between group numbers or randomly assigning a number to each participant.

4.2 Part Two: Reading and Annotation

After explaining the study and walking participants through the consent form, we begin the main aspect of the study which is the reading and annotating of the articles.

The articles for the study were selected with the criteria that content had to be neutral in subject matter and not invoke any polarizing reactions. This led to the selection of two articles from The New York Times. The first being: How a Low-Carb Diet Might Help You Maintain a Healthy Weight by Anahad O'Connor and the second: In Cave in Borneo Jungle, Scientists Find Oldest Figurative Painting by Carl Zimmer [20, 28]. Links to both are provided in the References section. These articles worked well for the study because the content was mostly educational and presented in an objective manner. The articles were also evergreen, meaning they didn't have the risk of becoming irrelevant within days, which was

important as conducting the study spanned months.

Once articles were selected, the next step was to create a custom HTML page for each article. The HTML page (Figure 4-1,4-2) was very simple, it consisted of just the text from the article. All images on the original article web page, even those relevant to the article, were scrubbed and only the article text was copied over to the new HTML page. This was done to provide a consistent experience and prevent any distractions from Ads or other modules that may appear on a New York Times article page. Since

Margins, as a Chrome Extension, relies on a unique URL to save annotations -each participant was given their own set of custom article pages. In the HTML of each article page, the username of the participant was added, making it easy to track, as well as a label of which article was to be annotated. Overall, 40

HTML pages were generated for 20 participants (2 per participant). To organize all the pages, a folder was created for each user and their two custom article pages were placed in their respective folder. Another folder was created, named "Test", to house custom pages for the purpose of testing the tool and teaching participants how to use the tool. The content for these test pages came from random Wikipedia articles and it was ensured that the text on test pages had no relation to the content of the article pages.

(28)

These folders were then uploaded to a MIT CSAIL web server, so they could be accessed via the Chrome web browser.

How a Low-Carb Diet Might Help You Maintain a Healthy Weight Sub-headline:Adults who cut carbohydrates from their dieLs and replaced them with fat

sharply increased their metabolisms. Author: Anahad O'Connor, Source: NY Times

It has been a fundamental tenet of nutrition: When it comes to weight loss, all calories are created equal. Regardless of what you eat, the key is to track your calories and burn more than you consume.

But a large new study published on Wednesday in the journal BMJ challenges the conventional wisdom. It found that overweight adults who cut carbohydrates from their diets and replaced them with fat sharply increased their metabolisms. After live months on the diet, their bodies burned roughly 250 calories more per day than people who ate a high-carb, low-fat diet, suggesting that restricting carb intake could help people maintain their weight loss more easily.

The new research is unlikely to end the decades-long debate over the best diet for weight loss. But it provides strong new evidence that all calories are not metabolically alike to the body. And it suggests that the popular advice on weight loss promoted by health authorities - count calories, reduce portion sizes and lower your fat intake -- might be outdated.

"This study confirms that, remarkably, diets higher in starch atd sugar change the body's burn rate after weight loss, lowering metabolism," said Dr. Dariush Mozaffarian. the dean of the Friedman School of Nutrition Science and Policy at Tufts University, who was not involved in the research. "The observed metabolic difference was large. more than enough to explain the yo-yo effect so often experienced by people trying to lose weight."

Dr. MozaTarian called the findings "profound" and said they contradicted the conventional wisdom on calorie counting. "It's time to shift guidelines. government policy and industry priorities away from calories and low-fat and toward better diet quality."

Figure 4-1: Article

1

Custom Page for User15: http://people. csail. mit.

edu/pkukreja/Userl5/Articlel-Diet Useri5.html

In Cave in Borneo Jungle, Scientists Find Oldest Figurative Painting in the World

Sub-Headline: A cave drawing in Borneo is at least 40.000 years old, raising intriguing questions about creativity in ancient societies.

Author: Carl Zimmer. Source: NY Times

On the wall of a cave deep in the jungles of Borneo. there is an image of a thick-bodied, spindly-legged animal, drawn in reddish ocher.

It may be a crude image. But it also is more than 40.00 years old, scientists reported on Wednesday, making this the oldest figurative art in the world.

Until now, the oldest known human-made figures were ivory sculptures found in Germany. Scientists have estimated that those figurines - of horses, birds and people

-were at most 40,000 years old.

Researchers have found older man-made images, but these were abstract patterns, such as crisscrossing lines. The switch to figurative art represented an important shift in how people thought about the world around them - and possibly themselves.

The finding also demonstrates that ancient humans somehow made the creative transition at roughly the same time, in places thousands of miles apart.

"It's essentially happening at the same time at the opposite ends of the world," said Maxime Aubert, an archaeoloeist at Griffith University in Australia and a co-author of the report, published in the journal Nature.

Figure 4-2: Article 2 Custom Page for Useri5: http://People.csail mit. edu/pkukreja/User15/Article2-Cave User15.html

(29)

As explained in Section 4.1, the participants are separated into two groups. For Group I participants, Margins is not active for the first article.

Once the participant is ready to read, we open the article page on the laptop and tell the participant they must read the whole article, they can spend as long as they like, and to tell us when they have finished reading. Once the participant begins reading, we start a stopwatch (for the study the default stopwatch app on the iPhone was used) to measure how much time the participant spent reading. Once the participant

indicates they've finished reading, the stopwatch is stopped and a screenshot of the time displayed is taken to save the total time (this is recorded later in a spreadsheet). Note, the starting and stopping of the stopwatch is done discreetly, so the participant is not aware he or she is being timed (again this is done to prevent bias, but what was collected for that participant can be shared after the survey is completed). Although, it is possible participants already assume time is being recorded.

Once Group I participants have finished reading, they are told that for the next article they will not only read but also annotate while reading using a tool called Margins. So, participants are not told in advance that they will be annotating an article; they are told right before they begin reading. During this time, a test page for practicing the use of Margins is opened on the laptop and the participant is taught how to use

Margins. We also explained that an annotation can be in the form of a highlight or a highlight with a comment. The participant is told that it is up to them to decide what they want to annotate and what kind of annotation they want to make, but they must make at least one annotation. Similar to article 1, the participant is told again they must read the whole article and that there is no time constraint. Once the instructions are clear, the article is opened and when the participant begins reading -the stopwatch is started. Again, similar to article 1, the participants notify us when they have finished reading and the stopwatch is stopped and a screenshot of the time is taken.

4.3 Part Three: Post-Reading Survey and Quiz

By this step, the user has completed the in-person part of the study. At the conclusion of the in-person session, the participant is reminded that they will receive a survey link in 24 hours and it must be completed the day they receive it.

The survey has two parts: part one captured details about the user and the second consisted of the

comprehension quiz. Part one involved answering questions about focus level (1-5, 5 being most focused) while reading each article, stating if they liked both articles equally or which one they liked more, rating

(30)

(similar 1-5 scale) if they found the annotation tool helpful, and if yes, how the annotation tool added value. These questions were an important addition because they allow us to see if other factors, such as interest, could correlate with quiz scores and also collect valuable feedback on the annotation experience. As for the quiz, the questions were created to measure understanding of the article. The questions were created based on the reading comprehension questions used in standardized tests, such as the GRE and GMAT. The type of question they were modeled after was the "inference question", which measures if the reader understood key ideas from the text. As the Educational Testing Service (ETS) describes, inference questions on the GRE involve being able to do such things as: understand "the meaning of paragraphs and larger bodies of text" and "[draw] conclusions from the information provided" [25]. Similarly, the SAT, as described by CollegeBoard, in its reading section includes questions of the

category "information and ideas" that involve making "reasonable inferences" [5]. Thus, questions of this type are commonly used across standardized tests as one of the ways to measure reading comprehension skills.

Based on sample GMAT and GRE questions, two types of questions for each article were created: the author would likely agree/disagree about which of the following and the article supports/presents/does not support which of the following. Each question had four answer choices and only one could be the right answer. Figures 4-3 and 4-4 are sample questions from article one and two, respectively (the whole survey, including the quiz, can be found in the Appendix section). As described by ETS, utilizing such questions measures the reader's understanding of the main idea, which is the primary goal because the intention is to see if readers will understand and retain more of an article after having annotated it.

The author would most likely disagree with what statement:

1. Many experts say that the underlying cause of the obesity epidemic is that Americans eat

too many calories of all kinds.

2. All calories are metabolically alike to the body.

3. The idea that counting calories is the key to weight loss has long been embedded in the

government's dietary guidelines.

4. On its website, the National Institutes of Health encourages people to count calories and warns that dietary fat has more calories per gram than protein or carbs.

(31)

The author is likely to agree with which of the following:

1. Researchers have not found older man-mage images than the discovered cave

image of a spindly-legged animal in Borneo.

2. Radiocarbon dating has no limitations.

3. Traveling to the Borneo cave posed no challenges.

4. Figurative art came after abstract art.

Figure 4-4: Sample questionfor Article 2

The last aspect to explain about the survey is the decision to have participants complete it the next day. To be more specific, participants were sent the survey link 24 hours after they finished reading the articles and were required to complete the survey the day of receiving it and if possible, within a four-hour window of receiving the link. This configuration is based on the research covering retention of new material done by Hermann Ebbinghaus, who through his research shows the rate at which people will forget new information. The rate is depicted in a graph commonly known as "The Curve of Forgetting" or "The Ebbinghaus Curve" [8,26]. It shows that people will forget 40% of new information after a day of learning it and then forget another 20% after the second day [8,26]. This means that after a day of learning something new, people are likely to retain only 60% of that content and then after two days, likely to retain only 40% of the content. With this in mind, the survey for this study is sent after 24 hours, so we can measure if annotations help participants retain more and potentially decrease the impact shown in the Ebbinghaus Curve. (One aspect to note, each participant completes the survey using their assigned username, so again all collected data is tied to a username and anonymous.)

(32)

5. Results & Discussion

In this section we cover the results of the user study and the analysis that was done on the collected data. First, we discuss the feedback that was collected about the annotation experience and in general how participants felt about annotation. Then we dive into quantitative data, such as the participant's performance on the comprehension quiz. Note, since the article that is annotated is switched for some participants, the labels "annotated article" and "non-annotated article" are used throughout this section.

5.1 Annotation Experience

In the survey given to the participants, the first set of questions captured three important pieces of feedback: how helpful participants found the annotation tool, whether they thought annotation added value to their reading experience, and comments on the annotation experience.

As a reminder, the question "How helpful did you find the annotation tool?" asked participants to answer based on a 1-5 scale, with the following meanings:

1 (not helpful) * 2 (somewhat helpful)

* 3 (neutral)

* 4 (helpful)

* 5 (very helpful)

The goal behind this question was to really get the participant to start thinking about their experience with the tool, Margins, and whether they thought the tool itself was useful.

(33)

Figure 5-1: Results for "How Helpful did you find the annotation tool?"

About 70% of the participants answered 4-helpful (as shown in Figure 5-1 above) and reminder was split between neutral and somewhat helpful.

The next question, however, "Do you think making annotations improved or added value to your reading experience" had more of clearer direction. With this question, the goal was to learn more about what the participant thinks about the act of annotating and less about the tool used in the study. As shown in Figure

5-2 below, about 80% of the participants felt that annotation did help.

How helpful did you find the annotation tool?

0 2: (Somewha. Helpful;

i 3 (NeutralI

(34)

Figure 5-2: Results for "Do you think making annotations improved or added value to your reading

experience? " (The added responses are

from

participants who wrote in the "Other" section)

The difference in the above two questions could indicate that participants find annotation helpful, but didn't find the tool Margins particularly effective. The last question in this set, which asks participants to elaborate on how annotations added or did not value, reveals valuable insights and reinforces many of the arguments made about focus. These comments are provided below in Table 5-3.

Username

Survey Question 7. How did annotations add or not add value to your

reading experience?

Userl "Since I knew I was trying to annotate, I had better focus on what I was reading, and better retained the info, in case future info referenced it. I also found myself

wondering/questioning the article more for some reason. Most of my notes were pointing out things that seemed inconsistent or questionable to me"

User2 "Provided an easy way for me to note items of interest. Being able to highlight sections both with or without comments was helpful as it allowed me to go back to clarify items as needed. Having the tool available seemed to also help me focus as I was reading." User3 "They helped highlight important facts to look back upon."

User4 "I can reflect back on important findings"

User5 "Helped me make connections to other articles or things that I've read."

User6 "Adding annotations probably helped me focus more but only because I was told I

Do you think making annotations improved or added value to your reading

experience?

* Ye5

* made noles re-gardino ci estions 1 had. bu1 t dniii know, if I would actually look ihose thin ps up afterwards. No

C-ol tooH Ii ced it a lo"

=a sy lo use ano helpfL

(35)

-needed to use the tool at least once."

User7 "Helped keep track of important points in the article."

User8 "It did not improve or add value to me 'reading' experience. It felt like a task while reading. I felt under pressure to make annotations as opposed to focusing on reading the content in the article. As I read almost every sentence, I kept thinking, "is this sentence worth annotating?" That being said, once I made the annotations, I felt like I was highlighting important pieces of the article to help me retain the content further. Although today, I feel like I remember the general content of both articles almost equally - maybe slightly leaning towards the article with annotations. However, I don't specifically remember what I annotated."

User9 "They helped a bit because I felt like I didn't reread sections as much as I often do when I sort of skim/drift off during my initial reading of the section."

User10 "Reading the article felt more active than passive. Highlighting the text helped me take a mental note of the information being conveyed."

User] I "They allowed me to highlight certain thoughts, and helped me remember them better. Just like annotating a book, but online - it helps make the content feel like your own." Userl2 "I tend to highlight physical readings so the annotations provide the same experience. It

helps me remember key details."

User13 "Nice to have a note taking/highlighting tool built in, keeps key points fresh." Userl4 "It helped me to keep myself engaged with the article."

User1 5 "Helped me remember key points"

Userl 6 "Adding Value:

1) Unobtrusive. The annotation tool wasn't in my way when I'm not using it. 2) Notation. I appreciated being able to easily flag a sentence that stood out to me. Not adding value:

1) 1 generally don't annotate much when reading hardcopy or softcopy. (However, I do write in hard copy margins a lot, but it's usually to note a vignette or summary of the paragraph/page, rather than a specific sentence or three.)

Potentially adding more value [feature requests]:

1) Tagging. I routinely collect quotes, citations, funny statements, etc. Can Margins help me tag my highlighted sentences so I can bucket into various groups?

2) Exporting/Viewing. How do I view my annotations, and can I export them to use outside of Margins?"

User1 7 "The annotations (highlights and notes) helped me summarize in my own way what I read which makes it easier to remember the article's main takeaways."

User] 8 "Annotations are generally helpful when revisiting something (e.g., helps with recall). It's difficult to assess how helpful it is if I'm just using it once while reading."

(36)

User19 "It helped to highlight points to reference back to later, especially when I'm checking what I read earlier in the same article against new information in the article."

User20 "I like highlighting while I'm reading so this gives me the opportunity to do the same on the web"

Table 5-3: Results for "How did annotations add or not add value to your reading experience?"

To further reveal commonalities between remarks and overall usage patterns, Figure 5-4 and 5-5 show a word cloud of the responses and a taxonomy of the remarks, respectively.

ax&&

UO W&PJ~

'Figure 5-4: Word Cloud Generatedfrom Responses

from

User Comments. Generated using: https://www. wordclouds. com/

Assessing the impact of annotation on understanding and retaining online news articles

Assessing the Impact of Annotation on Understanding and Retaining

Online News Articles

Signature redacted

Signature redacted

S ig

JUN 27

2019

nature redacted

(1

Signature redacted

Assessing the Impact of Annotation on Understanding and Retaining

Online News Articles

Abstract

Thesis Supervisor:

Acknowledgements

Table of Contents

List of Figures

List of Tables

10

Introduction

dean of the Friedman

f Nutrition Science and Policy at Tufts University, who

was not involved in the

ch. "The observed metabolic difference was large, more

than enough to explain the yo-yo effect so often experienced by people trying to lose

weight."

2.

Literature Review

2.1 Annotations: What are they and how do they help?

2.2

Print vs Digital

2.3

Annotations and their Impact on Other Readers

2.4 Example Annotations Systems

3.

Margins

3.1 Defining Margins

Ideal Margins User Personas

a

flaticon.

free,

flaticon.com

Margins Use Cases for Annotations in Online News Reading

1:

3.2 Building Margins

unts on restaurant me

labels. Many

he obesity epidemic is

- -

cans eat too

asy access to cheap and I

Dalatable foods,

1.

On its website, for

the National Ins

count calories and warns

dietary fat has mo

"You

need

to limit fats to avoid extra calories,"

from

kinds, prompted by easy access to cheap

alatable foods, and that they need to

exercise portion control. On its website, F,

the National Institutes of Health

encourages people to count calories and wa

t dietary fat has more calories per gram

than protein or carbs: "You need to limit fat

id extra calories," it states.

But experts like Dr. Ludwig, argue that the obesity epidemic is driven by refined

carbohydrates such as sugar, juices, bagels, white bread, pasta and heavily processed

cereals. These foods tend to spike blood sugar and insulin, a hormone that promotes fat

storage, and they can increase appetite. Dr. Ludwig and his colleague Dr. Cara Ebbeling

have published studies suggesting that diets with different ratios of carbs and fat but

identical amounts of calories have very different effects on hormones, hunger and

metabolism. He has also written a best-selling book on lower-carb diets.

from

Dr. David Ludwig, an endocrinologist at Harvard Medical School and one of the study

authors, disagreed, saying: "We used a gold standard method that has been validated