Heuristic Evaluation Method - Presentation of Research Method and Tool

Chapter 3: Presentation of Research Method and Tool

3.2 Heuristic Evaluation Method

Heuristic evaluation is outlined at great length by Nielsen (1993; 1995b, 1995a; 1995; 2009), and, as ex-plained in the previous chapter, it is one of the four techniques which constitute the discount usability engineering method.

More specifically, heuristic evaluation is a usability engineering method carried out by observing a user interface design for finding usability problems, and trying to form an opinion about the interface strengths and weaknesses. It can be performed by a single evaluator or teams of evaluators, who conduct the evaluation test according to certain rules (Nielsen, 1993, p. 155; 1995b).

Heuristic evaluation involves having a small set of evaluators examine the interface and judge its compliance with recognised usability principles (the ‘heuristics’) (Nielsen, 1995a).

In Nielsen 1993, heuristic evaluation is described as a “systematic inspection” of a user interface design for usability (1993, p. 155). Usability inspection consists in a set of methods based on a user interface inspection performed by evaluators. The goal of usability inspection is to find usability problems in the design. However, some methods also focus on issues like the severity of usability problems or the in-spection of user interface specifications that have not yet been implemented (Nielsen, 1995).

Besides heuristic evaluation, usability inspection includes the following types of evaluations:

Heuristic estimation, where “inspectors are asked to estimate the relative usability of two (or more) designs in quantitative terms” (Ibid.).

36 Cognitive walkthrough, “a more explicitly detailed procedure to simulate a user’s problem-solving pro-cess at each step through the dialogue, checking if the simulated user’s goals and memory content can be assumed to lead to the next correct action” (Ibid.).

Pluralistic walkthrough, which consists of “group meetings where users, developers, and human fac-tors people step through a scenario, discussing each dialogue element” (Ibid.).

Feature inspection, which “lists sequences of features used to accomplish typical tasks, checks for long sequences, cumbersome steps, steps that would not be natural for users to try and steps that require extensive knowledge/experience in order to assess a proposed feature set” (Ibid.).

Consistency inspection, having “designers who represent multiple other projects inspect an interface to see whether it does things in the same way as their own designs” (Ibid.).

Standards inspection, having “an expert on an interface standard inspect the interface for compliance”

(Ibid.).

Formal usability inspection, which “combines individual and group inspections in a six-step procedure with strictly defined roles to with elements of both heuristic evaluation and a simplified form of cognitive walkthroughs” (Ibid.).

While heuristic evaluation, heuristic estimation, cognitive walkthrough, feature inspection, and standards inspection are normally performed by a single evaluator at a time, pluralistic walkthrough and consistency inspection are “group inspection methods” (Ibid.).

In this context, heuristic evaluation remains “the most informal method and involves having usability specialists judge whether each dialogue element follows established usability principles (the ‘heuristics’)”

(Ibid.). As seen so far, heuristic evaluation is one of the four techniques on which the discount usability engineering method is based, as well as a systematic inspection method. The relationship among heuristic evaluation and the other usability methods mentioned in this Master thesis is shown in the following diagram:

Figure 3. Heuristic evaluation among other usability assessment methods.

37 If we try to focus on the meaning of the term “heuristic evaluation”, we can agree that evaluation means to consider something in order to make a judgement about it. Whereas, the meaning of the world “heu-ristic” becomes less obscure once we analyse its etymology: it comes from the ancient Greek verb

“εὑρίσκω”, which means “to find” or “to discover”. In simple terms, heuristic evaluation is an evaluation whose aim is to discover problems. In a web usability context, the goal of heuristic evaluation, as afore-mentioned, is “to find the usability problems in a user interface design […]” (Nielsen, 1993).

In principles, heuristic evaluation is performed by having each individual evaluator inspect the interface alone. When all evaluations have been completed, evaluators are allowed to communicate their findings and have them aggregated with the findings of other evaluators. This procedure ensure unbiased evalua-tions (Nielsen, 1993; J. Nielsen, 1995b).

Results are recorded as written reports. Otherwise, observers take notes of evaluators’ comments as they inspect the interface. Written reports have the advantage to be formal records, even though they need a manager to read and aggregate them. Nonetheless, an observer’s intervention can be of help in case the evaluators have little expertise, but “adds to the overhead of each evaluation session” (Nielsen, 1993, p.157; 1995b).

Three main differences between heuristic evaluation and usability user testing can be outlined (Nielsen, 1993, pp. 157–158, 224; 1995b):

In a user test, the observer, called the experimenter, has to interpret users’ actions, making it possible to conduct a test even if users do not know anything about web design. On the other hand, the observer in a heuristic evaluation can be of help in case a user is in trouble, but they normally only have to record comments, without any interpretation.

In a user test, observers are reluctant to give evaluators more hints than necessary, as the goal of the test is to uncover problems, and users are asked to answer questions about the interface. In contrast, in a heuristic evaluation, observers can answer questions as they could be of help to evaluators “to better assess the usability of the user interface” (Nielsen, 1993, p. 158).

Heuristic evaluation uncovers individual usability problems and can address expert user issues. On the other hand, user testing involves real users, making it possible to discover their real needs (Nielsen, 1993, p. 224).

As explained above, a single evaluator can perform a heuristic evaluation, as each evaluator works sepa-rately. However, as aforementioned, heuristic evaluation is generally difficult for a single individual to do, since obviously more evaluators have more probabilities to uncover all usability problems in a user inter-face, and different people find different usability problems (Nielsen, 1993, p. 158; 1995b).

38 In a heuristic evaluation session, “the evaluator goes through the interface several times and inspects the various dialogue elements and compares them with a list of recognised usability principles”, the afore-mentioned “heuristics” (J. Nielsen, 1995a). Heuristics can be defined “as general rules that seem to de-scribe common properties of usable interfaces” (Ibid.). Heuristics can be presented in the form of a checklist. In addition to the checklist, “the evaluator is also allowed to consider any additional usability principles or results that come to mind that may be relevant for any specific dialogue element” (Ibid.).

In principle, evaluators are free to decide how they want to conduct the inspection. A general recom-mendation (Ibid.) would be to first get the look and feel of the interface to get acquainted with the flow of the interaction and the general scope of the system. The second step would be to focus on specific elements keeping in mind how they all fit into the larger whole.

Further, “since the evaluators are not using the system as such (to perform a real task), it is possible to perform heuristic evaluation of user interfaces that exist on paper only and have not yet been imple-mented” (Ibid.).

Lastly, “the output from using the heuristic evaluation method is a list of usability problems⁸ in the in-terface with references to those usability principles that were violated by the design in each case in the opinion of the evaluator” (Ibid.). It is not enough for evaluators to simply say that they do not like something. They should list each problem separately; they should refer to the various usability principles that where violated, and explain why each wrong aspect of the interface is a usability problem (Ibid.). In fact, heuristic evaluation “is based on combining inspection reports from a set of independent evaluators to form the list of usability problems” (Nielsen, 1995).

The main characteristics of heuristic evaluation, as explained in this section, are summarised in the fol-lowing table:

Goal: Heuristic evaluation aims to find usability

prob-lems in a user interface.

Method: Interface is inspected according to some

princi-ples, called “heuristics”.

Participants: Heuristic evaluation is performed by having each individual evaluator inspect the interface on his own.

8 It is worth mentioning that heuristic evaluation does not provide a systematic way to fix the usability problems, nor does it provide a way to assess the quality of any possible redesigns (Nielsen, 1993, p. 159).

Results: Findings from heuristic evaluation are aggregated

and recorded as written reports.

Ultimate goal: Heuristic evaluation ends when a list of usability problems is produced.

Advantages: Heuristic evaluation finds individual usability problems and can address expert user issues.

Limitations: Heuristic evaluation does not involve reals users and does not directly test their needs.

Table 3. Heuristic Evaluation in summary.

3.2.1 From broad Heuristics to Guidelines and Standards

As aforementioned, heuristic evaluation relies on principles, called “heuristics”. Jakob Nielsen identifies ten general principles for interaction design. They are called “heuristics” because “they are broad rules of thumb and not specific usability guidelines” (J. Nielsen, 1995a):

1. Visibility of system status: “The system should always keep users informed about what is going on”

(Ibid.);

2. Match between system and the real world: “The system should speak the users’ language, with words, phrases and concepts familiar to the user” (Ibid.);

3. User control and freedom: When users choose system functions by mistake, they need a clearly marked “emergency exit” to leave the web site or the web page “without having to go through an ex-tended dialogue” (Ibid.);

4. Consistency and standards: “Users should not have to wonder whether different words, situations, or actions mean the same thing” (Ibid.);

5. Error prevention: Design should prevent a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they com-mit to the action (Ibid.).

6. Recognition rather than recall: “Minimise the user's memory load by making objects, actions, and options visible” (Ibid.);

7. Flexibility and efficiency of use: Accelerators can speed up the interaction for the expert user and

“allow users to tailor frequent actions” (Ibid.);

40 8. Aesthetic and minimalist design: Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of infor-mation and diminishes their relative visibility (Ibid.).

9. Help users recognise, diagnose, and recover from errors: “Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution” (Ibid.).

10. Help and documentation: “Even though it is better if the system can be used without documenta-tion, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large” (Ibid.).

In practice, these so-called heuristics refer to broad principles, while usability standards and guidelines refer to rules that are more specific.

According to Nielsen (Nielsen, 1993, p. 91), “guidelines list well-known principles for user interface de-sign which should be followed in the development project”. He distinguishes among general guidelines (applicable to all users interfaces), category-specific guidelines (for the kind of system being developed), and product-specific guidelines (for the individual product). An example of general guideline could be to

“provide feedback” to user about a system’s state. This guideline can become category-specific if intended for graphical user interfaces (“ensure that the main objects of interest to user are visible on screen and that the most important attributes are shown”). It would be possible to develop this guideline into a product-specific one, for example if the design of a graphical file system is considered (“use different icons to represent different class of objects”) (Nielsen, 1993, p. 92).

According to the same author, “a standard ensures that your users can understand the individual interface elements in your design and that they know where to look for what features” (Nielsen, 1999). A standard does not ensure that users will know how to manage the interface features. In addition, it does not ensure that the system will have the features users expect to find (Ibid.).

Standards can be national or international, industry standards, and in-house standards. International standards have already been mentioned in the previous chapter, when discussing internationalisation.

They are “instrumental in facilitating international trade” (ISO, 2018). The afore-mentioned ISO standards are an example of international standards. Industry standards are “promoted by various operating system and window system vendors”, and specify “the look and feel of user interfaces in great detail” (Nielsen, 1993, p. 232). Industry standards can differ to varying degrees, for such reason developers should take care when changing platforms, as details from other industry standards can show up (Ibid.). In-house standards are “developed locally with an organisation”, it is therefore pos-sible “to aim for a high degree of usability of the standard itself” (1993, p. 233).

41 Further, the difference between standards and guidelines is explained by Nielsen (1999, p. 92):

Standards specify how the interface should appear to the users, while guidelines provide advice about the usability characteristics of the interface. […] Hopefully, a given standard follows most of the traditional usability guidelines so that the interface designed according to the standard will also be as usable as possible.

Dans le document A Heuristic Evaluation of Multilingual Lom ba rdy : Museums' Web Sites (Page 37-43)