DIRECT AND INDIRECT MEASURES OF LEARNING IN VISUAL SEARCH U L B

(1)

U ^NIVERSITE L ^{IBRE DE} B ^RUXELLES

FACULTE DES SCIENCES PSYCHOLOGIQUES ET DE L’EDUCATION

DIRECT AND INDIRECT MEASURES OF LEARNING IN VISUAL SEARCH

Robert A.P. REUTER

Thèse de doctorat

Préparée sous la direction de M. Axel CLEEREMANS Membres du jury: Mme Natacha DEROOST, M. Arnaud DESTREBECQZ, M. Wim GEVERS et M. Christophe LEYS

Juin 2013

(2)

(3)

Preface

I guess the preface is the only place, where it is acceptable to give a personal appreciation of the very long journey that writing a Ph.D. thesis can be. So allow me to briefly retrace the important influences of the last decades that finally led me to do the scientific research described in the following pages.

As far as I remember, it all began during my adolescence, when my late father, René F.

Reuter, and I had numerous, sometimes lately, discussions about René Descartes’ philosophy of the mind. I was fascinated by the possibility that all our sensory impressions could in fact be mere illusions and that the outside world, we perceive as so real, maybe did not exist at all. I quickly understood that, according to Descartes, the only thing we could be sure of was the existence of our own private consciousness. I guess this was a real starting point for me:

ever since I was passionately interested in knowing what we could know. Another milestone on this journey was my discovery of Immanuel Kant’s writing on epistemology, where I first made contact with what I would nowadays qualify as my interest in cognitive science.

Unbeknownst to myself at that time, I wanted to know how knowledge is generated in the human mind, and, more generally, how the mind really works. To me, the most important question, at that time, was thus to know why we think the way we do. Before reading Kant, or David Hume for that matter, I was somewhat personally convinced that all we know necessarily stems from something we have learned. Knowledge could not be in our heads without having entered it through our senses by some kind of personal experience. It wasn’t though till some years later, when one of my high-school teachers introduced me to the works of Sigmund Freud, that I knew I had to study psychology, in order to gain a better understanding of the human mind.

I assume that partially explains why, during my undergraduate studies in psychology at the Université libre de Bruxelles, I kept devouring scientific books and papers about consciousness and cognition. Among the authors I discovered during those years, Daniel C.

Dennett was probably the most influential, as his writings introduced me to a Darwinian view of psychology and led me to read other authors from this field. “How the mind works” by Steven Pinker seemed to give me all the answers I had been looking for. Indeed I immediately saw that an evolutionary perspective on cognition was the most promising way to understand the old philosophical questions I had asked myself: Is it possible that some knowledge is already in our minds when we are born? How did it get there? What makes us humans think the way we do? I therefore quickly embraced the views of the nascent field of evolutionary psychology to understand the human mind. My licentiate thesis, under the direction of Professor José Junça de Morais, on sex differences in spatial cognition was greatly influenced by this new perspective. And his open-mindedness allowed me to freely explore a domain that was not even taught as such at my university, at least not then, back in 1998. At about the same time, I had the chance to get familiar with the domain of “subliminal perception”, during an internship at the Laboratoire de Psychologie Expérimentale with Professor Daniel Holender. Put in a nutshell, I learned that “subliminal perception” is very difficult to observe, since most scientific studies did not really control for the actual

(8)

awareness that participants had of the supposedly “unconscious” stimuli. I however remained very interested in all the phenomena commonly called the “cognitive unconscious”.

It was hence not surprising that I gratefully accepted Professor Axel Cleeremans’ proposal to work on the topic of IMPLICIT LEARNING as a scientific research assistant at the Cognitive Science Research Unit, now called the Consciousness, Cognition & Computation Group. This was evidently the decisive milestone that finally led to the document you are holding in your hands right now. Moreover, the years I have spent in Brussels at Axel Cleeremans’ research unit have so far surely been among the most enriching experiences in my live, not only at a professional, but also at a personal level. I certainly won’t forget the friendly atmosphere that reigned among colleagues and extended well beyond daily working hours! My skills at playing egoshooter games have however hardly improved, despite some practice.

As I write this final passage of my preface, I have moved back to my home country Luxembourg, where I have worked as a scientific collaborator at the newly founded University of Luxembourg and currently work as a senior lecturer in Educational Technology; after having worked in the domain of school psychology for a few months. The research group, I am a member of now, is called EMACS (Educational Measurement and Applied Cognitive Science) and our main scientific interest is the systematic as well as individual evaluation and analysis of cognitive processes underlying learning. I would have loved to highlight some of the practical implications that scientific knowledge of implicit learning processes might have on our understanding of learning in general, as well as on teaching methods. In my view, ultimately the purpose of cognitive science is not only to understand the various mental processes it is studying, but also to inform “professionals of cognition”, for instance teachers, about how the human mind works and what can thereby be the conditions for learning and teaching to be most efficient. However, the field of applied cognitive science is still relatively young and only modest conclusions can (and should) currently be drawn, until more scientific evidence is available. Thus more research into this domain is deeply needed…

Dudelange, June 2013

(9)

Remerciements

Je tiens à remercier tout d’abord mon directeur de thèse, Axel Cleeremans pour son amitié et sa confiance toujours renouvelée, pour les libertés qu’il m’a accordées dans la conception et la réalisation de mon travail, pour ses critiques constructives, pour sa disponibilité, ainsi que pour ses nombreuses incitations à me dépasser (les « practice talks » et les conférences enrichissantes).

J’exprime toute ma gratitude à Madame Natacha Deroost, Monsieur Wim Gevers et Monsieur Christophe Leys pour avoir accepté de participer au jury de cette thèse.

Je remercie chaleureusement Arnaud Destrebecqz, Daniel Holender et Josée Junça de Morais pour leurs conseils comme membres de mon comité d’accompagnement de thèse.

Je remercie cordialement mes jeunes collèges « du 10e », Maud Boyer, Michaël Dubois, Katia Duscherer, Vinciane Gaillard et Nicolas Schmidt qui m’ont tant apporté sur le plan professionnel que personnel.

Je remercie Olivier Lejeune de m’avoir assisté dans la mise en place et la réalisation d’un certain nombre des études qui font partie de cette thèse.

Mes recherches ont été supportées par le Fonds National de la Recherche Scientifique (Belgique).

Merci aussi à tous mes amis, de plus ou moins longue date, qui m’ont accompagné de maintes façons, au cours de l’aventure que représente une thèse. Sans eux, ma vie de doctorant aurait été bien triste. Merci Anne S., Anne T., Cathy, Christian, Dan, Isabelle, Jang, Jean-Paul, Lucien, Myriam H., Myriam K., Saskia, Vincent, Wouter, ainsi que tous ceux que j’aurais oubliés de citer.

Je voudrais également remercier les membres de ma nouvelle unité de recherche (EMACS) à l’Université du Luxembourg qui m’ont tous accueilli parmi eux avec beaucoup de collégialité, d’amitié et de patience au point où je terminai la rédaction de ma thèse (depuis quelques années déjà!). Merci plus particulièrement à Caroline, Christian, Christine et Gilbert. Un très grand merci à Romain Martin, le directeur de l’EMACS, qui n’a pas cessé de me rappeler l’importance de cette thèse pour ma carrière académique, même si j’ai entre temps réorienté quelque peu mes intérêts de recherche…

Merci Yves! Danke Anette! Danke Christel! Merci Frutz! Was lang währt, wird endlich gut!

Merci Benoît! Enfin tu ne sauras plus me taquiner avec le sujet « tabou ».

Enfin, la présence et le soutien de ma très chère épouse Nathalie, de mon fils Pierre, de mon frère Felix, de mes parents Jacqueline et René et de toute ma famille ont été pour moi indispensables au long de ce travail. Merci!

(10)

(11)

A la mémoire de mon père, René F. REUTER

(12)

(13)

Introduction

We learn all the time. We learn even when we have no intention to learn, no awareness of the fact that we are learning or no awareness of what we are learning.

Learning is not confined to situations where we (often painfully) try to acquire some new knowledge, by memorizing facts for instance. In order to illustrate this basic assumption, let us briefly give you some examples of implicit learning in daily life. When you look at a person’s face, you can – most of the time – very rapidly and easily decide whether that person is male or female. However, taken aside some explicit expertise in the domain of face perception, you will most probably be unable to say how you did it and you are surely not aware of any rules that you applied. When you hear a sentence, you are able to say whether it is grammatical or not: “Caroline should not have done this”. When speaking, we adhere to grammatical rules that we (often) cannot verbalize. As children, we already uttered grammatically correct sentences before ever having been informed about the underlying system of rules. In both cases, it seems that you have acquired some knowledge (about faces or language structure), without having had any intention to learn, nor any awareness of what you learned. These two examples also make clear that the cognitive processes underlying our behaviours can be described by rules or sets of rules. However, most importantly, it seems that the explicit knowledge of these rules is not necessary to do the task correctly, nor is it necessarily true that our mind/brain uses such rules to do what it does.

The idea that we could learn without being aware of this learning has been fascinating (and sometimes frightening) laymen and scientists alike for the last 40 years, at least. The term implicit learning has first been used by Arthur S. Reber in 1967, in a paper called “implicit learning of artificial grammars”, and he defined implicit learning as “the process by which [subjects] respond to the statistical nature of the stimulus array” (Reber, 1967). This learning process was itself defined as an “increasing sensitivity to the grammatical structure of the stimuli” (Reber, 1967). Reber also showed that the information learned about the underlying

(14)

artificial grammar, that had been used to generate the structure of the stimulus array, could be transferred from a memorization task to a recognition task with new stimuli. In other words, Reber’s definition captured a lot of essential features of the implicit learning phenomenon, which are still investigated and hotly debated these days by many cognitive scientists around the globe.

There has indeed been an overwhelming amount of empirical data collected during the last four decades using various experimental paradigms to study the phenomenon commonly called implicit learning, amongst which the Serial Reaction Time task (Nissen & Bullemer, 1987) has surely yielded the most elaborate and rich evidence.

Here are only a few of the gripping questions currently debated in the implicit learning literature, and they show how intrinsically complex and controversial this topic remains some 40 years after the inaugural paper by Arthur S. Reber (1967) on implicit learning:

How can we assure that participants actually do not know what they have learned?

Might they have some explicit knowledge that could explain their performance and we only fail to ask them the right questions? What are the methodological conditions necessary to distinguish explicit from implicit learning? Are implicit and explicit learning two different forms of learning or can they be explained by one single learning mechanism? What is the nature of implicitly acquired knowledge? Is attention to implicitly learned information mandatory? Is implicit learning about rule-abstraction or about learning associations? Where does implicit learning take place in the brain? Are there different parts of the brain involved in explicit and implicit learning? What does implicit learning tell us about the nature and function of consciousness? Is implicit learning an evolutionary older form of learning than explicit learning? How automatic is implicit learning? Can implicit learning be observed in a wide range of learning situations?

The theoretical position we will try to underscore in this thesis is that of Cleeremans and colleagues (Cleeremans, Destrebecqz, & Boyer, 1998), claiming that “implicit learning is best construed as a complex form of priming taking place in continuously learning neural systems, and that the distributional knowledge so acquired can be causally efficacious in the

(15)

absence of awareness that this knowledge was acquired or that it is currently influencing processing, that is, in the absence of metaknowledge”.

In order to study some of the questions related to the phenomenon of implicit learning, we will present original empirical data based on an experimental paradigm coined “contextual cueing of visual attention” by Marvin M. Chun and Yuhong Jiang in their seminal paper (Chun & Jiang, 1998)¹. Based on empirical data and theoretical considerations, Chun and Jiang (1998) have indeed suggested that visual attention can be guided by implicitly acquired knowledge about visuo-spatial invariants.

A fundamental assumption underlying the various studies by Chun and colleagues is based on a general principle of implicit learning, pioneered by Arthur S. Reber (Reber, 1993), namely that whenever complex information about the stimulus environment is structured in a way that can be used to enhance task performance, human subjects become sensitive to such regularities, even if they are not aware of the knowledge that influences their behaviour.

Let us very briefly sketch the visual search task used by Chun and Jiang (1998).

Participants have to search for a certain target item amongst distractor items and indicate its orientation (see the illustration below). Unbeknownst to them, for half of the trials, the global configurations of distractors are repeated for certain target locations, while for the other half, distractors occupy random locations. Reaction times for target items surrounded by repeated contexts decrease faster than for those surrounded by random contexts. Thus, the spatially invariant information seems to be used to find the target faster and in a more effective way, as compared to a target shown within a new visual context. However, participants are unable to tell that there were repeating visual contexts, unable to correctly recognize repeated context configurations. Furthermore, while they are, quite unsurprisingly, unable to differentiate them after one week’s time, the behavioural responses in the visual search task still indicate that they somehow use their incidentally acquired knowledge about the relationship between visual context and target location.

1 Originally the contextual cueing paradigm had been designed to study the deployment of attention when interacting with visual scenes, rather than to study mechanisms of implicit learning as such. We will however see below, that this paradigm is particularly interesting as a tool to study implicit learning processes, for various principled reasons.

(16)

Illustration of a visual display used in standard contextual cueing experiments. Observers have to search for a given target object (here the letter T, marked with an arrow) and respond by indicating its orientation, for instance (“left” in this case), using one of two keys on a computer keyboard.

For half the trials the configuration of all objects shown are repeated and for the other half they are moved to other locations. Targets are oriented as often to the left then to the right, for repeated and for rearranged configurations. The keyboard keys represented below the computer screen stand for the response keys associated with the corresponding orientation of the target letter (L, if the target is oriented to the left and R, if it is oriented to the right)

According to Chun & Jiang (1998), the contextual cueing effect has several main characteristics, in terms of underlying learning and memory processes.

(1) It reflects an acquired sensitivity to meaningful regularities and covariances between objects and events in a scene.

(2) Implicit learning processes allow this complex information about the stimulus environment to be acquired without intention or awareness.

(3) Memory representations involved are unconscious, highly robust, instance-based, episodic, and distinctive (i.e., specific to training contexts).

(4) They interact with general-purpose spatial attention mechanisms to guide search within complex visual arrays.

(5) Contextual cueing is a form of memory-based automaticity.

In general, the contextual cueing paradigm is especially interesting for the study of implicit learning and consciousness because existing results indicate that implicitly acquired knowledge can guide visual attention and thereby shape the way we consciously perceive visual scenes (since we are mostly aware of those things we attend to). The process of extracting structural regularities about the visual world is likely to be a constant and continuous “companion” of visual awareness and attention. Indeed according to Chun &

Nakayama (2000), “visual processing [should] benefit from the accumulation of information provided by the spatial and temporal context of past [visual experiences]. This provides fine- tuned, internal indexing of the visual memory embodied in the external environment”. In other words, “implicit memory mechanisms allow attentional allocation to benefit from

L R

(17)

experience, allowing for efficient guidance and deictic indexing, and compensating for the ephemeral nature of visual information across views”.

Furthermore, the contextual cueing paradigm differs from other implicit learning paradigms in certain interesting ways and may therefore allow for new insights into the nature of implicit learning in general.

(1) Learning is very unlikely to involve any kind of rule-abstraction, since there are no abstract rules to be discovered.

(2) It does not engage any kind of motor learning, since motor responses do not correlate (by design) with the relevant knowledge of the association between context configurations and target locations.

(3) It seems to involve brain structures different from those used to perform other implicit learning tasks (Chun & Phelps, 1999). Indeed patients with medial temporal lesions may show preserved (if somewhat reduced) performance in standard implicit learning tasks, while showing no contextual cueing effect at all.

Finally, exploring implicit learning of visuo-spatial invariants is likely to yield new insights into the cognitive mechanisms implied in everyday life situations of navigation in space, which have so far been largely described and conceptualized in terms of explicit and symbolic knowledge. However such activities are likely to lead to the acquisition of implicit knowledge, which is largely ignored by current models of spatial cognition. According to Chun (2000), the study of contextual cueing has an ecological validity and it shows us the adaptive value of spatial context learning related to the importance of visual and spatial regularities in real-world situations for navigation and orientation.

One striking anecdotal example, that hints at the existence of such implicit learning in spatial navigation, is that you have surely already experienced situations where you found yourself in an unknown supermarket and still somehow found your way around to certain things relatively easily without being able to fully verbalize your knowledge about the structure of the supermarkets you have been to. And you will surely be even more aware of the existence

(18)

of implicit expectations about your “prototypical” supermarket layout, when you do not find things were they “ought” to be. This however would have been the topic of another thesis…

(19)

Outline

Before presenting our own studies, we will first present a review of the scientific literature on contextual cueing, in order to give the readers of this thesis a better general idea of existing evidence and open questions within this relatively new research field.

The aims of our own experimental studies presented in the succeeding chapters are the following ones:

(1) to replicate and extend the findings described in the various papers by Marvin Chun and various colleagues on contextual cueing of visual attention;

(2) to explore the nature of memory representations underlying the observed learning effects, especially whether learning is actually implicit and whether memory representations are distinctive, episodic and instance-based or rather distributed, continuous and graded;

(3) to extend the study of contextual cueing to more realistic visual stimuli, in order to test its robustness across various situations and validate its adaptive value in ecologically sound conditions;

and (4) to investigate whether such knowledge about the association between visual contexts and “meaningful” locations can be (automatically) transferred to other tasks, namely a change detection task.

In a first series of four experiments, we tried to replicate the documented contextual cueing effect using a wide range of various direct measures of learning (tasks that are supposed to be related to explicit knowledge) and we systematically varied the distinctiveness of context configurations to study its effect on both direct and indirect measures of learning.

We also ran a series of neural network simulations (briefly described in the general discussion of this thesis), based on a very simple association-learning mechanism, that not only account for the observed contextual cueing effect, but also yield rather specific

(20)

predictions about future experimental data: contextual cueing effects should also be observed when repetitions of context configurations are not perfect, i.e., the networks were able to react to slightly distorted versions of repeating contexts in a similar way than they did to completely identical contexts. Human participants, we conjectured, should therefore (if the simple connectionist model captures some relevant aspects of the contextual cueing effect) become faster at detecting targets surrounded by context configurations that are only partially identical from trial to trial compared to those trials where the context configurations were randomly generated.

These predictions were tested in a second series of experiments using pseudo-repeated context configurations, where some distractor items were either displaced from trial to trial or their orientation changed, while conserving their global layout.

In a third series of experiments, we used more realistic images of natural landscapes as background contexts to establish the robustness of the contextual cueing effect as well as its ecological relevance claimed by Chun and colleagues. We furthermore added a second task to these experiments to study whether the acquired knowledge about the background-target location associations would (automatically) transfer to another visual search task, namely a change detection task. If participants have learned that certain locations of the repeated images are “important”, since they contain the target item to look for, then changes occurring at those specific locations should lead to less “change blindness” than changes occurring at other irrelevant locations. We used two different types of instructions to introduce this second task after the visual search task, where we either stressed the link between the two tasks, i.e., telling them that remembering the “important” locations for each image could be used to find the changes faster, or we simply told them to perform the second task without any reference to the first one.

We will close this thesis with a general discussion, combining findings based on our review of the existing research literature and findings based on our own experimental explorations of the contextual cueing effect. By this we will discuss the implications of our empirical studies for the scientific investigation of contextual cueing and implicit learning, in terms of theoretical, empirical and methodological issues.

(21)

Contextual Cueing of Visual Attention: Review of the Literature

About decade ago, Marvin Chun and Yuhong Jiang, then at Yale University, have started to open up a new research field at the intersection of visual attention and implicit learning, which they coined contextual cueing of visual attention.

In the present chapter, we will start with a relatively thorough description and discussion of their seminal paper (Chun & Jiang, 1998), which actually inaugurated the scientific research of the effects of implicit learning and memory of visual contexts on guidance of visual attention.

We will then review a number of relevant papers on contextual cueing² of visual attention published since 1998, and thereby try to explore its theoretical characterization, its adaptive meaning, its empirical conditions, its relationships to other implicit learning paradigms, as well as its neurophysiological and neuropsychological correlates.

The aim of this review is to give the readers of this thesis an overview, as broad and comprehensive as possible, of the studies done in the field of contextual cueing, and thereby to allow them to understand the scope of our own experimental data presented in the following chapter.

What is the “contextual cueing” effect?

Contextual cueing can be briefly defined as the processes by which implicit learning and memory of visual contexts guide spatial attention when a person is searching for a given target object. This definition is very broad and very specific at the same time: very broad in the sense that visual contexts that can guide spatial attention are underspecified, they could, for instance, be the identity of co-occurring objects or their spatial layout; very specific in the

2 In the following, we will use “contextual cueing” even though the original authors have also used “contextual cuing” (see, for instance, Chun & Phelps, 1999).

(22)

sense that learning and memory processes responsible for the guidance or facilitation effect are supposed to occur without awareness.

This characterization of contextual cueing is based on the results of 6 experiments done by Chun & Jiang (1998). Their first experiment introduces the contextual cueing paradigm, while experiments 2 and 3 study the nature of memory representations for visual context:

testing whether memory is explicit or implicit, whether learning is intentional or incidental, and whether the representations are specific or abstract. Experiment 4 examines how contextual information influences the efficiency of search using target slope measures as a function of set size. Experiment 5 uses flashed displays to see whether contextual cueing is dependent on motor skill learning expressed through eye movements. Finally, experiment 6 establishes the robustness and generality of contextual cueing.

Here we will, first of all, give a somewhat detailed description of their experiment 1, since virtually all subsequently published experiments have used very similar methods to study contextual cueing of visual attention. Participants were asked to perform a simple visual search task, where they had to search for a certain target letter embedded within spatial configurations of highly similar distractor items. These visual items were placed within an invisible 8x6 cells grid and slightly jittered within this rectangular array to prevent colinearities with other items. They consisted of 1 target letter T, rotated 90 degrees to the left or to the right from its normal vertical orientation, and of 11 distractor letters very similar to the target letter, namely L’s rotated 0, 90 180 or 270 degrees. The distractor items were heterogeneously coloured, with an equal number of them being red, green, blue and yellow (on average over trials). Each target item was presented in each colour for each of 2 configuration conditions. Target items could indeed appear in 24 different locations, 12 of them being each associated with a specific unchanging configuration of distractor items (old configuration) and 12 of them being surrounded by randomly placed distractor items (new configuration). Participants were presented with 30 blocks of 24 trials, 12 of them being old and 12 of them new configurations. In old configurations, the colours of target and distractor items were preserved across all repetitions. In new configurations, the colours of target items appearing at specific locations were unchanged. The participants’ task was to search for the target letter and press one of two keys to indicate its orientation (left or right). A sound

(23)

feedback was given to indicate correct and incorrect answers. Since, for each target location, the orientation of the target was randomly chosen, there was no association between configuration and motor response. Finally, there were no explicit instructions revealing the repetition of stimuli, so it was, at least, an incidental learning task. Results showed that people detected the target item faster when associated with repeated visual contexts than with random contexts. This “search facilitation” effect, measured thus as the difference in visual search reaction times (RTs) between trials with repeated and those with random visual contexts, was coined “contextual cueing” effect. Indeed, while there was no difference at the beginning of the experiment, the facilitation effect became effective after 1 epoch already, or 5 repetitions. By convention, the magnitude of the contextual cueing effect is defined as the difference between repeated and random trials collapsed across the second half of the experiment. Its magnitude was about 71 ms in experiment 1.

In experiment 2, Chun & Jiang wanted to study 3 issues: (1) the perceptual specificity of contextual cueing effects, (2) the explicitness/implicitness of the underlying spatial representations and (3) the influence of less discriminative contexts on cueing effects.

Therefore, they used monochromatic items (which produce less distinctive contexts then coloured items) and different distractor sets for the first and the second half of the visual search task, so that the identity of distractors changed while their general configurational layout remained constant. Furthermore, in addition to the visual search task, participants were asked a series of questions about their memorization strategies and whether they had noticed anything special during the search task, and then they had to perform a configuration recognition task, where they had to tell whether they had seen or not certain configurations.

Results suggest that the facilitation effect is related to an effect of global spatial layout rather than low-level perceptual learning, i.e. that the perceptual identities of distractors were not encoded with the information about global configurations, since coarse visual information was sufficient in this task. Furthermore, participants were unable to effectively distinguish between old and new configurations, suggesting that learning of the relevant contextual information was indeed incidental and led to implicit unconscious memory representations.

In experiment 3, Chun & Jiang showed that the critical information that is used to facilitate visual search is the predictive association or co-variation between distractor configurations

(24)

and specific target locations. Indeed, merely repeating certain visual arrays without coupling them to certain target locations did not lead to a facilitation effect, suggesting that it’s the predictive nature of contextual configurations that drives the enhanced visual search performance.

In experiment 4, Chun & Jiang investigated more directly the assumption that contextual information guides attention towards target locations, by studying the effect of set size (the number of distractor items varying between 8, 12 and 16) on target detection RT slopes (a standard measure of search efficiency). Results suggest, in sum, that contextual cueing affects the efficiency of search by directing visual attention towards relevant target locations rather than by enhancing early perceptual processing or late motor response selection and execution. Interestingly, as we will see later, Chun & Jiang found that “heterogeneous array of set sizes used in this experiment may have made the configurations more distinct [from each other] than in previous ones” and produce the contextual cueing effect very rapidly.

They do indeed observe faster visual search performances for old than for new configurations as soon as during the second block.

In experiment 5, Chun & Jiang used flashed (i.e. very shortly presented so that eye movements cannot take place during the stimulus presentation) displays in order to study (1) the time course of contextual cueing effects (is the search benefit acting on early or late visual processes?) and (2) the influence of procedural learning of eye movement patterns (is the benefit due to motor learning?). Results, apart from replicating a similar facilitation effect as with non-flashed displays, suggested that contextual cueing did “play a role in the first few hundred ms of visual processing” and more importantly, that it did not depend on eye movements and any kind of procedural learning associated with eye movement programming and execution. Moreover, since participants were also asked to perform an explicit recognition task after the visual search task, results strongly suggested that the effect was driven by implicit representations, especially because those participants who became aware of the repetitions also showed much smaller contextual cueing effects than those participants who did not become aware of them.

(25)

In experiment 6, they examined the “robustness of contextual cueing across [slight]

perturbations in the configurations and changes in target locations”. Whereas in the previous experiments items remained at the same location for any given configuration across repetition, here they were “jittered randomly within their cells across repetitions”. In addition, they wanted to see whether contextual cueing could be found for more than one target location for a specific configuration. Finally, they introduced displays with targets at untrained locations, i.e. for already trained configurations, targets were moved to random locations, thus breaking up the predictive nature of the context-target location associations.

Results showed that (1) small distortions of configurations did not disrupt the contextual cueing effect, (2) it generalized to 2 target locations and (3) it did not occur with targets appearing in untrained locations.

Based on these experiments and theoretical considerations, Chun & Jiang argued that the newly discovered attentional guidance effect could be thought of as a “top-down, knowledge- based” factor influencing the way attention is deployed within and across visual scenes. To them, it is, furthermore, an “ecologically critical factor” compared to other top-down influences on visual attention, like search templates, automaticity effects, novelty effects, familiarity effects and expectancy effects. Based on works by Biederman (1972, cited in Chun et al., 1998) and Gibson (1966, cited in Chun et al., 1998) on visual scenes and the existence of “rich and complex structure of covariation between visual objects and events”, Chun et al. (1998) had indeed assumed that “visual search tasks and especially objects in the real world are almost always accompanied by other objects forming a global context or scene” (emphasis added). Therefore, since “sensitivity to regularities in the [visual]

environment would be informative” – a point also made by Arthur Reber (1989) in the domain of implicit learning –, “people [should] learn to exploit the structure” of their environment to adapt “their behavior in a coherent manner” (Chun & Jiang, 1998). Indeed

“an important goal of attentional systems is to rapidly prioritize aspects of a complex scene that are of significant behavioral relevance. Such efficient ‘smart’ deployment of attention is crucial for adaptive functioning” (Chun & Jiang, 1998). However, since “natural scenes […]

tap into the rich background knowledge and extensive visual experience of observers, [it is difficult to study] how visual context can be defined, how it influences visual processing, and

(26)

how contextual knowledge is acquired and represented”, Chun et al. (1998) have therefore used highly artificial visual stimuli, where global context is defined as the overall spatial layout of distractor items.³ These global contexts can be repeatedly associated with specific target locations or generated in a random way from trial to trial for other target locations.

In sum, Chun et al. (1998, emphases added) made the following proposals to characterize the newly discovered phenomenon of contextual cueing:

1. “Visual context guides visual attention […] reflect[ing] sensitivity to meaningful regularities and covariances between objects and events in a scene.”

2. “Contextual knowledge is acquired through implicit learning processes which allow complex information about the stimulus environment to be acquired without intention or awareness [and it] forms a highly robust, instance-based, implicit memory for context.”

3. “Memory for contextual information is instance-based, and […] these episodic memory traces [or context maps] interact with attentional mechanisms to guide search. Hence, we consider contextual cueing as a form of memory-based automaticity. […] Most important, these memory traces […] allow for a distinction between stimuli that were presented in the history of perceptual interactions from novel stimuli that were not.

[Thus,] facilitation in performance would be specific to the contexts observers were trained on.”

4. “[The] rich memory for context interfaces with general-purpose spatial attention mechanisms to guide deployment to complex visual arrays.

Beyond Chun & Jiang (1998)

In the following sections we will try to further define the contextual cueing phenomenon, based on a number of papers published as a follow-up of the seminal paper by Chun & Jiang (1998), which we have just presented in some detail. Our review of the literature will address certain important questions related to the theoretical characterizations of contextual cueing, its functions within the general architecture of the mind/brain, its adaptive meaning, its empirical conditions, its robustness across age, its relationship to other implicit learning

3 We will see, in one of the following sections, that studying learning of new contextual information can and should be done with natural scenes, since this can lead to very different results and conclusions.

(27)

paradigms, its links to other attentional processes, as well as its neurophysiological and neuropsychological correlates.

One specific question that has been studied over and over by various researchers is what is actually learned during the visual search task that leads to the facilitation effect and what is the exact nature of the memory representations underlying it.

Other important questions have been to know what types of information of the visual world can lead to a contextual cueing effect, what can be part of the so-called “context maps” and what can eventually be a part of such a facilitation effect in conjunction with other contextual information.

Yet other studies have tried to find out under what conditions contextual cueing can be found and under which circumstances it cannot be observed.

Another strand of research has investigate the anatomical and the neural correlates of contextual cueing, and asked which part(s) of the brain participate in the emergence of it, how long it takes for the effect to arise, and at what stage of visual processing it occurs.

Some papers have also examined the effects of age on contextual cueing, whether it can be observed in healthy elderly people or not and whether more training is needed for it to emerge than in younger people.

There have also been a certain number of studies trying to contrast and/or combine the contextual cueing paradigm with other paradigms, mainly from the implicit learning literature (like the Serial Reaction Time task), for instance studying whether people could learn to become sensitive to sequences of the targets’ spatial locations occupied from trial to trial in an otherwise standard “contextual cueing” visual search task.

In the same line of reasoning, there have been some attempts to see whether acquired sensitivity to visuo-spatial contextual information could transfer to other subsequent tasks, like the change detection task.

(28)

Another very important question, which has been addressed in a lot of papers, is related to a specific aspect of the learning processes and the underlying memory representations, namely whether the contextual cueing effect is essentially reflecting an implicit (or unconscious) sensitivity to the hidden regularities of the visual world. This issue of “implicitness of learning” is, of course, very strongly reminiscent of the methodological, empirical and theoretical problems encountered in the domain of implicit learning in general. Some efforts have been made to design or to adapt “good” measures of implicit and explicit knowledge specifically for the contextual cueing paradigm.

We will also review a recently growing corpus of evidences obtained by studies using real world scenes rather than arbitrary artificial stimuli to investigate the nature, the robustness and the generality of contextual cueing (namely because it is a process supposed to be involved in visuo-spatial attention, perception and navigation in everyday life settings).

What is learned in contextual cueing?

While the seminal paper by Chun & Jiang (1998) showed that repeating the spatial configurations of distractors around a given target’s location lead to faster visual search performances, many studies thereafter were designed to more systematically explore what was actually learned, what kind of information was acquired and was thus responsible for the contextual cueing effect. It was not totally clear from the existing empirical evidence what was driving this facilitation effect.

Moreover, from the scientific literature on implicit learning in general, one knows that the abstract structural information hidden by the researchers is not always identical to the information used by participants (Perruchet, Gallego, & Savy, 1990). It is, for instance, possible that only parts of the repeated contexts are stored in memory. It is conceivable that spatially closer distractors are more easily associated and linked with the target’s location. In other words, is it really the global layout of all distractors that participants learn to use a cue for the target’s location or is it rather a subset of distractors? Chun & Jiang (1998) have conceptualized the memory representations and processes underlying the contextual cueing effect in a very general way as “context maps”, by analogy with the concept of cognitive

(29)

maps; where “the notion of context maps represents a generic mechanism which can be convolved with any computational process that enforces top-down modulation of attentional guidance”. Thus one of the important questions in the field of “contextual cueing” has been to know what these “context maps” are, i.e., which type of information they contain.

On the other hand, while trying to study what the context maps contain, many researchers have extended the contextual cueing paradigm and tried to investigate what kind of information can be part of such “context maps”. It may actually be that the original study has investigated one very specific kind of context map that could lead to more efficient visual search performances, while other contextual information could also “enter” these maps.

Chun & Jiang (1999) have indeed shown that, besides the static global spatial context (defined as the overall layout of distractor items), various other types of visuo-spatial features can serve as contextual information to guide attention. First they studied the covariation of visual shapes without configurational information, i.e., items that appear often together, regardless of their spatial arrangement, are used to predict target locations; secondly they used dynamic visual contexts, i.e., items that move around in predictable ways can also be used to predict the location of a target to be searched for. These experiments thus expand the scope of the contextual cueing effect and show that many kinds of visual covariations, beyond spatial configurations, can be used to guide visual attention, even when subjects are not informed about the existence of such covariations.

Similarly, Endo & Takeda (2002) combined spatial configuration learning and object identity covariation learning. Their results show that visual search is most efficient when searching for repeated configuration and repeated identity, the effects of these two types of cueing being additive. This suggests, according to Endo and Takeda, the existence of two functionally independent learning mechanisms. In other words, that two somewhat different and independent context maps can be “built” during a visual search task, which then interact to facilitate later visual search performances.

In a subsequent study, Endo & Takeda (2004) further investigated the effect of learning spatial configurations, object identities, and a combination of both configurations and object identities on visual search performance. They found that contexts consisting of spatial

(30)

configurations (the global layout of distractor items) produced facilitation effects, but not the object identities – even when both configurations and identities were completely correlated.

Moreover, they found that repeating only object identities did have an effect on visual search RTs, leading to identity learning. In addition configuration learning and identity learning was found when, in some trials, each context was the relevant cue for locating the target.

Observers could learn to use only the relevant context that was correlated with target locations. Thus, in sum, when multiple contexts are redundant, learning occurs specifically and selectively, as a function of the predictability of target locations.

Olson and Chun (2001) investigated whether temporal information could lead to contextual cueing, namely by implementing various types of sequential structures of visual events (which remain independent of responses): event durations, event identities and spatiotemporal event sequences. They showed that all these kinds of temporal information could guide attention: “temporal structure in perception can guide attention to a serial position, a point in space, or can cue the identity of an upcoming event” (Olson & Chun, 2001). They have also found that “the element immediately preceding the target is particularly salient and may provide more cueing power than other elements in the sequence”, suggesting that certain contextual information has greater importance in determining context maps.

Hodsoll and Humphreys (2005) combined contextual cueing and preview search in order to study (1) the effect of crowdedness of item displays on contextual cueing; (2) the effect of repeating preview vs. new items on contextual cueing; and (3) the effect of similarity between preview and new items (inducing more or less competition for selection). Their results indicate that old items, processed during the preview search do provide a context for subsequent search of new items, and that this context can be used to enhance search performance, if and only if there is a consistent spatial relation with the target. Contextual learning can thus operate across space (as shown by classical studies by Chun et al., 1998) as well as across time. Hodsoll and Humphreys have also been able to show that contextual cueing does not arise “automatically, simply on the basis of contingent relations between stimuli and the environment”. Contextual information needs to be encoded in some way to become an effective cue for visual search processes.

(31)

Hyun & Kim (2002) showed, however, that task-irrelevant but predictive background information helped target search. The background was learned as well as its association with the spatial configuration of search items. Moreover, colour changes in background and moving the background shapes preserved contextual cueing effects. Interestingly, background information that was per se task-irrelevant was nevertheless used to help visual search. However, it remains difficult to assert that learning was implicit since no task to test explicit knowledge was used after the visual search task. Background shapes are very artificial, but this is consistent with the artificial visual search items, and former contextual cueing experiments.

Figure 0.1. This figure illustrates the kind of backgrounds that Hyun & Kim (2002) used in their experiments.

Yokosawa & Tadeka (2004) extended Hyun and Kim’s (2002) study by combining background shapes and spatial layout of distractor items. Four visual context conditions were used: all-old, layout-old, target-old and irrelevant. In the all-old condition, backgrounds and distractor configurations were jointly repeated across blocks; in the layout-old condition only the distractor configurations were repeated, together with random backgrounds; in the target-old condition only the backgrounds were repeated and thus predictive of target locations; and finally, in the irrelevant condition, both visual contexts were randomly changed for each block. Results showed that visual search RTs became faster with training, except for the irrelevant trials. Moreover, the condition with combined predictive contexts rapidly triggered the fastest responses, followed by layout-old and target-old conditions. Both kinds of contexts can thus lead to contextual cueing effects and, when distractor

(32)

configurations and background shapes are jointly predictive of the target locations, a stronger facilitation effect was observed compared to the two individual effects.

Jiang and Wagner (2004) directly addressed one specific central question about the nature of memory representations underlying the contextual cueing effect, namely what precisely is learned, global configurations vs. individual locations. In order to study this question, the authors designed two experiments, where participants were trained on 36 visual search displays that contained 36 sets of distractor locations and 18 target locations. Contrary to the classical contextual cueing experiments, each target location was associated with two sets of distractor locations on separate trials. For experiment 1, after the visual search tasks, participants were presented with recombined displays, created by adding up half of one trained distractor set with half of the other trained distractor set. Participants showed perfect transfer to those recombined displays. For experiment 2, they were presented with rescaled, displaced and perceptively regrouped displays, and they still show good transfer performance, suggesting that relative locations between items were also learned. In sum, both individual target-distractor associations and configural associations are part of the context maps underlying contextual cueing effects. Based on computational accounts given for implicit sequence learning (Cleeremans, 1993), we think that these results can be re- interpreted in terms of distributed and gradual representations, rather than in terms of two separate types of context maps.

Jiang & Song (2005a) explored how perceptually specific contextual cueing based on spatial layouts is. In other words, they studied whether the facilitation effect can be transferred to spatial layouts that are occupied by items of new shapes or colours. They have found that learning can be specific to the trained colour, for instance, if the training phase includes trials with items of different colours (here black vs. white), but that it transfers to items of the other colour if the training set included only items of one colour. The same is true with changes in item shapes. The authors conclude that “implicit visual learning is sensitive to trial context, and that spatial context learning can be identity-contingent”.

We would like to suggest that these results are consistent with the account given by Ono, Jiang and Kawahara (2005) on the nature of statistical learning of visual information,

DIRECT AND INDIRECT MEASURES OF LEARNING IN VISUAL SEARCH U L B

U NIVERSITE L IBRE DE B RUXELLES