Thesis
Reference
Prediction in Interpreting
AMOS, Rhona
Abstract
People make comprehension easier by predicting upcoming utterances. But what happens when people comprehend and produce utterances concurrently, in two different languages?
That is the question addressed in this thesis, which considers the role of prediction in simultaneous and consecutive interpreting. After developing a model of prediction in simultaneous interpreting, the dissertation reports three eye-tracking studies which use a visual-world paradigm. These studies explore whether prediction takes place during simultaneous interpreting; how specific this prediction is; whether interpreters predict differently from other bilinguals; whether training affects prediction; and whether a consecutive interpreting task affects predictive processing. The dissertation concludes that prediction often takes place in both simultaneous and consecutive interpreting, even in noisy conditions, and that interpreters tend to predict earlier and to a greater extent than other bilinguals.
Exploratory findings suggest that greater synchronicity of comprehension and production supports prediction – something that could be investigated in [...]
AMOS, Rhona. Prediction in Interpreting. Thèse de doctorat : Univ. Genève, 2020, no. FTI 37
DOI : 10.13097/archive-ouverte/unige:148890 URN : urn:nbn:ch:unige-1488906
Available at:
http://archive-ouverte.unige.ch/unige:148890
Disclaimer: layout of this document may differ from the published version.
PREDICTION IN INTERPRETING
Thèse
Présentée à la Faculté de traduction et d’interprétation de l’Université de Genève
pour obtenir le grade de Docteur en interprétation par
Rhona M. Amos
Jury:
- Prof. Kilian G. Seeber, Faculté de traduction et d’interprétation, Université de Genève (Co-directeur de thèse)
- Prof. Martin J. Pickering, Department of Psychology, University of Edinburgh (Co- directeur de thèse)
- Prof. Lucía Ruiz-Rosendo , Faculté de traduction et d’interprétation, Université de Genève (Présidente du Jury)
- Prof. Robert J. Hartsuiker, Department of Experimental Psychology, Ghent University (Juré externe)
- Prof. Ena Hódzik, Department of Translation and Interpreting Studies, Boğaziçi University (Jurée externe)
- Prof. Franz Pöchhacker, Department for Translation Studies, University of Vienna (Juré externe)
Soutenue le 16 décembre 2020 à l’Université de Genève Thèse No 37
Acknowledgements
I had always imagined a PhD as meaning hours and days spent alone poring over books in the library, thinking very hard, and writing. But this has not been the solitary experience I had expected, as I was supported throughout by my supervisors as well as by friends and colleagues at Geneva, Ghent and Edinburgh universities.
I would like to start by thanking my two co-supervisors, Professor Kilian Seeber and Professor Martin Pickering. Professor Seeber guided me through my PhD, while also making sure I had the time, space and resources to develop my project and my ideas. He asked the right questions at the right times and made sure that I always had forward momentum. Professor Pickering has been a consistent and reliable academic mentor. He helped me develop my experiments, and has always been available, even on short notice, to answer questions, re-read my work and make comments that push me in my thinking.
I would also like to thank the following people:
- Dr Aine Ito for sharing her stimuli, her experimental set-up and her time as I set up and analysed my first experiment. She was always quick to respond to any questions that I had and made me feel like I could ask the same thing twice if something still wasn’t clear to me.
- Dr Laura Keller for showing me around the “LaborInt”, our eye-tracking lab, for the first time. She spent time showing me how to set up the eye-tracker and use the Experiment Builder software, and shared many tips from her own experience.
- Irene Carbona for the evenings spent working patiently together on RStudio and Excel. It made learning RStudio fun!
- Jesús González, Jean-Pierre Sossauer and Philippe Baudrion for all their technical support. A special thank you to Jesús for helping me with audio stimuli recording and editing and for always being on hand to sort out any issues in the experimental laboratory.
- Professor Robert Hartsuiker and his team for welcoming me to Ghent University. I’m particularly grateful to Mieke Slim, who helped find participants to norm my stimuli in Ghent, and Dr Matthias Franken, who showed me how to create “noisy” speech.
- All the members of Professor Holly Branigan’s lab group at the University of Edinburgh for welcoming me to and including me in their lab group discussions. I have learnt a lot from them about analysis, structuring presentations and collaborative working.
- Christine Mooser for taking the time to explain the Doc.Mobility procedure to me, and Professor François Grin for promoting my application, as my stays in Edinburgh and Ghent were made possible via this Swiss National Research Foundation funded scheme.
- The team, past and present, at the Interpreting Department of the University of Geneva for discussing my ideas with me, sharing their own experience, piloting my experiments and much more. I'm particularly obliged to Eléonore Arbona, Dr Carmen Delgado, Conor Martin, Dr Manuela Motta and Dr Magdalena Olivera Tovar-Espada.
- Professor Robert Hartsuiker, Professor Ena Hódzik and Professor Franz Pöchhacker for agreeing to be members of my jury, and for taking the time to read this dissertation. A special thank you to Professor Lucía Ruiz-Rosendo for agreeing to chair the jury.
Finally I would like thank all of the participants in my studies – it is fair to say that this thesis would
TABLE OF CONTENTS
Introduction ... - 1 -
1 Literature Review ... - 2 -
1.1 Interpreting – bilingual communication in a multilingual setting ... - 2 -
1.1.1 Simultaneous and consecutive interpreting ... - 4 -
1.2 Language comprehension in bilinguals ... - 6 -
1.3 Language production in bilinguals ... - 13 -
1.4 Prediction in comprehension ... - 16 -
1.4.1 Prediction in monolingual comprehension ... - 16 -
1.4.2 Prediction during comprehension of L2 ... - 20 -
1.4.3 Cognitive resources and prediction ... - 22 -
1.5 Production-based prediction ... - 25 -
1.6 Simultaneous interpreting – a unique bilingual setting? ... - 28 -
1.7 Models of interpreting ... - 30 -
1.8 Empirical research in Interpreting Studies ... - 36 -
1.8.1 Empirical studies of the process of interpreting ... - 37 -
1.8.2 Empirical studies on cognitive resources in interpreting ... - 41 -
1.9 Prediction in interpreting ... - 49 -
1.10 The visual world eye-tracking paradigm ... - 54 -
1.11 The current thesis ... - 64 -
2 A theory of prediction in simultaneous interpreting ... - 67 -
2.1 Introduction ... - 67 -
2.1.1 What do we mean by prediction? ... - 68 -
2.1.2 The advantage of prediction in simultaneous interpreting ... - 69 -
2.1.3 Traditional accounts of prediction in simultaneous interpreting ... - 71 -
2.1.4 Prediction during comprehension of a native language ... - 72 -
2.1.5 Prediction during bilinguals’ comprehension of L2 ... - 74 -
2.1.6 Prediction-by-production in a native language ... - 76 -
2.1.7 Prediction-by-production in bilinguals listening to their L2 ... - 79 -
2.1.8 Cognitive load ... - 80 -
2.2 A model of prediction-by-production in simultaneous interpreting ... - 82 -
2.3 Conclusion ... - 90 -
3 Study 1. Do interpreters predict differently from translators? ... - 92 -
3.1 Introduction ... - 92 -
3.1.1 Prediction in a native language ... - 93 -
3.1.2 Prediction in adverse conditions ... - 94 -
3.1.3 Prediction in simultaneous interpreting ... - 98 -
3.2 The current study ... - 100 -
3.3 Methods ... - 101 -
3.3.1 Participants ... - 101 -
3.3.2 Stimuli ... - 105 -
3.3.3 Procedure ... - 109 -
3.4 Results ... - 111 -
3.4.1 Comprehension question accuracy ... - 111 -
3.4.2 Eye-tracking data analyses ... - 111 -
3.4.3 Linear-mixed model for experimental items by group ... - 113 -
3.4.4 Linear mixed model for filler items by group ... - 115 -
3.4.5 Between-group analysis ... - 117 -
3.4.6 Relation between eye movements and lag in simultaneous interpretation ... - 121 -
3.5 Discussion ... - 125 -
3.5.1 Evidence for prediction during simultaneous interpreting ... - 126 -
3.5.2 Simultaneous interpreters predict more than translators ... - 127 -
3.5.3 Prediction of word-form during simultaneous interpreting ... - 128 -
3.5.4 Exploratory findings suggest that lag and prediction are linked ... - 129 -
3.6 Conclusion ... - 130 -
4 Study 2. Do interpreters learn to predict differently during training? ... - 131 -
4.1 Introduction ... - 131 -
4.1.1 Prediction as a strategy to reduce the burden on working memory ... - 133 -
4.1.2 Working memory and prediction ... - 134 -
4.1.3 Evidence that predictive processing can change depending on cognitive abilities ... - 138 -
4.1.4 Prediction as a pre-training aptitude ... - 139 -
4.2 The current study ... - 140 -
4.3 Methods ... - 142 -
4.3.1 Participants ... - 142 -
4.3.2 Stimuli ... - 145 -
4.3.3 Procedure ... - 146 -
4.4 Results ... - 149 -
4.4.1 Comprehension question accuracy ... - 149 -
4.4.2 Eye-tracking analysis ... - 150 -
4.4.3 Eye-tracking data from the first ten participants ... - 152 -
4.4.4 Eye-tracking data from all 23 participants ... - 156 -
4.5 Discussion ... - 161 -
4.5.1 No evidence that training affects predictive processing during simultaneous interpreting ... - 161 -
4.5.2 Our findings suggest that students who predict more may obtain higher grades ... - 163 -
4.6 Conclusion ... - 163 -
5 Study 3. Prediction during consecutive interpreting in noise ... - 165 -
5.1 Introduction ... - 165 -
5.1.1 Consecutive Interpreting ... - 165 -
5.1.2 Adverse listening conditions in consecutive interpreting ... - 167 -
5.1.3 Listening to speech in noise in a native language ... - 168 -
5.1.4 Listening to speech in noise in a non-native language ... - 170 -
5.2 The current study ... - 173 -
5.3 Methods ... - 175 -
5.3.1 Participants ... - 175 -
5.3.2 Stimuli ... - 177 -
5.3.3 Procedure ... - 180 -
5.4 Results ... - 182 -
5.4.1 Comprehension question accuracy ... - 182 -
5.4.2 Eye-tracking data analysis ... - 183 -
5.4.3 Eye-tracking data ... - 184 -
5.5 Discussion ... - 192 -
5.5.1 L2 speakers make semantic predictions in speech-shaped sound ... - 193 -
5.5.2 No evidence that a consecutive task affects prediction ... - 193 -
5.5.3 No evidence of word-form prediction ... - 195 -
5.5.4 Exploratory findings suggest an effect of task-order for the consecutive task ... - 196 -
5.6 Conclusion ... - 197 -
6 General Discussion ... - 198 -
6.1 Summary of empirical findings ... - 198 -
6.1.1 Prediction during interpreting ... - 199 -
6.1.2 Interpreters predict earlier and more during simultaneous interpreting ... - 199 -
6.1.3 No evidence of word-form prediction ... - 200 -
6.1.4 Exploratory findings suggest a link between prediction and synchronicity of comprehension and production ... - 200 -
6.2 Implications for theories and models of prediction ... - 201 -
6.2.1 Role of prediction in interpreting ... - 201 -
6.2.2 Our model of prediction ... - 202 -
6.2.3 The role of production-based prediction ... - 203 -
6.3 Implications for models of simultaneous interpreting ... - 204 -
6.4 Conclusion ... - 205 -
7 References ... - 206 -
8 Appendices ... - 225 -
8.1 Plagiarism declaration ... - 226 -
8.2 Study 1. Critical sentences and visual objects ... - 227 -
8.3 Study 2. Critical sentences and visual objects ... - 232 -
8.4 Study 3. Critical sentences and visual objects ... - 236 -
8.5 Sample LeapQ questionnaire ... - 240 -
INTRODUCTION
Bilingual comprehension and production are intrinsically and explicitly linked in
interpreting, and especially in simultaneous interpreting. How is it possible to listen to, and produce utterances at the same time, in two different languages? One aspect of
comprehension that has received attention both in the Interpreting Studies literature, relating to how interpreters carry out the task, and in the psycholinguistics literature, relating to how people comprehend rapidly and turn-take in conversation, is prediction.
Prediction in interpreting will be the focus of this thesis.
This thesis is structured as a series of self-contained chapters. In the literature review (Chapter 1), we consider first how comprehension and production take place in bilinguals, as interpreting is an example of both bilingual comprehension and bilingual production. We consider the role for prediction in comprehension processes, and its possible link to
production. We then turn to interpreting, and in particular simultaneous interpreting, and offer a review of the current state of research in Interpreting Studies in the light of current psycholinguistic evidence of comprehension, production and prediction processes. The second chapter proposes a theory of prediction in simultaneous interpreting. We then report on empirical work carried out to investigate 1. whether prediction takes place during a simultaneous interpreting task, 2. how specific this prediction is, 3. whether interpreters predict differently from another matched bilingual group (translators) 4., whether a
between-group difference is due to training and 5., whether a consecutive interpreting task affects predictive processing. This empircial work is presented in Chapters 3, 4 and 5.
1 LITERATURE REVIEW
1.1 Interpreting – bilingual communication in a multilingual setting
This thesis looks at interpreting as a process, and, more specifically, at prediction within this process. Interpreting Studies is a vast field, covering interpreting in different modes in different settings. Interpreting is variously defined by the context in which it takes place, for example: conference interpreting, courtroom interpreting, public service
interpreting (Malmkjær & Windle, 2011), community interpreting (Pöchhacker, 2001) and humanitarian interpreting (Delgado Luchner & Kherbiche, 2019); or else by the modality used within the interpreted context, for example: simultaneous interpreting with text (Setton & Motta, 2007), consecutive interpreting (Pöchhacker, 2011), simultaneous
interpreting (Christoffels & De Groot, 2004) and chuchotage (Baxter, 2016). Common to all of these forms of interpreting is the basic process of interpreting: comprehension of a spoken message in one language1, and translating and subsequently producing (orally) the same message in another language. This thesis focuses mainly on simultaneous interpreting, but also considers consecutive interpreting.
Interpreters are by definition bilinguals, and usually multilinguals. The interpreters in our studies (both students and professionals) were either being trained, or worked, in international organisations and institutions in Europe. Most simultaneous interpreters working for international organisations and institutions in Europe comprehend in their L2, and produce in their L1 (Pöchhacker, 2004). The language of comprehension is known as the
1 Simultaneous interpreting with text additionally includes comprehension of a written message, as may simultaneous and consecutive interpreting when the original speaker uses a written support (e.g., a
“source language”, and the language of production is the “target language” (Christoffels &
De Groot, 2005).
Figure 1. Interpreting – a very basic model
In conference interpreting, a unique language classification system is used. As mentioned, in general in Europe, interpreters work into their L1. The L1 is always described as the interpreter’s “A” language, and is the language in which the interpreter is most proficient. The L2 may be described as either a “B” or a “C” language. The B-language is a language in which the interpreter is less proficient than the A-language, but into which she sometimes interprets. A practical distinction between the A and B language exists in some workplaces but not in others, for example according to AIIC (2019a), the interpreter may work into the B-language from one or several other languages just as for the A-language, whereas in the European institutions, the interpreter only works into the B-language from the A-language (Executive Committee on Executive Committee on Interpretation, 2020). The C-language is a source language from which the interpreter works; in other words, it is a language which the interpreter comprehends, but into which the interpreter does not interpret. This classification system makes an implicit distinction between language comprehension and language production abilities in L2: interpreters are said to perfectly
“understand” their C-language, while being perfectly “fluent” in their B-language (AIIC, 2019a).
Source language comprehension Target language production
In this dissertation, we do not distinguish between whether a language is classed as a B or a C language: instead we use the terms L1 and L2, with L1 meaning the first acquired and most dominant language, and L2 meaning a subsequently acquired, less dominant language. We consider that comprehension during simultaneous interpreting is generally L2 comprehension, and production is generally L1 production in a bilingual setting (although of course some interpreters are early bilinguals who acquired their L1 and L2 at approximately the same time). Our focus is on bilingual comprehension, production and prediction
processes during interpreting. Specifically, we consider predictive processing during comprehension in simultaneous interpreting and consecutive interpreting.
1.1.1 Simultaneous and consecutive interpreting
Both simultaneous and consecutive interpreting involve comprehension of an utterance or a series of utterances in the source language, and the translation and
subsequent production of the same utterance(s) in the target language. Source and target language are thus activated in parallel, and in direct relation to one another (the target language output should transmit the same meaning as the source language input). The difference between the two modes of interpreting is the timing of the target language production. In consecutive interpreting, interpreters wait until the speaker of the source language has finished talking (either momentarily or definitively), and then begin their interpretation in the target language. In simultaneous interpreting, the interpreter begins production in the target language while, at the same time, comprehending in the source language.
While the use of consecutive interpreting dates back thousands of years
(Pöchhacker, 2011), the use of simultaneous interpreting using technical equipment dates
back to the League of Nations and, in particular, the International Labour Organization (ILO), in the 1920s. However, it was the use of simultaneous interpreting at the Nuremberg trials in the 1940s that brought it to the world’s attention, and subsequently led to its use for most interpreted meetings at the UN (Pöchhacker, 2012) and later at the EU.
Consecutive and simultaneous interpreting share similarities. Both involve multi- tasking and frequent and regular language switching under time pressure and following an externally determined direction of translation (Dong & Li, 2019). Consecutive interpreting is often used in admission exams to interpreting programmes to gauge whether students have the potential to be interpreters (European Masters in Conference Interpreting, n.d.).
However, it is primarily simultaneous interpreting, and its attendant concurrent comprehension and production, that has interested psycholinguists since the 1960s (Pöchhacker & Shlesinger, 2002). Of course, sequentially, the production of component parts of the message in the target language tends to lag somewhat behind the
comprehension of component parts of the message in the source language. However, source language comprehension and target language production overlap around 70% of the time that professional simultaneous interpreters work (Chernov, 1994).
Frauenfelder and Schriefers (1997) consider that the simultaneous interpreting task is rendered highly complex by both this concurrent timing of comprehension and
production, and linked to this, the time-constrained translation of the utterance from one language to another. Much of the empirical work carried out in Interpreting Studies to date has related to this in some way, and has included: modelling the task of simultaneous interpreting; investigating strategies employed by interpreters to carry out this complex
task; and exploring the extent to which simultaneous interpreting requires and/or trains specific skills (see Sections 1.7 and 1.8).
Given our basic definition of interpreting as bilingual comprehension of an utterance in the source language, its time-constrained translation, and the production of the same utterance in the target language, we will first review the current theories and empirical evidence of how bilingual comprehension and production take place, and the role of
prediction in bilingual comprehension. We will then consider theories, models and empirical work on comprehension, production and prediction in Interpreting Studies in the light of the current state of knowledge in the field of psycholinguistics.
1.2 Language comprehension in bilinguals
In many ways, comprehension seems to happen completely effortlessly (Harley, 2014). Listeners convert acoustic input into meaning by decoding phonemes and parsing them into recognizable words, while processing the syntax (and thematic roles) of these words, and extracting the meaning of the utterance by integrating pragmatic, discourse and knowledge-based factors (Cutler & Clifton, 2000). This process takes place very quickly (Marslen-Wilson, 1973; Rayner & Clifton, 2009; Swinney, 1979).
Word recognition is an interactive lexical retrieval process in which knowledge about a whole word affects perception of its individual sounds (Frauenfelder & Tyler, 1987). Two central models developed in the 1980s have influenced speech perception theory: the cohort model (Marslen-Wilson & Tyler, 1980), and the TRACE model (McClelland & Elman, 1986). According to the cohort model, as a word is spoken, the brain activates a “cohort” of possible word-level hypotheses. The word is recognized at the point at which it becomes
unique from its near neighbours, when the initial sequence of segments is common to that word and no other. This model gives bottom-up processing precedence over top-down processing, as context only becomes relevant once a cohort of words has been activated. On the other hand, the TRACE model of word recognition (McClelland & Elman, 1986) accounts for greater top-down processing by an interactive-activation account of word recognition, in which recognition at the word level feeds back to phoneme level recognition. This model thus accounts for the Ganong effect, where categorization at the word level is used to inform categorization of ambiguous phonemes (Ganong, 1980). More recent models add further nuance, placing more emphasis on the effect of semantics (Zhuang, Randall, Stamatakis, Marslen-Wilson, & Tyler, 2011) and orthographic and phonological
neighbourhoods (Grainger, Muneaux, Farioli, & Ziegler, 2005) in word recognition. However, both the cohort and TRACE models focus on word recognition, and leave aside how
sentences are parsed or interpreted by comprehenders.
But comprehension does not stop at word recognition. The comprehension process is incremental (see Rayner & Clifton, 2009 for a review), and there is evidence that context, both linguistic and extra-linguistic, influences comprehension from studies of both reading (Altarriba, Kroll, Sholl, & Rayner, 1996; Ehrlich & Rayner, 1981; Federmeier & Kutas, 1999;
Frisson, Rayner, & Pickering, 2005) and listening (Huette, Winter, Matlock, Ardell, & Spivey, 2014; Spivey, Tanenhaus, Eberhard, & Sedivy, 2002; Tanenhaus, Spivey-Knowlton, Eberhard,
& Sedivy, 1995). In fact, comprehenders do more than simply integrate information into a preceding context and their own world knowledge: they also predict what they are about to hear (see Section 1.4).
Crucially, these theories of comprehension are largely based on processes thought to take place when native speakers listen to their native language. As indicated above,
however, interpreters generally work from a non-native source language (their L2).
Language comprehension in L2 also involves word identification, parsing, semantic-syntactic representation and text representation and integration (Kroll & De Groot, 2005), just as it does in L1. Just as in L1 comprehension, different stages of this process may interact with one another (Dijkstra & van Heuven, 2002).
Importantly, however, models of comprehension in bilinguals and multilinguals must consider the potential co-activation of two or more lexicons during comprehension. The BIMOLA model for bi- and multilingual word recognition (Grosjean, 1988, 1997) is modelled on the TRACE model. It assumes that different languages are stored in separate lexicons, but that both languages share the lowest feature level and begin to separate into different networks at the phoneme and word level (Grosjean, 1988). In contrast, in their BIA+ model, Dijkstra and van Heuven (2002) propose that language activation is non-selective at the orthographic, phonological and semantic levels (in the case of interlingual homographs, both representations are activated, but these representations may nonetheless be stored separately for each language). The model accounts for top-down effects (lexical, syntactic or semantic) on word identification, as well as on the extent of activation of each language.
However, both languages are assumed to be active, to some extent, all the time, as it does not seem possible to suppress one reading of an interlingual homograph (Dijkstra & van Heuven, 2002). The SOMBIP model (Li & Farkas, 2002) also assumes an integrated lexicon, but takes into account differences in proficiency between bilinguals, and changes due to learning, thus proposing a more nuanced picture of both the separate and interactive
nature of a bilingual’s two languages. The BLINCS model (Shook & Marian, 2013) accounts for further variability by considering how both long-term features of bilingualism, such as age of acquisition or proficiency, as well as short-term features, such as recent exposure, might affect activation of languages. The BLINCs model also accommodates the integration of visual context during language comprehension.
All of these models account in some way for the parallel activation of a bilingual’s two languages. The extent to which both of a bilingual’s languages are activated has also been the subject of empirical research. However, there is still debate about the extent to which, and when, both languages are activated. For instance, Thierry and Wu (2007) found that Chinese bilinguals associated words in English when the Chinese translation of these words had related forms (e.g., huo che/train, and huo tui/ham), even although the words were not related in English. This suggests that even in a monolingual English context, Chinese was activated, and thus that lexical access is non-selective. However, a non-
selective account is not the only way to explain these findings: they could be due to the way in which word associations are transferred from L1 to L2 during learning, with traces of L1 remaining within the L2 lexicon (Costa, Pannunzi, Deco, & Pickering, 2019).
Another question relates to the extent that semantic and syntactic processing is shared between languages. Kroll and Stewart (1994) suggest that links are stronger between L1 lexical items and concepts than between L2 lexical items and concepts, and that, at least some of the time, conceptual links from L2 are made by first accessing the L1 translation equivalent and then the concept. Their model supposes that with increasing L2 proficiency, L2 conceptual access becomes increasingly independent from L1. The model can be used to explain translation asymmetry: translation from L2 to L1 may take place more quickly than
the reverse, because the L2 lexical item links directly to the L1 lexical item, whereas
translation in the opposite direction is conceptually mediated. However, the picture is more complex, because switching from L2 to L1 has been shown to take longer than the reverse (Meuter & Allport, 1999). These findings have been explained by an inhibition account according to which a bilingual’s stronger language must also be more strongly inhibited, and is thus more difficult to re-activate (Green, 1998; Meuter & Allport, 1999).
Further, there is now evidence that L1 and L2 are also routinely linked on the semantic level. Priming studies have consistently found that lexical decisions are made faster when a word is presented after a semantically-related word, regardless of whether the semantically-related word is presented in the same or in a different language in the presence of a prime word across languages (see Francis, 2005). Even across languages with different scripts, and in masked priming studies2, robust priming effects have been shown with an L1 prime and an L2 target (Gollan, Forster, & Frost, 1997; Jiang & Forster, 2001).
Hoversten and Traxler (2016) investigated whether semantic cues could influence lexical activation in bilinguals in a study which investigated semantic sharing in the context of a sentence. Participants read sentences in English that were semantically constraining for either the English or the Spanish meaning of an interlingual homograph, for instance “pie”
(meaning foot in Spanish). One sentence was semantically constraining for the English meaning (the congruent condition), e.g., “While eating dessert, the diner crushed his pie accidentally with his elbow.” The other sentence context was constraining for the Spanish meaning of the noun (the incongruent condition), e.g., “While carrying bricks, the mason
2 In masked priming studies, the prime is displayed on the screen for such a short time period (e.g., 60
crushed his pie accidentally with the load.” The authors found that bilinguals did not initially appear to activate the Spanish meaning of the interlingual homograph when reading in English (as measured by reading time on the interlingual homograph), even when sentence constraints encouraged this. However, the authors suggested that shorter overall reading times in bilinguals for the incongruent sentences could be due to the bilinguals integrating the Spanish meaning of the homograph later in processing, suggesting that the Spanish meaning did become available. This study lends support to a nuanced view of language activation during bilingual comprehension, in which both language environment and context influence the strength of activation at different points in the time course of lexical access.
There is also evidence that syntax and grammar may be shared across languages during comprehension. De los Santos, Boland and Lewis (2020) investigated whether grammar is shared between a bilingual’s two languages. Participants read word pairs which were either grammatical or ungrammatical, and either in the same language (English or Spanish) or mixed languages (English and Spanish). Participants read the second word (a noun) in the grammatical language pairs more quickly than in the ungrammatical language pairs. Crucially, the grammaticality effect was found across the same and mixed language conditions, indicating that syntactic representations are language independent, supporting a shared view of syntax. Studies of dialogue also show that syntactic structures may be
primed across languages (e.g., Hartsuiker, Pickering, & Veltkamp, 2004 - see section 1.3 for further discussion).
Studies of parsing preferences also support a shared view of syntax. For instance, Dussias (2001, 2003, 2004) has shown that parsing preferences in an L2 can affect parsing of the L1. Dussias (2003) exploited the fact that, broadly speaking, native English speakers
have low attachment preferences, whereas native Spanish speakers have high attachment preferences (the picture is slightly more complex; for a review, see Pickering & Van Gompel, 2006). Dussias (2003) had participants read sentences such as “Peter fell in love with the daughter of the psychologist who studied in California” in English or Spanish. English monolingual speakers tended to attach “who studied in California” to the psychologist, whereas Spanish monolingual speakers attached the relative clause to the daughter.
However, Spanish-English bilinguals exhibited low-attachment preferences even when reading in Spanish, suggesting that the L2 can affect the L1 at the grammatical level.
More recent evidence also suggests that sentence-level constraints may influence bottom-up cross-language activation. Chambers and Cooke (2009) had native English speakers listen to constraining and non-constraining sentences in their L2, French, as they looked at four objects, one of whose preferred name was “chicken” (poule in French), and one of whose preferred name was an English interlingual homophone of the French preferred name (pool) in the sentence “Marie va nourrir/décrire la poule” [Marie will feed/describe the chicken]. When participants heard the constraining verb (feed), they looked at the target object (chicken) and rarely considered the interlingual homophone (pool). This suggests that where there is greater prediction (top-down processing), language activation is more selective (see also: FitzPatrick & Indefrey, 2010).
In sum, top-down and bottom-up processes interact in language comprehension in both bilinguals and monolinguals. In bilingual comprehension, there is evidence of cross- language activation in word recognition, even in a monolingual context, and certainly there is cross-language activation in a bilingual context such as may be found in interpreting.
There is also evidence of shared representations at the syntactic and semantic levels.
Further support for language sharing comes from the language production literature.
1.3 Language production in bilinguals
Just as language comprehension has often been considered from the bottom up, from word recognition, to syntax, to semantics, so has language production often been
considered from the top down. Accounts of production assume that speakers first
conceptualise their utterance, then convert that concept into syntactic representations, and then construct sound-based representations before they articulate (e.g., Bock & Levelt, 1994). The conceptual level has traditionally been viewed as non-language specific, and non-automatic (Levelt, 1989). Once speakers decide on the message they wish to convey, they formulate the message by selecting words (lemmas) in their syntactic context, and activate their phonology (sound processing) before articulating. At these later stages the process is traditionally considered as taking place automatically (Levelt, 1989). Ferreira and Pashler (2002) demonstrated that lemma selection and word-form selection require
cognitive resources, and are thus affected by performance of a concurrent task, while phoneme selection is not. They conclude that lemma selection and word-form selection are subject to a processing bottleneck, meaning that they cannot occur simultaneously (see Boiteau, Malone, Peters, & Almor, 2014; Sjerps & Meyer, 2015 for further empirical evidence that speech planning is more cognitively demanding than articulating).
De Bot (1992) proposes an adapted version Levelt’s (1989) model of speech
production for bilingual production. The model assumes that production mechanisms are essentially the same for native and non-native languages, and that bilinguals make use of two essentially equivalent production systems, choosing the language of production after
conceptualising their utterance. By this account, production becomes language specific directly after the message is conceptualised. The idea of two separate lexicons is also central to Kroll and Stewart’s (1994) model, which assumes that L1 words are more strongly linked to the conceptual level than L2 words, suggesting that switching into L2 is more demanding in terms of lexical selection than switching into L1. However, as reviewed above (Section 1.2), there is now extensive evidence of cross-linguistic activation at almost every level of representation, and this is also the case for language production (for a review see Brysbaert & Duyck, 2010)
Recent empirical studies have shown that languages share the syntactic and lexical levels of processing. Hartsuiker et al. (2004) showed that, in dialogue, L1-Spanish-L2-English bilinguals tended to repeat syntactic structures used by a confederate when describing pictures, even although the confederate spoke Spanish and the participant spoke English. In a similar experiment, Loebell and Bock (2003) had L1-German-L2-English participants
describe a picture using a designated sentence in German, before freely describing another picture in English (or the reverse, starting with the designated sentence in English). They found that dative constructions were primed across the two languages. More recently, Hatzidaki, Branigan, and Pickering (2011) showed cross-linguistic syntactic number agreement in English-Greek and Greek-English bilinguals. Participants read aloud a noun phrase in either English or Greek, in which the noun had a different syntactic number across the two languages (e.g., The money; singular or Ta λεφτά; plural). They were asked to complete the phrase in either the same language, or in the other language. Participants sometimes produced a verb which agreed in number with the translation of the subject noun, and did so much more often when they produced the verb in the language of the
noun’s translation (i.e., when they switched languages). This suggests that they concurrently activated both languages’ syntax (on many occasions at least). In a corpus-based analysis of English-Spanish bilinguals, Fricke and Koostra (2016) showed that these findings extend to priming in spontaneous code-switching. These findings support accounts according to which syntax and grammatical encoding are shared across languages.
When it comes to word production, the top-down nature of speech production means it cannot simply be considered as speech comprehension in reverse. Unlike in speech
recognition, where a bilingual’s other language may be activated in a bottom-up manner (Dijkstra & van Heuven, 2002), speakers do exert some control over which language is activated for word production (Costa & Santesteban, 2004). This control is the focus of Green and Abutalebi’s (2013) adaptive control model for bilingual production. The model places an emphasis on communicative goals, such as speaking in one language rather than another, and supposes that bilingual speakers can maintain their goal even in the presence of activation of the other language, by suppressing lexical competition. This contrasts with monolingual models of speech production, which do not assume any inhibitory processes (Costa, 2005). Importantly, Green and Abutalebi (2013) thus assume that bilingual
production involves more cognitive control than monolingual production; that regularly exercising cognitive control for language selection leads to enhanced cognitive control in bilinguals; and that this enhanced cognitive control also manifests in nonverbal tasks.
This idea that lexical access is more difficult for bilinguals is supported by empirical evidence from Ivanova and Costa (2008), who compared picture naming speeds between a group of Spanish monolinguals and group of L1-Spanish-L2-Catalan bilinguals and found that the monolingual group was faster at picture naming. Similarly, Gollan, Montoya, Fennema-
Notestine, and Morris (2005) found that English-dominant bilinguals named pictures more slowly than English monolinguals (although they named the same pictures just as quickly as monolinguals once they had been shown multiple times). These findings could be the result of cross-language interference as bilinguals retrieve a lexical representation. However, other factors may be at play: in Gollan et al.’s (2005) study, participants’ dominant language and their first acquired language were not always the same, and they were tested in the dominant language; frequency effects may play a role, as bilingual speakers will use both lexicons correspondingly less than monolingual speakers, making each lexical entry less frequent; and the typical age-of-acquisition of a particular word may also affect lexical retrieval, and this is often, but not always, confounded with the frequency effect. These studies thus illustrate, on the one hand, that lexical access may be more difficult for bilinguals, and, on the other hand, that it is difficult to ascribe this additional difficulty to any one feature of bilingual production.
1.4 Prediction in comprehension
1.4.1 Prediction in monolingual comprehension
We have reviewed the rapid recognition of words by listeners and their integration into a syntactic and semantic context during comprehension. When a word is predictable within a sentence, e.g., bath, in the sentence “The tired mother gave her dirty child a bath”, then the predictable word is typically processed more quickly by a listener than a less predictable word (e.g., shower) (Schwanenflugel & Shoben, 1985). One reason for this is that the word bath is easily integrated with the preceding context (i.e., when listeners hear the word bath it quickly makes sense to them). However, comprehension takes place even
more quickly than such a bottom-up route would suggest – and this is because listeners regularly predict what they are about to hear (for a review, see Pickering & Gambi, 2018).
We consider prediction to mean the pre-activation of any aspect of a linguistic representation (i.e. meaning, syntax or form) before the listener hears (or reads) that representation. In other words, when listening to the sentence “The dentist asked the man to open his mouth a little wider”, it is possible to predict the word mouth at different levels – e.g., at the semantic (conceptual) level, at the syntactic level (noun, singular) and at the phonological level (e.g., /maʊθ/). Semantic, syntactic and phonological prediction may thus be viewed as discrete parts in the prediction process, and empirical studies have
demonstrated prediction at each (and sometimes more than one) of these levels. It is important to make a clear distinction between the pre-activation of features of a predictable word, and the rapid integration of a predictable word in a context (as in
Schwanenflugel & Shoben, 1985). Demonstrating that an utterance is predictable is not the same as demonstrating that an utterance has been predicted. To be fully confident that prediction has taken place, a measure needs to be taken before and not after the predictable word is uttered. The psycholinguistic studies included in this review employ methods that make it possible to measure predictive processing before the occurrence of a predictable word in discourse, or else use a design that rules out an integration account as an explanation for their findings (e.g., Federmeier & Kutas, 1999). In Interpreting Studies, prediction is often referred to as anticipation, and theoretical and empirical work has been based on slightly different concepts of prediction (see Section 1.9).
Evidence of prediction during comprehension comes from empirical studies of prediction in monolinguals. In a seminal study, Altmann and Kamide (1999) presented
participants with scenes containing an agent (e.g., a boy) and four objects (e.g., a cake, a train set, a toy car, and a balloon). Participants heard a sentence with a verb that was semantically linked to either only one or all four of the objects in the display, such as “The boy will eat the cake” or “The boy will move the cake”. When participants heard the constraining verb (eat), anticipatory eye movements began towards the cake before noun onset, whereas when they heard the non-constraining verb, they first looked at the cake only after noun onset. This shows that semantic information in the verb (edible) was used to predict the meaning of the noun (an edible object). Further evidence of verb-mediated semantic prediction in monolinguals comes from the visual-world paradigm (Boland, 2005;
Kamide, Altmann, & Haywood, 2003), as well as from studies absent a visual scene (Grisoni, McCormick Miller, & Pulvermüller, 2017). There is also visual-world evidence of the reverse effect, i.e., that linguistic information in a sentence can be used to predict the verb
(Knoeferle, Crocker, Scheepers, & Pickering, 2005).
Federmeier and Kutas (1999) showed prediction of semantic meaning in an ERP study. They had participants read sentences, e.g., “They wanted to make the hotel look like a tropical resort. So, along the driveway, they planted rows of…”). Sentences culminated in a predictable noun (e.g., palms), an unexpected noun from the same semantic category (e.g., pines), or an unexpected noun from a different semantic category (e.g., tulips). Sentences were either very constraining for the predictable word (high cloze), or else slightly
constraining (medium cloze). They found a greater N400 reduction at the expected nouns in the high-cloze than in the medium cloze sentences. Critically, they also found a smaller N400 effect for the within-category unexpected nouns in the high-cloze than in the
medium-cloze sentences, even though the within-category unexpected nouns were rated as
less plausible continuations in the high-cloze context. This implies that prediction, rather than easier contextual integration, facilitated processing.
There is also evidence of syntactic prediction. Wicha, Moreno and Kutas (2004) showed that not only do people predict the meaning of upcoming words, they also predict their grammatical gender. In a matched gender condition, participants read sentences in Spanish that were constraining for a particular word, such as “crown” in, “The prince dreamt about having his father’s throne. He knew that when his father died, he would finally be able to wear the crown for the rest of his life.”3 The sentences contained either the
semantically congruous noun, e.g., la corona (crown, feminine), a semantically incongruous noun of the same gender e.g., la maleta (suitcase, feminine), the semantically congruous noun preceded by the incorrect gender e.g., el corona (crown, masculine) or the
semantically incongruous noun preceded by the incorrect gender e.g., el maleta (suitcase, masculine). They found that semantically incongruent nouns elicited an N400 effect (again suggestive of semantic prediction). Importantly, they also found that unexpected articles generated an enhanced positivity effect compared to expected articles, indicating that readers predicted gender (for prediction of gender, see also: Otten, Nieuwland, & Van Berkum, 2007; Otten & Van Berkum, 2008; Van Berkum, Brown, Zwitserlood, Kooijman, &
Hagoort, 2005).
Several studies provide evidence of prediction of word form. Laszlo & Federmeier (2009) had participants read highly constraining sentences that ended with either a predictable noun, its orthographic neighbour, an orthographically related pseudoword, or
3 Original Spanish sentence: El príncipe soñaba con tener el trono de su padre. El sabía que cuando su padre
an orthographically related illegal string, (e.g., “It was a beautiful summer day, with not a cloud in the sky/spy/smy/skq”) or else with the predictable noun, an unexpected word, an unrelated pseudoword or an unrelated illegal string (e.g., “The genie was ready to grant his third and final wish/clam/horm/rqck.”) They found that orthographic neighbours, be they words, pseudowords, or illegal strings, elicited smaller N400s than orthographically unrelated items, suggesting that orthographic information can be predicted, and that predictions interact with bottom-up processing even before words are recognised (e.g., before pseudowords and illegal strings have been processed for meaning) (see also Kim &
Lai, 2012).
Ito, Pickering and Corley (2018) demonstrated prediction of word form during listening. Participants heard highly constraining sentences, such as, “The tourists expected rain when the sun went behind the cloud but the weather got better later”, and looked at a visual display containing one of either the predictable word (e.g., cloud), a word
phonologically related to the predictable word (e.g., clown) or an unrelated word (e.g., globe). Not only did participants fixate on the predictable object more than on the unrelated object before word onset, they also fixated the phonologically related object more than the unrelated object before word onset. This shows that when listening,
participants may pre-activate not only the meaning, but also the form of a predictable word.
1.4.2 Prediction during comprehension of L2
Prediction during comprehension of an L1 has been established at the levels of semantics, syntax and word form. However, interpreting involves prediction during comprehension of an L2. In some ways, prediction during comprehension of an L2 and
prediction during comprehension of an L1 are similar. However, prediction during L2 comprehension may be slower and less detailed than prediction during L1 comprehension.
Semantic prediction in L2 has been demonstrated in several studies (Dijkgraaf, Hartsuiker, & Duyck, 2017; Ito, Corley, & Pickering, 2017; Ito et al., 2018; Martin et al., 2013). Syntactic prediction has also been shown in L2 speakers (Foucart, Martin, Moreno, &
Costa, 2014). This demonstrates that people predict upcoming meaning and syntax when listening in their non-native language.
However, there is also evidence that prediction during L2 comprehension may be slower and less detailed than prediction during L1 comprehension. For example, Martin et al. (2013) compared L1 and L2 readers, and found that while L1 readers predicted both meaning and form, L2 readers only predicted meaning. Similarly, Ito et al. (2018) found evidence in a listening task that L1 listeners predicted form and meaning, whereas L2 listeners predicted only meaning. Lew-Williams and Fernald (2010) found that L2 Spanish speakers were not able to use gender cues as L1 Spanish speakers were. Participants listened to short sentences containing a gender-marked object, e.g., “Encuentra la pelota” 4 while they looked at scenes containing pictures containing two objects, one of which was the named object, and the other object of which was matched or not in grammatical gender. While native Spanish speakers were able to use the gender cue to look to the named object more quickly when it was presented with mismatched gender object, non- native speakers did not. Mitsugi and MacWhinney (2015) report similar results in a study of Japanese L2 speakers.
However, prediction in L2 is not uniform among L2 speakers. For instance, Hopp (2013) showed that highly proficient L2 speakers of German, whose L1 was English, were sensitive to gender-marking on determiners, and used this information to predict upcoming nouns in a similar way to native German speakers. Lower proficiency L2 speakers, on the other hand, did not use gender markings as efficiently. This suggests that highly proficient L2 users can make syntax-based predictions. Dussias, Valdés Kroff, Guzzardo Tamargo and Gerfen (2013) also found that higher proficiency bilinguals, whose L1 did not include gender markings, were able to use gender markings to predict upcoming nouns. In addition, they found that lower proficiency bilinguals used gender markings predictively when their L1 contained gender markings, regardless of their proficiency. Similarly, Foucart et al. (2014) found that where their L1 and L2 (in this case French and Spanish) shared use of gender markings, bilinguals listening in their L2 predicted the article of an upcoming noun.
Thus, prediction during comprehension of L2 is not the exact equivalent of prediction during comprehension of L1. However, as Grosjean (1998) notes, there is significant diversity among the bilingual population with regard to factors including
language history, linguistic environment and level of proficiency of each language. Not only are L1 and L2 prediction not exact equivalents, but L2 prediction may differ among L2 listeners depending on each and any one of these factors.
1.4.3 Cognitive resources and prediction
We have reviewed evidence showing that prediction routinely takes place in comprehension of both an L1 and an L2. However, prediction may be affected by several factors including speed, accent, cognitive load and individual variability. This suggests that although prediction routinely takes place, it may require cognitive resources.
For speed, Wlotko and Federmeier (2015) demonstrate that presentation rate modulates prediction effects. Following Federmeier and Kutas (1999), they presented participants with constraining contexts culminating in a predictable noun, a less predictable noun from the same semantic category, or a less predictable noun from a different semantic category, in two blocks at two different speeds, one at a stimulus onset asynchrony (SOA) of 250ms and the other at an SOA 500ms. They found that at the lower speed of SOA 500ms, the N400 magnitude was reduced for semantically related, but less predictable nouns, as compared to semantically unrelated, less predictable nouns, just as in Federmeier and Kutas (1999). However, at a faster presentation rate (SOA 250ms) this was no longer the case, suggesting that a higher (and more difficult) presentation rate leads to less prediction (for similar findings, see Dambacher et al., 2012). An alternative explanation is that at a faster presentation rate, participants might begin to form predictions, and activate semantic representations, but that this process is too slow to be completed before word onset at a higher presentation rate. Ito, Corley, Pickering, Martin and Nieuwland (2016) also found that prediction was less specific at a faster presentation rate: while participants predicted both semantic and phonological aspects of words at lower presentation rates, at higher rates only semantic predictions were formed. Interestingly, Wlotko and Federmeier (2015) also found that participants who first received a block of stimuli at 500 SOA demonstrated a reduced N400 magnitude in the SOA 500 block, unlike participants who received the faster block first, showing that prediction may be flexible. Flexibility in predictive processing has also been demonstrated by Brothers, Swaab and Traxler (2017), who showed that when predictive cues were consistently reliable, participants predicted more than when predictive cues were less reliable. These findings suggest that once participants have engaged in prediction successfully, they may continue to predict (even in
more challenging circumstances).
For accent, in a between-group study based on Federmeier and Kutas (1999), Romero-Rivas, Martin, and Costa (2016) had two groups of participants listen to highly constrained sentences recorded by either a Spanish native speaker or a foreign-accented speaker of Spanish. Sentences ended with either an expected noun, an unexpected noun within the same category, or an unexpected between-category noun. The predictable noun elicited a reduced N400 effect in comparison to the unexpected between-category word in both groups. However, in the foreign-accented group, there was no difference between the N400 effect at unexpected within and between category completions, while in the native accented group, within-category completions elicited a lower N400 effect than between- category completions (replicating previous findings). More recently, Schiller, Boutonnet, Kloots, Meelen, Ruijgrok and Cheng (2020) found an interaction between foreign-accented speech and expectancy in a within-group ERP study. Participants listened to native or non- native accented Dutch sentences containing an expected or unexpected noun preceded by an article. They found more negative ERP amplitudes in a window from 120-300ms after the onset of the article belonging to an unexpected noun (compared to an expected noun), but only when participants listened to native accented speech. However, they did not find an interaction between accent and expectedness in the N400 window. Overall, the evidence shows that listening to foreign-accented speech leads to less robust prediction. However, prediction may still take place.
Cognitive load also influences prediction, suggesting that prediction may be costly in terms of cognitive resources. Ito et al. (2017) showed in an eye-tracking study that cognitive load affects predictive processing in both L1 and L2 speakers. As in Altmann and Kamide
(1999), participants heard sentences containing a verb that was constraining, or not, for one of four objects displayed on the screen (e.g., “The lady will fold/find the scarf”). There were two cognitive load conditions (no-load, and load). Just as in Altmann and Kamide (1999), both groups looked predictively towards critical objects. However, predictive looks by both L1 and L2 speakers were delayed when they listened to the sentences in the load condition.
This suggests that prediction requires cognitive resources.
Predictive processing may also vary depending on individual cognitive resources.
Huettig and Janse (2016) found that differences in working memory correlated with the extent of predictive eye movements: individuals with higher working memory predicted more. Age is also associated with prediction, with older adults making fewer predictive eye movements (Federmeier, Kutas, & Schul, 2010; Wlotko, Lee, & Federmeier, 2010), although predictive decline due to age may be mitigated by high verbal fluency (Federmeier et al., 2010; Federmeier, McLennan, De Ochoa, & Kutas, 2002).
Together, these findings show that prediction is affected by adverse conditions affecting the speech stream itself (e.g., speed, accent), as well as by comprehender limitations (e.g., additional cognitive load, working memory capacity, age).
1.5 Production-based prediction
We have considered evidence that prediction is a routine part of comprehension in both L1 and L2, but we have not yet considered how this prediction may take place.
There are different theories of prediction. Based on a study comprising two
experiments, Kukona, Fang, Aicher, Chen and Magnuson (2011) proposed that prediction is based simultaneously on associative priming and sentence-level constraints, where
associative priming may play a more dominant role. In a first experiment, they had
participants listen to sentences containing a verb that was constraining, or not, for one of four objects, for instance: “Toby arrests/notices the crook”. The visual array contained an image of a crook and three distractors (competitor absent condition), or else a crook, a policeman (a semantic competitor) and two distractors. The images of both the crook and the policeman were semantically associated with the verb, but had different thematic roles (patient or agent). They found that participants were equally likely to look at the image of the policeman as at that of the crook, suggesting that associative priming, rather than sentence context, was responsible for predictive fixations. However, in a second
experiment, which used passive rather than active sentences (e.g., “Toby was arrested by the policeman”), they found that sentence context played a greater role. Participants made proportionately more predictive fixations on the policeman than on the crook, suggesting that they took sentence-level constraints into account when they 1., had more time (passive sentences were longer) and 2., were given more syntactic cues (passive sentences included the function words “was” and “by”). Metusalem, Kutas, Urbach, Hare, McRae and Elman (2012) also observed that thematic knowledge may sometimes play a greater role than sentence-level constraints, and propose that listeners predict by using event knowledge (not explicitly mentioned in the discourse) to form mental representations. Huettig (2015) proposes that several mechanisms underlie predictive processing. These mechanisms rely on the production mechanism, associative priming, a combination of thematic priming and sentence-level priming, and event simulation. Finally, Dell and Chang (2014) suggest that production, comprehension and language acquisition are all linked by prediction in what they call the P-chain. They suggest that predictive processing is carried out via the
production system, and that incorrect predictions lead to learning and development in both the comprehension and production systems.
Pickering and Garrod’s (2007; 2013) and Pickering and Gambi’s (2018) theories of prediction also propose that the language production system is used in prediction. This supposes a much greater interlinking between comprehension and production processes than has traditionally been considered: although self-monitoring using the comprehension system is assumed to take place during production (Levelt, 1989), until quite recently, the production system was not considered in models of comprehension (Cutler & Clifton, 2000;
Dijkstra & van Heuven, 2002). In contrast, the production-based prediction account assumes that predictions during comprehension are formed via the production system.
Pickering and Garrod (2013) and Pickering and Gambi (2018) propose that in prediction, as in production, comprehenders pass the message through the production system (they conceptualise, select a lemma and activate its phonological form), but they stop short of articulation. Pickering and Garrod (2013) describe this process as covert imitation. Because there is no articulation, the process is more rapid than production, thus allowing the comprehender not only to imitate but also to predict what the speaker is about to say using the production mechanism (Pickering and Garrod, 2013). The production
system is thus activated during comprehension.
Pickering and Garrod’s (2007; 2013) and Pickering and Gambi’s (2018) accounts are compatible with the evidence for prediction that we have reviewed. We have seen that prediction takes place at the semantic, syntactic and phonological levels in both bilinguals and monolinguals. Further, we have seen that prediction at the semantic level appears to take place before prediction at the syntactic and phonological levels, just as in production.
For example, when speech was presented at higher speeds, listeners predicted only at the semantic level (Ito et al., 2016), and prediction when listening to a non-native language is more reliably found at the semantic than at the syntactic and phonological levels. This lends support to an account of spreading activation, from the semantic, to the syntactic, to the phonological level (Ito et al., 2018; Martin et al., 2013). There is also correlational evidence of a link between production skills and prediction (Federmeier et al., 2010; Mani & Huettig, 2012), and evidence of the activation of the production mechanism during comprehension (Adank, 2012b; Adank, Hagoort, & Bekkering, 2010). More importantly, there is evidence that prediction influences the articulation system (Drake & Corley, 2015) and that engaging the production mechanism may impede prediction (Martin, Branzi, & Bar, 2018). This evidence is reviewed in more detail in the introduction to the model of prediction in simultaneous interpreting presented in Chapter 2.
1.6 Simultaneous interpreting – a unique bilingual setting?
We have considered current evidence of how bilinguals comprehend and produce, and the role that prediction plays in comprehension. As we noted, simultaneous
interpreting has attracted particular interest because of the concurrent comprehension and production processes it implies, as well as the concurrent activation of two languages. We have seen that during monolingual comprehension, both the comprehension and
production mechanisms may be activated for predictive processing, even when the listener is not overtly producing utterances (Dell & Chang, 2014; Ito et al., 2018; Mani & Huettig, 2012; Martin et al., 2018). In addition, we have reviewed evidence suggesting that even in a monolingual setting, both of a bilingual’s languages are active (Dijkstra, Grainger, & van Heuven, 1999; Dijkstra & van Heuven, 2002; Marian & Spivey, 2003; Mercier, Pivneva, &