• Aucun résultat trouvé

Translation Studies

Dans le document Syntactic difficulties in translation (Page 62-67)

2.2 Related Research

2.2.3 Translation Studies

In addition to language features that are typically analysed in readability re-search, predicting translation difficulty also has to take the source text and source language into account, as well as the process of translating from the source language to the target language. Research has focused on the issues that machine translation systems encounter, as well as the difficulties that human translators are faced with.

2.2.3.1 Machine translation

In the field of machine translation, researchers have focused on ways to im-prove an MT system’s translation as well as analysing properties of the source text where MT systems are having difficulties coming up with a good and cor-rect translation (Bernth & Gdaniec, 2001;Naskar et al., 2011;O’Brien, n.d.;

Underwood & Jongejan,2001).

According to Underwood and Jongejan (2001), machine translation sys-tems are prone to negative translatability indicators (NTIs, also referred to

as “negative sentence properties”, or “translatability indicators”) that make translating a given text difficult. Some of these pointers are lexical ambiguity (e.g. polysemous words, homographs), structural ambiguity (e.g. caused by prepositional phrases or multiple coordination), complex noun phrases, and very long or very short sentences. Human translators are different from ma-chines and not all the aforementioned negative translatability indicators will cause problems for human translators. Furthermore, translators may have to deal with difficulties that are non-existent in MT. Nonetheless, some overlap between translation difficulties of MT systems on the one hand and human translators on the other is to be expected. O’Brien (n.d.) applies negative translatability indicators on post-editing effort. She found that when less NTIs are present in a segment, the post-editing effort is reduced. However, the author also found that not all NTIs have the same effect on editing effort.

Bernth and Gdaniec (2001) discuss a number of ways to improve what they callMTranslatability, i.e. how well-suited a text is to be translated by an MT system. They provide suggestions for improving an input text in prepa-ration of machine translation, which can greatly improve the output quality of a text. By doing so, they also highlight the problematic constructions that MT systems are faced with. Such source-text difficulties are, among others, ungrammatical constructions, ambiguity, coordination, and ing-words.

Rather than looking at how the source text should be tailored to the MT system, Naskar et al. (2011) improved an MT system by “relying on the advice of end-users on the basis of what they deem[ed] should be pri-oritized” (p. 529). The researchers used linguistic checkpoints to evaluate a system’s performance. A linguistic checkpoint is a point of importance that is required for an adequate translation. These points of interests are subjective and depend on the MT system as well as the end-users’ priorities. Exam-ples of these checkpoints could be an ing-form, noun-noun compounds, and any part-of-speech tag. The checkpoints are sorted into a taxonomy that as a whole represents the important source-text features for a given task. By analysing an MT system’s performance at these linguistic checkpoints, a sub-jective performance test can be created where the importance of a correct translation of specific linguistic features is determined by the research set-up.

2.2.3.2 Human translation

From the perspective of human translation,Campbell (1999) took an empir-ical approach and found source-text specific constructions that were difficult to translate to different target languages, such as multi-word units, complex noun phrases, and abstract content words. Additionally including the type of the translation task and the competence of the translator, he presents a general framework that sheds light on translation difficulty. Particularly, the author concludes that “since common difficulties were encountered across subjects texts could be said to be inherently difficult to translate” (Campbell, 1999, p. 57). This statement implies that source-text features play a considerable

role in a text’s translation difficulty.

Building on those findings, and alluded to earlier by the same author (Campbell, 1998), Campbell and Hale (1999) aim to chart the choices that translators make during a task in a choice network. The idea of a choice network was later generalised to a framework, namely Choice Network Anal-ysis (CNA) inCampbell(2000). CNA tries to model the mental processes of translation. It assumes that a translator’s target text is the evidence and the product of his or her mental processes during translation. Mental processes, in this context, are in fact the choices that the translator had to make during the translation process. When multiple translators try their hand at the same source text, the product of these translation processes can be combined in a network that represents all the choices that can be made given an input text.

As an example (taken from Campbell, 2000, p. 36-37), one can imagine a choice network analysis of complex noun phrases in English translated to Spanish. Such an English construction of two nouns (N N) can be structurally re-factored into a Spanish translation that, for instance, consists of a single noun (N), a noun followed by an adjective (N Adj), a new complex noun phrase (N N), or a noun and a prepositional phrase (N PP) which entails many choices in itself (e.g. the choice of the preposition).

According to the author, “CNA [is] useful for estimating the relative dif-ficulty of parts of source texts” (Campbell, 2000, p. 38). It follows that the plurality of choices in itself can be quantified as a difficulty indicator as “the more nodes and branches in the network, the more choices are faced”.

The Choice Network model can be used to discuss another way of quan-tifying the choices that translators can choose from, namely word translation entropy (Carl et al.,2016;Schaeffer, Carl, et al.,2016). Word translation en-tropy indicates the uncertainty for a translator to choose (a) target word(s) for a source token. It revolves around the idea that a translator has multiple ways to translate a given source token. The more options that are available, the harder it is to make a decision. Word translation entropy is situated on the lexico-semantic level. The core idea of word translation entropy (i.e. the number of different translation options) can be modelled and visualised in a CNA. Figure2.1shows howprecipice in the sentenceResidents have to catch a cable car to the top of a nearby precipice to get a dose of midday vitamin D has been translated by different translators. Note that afgrond (abyss) is written in italics because this translation is semantically incorrect; an abyss does not have a top.

On the syntactic plane, syntactic equivalence can serve to indicate prob-lems with translation. Sun (2015) proposes that difficulties that arise dur-ing translatdur-ing from one language into another can generally be attributed to difficulties with equivalence, a concept from translation theory that has been around since the second half of the twentieth century (Pym,2014, p. 7).

Equivalence (or the lack thereof) can manifest itself on different levels in lan-guage. On a microscopic layer in morphology, lexicon, and syntax, up to a more global, macroscopic layer: a semantic, pragmatic and ultimately a

cul-tural level (Baker, 2011). In this paper, we are mainly interested in syntactic equivalence. Problems with equivalence can occur when there is non-equiv-alence, one-to-several equivnon-equiv-alence, and one-to-part equivalence according to Sun (2015). This categorisation is in line with an earlier grouping by Kade (1968, pp. 79-89) who uses the German terms Eins-zu-Null, Viele-zu-Eins, and Eins-zu-Teil respectively. Equivalence issues arise “especially for novice translators” (Sun, 2015, p. 36).

Carl and Schaeffer(2017) used both word translation entropy and syntactic equivalence to model translation literality or the lack thereof. They found

“strong correlations of cross-lingual semantic [word translation entropy] and syntactic similarities [syntactic equivalence] and that non-literal translations were more difficult and time consuming [...] to produce than literal ones”

(p. 55). From this we can assume that word translation entropy as well as syntactic equivalence give rise to higher cognitive effort. In other words, the more choices (or the more elaborate the choice network) or the more syntactic re-ordering has to take place, the more difficult a translation is to create.

2.2.3.3 Translation process research

In addition to the above similarity coefficients, cognitive effort is also often measured by analysing user-activity data (UAD) gathered during the transla-tion process. Detailed informatransla-tion concerning duratransla-tion (e.g. time to translate, pause information), revision (e.g. number of character insertions or deletions, number of self-corrections), and gaze information gathered with an eye tracker (e.g. number of fixations, fixation duration, regressions) are often used metrics in this type of research (see for instanceCarl et al.,2008,2010;Daems,2016;

Daems et al.,2017;Jakobsen,2011;Lacruz et al.,2012;Schaeffer, Carl, et al., 2016).

Revision information (such as number of inserted or deleted characters, number of revisions) can shed a light on the cognitive effort a translator had to muster during translation. According toLeijten and Van Waes(2013, p. 360)

“[t]he main rationale behind keystroke logging is that writing fluency and flow reveal traces of the underlying cognitive processes. This explains the analytical focus on pause [...] and revision [...] characteristics”.

Initially touched upon byO’Brien(2005) and further developed inO’Brien

Figure 2.1. A CNA of the translation options to Dutch for Englishprecipice

(2006), pauses and in particular pause ratio can be used as an indicator of cognitive effort for post-editing tasks (PE) where translators receive a ma-chine-translated text and correct mistakes or improve the text in other mean-ingful ways. Pause ratiois the total time that a translator has paused (i.e. has not provided keyboard input) relative to the total production time of a trans-lated segment.

The underlying idea is that the longer a translator pauses (i.e. does not provide keyboard input), the more cognitive effort is required to generate a suitable translation. It should be noted, though, that it is currently not possi-ble to ensure that the cognitive effort related to a pause is in fact present and being directed towards the task at hand. In other words, it is nearly impossible to find out the motivation of a pause with certainty (see for instance Kumpu-lainen,2015). With regards to our study, however, we assume that pauses that are not related to an increased cognitive effort (e.g. day-dreaming) are scarce and of small to no consequence for our results.

Lacruz et al.(2012, p. 24) reacted to O’Brien’s duration metric by altering the concept of pause ratio and instead using average pause ratio, which is calculated as the average duration per pause divided by the average production time per word. The authors claim that “[pause ratio] does not take different patterns of pause behavior into account. In particular, it is not sensitive to the existence of clusters of short pauses”.

Even though the researchers note that their research was limited to a single translator, they do believe that average pause ratio can be used as a valid measure of cognitive effort, and that it at least is a better indicator than O’Brien’s aforementioned pause ratio metric. In her research on the effect of MT quality on PE effort indicators, Daems (2016, p. 131) confirms that, indeed, “average pause ratio is a better measure of cognitive effort than [...]

pause ratio”.

The methodological work byO’Brien(2005,2006) andLacruz et al.(2012), as well as the applied study byDaems(2016) above are restricted to post-edit-ing tasks. A source text is fed into a machine translation system, and the generated output is then post-edited by a translator. Concerning cognitive effort, post-editing and translating are related but not identical tasks. The effort required to create a translation from start to finish is more intensive than post-editing an MT-translated text. Therefore, conclusions drawn in a PE setting are not necessarily applicable to translation. However, due to a lack of comparative research on pause ratio and average pause ratio in human translation tasks, and the tested preference for average pause ratio byDaems (2016), we will use this metric in the remainder of this study.

In Section2.4, we will examine whether the number of translation errors and selected aforementioned product features reflecting similarity between the source and target text (word translation entropy and syntactic equivalence) correlate with process features that are generally accepted to reflect cognitive effort.

Dans le document Syntactic difficulties in translation (Page 62-67)