• Aucun résultat trouvé

Source texts

Dans le document An empirical study on the impact of (Page 159-165)

Research design and methods

5.3 Experiment design

5.3.5 Source texts

– Acquaintance with translation of software guides and marketing collaterals

+ + +

– Proficient use of SDL 2007 Trados CAT tool + + + – At least 3 years of professional experience + + 50% of income from professional translation

services

+ +

Table 5.2: Definition of subject profiles in the TRACE study

The novice profile consisted of 18 translators13, being their distinct feature the fact of having less than 3 years of experience, yet already working as professional translators (none of them were translation students), but not having still a minimum amount of 50% income deriving from translation jobs.

The in-house translator profile also consisted of 18 translators, all of them making a living out of translation as a full-time job but not always having more than 2 years of experience as in-house translators. Finally, the biggest sample of translators in the study is that made up by freelance translator with a total of 54 participants14. These professional translators had more than 3 years of experience and translation was the main source for their income.

5.3.5 Source texts

The corpus compiled in this study (see chapter 6) originates from translating three different STs from English into Spanish under three different translation editing environments. These three texts belonged to the following two genres:

software user guides and software marketing collaterals. Any of the three selected texts for the study had a known Spanish translation as of January 2009.

1318 participants was the minimum amount of subjects required in the study in order to fully randomize the tool order presentation in the experimental task (3 STs to be presented under 3 translation editing environments, repeating this measure at least 3 times for each tool order combination).

1454 participants is a sample consisting of three times the minimum amount of participants to fully test a complete range of text/environment order combination, i.e. 18x3 participants.

Text 1 (T1) presents the instructions issued by the computer service at the University of Pittsburgh about how to install spyware protector software.

Text 2 (T2) is an extract of the instructions accompanying a palm software application used in medical settings to control and issue prescriptions to patients. Text 3 (T3) is the marketing text accompanying the user guide for a handheld device to keep stock data in business (Full texts are presented in Appendix A).

Despite being considered specialized texts, both genres (i.e. software user guides and marketing collaterals) don’t make use of highly specialized language and are persuasive in nature, being their overarching communicative goal to build up trust among potential customers and users. As to their communicative purpose, this type of texts typically serve to:

ˆ Get across potential customers and users.

ˆ Transmit a positive and trustful image about the product.

ˆ Provide an account of the main features of the software in question.

Both genres are also comparable in terms of authorship. No author is not known and elaborate non-literal language models should not be found.

Denotation is thus the main linguistic function present in these kinds of texts.

The reason why these two genres were chosen as STs in the present study is a practical one: texts belonging to these genres are generally translated making use of CAT tools, but it can also be imagined a scenario where they are translated without any kind of translation automation (e.g., a word processor - E1).

The classical translation approach taken to translate these texts was also taken into account when selecting the STs for the experiment. House (1997) famously distinguishes between two different modes of translation, overt and covert15. The distinction between overt and covert should not be seen as a strict dichotomy. Overt and covert should be seen as the two endpoints of a line ranging from rather ‘literal’ translations on the one end (as it should be expected for the genres used in this study) to rather ‘free’

(non-literal) translations on the other end (House 1997). The two genres in

15To put it in short, overt translations are characterized by an effort to the translator to stay as close to the ST as possible. When translating overtly, the translator will not have to look for a match between the communicative function of the ST in the SL community and the communicative function of the TT community, since both tend to be exactly the same. This contrasts with a covert translation approach, where the translator tries to produce a TT that fulfils in the source culture. In order to achieve this, the translator may apply a “cultural filter” that brings the TT in line with the communicative conventions of the target culture. In this way, the process of covert translation often results in a TT that deviates considerably from the ST. In overt translation, on the other hand, no cultural filter is applied.

the STs selected for the study were also chosen because they are located somewhere in the middle between overt and covert. This characteristic makes our STs good candidates for an investigation of explicitation shifts depending on the translation editing environment used to translate. If a TT deviates considerably from its ST, as it is the case in ‘purely covert’ translations, it won’t be a good candidate to be translated using any kind of translation automation tool, since there will be many passages where sentences or parts of sentences have been omitted, added or rearranged, making the identification of shifts difficult or even impossible.

The three STs in the study were manipulated with two aims in mind.

Firstly, the experimental texts had to be comparable with respect to the total number of words. Secondly, the levels of text difficulty should also be comparable in order not to make the text difficulty a confounding variable which could compromise the results of study. Thus, the three texts used in the experiment had to be comparable in length and complexity in order to provide a uniform framework for comparison.

The experimental texts were made rather short, but long enough to include enough focus points of the dependent variables under study (see Section 5.3.1).

The word-counts for each of the three texts in the experiment are as follows:

T1: 528 words, T2: 472 words and T3: 418 words. An average of 500 words was considered not to be a too demanding task for professional translators used to translating technical texts with CAT tools.

5.3.5.1 Source text complexity

The comparable levels of complexity of the experimental texts were established using two quantitative indicators: 1) measurements of readability, and 2) calculations of word frequency.

When it comes to not highly specialized texts, it is assumed that these objective measurements, to some extent, indicate the level of difficulty ex-perienced by the participants when comprehending for translation. While readability indices have traditionally focused on difficulties related to certain aspects of text comprehension only (Nation 2001: 161), Jensen (2009) suggests how readability indices can also be used to assess the relative amount of both production and comprehension effort needed during translation.

5.3.5.1.1 Readability measurement :

Readability indexes measure how easy to read and comprehend a document is, assuming the readers are identical in understanding of the subject matter.

The ease with which the text is likely to be understood has attracted the attention of many scholars as early as the 1930s (Gray and Leary 1935). More

recently, Nation (2001: 161-162) pointed out that readability formulas mostly focus on what is easily measurable, i.e. word length and sentence length.

Factors such as prior knowledge, motivation, rhetorical structures, etc., would certainly be valuable for assessing text comprehensibility but such factor are not easy to include into an algorithm as they are difficult to quantify. For this reason most readability indexes are only based on counts of syllables, words and sentences.

Four reading index formulas were used in this study to measure the level of readability for the three experimental texts: the Automated Readability Index16 (ARI), the Coleman-Liau index17, Gunning Fog index18, and the SMOG index19. These four indexes return the U.S. grade level that the reader must have completed in order to fully understand the text.

All four readability indexes for Text 1, Text 2 and Text 3 showed a relatively similar score, yet not identical, in the level of complexity. The U.S. grade level indexes revealed that an average of 11 years of schooling was needed to successfully comprehend Text 1; 8,8 years of schooling was needed to successfully comprehend Text 2, while 11,9 years of schooling was needed to successfully comprehend Text 3. According to these measures Text 2 would be the easiest to understand and Text 3 would be the more difficult20.

16Automated Readability Index: ARI = 4.71*characters/words+0.5*words/sentences-21.43. Unlike the other indices, the ARI, along with the Coleman-Liau, relies on a factor of characters per word, instead of the usual syllables per word. Although opinion varies on its accuracy as compared to the syllables/word and complex words indices, characters/word is often faster to calculate, as the number of characters is more readily and accurately counted by computer programs than syllables

17Like the ARI but unlike most of the other indices, Coleman–Liau index relies on characters instead of syllables per word. Coleman-Liau Formula = 5.89*characters/words-0.3*sentences/(100*words)-15.8.

18Gunning-Fog Readability Formula: FOG Index = 0.4 (ASL + TSW), where ASL is the Average Number of Words in a sentence and TSW is the Number of Trisyllabic Words per 100 words. The trick in counting TSW is that proper words, compound nouns, and verb conjugations do not count. For example “determination” is a trisyllabic word but

“opening” is not.

19The SMOG-Grading for English texts was developed by McLaughlin in 1969. Its result is a school grade. SMOG index Formula: SMOG-Grading = square root of (((wds<= 3 syll)/sent)*30) + 3.

20Averages from Online-Utility.org (http://www.online–utility.org/text/analyzer.jsp) andEditcentral (http://www.editcentral.com) were used to calculate these index scores.

Both websites return the complexity scores of a text which is entered into an online query box by the user, but since the results for some of the indexes were not the same between these two tools, we decided to work with the means of each index resulting from these two utilities.

Figure 5.1: Source text complexity automatic scores

It is important to notice here that readability indexes are only sensitive to the surface structure of a text (Jensen 2009: 68) and they don’t fully grasp all the complexities surrounding technical texts as it is the case. These indexes were developed with text comprehension in mind and they cannot give us conclusive evidence of how difficult a translator perceives a text to be or how much translation effort is needed to translate it. However, one could hypothesise that these indexes can provide us with some idea of the amount of effort that will be invested in the production of a TT, assuming that text comprehension and text production (and memory) are inseparable parts of the translation process (Gile 1995: 162-169).

As noted before, the indexes reported above base their scores on calcula-tions of character length, sentence length and syllable length, and they fall short of making predictions about the perceived difficulty of single words or compounds.

5.3.5.1.2 Word frequency : Based on the common assumption in cog-nitive psychology that there is a relationship between word frequency and word familiarity, Jensen (2009: 69) suggests that word frequency can also be used to estimate the relative amount of effort needed to process a given word. The frequency with which a word appears in the real world as reflected through corpora such as the British National Corpus is assumed to mirror the amount of effort that a translator will have to put into the processing of it (Read 2000: 160).

Less frequent words, i.e. words that occur less often than high-frequency words, are expected to demand more cognitive resources than high-frequency words, which are more familiar to the reader. Evidence from psycholinguistic experiments has shown that lexical retrieval time and word frequency correlate

as less frequent words are retrieved more slowly than high-frequency words.

In the present study, the words in the experimental texts were grouped according to frequency: one group consisted of high-frequency words and one group consisted of less frequent words.21 Less frequent words were defined as words that are among 1,001-10,000 most frequent words (K2-K10 words).

High-frequency words were defined as words that are among the 1-1,000 most frequent words (K1 words). In Figure 5.2 below, the number of less frequent words is compared with the number of high-frequency word in the three experimental texts.

Figure 5.2: Source text word frequency scores

As shown in Figure 5.2, the word frequency measurements show that the three texts are comparable in terms of word frequency and thus in terms of processing effort. However, it should also be mentioned that the use of these two groups of frequency bands (K1 and K2-K10) as a reflection of ST difficulty is not entirely unproblematic since words that belong to the K2-K10 bands may indeed not be perceived as being difficult compared to words which belong to the K1 band, but still an external validation is provided to prove that there is at least a formal similarity among the three texts in the experiment.

5.3.5.1.3 Summary and discussion of source text complexity : As pointed by Hvelplund (2011: 93), the level of difficulty of a text is inherently problematic to gauge, as the experience of a text’s level of difficulty varies between individuals. A complex text is not necessarily difficult to translate by everyone —it depends very much on the previous projects, skills and

21Word frequencies are based on the British National Corpus as a reference.

specialisation of the translator (e.g. Jensen 2009: 62-63). However, since a complex text is often experienced as a difficult text, these relatively crude measurements can be proposed as indicators of anticipated ST difficulty when designing an experiment.

Indicator Text 01 Text 02 Text 03

Readability 9.6 7.2 11.2

Word frequency K1 70.18 68.58 69.9

Word frequency K2-K10 78.78 85.99 82.9

Table 5.3: Summary of source text complexity indicators

We are aware that the set of indicators discussed above in relation to the measurement of ST complexity is not exhaustive. Other indicators could have been employed in order to further gauge the anticipated comparability of STs used in this study. For instance, syntactic structures could have been compared among texts instead of limiting to the word surface. Nonetheless, the two indicators used in the present study are expected in principle to provide a good general indication of ST similarity and complexity.

Dans le document An empirical study on the impact of (Page 159-165)