How Many Ways Can Google Translate Say It?: Synonym Use in Neural Machine Translation Output

(1)

Master

Reference

How Many Ways Can Google Translate Say It?: Synonym Use in Neural Machine Translation Output

GULLAPALLI, Aparna

Abstract

This study attempts to determine whether a neural machine translation system that encounters repeated occurrences of a certain concept, expressed through a variety of synonyms, will consistently translate each synonym a given way or use multiple target-language synonyms as translations for each source-language synonym. The frequency with which the system recasts the meaning expressed by the source-language synonym using constructions involving different parts of speech is also considered. The study analyzes English translations generated by Google Translate for selected original French earnings releases. The translations generated by Google Translate are also compared to published English translations, where available, and the characteristics of both types of translations with respect to synonym use are compared to those of selected original English earnings releases.

GULLAPALLI, Aparna. How Many Ways Can Google Translate Say It?: Synonym Use in Neural Machine Translation Output. Master : Univ. Genève, 2018

Available at:

http://archive-ouverte.unige.ch/unige:114723

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

1

APARNA GULLAPALLI

HOW MANY WAYS CAN GOOGLE TRANSLATE SAY IT?:

SYNONYM USE IN NEURAL MACHINE TRANSLATION OUTPUT

Directrice : Prof. Pierrette Bouillon Jurée : Johanna Gerlach

Mémoire présenté à la Faculté de traduction et d’interprétation pour l’obtention de la Maîtrise universitaire en traduction, mention Technologies de la traduction

août 2018

(3)

2 Déclaration attestant le caractère original du travail effectué

J’affirme avoir pris connaissance des documents d’information et de prévention du plagiat émis par l’Université de Genève et la Faculté de traduction et d’interprétation (notamment la Directive en matière de plagiat des étudiant-e-s, le Règlement d’études des Maîtrises universitaires en traduction et du Certificat complémentaire en traduction de la Faculté de traduction et d’interprétation ainsi que l’Aide-mémoire à l’intention des étudiants préparant un mémoire de Ma en traduction).

J’atteste que ce travail est le fruit d’un travail personnel et a été rédigé de manière autonome.

Je déclare que toutes les sources d’information utilisées sont citées de manière complète et précise, y compris les sources sur Internet.

Je suis consciente que le fait de ne pas citer une source ou de ne pas la citer correctement est constitutif de plagiat et que le plagiat est considéré comme une faute grave au sein de l’Université, passible de sanctions.

Au vu de ce qui précède, je déclare sur l’honneur que le présent travail est original.

Nom et prénom : GULLAPALLI Aparna

Lieu / date / signature :

(4)

3 Acknowledgments and Dedication

I would like to thank Pierrette Bouillon for the time she devoted to reading drafts of this thesis and the valuable feedback and guidance she provided. I am especially grateful for her willingness to read my drafts quickly even when they showed up unexpectedly; thanks to her efforts, I was able to finish my thesis on time. I would also like to thank Johanna Gerlach for her readiness to jump in as second reader and her many thought-provoking questions; Chelsea, Isabel and Tatiana for being an amazing mock jury; and Val for her seemingly endless supply of encouraging words. Most of all, I would like to thank my mother for her unwavering support throughout my translation adventures.

This thesis is dedicated to the memory of Madhu and Tope, two everlasting sources of sunshine.

(5)

4

Table of abbreviations

Banks:

BPCE Groupe BPCE (Banque Populaire et Caisse d’Epargne) BCGE Banque Cantonale de Genève

BCV BCV Group (Banque Cantonale Vaudoise) GCN Crédit du Nord Group

BFCM BFCM Group (Banque Fédérative du Crédit Mutuel) BCJ Banque Cantonale du Jura

BCN Banque Cantonale Neuchâteloise

Corpora:

EN-All Corpus comprising the 17 original English earnings releases shown in Table 4.3 (page 53)

EN-UK Subcorpus comprising the 5 original English earnings releases issued by UK banks shown in Table 4.3 (page 53)

EN-US Subcorpus comprising the 12 original English earnings releases issued by US banks shown in Table 4.3 (page 53)

FR-All Corpus comprising the 50 original French earnings releases shown in Table 4.1 (page 48)

FR1 Subcorpus comprising the 35 original French earnings releases for which published English translations are available, as shown in Table 4.1 (page 48)

FR2 Subcorpus comprising 15 original French earnings releases for which no published English translation is available, as shown in Table 4.1 (page 48)

GNMT-All Corpus comprising the Google Translate-generated translations of the French earnings releases in FR-All

GNMT1 Subcorpus comprising the Google Translate-generated translations of the French earnings releases in FR1 (i.e. those for which a published English translation is available)

GNMT2 Subcorpus comprising the Google Translate-generated translations of the French earnings releases in FR2 (i.e. those for which no published English translation is available)

(9)

8 1. INTRODUCTION

1.1. Introduction

Variety is the spice of life. It is also a key ingredient in our writing. To make our sentences more interesting, we vary their structures. We try not to begin consecutive sentences with the same word. And if we have already mentioned something in one sentence, we might be tempted to find a different way to refer to it in the next one. To do this, we can paraphrase or use pronouns, hypernyms, synonyms or other “semantically related” words (Egbert 1999:

42). The latter are all forms of lexical variation, and they help keep our writing from being dull and repetitive.

“Dull” and “repetitive” are likely among the adjectives that many translators would apply to machine translation (MT) output. This thesis aims to assess whether the second part of that description—that it is repetitive—is still warranted. MT has made great strides and moved beyond the basic systems that are programmed to respond to a given input word by producing a given output word. Today’s neural machine translation (NMT) systems, for example, undertake much more complex analyses and are able to consider various elements in a source word’s context and the translations of all previous words in a sentence when making their translation choices. This thesis will look at whether these systems, with their new capabilities, can bring variety to the writing they produce through the use of synonyms.

1.2. Studies of discourse and style in MT

As MT quality has improved, some researchers have begun shifting their focus from accuracy to issues of discourse and style (Sim Smith 2017: 110). The discourse properties of a text are those that

“go beyond [the properties of] individual sentences and that reveal themselves in the frequency and distribution of words, word senses, referential forms and syntactic structures, including: document-wide properties, such as style, register, reading level and genre; patterns of topical or functional sub-structure; [and] patterns of discourse coherence, as realized through explicit and/or implicit relations between sentences, clauses or referring forms” (Organizers 2015: iii).

(10)

9 Issues related to discourse must be addressed in order for MT systems to “move beyond stringing together” the discrete items they are programmed to translate and produce coherent texts (Sim Smith 2017: 110). Efforts to resolve them focus on providing systems access to information located outside the specific phrase or sentence being translated; as these are translated as separate units, the systems often lack sufficient context (Sim Smith 2017: 111;

Bawden 2018: 1304).

Sim Smith provides an overview of the work being done to give systems greater context in making their translation choices. She cites, for example, the study done by Mascarell et al, which sought to highlight the presence in the source text of trigger words that would enable a system to choose between alternative translations (Sim Smith 2017: 113). Luong and Popescu- Belis tried to improve the accuracy of pronoun translation, which often requires knowledge of an element outside the current context being translated, by developing methods that would help systems deduce the correct pronoun form based on various word features (Sim Smith 2017:

111). Both groups of researchers were working on issues (disambiguation and pronoun use) where there is one correct answer; the goal of the researchers was to give MT systems additional context so that they could reach that answer. In other situations, there are multiple correct translations, but one is preferred because it preserves a certain characteristic of the source text. In these situations, the goal behind providing an MT system additional information about the context is not to avoid an entirely wrong answer but to arrive at a better one.

While the subject of this thesis is lexical variation, it should be noted that preserving source-text characteristics in translation sometimes requires repetition. In their study on the evaluation of discourse phenomena, Bawden et al looked at cases where the word or phrase needing to be repeated had appeared in an earlier sentence. They included the following sentence pair, in which the repetition of the word “crazy” provides coherence in the source text, in their test set: “What’s crazy about me?” and “Is this crazy?” Given that the system had already produced a translation for the first sentence using the word fou, the authors argued that dingue would not be an appropriate translation for the second occurrence of “crazy,” as it would result in a loss of alignment: “despite two translations of an English source word being synonyms […], they are not interchangeable in a discourse context” (Bawden et al 2018: 1306- 7).

Niu et al (2017) also looked at situations where different synonyms are not equally appropriate translations. In their study, synonyms are distinguished based on the formality they

(11)

10 express. The authors note that “[s]uch differences in formality have been identified as an important dimension of style.” (They illustrate this difference with the sentences “anybody hurt?” and “is someone wounded?”). To address this issue, the authors designed an MT system that “explicitly takes an expected level of formality as input” (Niu et al 2017: 2814). Like Bawden et al, Niu et al were addressing an issue of discourse where the decision as to the appropriate translation choice depended on information that is often outside the current sentence being translated.

Like other discourse phenomena (that is, phenomena that occur “beyond sentence level” (Bawden et al 2018: 1307)), variation often depends on word choices made in the rest of the document being translated. Variation can be considered a discourse property of a text because, in the words of the organizers of the Second Workshop on Discourse in Machine Translation, it reveals itself “in the frequency and distribution of words, word senses, […] and syntactic structures” (Organizers 2015: iii) throughout an entire text. Variation can be seen as the opposite of the type of repetition studied by Bawden et al; while such intentional repetition requires the use of the same word or phrase that appeared previously, variation requires the use of a different one.

Lapshinova-Koltunski (2013) was interested in the variation present in human translations and in MT output. While Bawden et al’s study was aimed at developing metrics for MT evaluation and Niu et al’s goal was to help MT systems make more appropriate word choices, Lapshinova-Koltunski’s focus was the description and comparison of the variation present in translations produced using different methods, each called a “translation variant.”

She looked at translations produced by (i) professional translators (designated as “PT” in her study), (ii) students using CAT tools (designated as “CAT”), (iii) a rule-based machine translation (RBMT) system and (iv) two statistical machine translation (SMT) systems. One of the SMT systems used (designated as “SMT1”) was Google Translate, which was still running a statistical model at the time of the study¹; the other was Moses.

She compiled the translations produced using these methods into separate corpora. The five subcorpora were “linguistically annotated”; they were “tokenised, lemmatised, tagged with part-of-speech information, segmented into syntactic chunks and sentences.” The subcorpus of professional translations was aligned with the English originals but the others were not.

1 Google announced the launch of its “Google Neural Machine Translation system (GNMT)” on September 27, 2016. (Le and Schuster 2016)

(12)

11 (Lapshinova-Koltunski 2013: 80) The translations were all into German and all based on the same English source texts. Lapshinova-Koltunski selected a variety of text types for her source texts because she had decided to focus on “variation across registers and genres.” This variation was due to the fact that “language may vary according to the activity of the involved participants, production varieties (written vs. spoken) of a language or the relationships between speaker and addressee(s)” as well as other factors (Lapshinova-Koltunski 2013: 78).

The structure of the subcorpora and their annotations would allow for more complex studies to be carried out in the future, but the study at hand was limited to calculations of the type-token ratio (TTR), the lexical density (LD) and the part-of-speech (POS) distribution for each subcorpus, as “[t]hese features are among the most frequently used variables which characterise linguistic variation in corpora” (Lapshinova-Koltunski 2013: 81). The translation variants were then compared to each other in terms of these three features. They were not compared to the English originals. Of the five translation variants, the professional translations had the highest TTR and LD, the student translations had the lowest and the MT systems fell in between. Lapshinova-Koltunski conveys the results with the following shorthand: “PT >

RBMT > SMT2 > SMT1 > CAT” (for the TTR results) and “PT > SMT2 > SMT1 > RBMT >

CAT” (for the LD results) (Lapshinova-Koltunski 2013: 82-83). This is a type of shorthand that I will also use in my study.

With respect to POS distributions, Lapshinova-Koltunski looked specifically at the ratio of nouns to verbs. There, Moses had the highest ratio of nouns to verbs, followed by Google Translate and the professional translators (the ratio being the same for both), the RBMT system and lastly the student translators (Lapshinova-Koltunski 2013: 82-83). This ratio was an interesting one to observe in the context of English-to-German translation because it had been argued that “German is less ‘verbal’ than English.” Consequently, “higher ‘verbality’” in the translations could indicate that features of the English originals were “shining through.”

However, thoroughly studying this phenomenon would have required comparing the translations with their source texts and with comparable original German texts, which was beyond the scope of the study Lapshinova-Koltunski was then conducting. (Lapshinova- Koltunski 2013: 83) I will return to the features of Lapshinova-Koltunski’s study in Section 1.4 when I compare and contrast them to the methods I used to carry out my study on variation.

Many of the differences in our studies stemmed from the fact that we were looking at different aspects of translation. The aspect that interested me and the research questions I formulated regarding it will be addressed in Sectoin 1.3.

(13)

12 1.3. Research questions

Like Lapshinova-Koltunski, I was interesting in observing the patterns of variation in MT output. I was specifically interested in the type of variation described in Section 1.1; that is, variation that serves to make a text more interesting and readable by reducing its monotony.

As indicated in Section 1.1, I wanted to look particularly at lexical variation achieved through the use of synonyms; the term “lexical variation” as used in this thesis will refer exclusively to this type of variation. In order to illustrate the type of variation that I was interested in observing in a translation context, I will first begin by discussing variation in a monolingual context.

A study of variation in a monolingual context might look at the number of ways a certain concept is expressed in one language. For example, if researchers looking at the ways of expressing the concept of an increase (specifically, the fact that an amount in the present is greater than it was in the past) in French use a corpus that includes the press release issued by Groupe BPCE to announce its 2017 financial results, they would find the verb augmenter, the noun croissance and the adverb en hausse used to express this concept in the following sentences:

 Le résultat avant impôt du groupe augmente de 4,1 %, et s’établit à 6 054 millions d’euros pour l’année 2017

 Les encours de crédit s’établissent à 196 milliards d’euros au 31 décembre 2017, enregistrant une croissance de 7,7 % par rapport au 31 décembre 2016

 Le produit net bancaire (hors éléments exceptionnels) du pôle Assurance au quatrième trimestre s’élève à 190 millions d’euros, en hausse de 10,7 % par rapport au quatrième trimestre 2016

Similarly, researchers studying English synonyms for the same concept would find the following occurrences of the verb “increase,” the noun “growth” and the adverb “up” in the press release issued by Standard Chartered to announce its full-year results for 2017:

 Basic earnings per share increased from 3.4 cents in 2016 to 47.2 cents in 2017

 Client activity was positive with 13 per cent growth in loans and advances to customers

 Profit before tax of $3.0bn was up 175%

When we look at lexical variation in translations, we can look at the synonyms that are present in the source text and those that are present in the target text independently, as in the

(14)

13 examples just given. We can also, however, study the relationships between the synonyms used in the source text and those used in the target text by looking at the specific target-language synonyms used to translate any given source-language synonym. In this way, we can study the level of variation in the translations generated for any given source-language word or phrase.

For example, given the groups of French- and English-language synonyms identified in the previous paragraph, we can look at whether a single English synonym is used as a translation for a given French one (for example, the verb augmenter for the verb “increase,” the noun croissance for the noun “growth,” or the adverb en hausse for the adverb “up”) or whether all three English synonyms are used interchangeably as translations for all three French synonyms.

(As these synonyms are of different parts of speech, their inter-substitution would, of course, require syntactic changes to be made to the sentence. These types of changes will be addressed in Section 1.4.3 and in Chapter 2. The existence of a relationship of synonymy among words of different parts of speech will be addressed in Chapter 2.)

The relationships between source-language synonyms and the target-language synonyms used to translate them, as well as the variation that could result from them, are what I hoped to study. Unlike Lapshinova-Koltunski’s study, which looked at the characteristics of different types of translations independently of the source text, I hoped to map the relationships between given source-text elements and given target-text ones. The research question I wanted to address was whether an NMT system that encounters repeated occurrences of a certain concept, expressed by means of a variety of synonyms would (i) consistently produce a given translation for each synonym, using the same part of speech as the source text, or (ii) translate each source-language synonym using a variety of target-language synonyms and syntactic constructions. My hypothesis was that the system would not consistently produce the same translation for a given source-language synonym and would, instead, use a variety of target- language synonyms and syntactic constructions in its translations.

The main focus of this study is, as indicated, the relationship between source-language synonyms and the target-language synonyms used to translate them. However, as readers of translations generally treat them as independent texts (that is, without referring back to the source text), it would also be interesting to follow Lapshinova-Koltunski’s approach and look at the lexical variation in the target texts themselves (that is, to not only look at how source- text items correspond to target-text ones, but to also describe the characteristics of the target texts as independent texts). As in Lapshinova-Koltunski’s study, the characteristics of different groups of texts could be compared to help us interpret and assess the level of variation produced

(15)

14 in the NMT output. Therefore, in order to provide greater context for interpreting the results related to the primary research question, this study will also address a subsidiary research question: how does the lexical variation present in the NMT output compare to the lexical variation present in other types of texts (in both cases, with respect to the concept chosen for the primary research question)?

1.4. Methodology

This section will provide an overview of the approach that I took to address my primary and subsidiary research questions. First, Section 1.4.1 will introduce the concept and text type chosen for this study and discuss the factors that influenced this choice. I chose to look at synonyms expressing the concept of an increase (in the specific use for which examples were given in Section 1.3) in earnings releases issued by banks to announce their financial results.

For my primary research question, I looked at French-language press releases issued by Swiss and French banks to announce their financial results and the translations into English produced by Google Translate. To help evaluate the results of that part of the study and to address the subsidiary research question, I also compiled other corpora of related texts. These will be described in Section 1.4.2. Finally, Section 1.4.3 will discuss how I analyzed and evaluated the information extracted from the corpora.

1.4.1. Text type and concept chosen for study

To address my primary research question, I needed a corpus of texts that met certain criteria: First, the repeated concept needed to be expressed in ordinary words and not terms, as when terms are involved, repetition is generally the most common and the most appropriate strategy both in monolingual and translation contexts. Second, the concept had to be able to be expressed by multiple, interchangeable synonyms of the same register so that choosing among a group of target-language synonyms would not change the register of the source text and result in an inappropriate translation. (This relates to the concerns addressed in the study by Niu et al that was discussed in Section 1.2.) Third, the synonyms could not appear in contexts using rhetorical devices such as alignment or repetition because of the types of concerns raised by Bawden et al. I therefore excluded literary texts. In literature, the intentional use of lexical repetition can serve many purposes; Egbert (1999) provides an extensive review of these purposes.

I decided to look for texts with what DiMarco and Hirst call a “utilitarian style.” This is a type of “group style,” which means that all texts of a given text type share the characteristics

(16)

15 of the style. Utilitarian styles are “associat[ed] with a genre of text that has a particular function or purpose, such as medical textbooks or newspaper articles. In such styles, the writer accommodates her language to what readers expect in the specific, restricted situation”

(DiMarco and Hirst 1990: 66). While they were interested in studying utilitarian styles because they thought they would be easier to codify than literary ones, I was interested in them because I thought texts with such styles would have more limited, repetitive and predictable content, which would make the process of observing variation easier.

I chose to work with the press releases used by companies to announce their financial results, called “earnings releases,” (DiStaso 2012: 127) because they provide this type of repetition, predictability and limited context. Earnings releases satisfy the three criteria set out in the first paragraph of this section: the repeated concept is expressed in ordinary words, the texts use multiple synonyms with the same register, and given the text type’s utilitarian style, the chances of finding many rhetorical devices were low. The purpose of earnings releases is to allow “managers to communicate their firm’s performance to shareholders and other stakeholder groups” (Bowen 2005:1012). In order to give their readers a clearer sense of the nature of that performance in a particular period, earnings releases “generally include a comparative statement of prior period earnings” that serves as “a benchmark that investors can use to evaluate current earnings” (Krische 2005: 244).

In any given earnings release, such a comparative statement is provided for a number of individual results to indicate whether the current-period result is higher or lower than the prior-period result or has remained unchanged. Consequently, the ideas of an increase, a decrease and a lack of change are expressed numerous times in a release. For example, in one paragraph of its earnings release for the 2017 financial year, BCV Group informs the reader that, among other things, “le résultat brut des opérations d’intérêts augmente de 3% à CHF 498 millions,” “[l]es autres résultats ordinaires du Groupe reculent de 8% à CHF 39 millions”

and “[l]es revenus du Groupe BCV sont stables à CHF 967 millions.” For this study, I decided to look specifically at the words used to express the concept of an increase.

1.4.2. Corpus study

Having decided on a text type, I proceeded to gather relevant texts and organize them into corpora. I assembled (1) a corpus of French-language earnings releases issued by Swiss and French banks that would serve as the source texts, (2) a corpus of the translations into English generated by Google Translate for each of the original French-language earnings

(17)

16 releases (which I will call the “GNMT translations”), (3) a corpus of English translations that had been published on the bank websites for 35 out of the 50 original French releases (which I will call the “human translations”), and (4) a corpus of original English-language earnings releases issued by American and British banks. These corpora could also be analyzed as smaller subcorpora or aligned in parallel corpora in accordance with the needs of the study: the US and the UK components of the English corpus could be analyzed separately; the French releases could be divided in terms of whether a human translation was available or not, as could the GNMT translations; and the French subcorpora, the Google Translate subcorpora and the corpus of the human translations could be combined into parallel corpora.

The French corpus was the starting point for the study. My primary research question asks what an NMT system would do when it encounters repeated occurrences of a certain concept, expressed by means of a variety of synonyms. I therefore needed to find the synonyms used to express the concept of an increase in the source texts and locate the occurrences of those synonyms. The criteria that I established for identifying synonyms and for selecting occurrences are described in Sections 2.3 and 4.3.1.2. Because words can have multiple word uses, the criteria served to ensure that the word use of a word in a given occurrence did indeed express the concept I was interested in and that, as used in that occurrence, that word was indeed interchangeable with the other synonyms identified. As will be shown in Chapter 2, a relationship of synonymy does not exist between words, but between word uses.

For example, the English noun “growth” can be used in connection with the development of the human body (for example, a “growth spurt”) or to talk about economic expansion (“economic growth”). The English noun “increase,” on the other hand, cannot be used in these two contexts. The words “growth” and “increase” are therefore not synonyms in all of their word uses, but only in some. In Section 1.3, one of the example sentences from the Standard Chartered press release included the phrase “13 per cent growth in loans and advances to customers.” In a Royal Bank of Scotland Group’s 2017 full-year earnings release, we find a very similar context, but with the noun “increase” used instead of “growth”: “a 5.9% increase in net loans and advances.” The words “growth” and “increase” are synonymous in this word use. By looking at the context, we can better identify synonymous word uses.

My search for French synonyms in the corpus of original French releases and my search for English synonyms in the corpus of original English releases therefore focused heavily on the contexts in which words appeared. The search for English synonyms in the corpora of

(18)

17 GNMT translations and human translations, on the other hand, consisted of merely identifying the words used to translate the occurrences of French synonyms in the source texts.

1.4.3. Framework for analysis

In order to be able to easily analyze and compare the occurrences of French and English synonyms extracted from the monolingual and parallel corpora, I decided to record only two pieces of information for each occurrence: its morphological root and its part of speech. In her study on lexical repetition in literary translation, Egbert defined the root as “the element that remains constant in” the “inflectional and derivational variants of any lexical term” (Egbert 1999: 57). To indicate the roots, I used a system of notation (based on the one used by Egbert in her study) composed of my personal shorthand forms of the roots and an asterisk (for example, in my system of notation, progress* is the root for the word family that includes the verb progresser and the noun progression).

The advantage of analyzing synonyms based on their roots rather than treating different parts of speech separately (for example, treating the verb “increase” and the noun “increase”

separately) is that it can give us a better idea of how a reader might perceive lexical variation in a text. Egbert notes that lexical repetition can give readers the feeling of cohesion even when the items involved are not entirely identical. She therefore expands her definition of lexical repetition to include “certain morphological alterations by way of inflection (‘like’ – ‘likes’) and derivation (‘entertain’ – ‘entertainment’) or conversion (‘human’ [adj.] – ‘human’ [n.])”

(Egbert 1999: 26-27). If readers perceive repetition when morphologically related words are used, it is reasonable to presume that they would either not perceive variation or perceive a lack of variation in those cases, as variation is absent when repetition is present.

Grouping synonyms together by their roots presupposes that words of different parts of speech can be treated as synonyms. (In other words, it presupposes accepting that word uses of the verb progresser and the noun progression can be treated as synonymous in French and that word uses of the verb “increase” and the noun “increase” can be treated as synonymous in English.) In this study, I accept that a relationship of synonymy can exist between words of different parts of speech; the reasons will be explained in Chapter 2. One advantage of this broader view of synonymy is that it allows us to more accurately describe the choices we make when we translate. As will be described in Chapter 2, when we translate, we do not engage in the direct replacement of nouns with nouns and verbs with verbs, but may use different parts of speech in order to express the meaning of the source text while producing translations that

(19)

18 read well in the target language. Looking at parts of speech in MT output enables us to study the extent to which MT systems engage in that sort of rephrasing. (As indicated in the first paragraph of this section, the part of speech is the second piece of information that I recorded for each occurrence of a synonym.) Engaging in this type of rephrasing expands the range of translation options available for a given source-text word or phrase and, consequently, increases the potential for variation in the target text.

The following examples are taken from the corpora used for this study and show how translations can use parts of speech that differ from the source text. The first example involves one of the sentences from a Groupe BPCE earnings release provided in Section 1.3:

 Les encours de crédit s’établissent à 196 milliards d’euros au 31 décembre 2017, enregistrant une croissance de 7,7 % par rapport au 31 décembre 2016 In translating the phrase beginning “enregistrant une croissance de 7,7 %,” the human translation follows the structure of the original:

 “representing 7.7% growth compared with December 31, 2016”

The GNMT translation, on the other hand, deviates from that structure and uses the adverb

“up”:

 “up 7.7% compared to December 31, 2016”

Here, the translation produced by Google Translate might be considered more idiomatic and authentic than the human translation.

The following example is taken from the same earnings release. It also involves the noun croissance but relates to a different financial result:

 Ils ont enregistré une croissance de 7,0 % par rapport au 31 décembre 2016 The human translation combines this sentence with the previous one, and the information above is conveyed through the following phrase:

 “equal to growth of 7.0% compared with December 31, 2016”

In its translation, Google Translate replaces the verb phrase enregistrer une croissance with the verb “increase”:

 “They increased by 7.0% compared to December 31, 2016”

(20)

19 These examples serve to illustrate the various types of rephrasing that are possible in translation. Further examples will be provided in Section 2.4.

1.4.4. Evaluation

Section 1.4.3 introduced the two features of synonym occurrences that would be considered in my study: the roots and part of speech. This section will describe how those two features will be used to address my research questions.

1.4.4.1. Primary research question

As explained in Section 1.3, my primary research question is concerned with the relationships between specific source-language synonyms and the translations generated for them in NMT output. The first part of the question asks whether an NMT system would consistently produce the same translation for a given source-language synonym. If such a consistent pattern of translation were present, this would be seen in a one-to-one correspondence between a source-text item and a target-text one. Because I had decided to analyze occurrences of synonyms in terms of their roots and parts of speech, I decided to look for one-to-one correspondences between source-language roots and target-language ones (these roots will be identified in Section 4.3.1.1) and between the parts of speech used in the source-text occurrences and those used in the corresponding target-text occurrences. The absence of a one-to-one correspondence would indicate that a minimal level of variation was present.

With respect to the roots, in order for a one-to-one correspondence to be absent, more than one English root would need to be used in the translations of occurrences of a given French root. If, for example, all English translations of occurrences of the French root progress* (that is, the noun progression, the verb progresser and so on) used either the verb “increase” or the noun “increase,” there would be a one-to-one correspondence between the French root progress* and the English root increas*. The noun and verb forms of “increase” are not distinguished here because, as indicated in Section 1.4.3, the use of the same root, even in different morphological forms, creates the impression of repetition. It should be noted that, for the purposes of this study, the one-to-one correspondence is seen from the perspective of the French root. For example, if increas* is the only English root used to translate occurrences of the French root progress* and it is also the only root used to translate occurrences of the French root augment*, one-to-one correspondences would still be considered to exist between the French root progress* and the English root increas* and between the French root augment*

(21)

20 and the English root increas* (even though there would be no one-to-one correspondence seen from the perspective of the root increas*).

The first part of my primary research question would therefore be addressed by looking for the presence or absence of a one-to-one correspondence between roots or parts of speech.

In the absence of a one-to-one correspondence, the second part of my primary research question goes on to ask whether the NMT system would use a variety of target-language synonyms and syntactic constructions in its translations of given source-language synonyms. This introduces a subjective element as there is no precise definition for what constitutes a variety. Levels of lexical variation are often assessed by comparing features of various groups of texts (as Lapshinova-Koltunski had done in the study described in Section 1.2 when she compared the type-token ratios and lexical density of the corpora of translation variants). In my study, the features being assessed in the GNMT translations (the roots and parts of speech of target-text synonyms as they relate to specific source-text synonyms) can be compared to those same features in the human translations. Although the variation in the NMT output would not be expected to exceed or even match the level present in the human translations, a comparison of the two would provide a frame of reference. This comparison will be described in more detail later in this section.

Because human translations are not available for all of the earnings releases included in the French corpus, I decided to set baseline criteria to establish the presence or absence of a

“variety” of synonyms. The test for one-to-one correspondence that was previously mentioned seeks to establish whether a minimal level of variation is present. When applied to roots, it would therefore not, for example, distinguish between cases in which (i) two English roots are used as translations of a given French root, one 99% of the time and the other 1%, (ii) two English roots are used as translations of a given French root, each 50% of the time, and (iii) five English roots are used as translations of a given French root, each 20% of the time. A minimal level of variation would be found to be present in all three cases.

To have a more nuanced idea of the level of variation present, I decided to define the baseline criteria for the presence of a “variety” in terms of both (a) the number of target roots or parts of speech used as translations for a given source root or part of speech and (b) the distribution of translations among those target roots or parts of speech. Satisfaction of the baseline criteria would indicate a level of variation beyond the minimum; I will call this higher level an “interesting” level of variation. As the use of two target roots or parts of speech would

(22)

21 be sufficient to indicate a minimal level of variation, I decided to simply increase that number by one (to three) to set the threshold for an interesting level. In terms of distribution, I wanted to distinguish between cases where translations were concentrated in a single root or part of speech (as in the scenario described in (i) in the previous paragraph) and those where the NMT system regularly used multiple synonyms. To make this distinction, I added a criterion to determine whether a single root or part of speech was used in the majority of cases; according to this criterion, no target root or part of speech, as applicable, could be used to translate a single source root or part of speech more than 50% of the time (i.e. in a majority of cases).

The tests for minimal and interesting levels of variation would be conducted separately for each French root (progress*, augment* and so on) and part of speech (adverbs, verbs and so on). The tests are summarized in Tables 1.1 and 1.2.

Table 1.1: Test for establishing a minimal level of variation Roots In a given corpus, for each source root,

(i)(a) if the number of target roots used to translate occurrences of the given source root = 1, no variation is present

(i)(b) if the number of target roots used to translate occurrences of the given source root > 1, a minimal level of variation is present

Parts of speech In a given corpus, for each source part of speech,

(ii)(a) if the number of target parts of speech used to translate occurrences of the given source part of speech = 1, no variation is present

(i)(b) if the number of target parts of speech used to translate

occurrences of the given source part of speech > 1, a minimal level of variation is present

Table 1.2: Test for establishing an “interesting” level of variation Roots In a given corpus, for each source root, if

(a) the number of target roots used to translate occurrences of the given source root ≥ 3

and

(b) no target root accounts for more than 50% of those occurrences,

then an interesting level of variation is present Parts of speech In a given corpus, for each source part of speech, if

(a) the number of target parts of speech used to translate occurrences of the given source part of speech ≥ 3 and

(b) no target part of speech accounts for more than 50% of those occurrences,

then an interesting level of variation is present

(23)

22 The comparison with the human translations will be based on two measures. First, the difference between the number of English roots used to translate it in the human translations and the number used in the GNMT translations will be calculated. For each part of speech used in the source texts, the difference between the number of parts of speech used in the human translations and the number used in the GNMT translations will be calculate. The results will be analyzed on a case-by-case basis in Section 4.5 to see how the GNMT translations compare to the human translations in terms of absolute numbers of synonyms used. The second measure used to compare the GNMT translations to the human translations is intended to reflect how evenly translations of a given French root or part of speech are distributed over the English roots or parts of speech used. For example, if three English roots are used in the GNMT translations to translate the French root progress*, this measure is intended to reveal whether all three roots are used fairly regularly or one is used in most cases.

The second measure is based on the idea of a range. It is based not on the absolute numbers of roots or parts of speech used, but on the percentages they represent. The difference between the highest percentage and each of the other percentages is taken, and the average of those differences is then calculated. This yields the second measure, which I call the “average range.” The following is an example of how the average range is calculated:

 Scenario: 3 English roots are used to translate a given French root. 1 English root is used for 97% of occurrences, 1 is used for 2%, 1 is used for 1%

 Percentages involved: 97%, 2%, 1%

 Difference between the highest percentage and each of the other percentages:

.97 - .02 = .95; .97 - .01 = .96

 Average of the differences: .95 + .96 = 1.91; 1.91 / 2 = .96 (rounded up from .955)

 Average range is .96

If each of the three English roots accounted for 33% of occurrences, the average range would be 0.

The average range is based on the difference between the highest percentage and each of the other percentages, and not on the intervals between each percentage, because using intervals would have a strong diluting effect in cases where the occurrences are concentrated in one English root. If we were to apply a formula based on intervals to the scenario given above, the result would be as follows:

(24)

23

 Intervals between each of the percentages: .97 - .02 = .95; .02 - .01 = .01

 Average of intervals: .95 + .01 = .96; .96 / 2 = .48 The average ranges will also be analyzed on a case-by-case basis.

1.4.4.2. Subsidiary research question

All of the corpora and subcorpora mentioned in Section 1.4.2 will be involved in addressing the subsidiary research question, which asks how the lexical variation present in NMT output compares with the lexical variation present in other types of texts. The corpora and subcorpora will be compared in terms of the numbers of roots and parts of speech that are used in them, the average number of roots used per release, the overall average range for roots and the overall average range for parts of speech. The comparisons will be shown by means of the same type of notation used by Lapshinova-Koltunski. For example, if in the comparison of numbers of roots used, 7 roots are used in the entire French corpus, 12 are used in the human translations, 7 in the GNMT translations and 5 in the entire original English corpus, the results will be shown as “Human > FR-All, GNMT-All > EN-All.” The comma between FR-All and GNMT-All indicates that they occupy the same place in the comparison. The names of the corpora will be explained in Section 4.5.

The English-language corpora (of original releases and translations) will also be compared in terms of the percentages of occurrences of synonyms in each corpus that each English root accounts for and the percentages that each part of speech accounts for. For example, if the root increas* accounts for 26% of the occurrences of synonyms in the human translations, 50% in the GNMT translations and 63% in the entire original English corpus, the results would be shown as “EN-All > GNMT-All > Human.” This type of comparison might reveal patterns that could help explain certain tendencies in the GNMT translations or human translations and help us interpret the results of the study with respect to the primary research question.

1.5. Conclusion

This chapter has introduced the research questions that I will address and has touched on some important concepts that will be explored further in Chapter 2. These include the possibility of synonymy among different parts of speech and the principle that a relationship of synonymy exists between word uses and not words. Chapter 3 will provide an overview of the main approaches to MT and will argue that NMT systems have greater potential for

(25)

24 generating the type of variation being studied than systems based on other approaches. Chapter 4 will turn to the study itself. It will describe the corpora used, discuss the synonyms and occurrences of synonyms extracted from those corpora and apply the measures described in Section 1.4.4 to those occurrences.

(26)

25 2. SYNONYM USE IN EARNINGS RELEASES AND THEIR TRANSLATIONS

2.1. Introduction

This chapter will briefly address different conceptions of synonymy in order to provide a theoretical basis for the approach taken in this thesis. The approach can be considered both narrow, in that it focuses on a context-dependent use of synonyms, and broad, as it allows for synonymy among different parts of speech. A broad view of synonymy is taken in this thesis because it reflects the actual practice of translation (as will be shown in Section 2.4). After a general discussion of synonymy in Section 2.2, this chapter will, in Section 2.3, look specifically at the use of synonymy in earnings releases. It will describe how the narrow and broad aspects of synonymy mentioned above will be taken into account in identifying synonyms in the earnings releases used in the study, and will provide examples of synonyms taken from the French and English corpora mentioned in Section 1.4.2. Section 2.4 will then bridge the gap between intralingual synonymy and interlingual equivalence and examine how French synonyms are translated in the “human translation” of one French earnings release.

2.2. Intralingual synonymy and interlingual equivalence

We often engage in variation in our writing through the use of synonyms. We may choose to use different words from among a group of synonyms for stylistic purposes (“aiming for textual cohesion, adding emphasis, avoiding repetition”) or for prosodic ones (“rhyme, rhythm, alliteration”) (Adamska-Sałaciak 2013: 330). Egbert calls the avoidance of lexical repetition a stylistic norm; “by displaying variation, a language user can earn the appreciation of the speech community as someone with a high stylistic proficiency” (Egbert 1999: 43). As noted in Section 1.1, this variation can also be achieved through the use of various other devices, including hypernyms and pronouns (Egbert 1999: 42). However, this thesis is only concerned with the variation produced through the use of synonyms and the syntactic variation that arises when synonyms of different parts of speech are used.

Words are generally considered to be synonyms if they pass the test of

“exchangeability” (Hüllen 2003: 36), but the ability to replace one word with another is limited by the degrees of difference in their denotations, by contextual factors (Hüllen 2003: 36), and by the words they generally collocate with (Edmonds and Hirst 2002: 111). According to Edmonds and Hirst,

(27)

26

“[a]bsolute synonymy, if it exists at all, is quite rare. Absolute synonyms would be able to be substituted one for the other in any context in which their common sense is denoted with no change to truth value, communicative effect, or ‘meaning’ (however

‘meaning’ is defined).” (Edmonds and Hirst 2002: 107)

Near synonyms, on the other hand, are “almost synonyms, but not quite; very similar, but not identical, in meaning; not fully intersubstitutable, but instead varying in their shades of denotation, connotation, implicature, emphasis, or register” (Edmonds and Hirst 2002: 107).

Near synonyms can also be distinguished from each other by their collocations, “the words or concepts” with which they can be idiomatically combined (Edmonds and Hirst 2002: 111). For the purposes of this thesis, I accept that most of the words that we consider synonyms are actually near synonyms and that their interchangeability is subject to constraints. I will refer to them as synonyms, but with the understanding that special attention needs to be paid to the conditions under which they can be substituted for each other.

Because of the test of exchangeability, examples of synonymy often involve words with the same part of speech. However, for authors such as Hurford et al, a relationship of synonymy can exist even “between words of different parts of speech, for example between the verb sleeping and the adjective asleep” (Hurford et al 2007: 108). Even though they restrict synonymy to word uses and use the term “paraphrase” to refer to “the notion of ‘sameness’ of meaning” when it is “extended to entire sentences in a language” (Hurford et al 2007: 108), they acknowledge that “some semanticists talk loosely of synonymy in the case of sentences as well” (Hurford et al 2007: 109).

Hüllen goes beyond word uses and asserts that words and phrases can function as synonyms of each other:

“[I]n the statement Ά bachelor is a male person who has never married’ the word of equation (be) means ’p is semantically identical with q’ and the expressions before and after be (p, q) have exactly the same meaning. This phenomenon is called synonymy.

Words can be synonyms of words, and words can be synonyms of phrases, just as phrases can be synonyms of words or of phrases. What is essential in all these cases is that one and the same meaning can be expressed by two (or more) linguistic expressions.” (Hüllen 2003: 36)

Hüllen then acknowledges that, depending on the context, one of these synonyms (the word or the phrase) might be more appropriate.

(28)

27 In Hüllen’s example, a relationship of synonymy existed between a noun and a noun phrase; in Hurford’s example, it was between a verb and an adjective. Combining their observations, we can also see a relationship of synonymy between noun phrases and verb phrases (for example, between “There was a 3% rise in net income” and “Net income rose 3%”

or “Net income increased 3%”), between verbs and adjectives (for example, between “Net income rose 3% in the third quarter” and “Net income was 3% higher in the third quarter”) or between other syntactically distinct structures. As Hüllen notes, this use of various “linguistic expressions” to express “one and the same meaning” is a strategy that is “regularly exploited […] in translations” (Hüllen 2003: 36).

In considering the relationship between intralingual synonyms and interlingual equivalents, Adamska-Sałaciak writes, “An equivalent is often equated with a synonym in another language, and vice versa: a synonym may be defined as a same-language equivalent”

(Adamska-Sałaciak 2013: 329). When multiple words or phrases can be used in a source language to express a certain concept and multiple words or phrases can also be used to express that concept in the target language, then a relationship of equivalence seems to exist between all members of the source language synonym group and all members of the target language synonym group for the specific use in question.

2.3. Synonym use in earnings releases

Earnings releases are a useful text type for studying the use of synonyms because, as indicated in Section 1.4.1, each one provides multiple statements comparing a firm’s performance in various areas to its performance in those areas in a prior period. Each earnings release therefore expresses the concept of increase (the current-period figure being higher than the prior-period one), decrease (the current-period figure being lower than the prior-period one) or lack of change (the current-period figure being the same as the prior-period one) multiple times. As indicated in Section 1.4.1, of the three concepts just mentioned, I chose to focus on the concept of an increase for this study.

Another advantage of using earnings releases is that, not only are there repeated references to the same concept, but the references appear in a very precise, delimited context.

This limits the risk of differences in denotation and provides reassurance that the word uses are indeed synonymous. In order to ensure that, in my search for synonyms, I was identifying occurrences of words where the word uses were indeed synonymous, I decided to limit the types of contexts that I would look at. In identifying synonyms and selecting occurrences of

(29)

28 them for use in the study, I would only consider words used as a part of a comparison (i) that was being made (either explicitly or implicitly) to a result in a specific prior period and (ii) in which a figure was used to indicate the current-period result or to quantify the change from the prior-period result. The figure could express a monetary amount, a percentage, or a number of basis points.

To observe the use of synonyms to express the idea of an increase, I compiled a monolingual corpus of original French-language earnings releases, corpora of machine and human translations of those French releases (which were also combined with the French releases in parallel corpora) and a monolingual corpus of original English-language earnings releases, as indicated in Section 1.4.2. The characteristics of these corpora and the process used to assemble them will be described in detail in Section 4.2 and the patterns of synonym use in Section 4.5. In this section, I will describe some trends observed in general terms in order to illustrate how the concept of an increase is expressed in the earnings releases through synonyms. The French monolingual corpus shows that certain morphologically related word forms are frequently used in many of the texts to refer to an increase. These include progresser and its related forms progression and en progression, augmenter and its related forms augmentation and en augmentation, croître and its related forms croissance and en croissance, and hausse and its related form en hausse. The English monolingual corpus also shows the frequent appearance of certain words, such as the adverb “up,” the adjective “higher,” and the noun and verb forms of “increase.” We also find the noun and verb forms of “rise” and “grow.”

Both monolingual corpora show that there are several ways to express a change from a prior period.

The amount of variation used in expressing increases differed among earnings releases, but no earnings release in the French corpus relied entirely on one word or one group of related words or phrases to express the idea of an increase or decrease. For example, to express that a current period result is higher than a prior period result, Banque Cantonale de Genève’s full- year 2013 earnings release uses words or phrases related to progresser (progresse, forte progression, la progression modérée, a progressé de manière marquante, ont progressé modérément, a progressé, progression (twice)), accroître (se sont accrus (three times)), hausse (une hausse, en hausse (twice), étaient en hausse, furent en forte hausse, forte hausse), augmenter (ont augmenté) and avancer (avance).

How Many Ways Can Google Translate Say It?: Synonym Use in Neural Machine Translation Output

Master

Reference

How Many Ways Can Google Translate Say It?: Synonym Use in Neural Machine Translation Output

Table of Contents

Table of abbreviations