91 Campbell et al., columns AY-BC.

92 Pennebaker (2007) 5-6.

0 1 2 3 4 5 6

Baseline English Translated BCV

Affective process usage as %


categories for which usage rates are higher in our letters than for the baseline–cause, certainty, inhibition, and inclusive–show generally a large discrepancy between the incidence in the letters and the baseline, and, in half of those sub–categories, a very significant variance between the texts written in English and those translated into that language.

The exclusive and discrepancy categories, including words like “but”, “without”,

“exclude”, and “should”, “would”, and “could” respectively, show an incidence in the English–original texts of 1.12% and 1.16%, slightly below the baseline of 1.41% for discrepancy, and well below the baseline of 2.40% for exclusive. But the translated texts show only half the usage rate for these words as what is observed in the English–original texts: 0.63% for discrepancy, and 0.64% for exclusive. This, then, is a clear instance of where the translations diverge from the linguistic patterns of the untranslated texts, with the potential specific result, given the categories involved, of reduced clarity, specifically regarding the definition of specific categories and phenomena, one of the principle functions of many of the words in the exclusive category. Many of the words in the discrepancy category allow for nuancing of otherwise stark factual elements or predictions, and texts with such communicative tools under–utilized run the risk of appearing to make more definitive claims than might be typical, or warranted, in the context of shareholder letters in annual reports.

Both of these subcategories also represent possible targets for tweaking by the translation team at BCV, as in each case the LIWC results for the “Letter from the Chairman and the CEO” show usage rates fractionally lower than the already comparatively–low rates for translated shareholder letters, at 0.53% incidence for each sub–category. It should be noted, however, that once again Pennebaker’s categories prove to be imperfectly aligned with realities of financial–sector language use. While his exclusive category includes “versus”, it does not include “compared with” or “compared to”, and certainly not “year–on–year”, meaning that some quite common turns of phrase used in the banking industry for the specific contextualizing and chronological situating of details being reported would not be included in this category despite serving linguistic and communicative functions that would appear to correspond to those of words that were in fact selected by Pennebaker in the constitution of his categories and sub–

categories. In any event, it is certainly of note that throughout these eight sub–categories,


regardless of how perfectly they align with the subtleties of banking–sector reporting language, there is no case in which the translated texts show a higher incidence of words related to cognitive processes than that in the texts from English–language institutions.

In some cases the difference between original–English and translated English is quite small (0.79% versus 0.73% for the inhibition sub–category, for example) and in others it is quite sizeable (as in the sub–categories already mentioned, as well as “tentative”, for which the English–original texts show an incidence of 1.48%, while the translated texts come in at just 0.73%). However, the single most compelling LIWC result in this category family is for “inclusive”, a relatively small sub–category of 18 words including “and”,

“with”, “include”, and forms of the verb “to come”.93

In baseline English–language usage this is already a quite prevalent group of words, representing 4.91% of all production. All of our shareholder letters showed a higher–

than–baseline incidence of the words in this sub–category. LIWC results for the translated texts showed a usage rate of 5.91% for this sub–category, more than 15% above the baseline rate. Our English–language shareholder letters showed a usage rate of 7.86%, more than 50% above the baseline rate, marking this as a group of words that are clearly deemed by financial–sector communication experts to be useful in the propagating of their chosen corporate communication strategy and voice.

Graph 18: Usage rates for inclusive terms

93 Campbell et al., column BK.

0 1 2 3 4 5 6 7 8 9

Baseline English Translated BCV

Inclusive usage as %


Far from the possible accusation of coldly cerebral communication hinted at above, this clear emphasis of language of inclusion–even though it is of course more related to inclusion of elements in categories rather than a more touchy–feely version of inclusiveness–indicates that the voice of financial institutions overtly skews towards open arms rather than terms that limit the contents of a category. This is also an area in which the translation team at BCV would appear to be ahead of the curve compared to their colleagues working on behalf of other French–language banks. With LIWC results showing an incidence of 7.41% in BCV’s shareholder letters for words from the

“inclusive” category, the translators on this team are much more closely aligned with the linguistic conventions of English–language banks than they are with those of other French–language banks once translated into English, showing an LSM of .97, or near–

perfect synchrony between the BCV translation and the English letters.

This sub–category, then, would appear to constitute an example of where financial–sector translators working from French to English could potentially produce texts more in tune with those produced in Anglosphere banks by emulating the usage rates of “inclusive”

words at such banks, much as seems to have been implemented by the team at BCV.

Cognitive processes, then, far from proving to represent strategies for distancing the so–

called “one percent” from the other ninety–nine, in fact show the potential for bringing greater harmony, linguistically speaking, between the shareholder letters written by CEOs and other top managers at banks in the US, UK, and English–speaking Canada and those produced by colleagues in the French–speaking world and then subjected to the filter of the process of translation into English.

Perceptual processes

Pennebaker’s dictionary category of “perceptual processes” includes words associated more or less obviously with the five senses: vision, hearing, smell, taste, and touch. While clearly less prevalent in financial–sector reporting than, for example, in an annual report from a food products company, the terms in this category nevertheless feature in many metaphors in English and are thus not entirely absent from the specific sort of texts analyzed in this research study. Pennebaker has broken this category not into five sub–

categories as one might have expected, but only three: “see”, “hear”, and “feel”.94 Our LIWC results show, as we would have anticipated, that this subset of words is far more

94 Pennebaker (2007) 5-6.


common in baseline English–language usage than in our shareholder letters. Perceptual processes account for 2.16% of all production in general in English, but in shareholder letters written in English the usage rate is only 0.60%. The rate for shareholder letters translated into English is quite similar, coming in at 0.50%. However, as indicated above, the relatively small sizes of these use frequencies means that this small difference is proportionally more significant—in this case, the translated texts employ perceptual process words at a rate 20% below that of the usage rate in the untranslated letters.

For the three sub–categories, Pennebaker’s baselines suggest that sight is the most widely–referenced sense, with “see” words making up 0.86% of all language production in English. “Feel” and “hear” have baseline usage rates of 0.61% and 0.56%, respectively, according to Pennebaker. The shareholder letters in our corpora show the same ranking in order of frequency of use as the baseline, with sight the most common sub–category, followed by feel and hear. The untranslated texts generally show a slightly higher usage rate for words in these categories than do the translated texts, with the exception of the

“feel” sub–category, in which the usage rate of 0.17% in the translated letters is fractionally higher than the 0.15% scored by the English–original letters. Interestingly, the numbers from BCV are inconsistent across the three sub–categories when compared to both the English corpus and the translated one.

Overall, the “perceptual process” rate for the BCV shareholder letters was almost identical to the English–language corpus, at 0.62%, very slightly higher than the 0.60%

for that corpus. However, the BCV letters show an incidence of just 0.12% for “see” words, suggesting that terms such as “look”, “picture”, “view”, and “watch”, fairly common in macroeconomic language, are slightly less prevalent in the translated letters from BCV than we might have anticipated. The “feel” sub–category, on the other hand, sees BCV posting a usage rate that is twice that of the English–drafted letters, albeit still only half the baseline usage rate. In this case, words such as the verb “to edge” (up or down), and the various permutations of “weight” and “weighted” may begin to explain the higher usage at BCV than in the industry in general, but of course the relatively small size of the BCV sample argues for hesitancy before any excessively sweeping conclusions. Given the deictic role of many of the terms in these sub–categories, and the very close attention to deixis in translation at the BCV due to the efforts of head of translation David Jemielity, it is also not inconceivable that some measure of this observed upward tick in word–use


frequency could be ascribed to that particular communicative stance aimed at producing English translations that are carefully rooted in the spatial and temporal clarification proper use of deictic forms and associations can provide.

Biological processes

This category, constructed by Pennebaker from 567 words and stems,95 is one we would expect to be drastically under–represented in financial–sector financial reporting compared to day–to–day English usage. Indeed, although the baseline usage rate calculated by Pennebaker is 1.94%, both of our corpora return LIWC results of 0.38% for usage of words from this category. The rate from the BCV is perfectly aligned with our corpora results, at 0.39%. As none of the sub–categories under biological processes shows any more compelling results, we will not engage in any deep analysis of the results for this particular subset of English–language words, contenting ourselves with confirming that the LIWC results for our corpora are almost surprisingly consistent across this category. We consider the strongly parallel numbers returned by LIWC to serve as confirmation of the validity of the corpora themselves, and will move on without further ado to the final psychological process category defined by Pennebaker,



In this broad category composed by Pennebaker of 638 words and stems we find terms that anchor narrative in time and space, marking this as a category of words that frequently function deictically.96 Pennebaker has subdivided this category into three sub–categories: “Motion”, “Space”, and “Time”.97 Our LIWC analysis results mark relativity as an area in which there is striking consistency in usage rates across our corpora and with Pennebaker’s baseline. Compared to that baseline usage rate of 14.09%, the English corpus comes in at 14.26%, while the translated corpus is an almost identical 14.31%. The results are similarly consistent for the sub–categories, with the English–

original and the translated texts never diverging by more than 0.19 percentage points.

These results suggest that the shareholder letters forming our corpora are similarly explicit in temporal and spatial terms to general usage of the English language as normed

95 Campbell et al., columns BR-BU.

96 Campbell et al., columns CB-CF.

97 Pennebaker (2007) 5-6.


by Pennebaker. While the sub–categories of “Motion” and “Time” show a very slightly lower incidence than the baseline, LIWC analysis of our corpora shows their constituent texts to use spatial terminology with a higher than baseline frequency; however, the differences remain quite minor.

Perhaps the single most interesting result from the analysis of this category is that the texts from BCV show higher usage of “Relativity” words overall than the baseline and either corpus, and indeed come out higher than the corpora in all three sub–categories as well. In fact, the BCV texts show a higher–than–baseline incidence for two of the sub–

categories, and come in at exactly the same 2.30% rate for motion terms as the baseline.

This shows that the BCV texts are measurably richer in spatio–temporal specificity than financial–sector shareholder letters in general, and even than the baseline English language. This is almost certainly directly related to efforts over the past decade by the head of the BCV translation team to ensure that BCV texts remain rooted to the physical world through astute application of deictic forms, that is, expressions containing words whose meaning is defined situationally in the context of the sentence and/or paragraph.

While there is of course no perfect correlation between Pennebaker’s “Relativity”

category and the many different aspects of language structure that constitute deixis, this is certainly a category that highlights many terms that can be used to enhance the clarity of the connections between narrative elements in time and space, such as “ahead”,

“apart”, “around”, “closely”, “down”, and “near”. One could even argue that these LIWC results suggest that the BCV team may have even begun to over–compensate for the perceived paucity of deictic clarity in French source texts, given that the usage rates for words in this category are at or above those observed at baseline–or in English–language shareholder letters.

Personal Concerns

Pennebaker’s category of terms related to personal concerns is comprised of seven sub–

categories: “Work”, “Achievement”, “Leisure”, “Home”, “Money”, “Religion”, and

“Death”.98 In the context of financial–sector shareholder letters, we would probably expect to find a relatively high incidence of terms concerning “Work”, “Achievement”, and

“Money”, while the remaining four sub–categories do not seem likely to be highly

98 Pennebaker (2007) 5-6.


represented in the text forming the subject of this research study. Indeed, the LIWC results confirm this supposition, showing that for the three likely categories our corpus texts show from four to twelve times the baseline usage rate, with the other four categories showing rates from a quarter to a half of the baseline rates. Indeed, even the texts from BCV conform almost perfectly to this schema, aligning with the corpus texts nearly to the hundredth of a point.

The conclusions to be drawn here are neither surprising nor profound: banking CEOs are addressing matters directly of concern to their institutions and shareholders, which are of course aligned with performance and monetary specificities, while the other sub–

categories are only likely to be mentioned on rare intersections, such as a mention of

“home” mortgages or prices, or the death of a prominent member of the Board of Directors. It seems quite unlikely that the terms in these sub–categories would represent fruitful avenues for further research regarding the translation of financial–sector shareholder letters, and we shall therefore move on to Pennebaker’s final categories:

elements of spoken language, and punctuation.

Spoken Categories

The toolkit which Pennebaker’s LIWC program constitutes is not only able to analyze written texts from all manner of backgrounds, but also oral language–although it requires a transcription in order to perform its analysis. For that reason, Pennebaker has included categories of interest purely to the analysis of spoken language, and which are therefore nearly completely inapplicable to this study. The handful of shareholder “letters” which were in an interview format are the only example of putatively “spoken” language in our corpora, but given the highly formalized and edited nature of even those texts, oral fillers and pauses are of course essentially absent. Our LIWC results, as expected, show practically no incidence of any of the structures included in these categories, and so we will not bother to attempt an analysis in depth on a non–feature of the language in our corpora, preferring instead to simply acknowledge this as an aspect of the LIWC procedures that is of less use for this type of study.


Through the systematic examination of our LIWC results for the various categories and sub–categories identified and normed by Pennebaker, we have been able to show areas


of language–both grammatical and thematic–in which the linguistic conventions in financial–sector shareholder letters are distinct from the baseline production in English across a wide variety of text types. We have also highlighted instances of divergence between the rates of use of types of words between shareholder letters originally written in English and those translated into English from the French language. Finally, we have compared these results whenever relevant to a LIWC analysis of shareholder letters from the same period from BCV, ground zero for this study as well as the one in whose footsteps it follows, in order to maximize the potential real–world payoff for the professional financial–sector translator. It is the sincere hope of this study’s author that this research will serve as the basis, not only for further, more qualitative analysis of certain language aspects foregrounded by this quantitative overview, but also for actual reflection, by working professionals, on how to apply the results of this survey to their day–to–day work on behalf of French–language financial institutions who have elected to include English as part of their international communication strategy. Working with the translation team at BCV since early 2012 has shown that team to be remarkably proactive in the application of concrete observations of best practices, norms, and compelling English prosody, as well as conventions of financial–sector language, leaving us with a strong measure of confidence that this research study will, like the one by Wells in whose footsteps it follows, have a real impact on professional practices in the real world.

As a concrete example of the potential applications of this research study in following up on work by Wells, we have selected two elements that the wide–ranging LIWC results flagged up for further scrutiny: “we” words and the shareholder letters by Jamie Dimon at JPMC. Using LIWC, we processed the 2014 shareholder letter from JPMC and the English translation of the 2014 letter from BCV. We immediately see that for the category of first person plural pronouns, the usage rates of 4.74% from JPMC and 3.95% from BCV indicate an LSM score between the two of .91. This means that the English translation from BCV is in quite high synchrony with the linguistic conventions of “we” word usage as exemplified by Jamie Dimon’s widely–read prose. Compared to the overall usage rate for English–language shareholder letters in this category, 5.05%, the JPMC letter generates an LSM of .97, indicating that Dimon is in almost perfect synchrony with the

As a concrete example of the potential applications of this research study in following up on work by Wells, we have selected two elements that the wide–ranging LIWC results flagged up for further scrutiny: “we” words and the shareholder letters by Jamie Dimon at JPMC. Using LIWC, we processed the 2014 shareholder letter from JPMC and the English translation of the 2014 letter from BCV. We immediately see that for the category of first person plural pronouns, the usage rates of 4.74% from JPMC and 3.95% from BCV indicate an LSM score between the two of .91. This means that the English translation from BCV is in quite high synchrony with the linguistic conventions of “we” word usage as exemplified by Jamie Dimon’s widely–read prose. Compared to the overall usage rate for English–language shareholder letters in this category, 5.05%, the JPMC letter generates an LSM of .97, indicating that Dimon is in almost perfect synchrony with the

Dans le document Suite Talking : a corpus comparison study of function words in banking-industry shareholder letters (Page 61-74)