• Aucun résultat trouvé

23

both the lowest and highest percentages of longer–word use coming from that subset of our corpus.

Interestingly, the company already identified as the most voluminous communicator, JP Morgan Chase, also proves to be the entity that employs the lowest percentage of words greater than six letters in length. US Bancorp edges out the Royal Bank of Canada as the company using the highest percentage of longer words, with both of them above 34% in this category. All of the French–using institutions fall in the middle of the pack. Whether this is due to translators ‘smoothing out’ language use is not clear, but merely identifying the outliers in this category will begin the process of potentially highlighting stylistic features of individual components of our cross–section of the financial industry, and thereby pointing to possible veins that might be mined for further, more qualitative, study at a later date.

Pennebaker’s ‘Dictionary’

Over the decades he has spent refining LIWC and building his massive corpora, Pennebaker has built what he calls a ‘dictionary’. In fact, this is a compilation of words and word fragments that he has classified as representing his various categories. A word may be classified in more than one category, in some cases simultaneously reflecting multiple categories, in some cases contextually only representing one or more. One of the categories that LIWC tracks is the percentage of words in analyzed texts that were matched to one or more of Pennebaker’s categories. Our results from our original–

English texts show a marked correlation with Pennebaker’s observations: 83.29% of the words in these texts were present in his dictionary, while the average result is 82.42%.

Our originally French texts showed a somewhat lover correlation, only 78.41%. This may be imputed to a higher proportion of non–Anglophone proper nouns in texts describing people and places from outside the political and cultural boundaries of the English–

speaking world. It may also suggest a slight degree of the ‘exotic’ from the original French that shows up even through the filter of translation.

On the correlation side, Wells Fargo is the outlier with 85.02% of the words from those texts finding a match in Pennebaker’s dictionary. The lowest correlation, as we might expect, is from a French–language institution, the Compagnie Financière Tradition, with only 74.55% of its words finding dictionary matches. This spread of eleven percentage points indicates that while, on average, there may not be a huge disparity between the

24

French–original and English–original texts, when we zoom in we still find quite a bit of relief in the numbers. Surprisingly, the letters from BCV score 76.13%, suggesting that the in–house language at BCV is less well aligned with that of the English language as used in day–to–day life than it is with other financial reporting texts.

Going forward in this research study, we will examine each of the categories in Pennebaker’s dictionary in order to render our analysis more systematic and less prone to cherry picking. Some of the categories will of course provide more interesting or compelling results than others, but it is hoped that the overall, wide–angle approach provided by adhering to Pennebaker’s tried and tested methodology will ultimately point to some surprises that a more targeted analysis of likely phenomena would have missed entirely. And, of course, such a targeted, quantitative analysis can always be performed for any language–use details that this broad–based approach might highlight as promising avenues for further research.

Function words

As detailed earlier, Pennebaker’s research has tended to suggest that the relative frequency of use of what he calls ‘function’, or ‘stealth’, words, that is to say words that provide linguistic framework more than providing specific content details, is an excellent tool for the analysis of the verbal or written production of a person (or of an entity). The suite of analytical approaches that this opens has resulted in much of the attention to Pennebaker in the popular press, especially regarding his analyses of pronoun usage by US presidents and the intimations regarding their personalities that those studies conveyed – for one such article, see Ben Zimmer’s piece in the New York Times from August 2011.67 Pennebaker’s dictionary includes in the “Function words” category all manner of pronouns, prepositions, articles, and relatively short and commonly–used adjectives and adverbs, as well as a variety of other words and terms.68 It is immediately apparent from analysis of our corpus that there is a lower instance of function word use in financial–sector shareholder letters than the average baseline usage determined by Pennebaker. He has established the norm in this category as 54.85%, which is a surprisingly high proportion of words that do not relate specific details, but instead form

67 Zimmer, Ben, "The Power of Pronouns", New York: The New York Times (26 August 2011).

68 Campbell, Sherlock, et al., LIWC2007 Dictionary Poster, Austin, Texas: LIWC.net, 2007.

25

the framework inside which specific content words are arranged. Neither the English–

original texts nor the translated texts in our corpus attain even 50% function–word usage, let alone the nearly 55% that Pennebaker has observed as an average result. Our English texts come in at 49.67% function words, ten percent lower than Pennebaker’s average, but scoring an LSM of .95 that indicates high synchrony. The originally French texts show 46.98% function–word usage, nearly 15% below the average, and with an LSM of .92, lower but nevertheless still in the high range for synchrony.

Graph 2: Usage rates for function words

This represents the first indication from our corpus comparison study that a measurable divergence exists, at a linguistically fundamental level, between texts that were originally drafted in English and others that represent the product of a process of translation from French. If we accept Pennebaker’s supposition that function words really are key to the way different people and actors communicate, then this represents potentially fertile ground for further study. The fact that the texts from BCV come in even lower than the average for the translated texts, at 44.52% for an LSM of .77, further underscores this category as representing a tantalizing subset of words for further analysis, with the caveat that the selection of words in this category may also require some further scrutiny in order to determine if there are in fact words commonly serving as “function” words in shareholder letters that for some reason have not been included in Pennebaker’s dictionary entry for this category. It is to be hoped that the analyses for specific sub–

0 10 20 30 40 50 60

Baseline English Translated BCV