• Aucun résultat trouvé

3. Results (overview) 2. Methodology 1. Goals V :


Academic year: 2021

Partager "3. Results (overview) 2. Methodology 1. Goals V :"


Texte intégral






1. Goals

Most studies in phonological typology are based on segment inventories with the hypothesis that observed regularities result from general constraints (articulatory ease, perceptual salience, etc.). This relies on the implicit assumption that all segments in an inventory are equally meaningful. However, segments form a system because they‟re engaged in phonological contrasts with each other and as noted in [1]: „some contrasts between the phonemes in a system apparently do more of [the] job than others‟, thus questioning the supposed equivalence. Recent works have highlighted the importance of frequency of use in characterizing languages‟ phonology [2]. Building on this idea, we suggest that the notion of functional load (FL, see [1] & [3] among others) would enrich the usual representation of phonological inventories. In this study, we propose a corpus-based and cross-linguistic approach to evaluate the relevance of FL for characterizing vowel systems. Our main objective is to assess whether reanalyzing vowel systems in terms of FL validates the explanations based on inventories or reveals new constraints.

2. Methodology

Data consisted of text corpora (including hundreds of thousands of word tokens) in twelve languages – Amharic, Bulgarian, Spanish, English, Estonian, Finnish, French, German, Swahili, Tagalog, Turkish, and Zulu – mainly from the Leipzig Corpora Collection [4]. FL was computed for each language following the algorithm proposed in [5]: a) Using the frequencies of the wordforms, we evaluated the entropy of their distribution. b) We characterized the FL of a contrast between two vowels as the relative loss of entropy induced by their artificial merging into a single phoneme. c) This procedure was repeated for each vowel pair, and FL was normalized with respect to the sum of the FL borne by vocalic contrasts.

3. Results (overview)

Figure 1a gives the example of Amharic. Edges indicate the FL for the vowel contrasts built upon their endpoints; for instance, the /e/~/u/ contrast accounted for 36% of the vocalic FL. Figure 1b displays the distribution of contrasts ranked by decreasing FL. The twelve languages exhibited a similar uneven use of the vowel contrasts available from their inventory. From this paradigmatic viewpoint, they seemed far from an optimal use of their phonological system, even if an optimization in terms of syntagmatic trajectories (involving coarticulation and phonotactics) cannot be ruled out. Furthermore, the cross-linguistic comparison revealed no uniform tendency to favor maximal contrasts (such as /a/~/i/ and /a/~/u/). It actually highlighted that several languages with similar inventories may exhibit different contrast patterns (see Figure 2). These results and their interpretation in terms of the organization of vowel systems will be discussed during the conference, along with methodological issues.

Figure 1.

a) Representation of the Amharic vowel system.

For each vowel pair, the edge thickness is proportional to the FL.

Numerical values are given when greater

than 1%.

b) Distribution of the vowel contrasts ranked

by decreasing FL.

Figure 2. Comparison of four languages with vowel systems akin to /a e i o u/. For legibility sake, only edges corresponding to FL greater than 10% are displayed.


Vowel contrasts


a) Vowel System b) Functional loads of the vowel constrasts

0 10 20 30 40


accepted for poster presentation at the 86th Meeting of the LSA

François Pellegrino, Egidio Marsico, and Christophe Coupé Laboratoire Dynamique Du Langage, CNRS - Université de Lyon


4. References

[1] Hockett, C.F. 1966. The quantification of functional load: A linguistic problem, Report Number RM-5168-PR, Rand Corp. Santa Monica.

[2] Bybee, J. 2001. Phonology and Language Use, Cambridge University Press: Cambridge.

[3] Martinet, A. 1955. Économie des changements phonétiques. Traité de phonologie diachronique, Francke: Berne.

[4] Quasthoff, U., Richter, M., & C. Biemann. 2006. “Corpus Portal for Search in Monolingual Corpora”, proc. of the 5th International Conference on Language Resources and Evaluation LREC, Genoa, pp. 1799-1802.

[5] Surendran, D., and P. Niyogi. 2003. Measuring the Usefulness (Functional Load) of Phonological Contrasts, Technical Report TR-2003-12, Department of Computer Science, University of Chicago, USA.

accepted for poster presentation at the 86th Meeting of the LSA

François Pellegrino, Egidio Marsico, and Christophe Coupé Laboratoire Dynamique Du Langage, CNRS - Université de Lyon


Documents relatifs

The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having

Ten source languages -Bulgarian, Dutch, English, Finnish, French, German, Italian, Portuguese, Spanish and, as an experiment, Indonesian- and 9 target languages -all the

The expression of the rate function in terms of Lyapunov exponents allows to prove that the change in regime of the Brownian motion with constant drift observed by Sznitman ([36],

The main results concern i) intra- and inter-speaker variability, ii) the phonetic distribution of intrusive h, and iii) the role of the task. First, only 48 intrusive tokens

This study investigates the realization of CK#C sequences in continuous speech using large corpora: the formal journalistic speech corpus ESTER, the conversational journalistic

Our grammar model is situated between the two extremes applied in existing treebanks: constituency-style annotation, as in the Old and Middle English Corpora, and

The aim of this paper is to investigate to what degree cor- pus based semantic methods can be used to derive weighted networks of words which approximate human free associa-

In this work, information from parallel Spanish-English corpora is used in short-text clustering and two main research questions are addressed: 1) how this bilingual informa- tion