Dynamics, causation, duration in the predicate-argument structure of verbs : a computational approach based on parallel corpora

(1)

Thesis

Reference

Dynamics, causation, duration in the predicate-argument structure of verbs : a computational approach based on parallel corpora

SAMARDZIC, Tanja

Abstract

This dissertation addresses systematic variation in the use of verbs where two syntactically different sentences are used to express the same event. We show that the frequency distribution of the syntactic alternants depends on the meaning of the verbs: force dynamics in light verb constructions, external causation in lexical causatives, and duration in verb aspect classes. Both intra-linguistic and cross-linguistic variation in morphological and syntactic realisations of semantically equivalent items are taken into account by analysing data extracted from parallel corpora. The three semantic properties are empirically induced on the basis of the observations automatically extracted from large parallel corpora, which are automatically parsed and word-aligned. The generalisations are learned from the extracted data automatically using statistical inferences and machine learning techniques. The accuracy of the predictions made on the basis of the generalisations is assessed experimentally on an independent set of test instances.

SAMARDZIC, Tanja. Dynamics, causation, duration in the predicate-argument structure of verbs : a computational approach based on parallel corpora. Thèse de doctorat : Univ. Genève, 2013, no. L. 796

URN : urn:nbn:ch:unige-384224

DOI : 10.13097/archive-ouverte/unige:38422

Available at:

http://archive-ouverte.unige.ch/unige:38422

Disclaimer: layout of this document may differ from the published version.

(2)

Dynamics, causation, duration in the predicate-argument structure of verbs:

A computational approach based on parallel corpora

Tanja Samardˇzi´c June 30, 2014

Supervisor: Prof. Paola Merlo

(3)

(4)

La Faculté des lettres, sur le préavis d’une commission composée de Madame et Messieurs les professeurs Jacques MOESCHLER, président du jury; Paola MERLO, directrice de thèse; Balthasar BICKEL (Université de Zurich); Jonas KUHN (Université de Stuttgart);

Martha PALMER (University of Colorado, Boulder), autorise l’impression de la présente thèse, sans exprimer d’opinion sur les propositions qui y sont énoncées.

Gen`eve, le 23 d´ecembre 2013

Le Doyen: Nicolas ZUFFEREY

Th`ese N^o 796

(5)

(6)

This dissertation addresses systematic variation in the use of verbs where two syntactically different sentences are used to express the same event, such as the alternations in the use of decide, break, and push shown in (0.1-0.3). We study the frequency distribution of the syntactic alternants showing that the distributional patterns originate in the meaning of the verbs.

(0.1) a. Mary took/made a decision. b. Mary decided (something).

(0.2) a. Adam broke the laptop. b. The laptop broke.

(0.3) a. John pushed the cart. b. John pushed the cart for some time.

Both intra-linguistic and cross-linguistic variation in morphological and syntactic realisations of semantically equivalent items are taken into account by analysing data extracted from parallel corpora. The dissertation includes three case studies: light verb constructions (0.1) in English and German, lexical causatives (0.2) also in English and German, andverb aspect classes (0.3) in English and Serbian.

The core question regarding light verb constructions is whether the verbs such as take and make, when used in expressions such as (0.1a), turn into functional words losing their lexical meaning. Arguments for both a positive and a negative answer have been put forward in the literature. The results of our study suggest that light verbs keep at least force-dynamic semantics of their lexical counterparts: the inward dynamics in verbs such as take and the outward dynamics in verbs such as make. The inward dynamics results in a cross-linguistic preference for compact grammatical forms (single verbs) and the outward dynamics results in a preference for analytical forms (constructions).

(7)

of the study suggest that the property which underlies the variation is the likelihood of external causation. Events described by the alternating verbs are distributed on a scale of increasing likelihood for an external causer to occur. The verbs which alternate in some but not in other languages are those verbs which describe events on the two ex- tremes of the scale. The preference for one alternant is so strong in these verbs that the other alternant rarely occurs, which is why it is not attested in some languages. There are two ways in which the likelihood of external causation can be empirically assessed:

a) by observing the typological distribution of causative vs. anticausative morphological marking across a wide range of languages and b) by the frequency distribution of transitive vs. intransitive uses of the alternating verbs in a corpus of a single language.

Our study shows that these two measures are correlated. By applying the corpus-based measure, the position on the scale of likelihood of external causation can be determined automatically for a wide range of verbs.

The subject of the third case study is the relationship between two temporal properties encoded by the grammatical category of verb aspect: event duration and temporal boundedness. The study shows that these two properties interact in a complex but predictable way giving rise to the observed variation in morphosyntactic realisations of verbs. English native speakers’ intuitions about possible duration of events described by verbs (short vs. long) are predicted from the patterns of formal aspect marking in the equivalent Serbian verbs. The accuracy of the prediction based on the bi-lingual model is superior to the best performing monolingual model.

One of the main contributions of the dissertation is a novel experimental methodology, which relies on automatic processing of parallel corpora and statistical inference. The three properties of the events described by verbs (dynamics orientation, the likelihood of external causation, duration) are empirically induced on the basis of the observations automatically extracted from large parallel corpora (containing up to over a million sentences per language), which are automatically parsed and word-aligned. The generalisations are learned from the extracted data automatically using statistical inferences and machine learning techniques. The accuracy of the predictions made on the basis of the generalisations is assessed experimentally on an independent set of test instances.

(8)

Cette thèse porte sur la variation systématique dans l’usage des verbes où deux phrases, différentes par rapport à leurs structures syntactiques, peuvent être utilisées pour exprimer le même événement. La variation concernée est montrée dans les exemples (0.4- 0.6). Nous étudions la distribution des fréquences des alternants syntactiques en mon- trant que la source des patterns distributionnels est dans le contenu sémantique des verbes.

(0.4) a. Mary Marie

took/made pris/fait

a une

decision.

d´ecision Marie a pris une d´ecision.

b. Mary Marie

decided d´ecid´e

(something).

(quelque chose) Marie a d´ecid´e (quelque chose) (0.5) a. Adam

Adam

broke cass´e

the le

laptop.

ordinateur Adam a cass´e l’ordinateur.

b. The le

laptop ordinateur

broke.

cass´e L’ordinateur c’est cass´e.

(0.6) a. John Jean

pushed pouss´e

the le

cart.

chariot Jean a pouss´e le chariot.

(9)

Jean poussait le chariot pendent quelque temps.

La variation intra-linguistique ainsi que la variation à travers des langues concernant les réalisations morphologiques et syntactiques des items sémantiquement équivalents sont prises en compte. Ceci est effectué par une analyse des données extraites de corpus parallèles. La thèse contient trois études de cas: constructions à verbes légers (0.4) en anglais et allemand,les verbes causatifs lexicaux (0.5), égalément en anglais et allemand, etles classes d’aspect verbal (0.6) en anglais et serbe.

La question centrale par rapport aux constructions à verbes légers est de savoir si les verbes comme take et make utilisés dans des expressions comme (0.4a) deviennent des mots fonctionnels perdant donc entièrement leur contenu lexical. Des arguments en faveur des deux réponses, positive et négative, ont été cités dans la littérature.

Les résultats de notre étude suggèrent que les verbes légers maintiennent au moins la sémantique de dynamique de force appartenant au contenu des verbes lexicaux équivalents:

La dynamique orientée vers l’agent de l’événement (à l’intérieur) des verbes comme take et la dynamique orientée vers d’autres participants dans l’événement (à l’extérieur) des verbes comme make. La dynamique orientée vers l’intérieur a pour conséquence une préférence pour des réalisations compactes (des verbes individuelles) à travers des langues, tandis que la dynamique orientée vers l’extérieur a pour conséquence une préférence pour des formes analytiques (des constructions).

L’étude des verbes causatifs lexicaux (0.5) porte sur la variation à travers des langues concernant la participation de ces verbes dans l’alternance causative: Pourquoi certains verbes dans certaines langues n’entrent pas dans l’alternance causative tandis que leurs verbes correspondants dans d’autres langues le font? Les résultats de l’étude suggèrent que la caractéristique sémantique qui est à la source de la variation est la probabilité de la causalité externe de l’événement décrit par un verbe. Les événements décrits par les verbes causatifs lexicaux sont placés au long d’une échelle de probabilité croissante de la causalité externe. Les verbes qui entrent dans l’alternance dans une langue, mais ne le font pas dans d’autres langues, sont les verbes décrivant des événements qui se trouvent aux deux extrémités de l’échelle. Ces verbes ont une préférence pour l’un

(10)

y a deux moyens empiriques pour estimer la probabilité de la causalité externe: a) en observant la distribution typologique des morphèmes causatifs vs. anticausatifs dans la structure des verbes causatifs lexicaux au travers d’un grand nombre des langues et b) en observant la distribution de fréquences des réalisations transitives vs. intransitives des verbes dans un corpus d’une langue individuelle. Notre étude montre que ces deux mesures sont corrélées. En appliquant la mesure basée sur le corpus, la position sur l’échelle de la causalité externe peut être déterminée automatiquement pour un grand nombre de verbes.

Le sujet de la troisième étude de cas est la relation entre les deux caractéristiques tem- porales des événements encodées par la catégorie grammaticale d’aspect verbale: la longueur et la délimitation temporelle. L’étude montre que ces deux caractéristiques interagissent d’une manière complexe mais prévisible, ce qui est à l’origine de la variation observée dans les réalisations morphosyntactiques des verbes. Les intuitions des locuteurs natifs anglais sur la longueur possible d’un événement décrit par un verbe (court vs. long) peuvent être prédites sur la base du marquage formel d’aspect verbal dans les verbes correspondants serbes. L’exactitude des prédictions basées sur le modèle bi-linguistique est supérieure à la performance du meilleur modèle monolanguistique.

Une parmi les contributions principales de cette thèse est la nouvelle méthodologie expérimentale qui se base sur le traitement automatique des corpus parallèles et sur l’inférence statistique. Les trois caractéristiques sémantiques des événements décrits par des verbes (la dynamique, la probabilité de la causalité externe, la longueur) sont inférées empiriquement à partir d’observations extraites automatiquement des grands corpus parallèles (contenant jusqu’à plus d’un million de phrases pour chaque langue) automatiquement analysés et alignés. Les généralisations généralisations sont acquises de données de corpus de manière automatique en utilisant l’inférence statistique et les techniques d’apprentissage automatique. L’exactitude des prédictions effectuées sur la base des généralisations est estimée de manière expérimentale en utilisant un échantillon séparé de données de test.

(11)

(12)

This dissertation has greatly benefited from the help and support of numerous friends and colleagues and I wish to express my gratitude to all of them here.

First and foremost, I would like to thank my supervisor, Paola Merlo, for the com- mitment with which she has supervised this dissertation, for sharing generously her knowledge and experience in countless hours spent discussing my work and reading my pages, for treating my ideas with care and attention, and for showing me that I can do better than I thought I could.

I am most thankful to Vesna Polovina and Jacques Mœschler, who made it possible for me to move from Belgrade to Geneva and who have discretely looked after me throughout my studies.

I thank Balthasar Bickel, Jonas Kuhn, and Martha Palmer, who kindly agreed to be members of the defence committee, and to Jacques Mœschler, who agreed to be the president of the jury.

I have gathered much of the knowledge and skills necessary for carrying out this research in the discussions and joint work with Boban Arsenijević, Effi Georgala, Andrea Ges- mundo, Kristina Gulordava, Maja Miliˇcević, Lonneke van der Plas, Marko Simonović, and Balˇsa Stipˇcević. I am thankful for the time they spent working and thinking with me.

I appreciate very much the assistance of James Henderson, Jonas Kuhn, and Gerlof Bouma, who shared their data with me, allowing me to spend less time processing corpora, so I could spend more time thinking about the experiments.

(13)

kindness and support. On various occasions, I felt lucky to be able to talk to Tijana Aˇsić, Lena Baunaz, Anamaria Bentea, Frédérique Berthelot, Giuliano Bocci, Eva Cap- itao, Maja Djukanović, Nikhil Garg, Jean-Philippe Goldman, Asheesh Gulati, Tabea Ihsane, Borko Kovaˇcević, Joel Lang, Antonio Leoni de León, Gabriele Musillo, Goljihan Kashaeva, Alexis Kauffmann, Christopher Laenzlinger, Jasmina Moskovljević Popović, Luka Nerima, Natalija Panić Cerovski, Genoveva Puskas, Lorenza Russo, Yves Scherrer, Violeta Seretan, Gabi Soare, ˇZivka Stojiljković, Eric Wehrli, and Richard Zimmermann.

I would also like to thank Pernilla Danielsson, who helped me start doing computational linguistics while I was a visiting student at the Centre for Corpus Research at the University of Birmingham.

In the end, I would like to express my gratitude to Fabio, who has stayed by my side despite all the evenings, weekends, and holidays dedicated to this dissertation.

(14)

1. Introduction 23

1.1. Grammatically relevant components of the meaning of verbs . . . 24

1.2. Natural language processing in linguistic research . . . 26

1.3. Using parallel corpora to study language variation . . . 27

1.4. The overview of the dissertation . . . 31

2. Overview of the literature 35 2.1. Theoretical approaches to the argument structure . . . 36

2.1.1. The relational meaning of verbs . . . 38

2.1.2. Atomic approach to the predicate-argument structure . . . 40

2.1.3. Decomposing semantic roles into clusters of features . . . 44

Proto-roles . . . 44

The Theta System . . . 47

Summary . . . 51

2.1.4. Decomposing the meaning of verbs into multiple predicates . . . . 51

Aspectual event analysis . . . 52

Causal event analysis . . . 54

2.1.5. Summary . . . 55

2.2. Verb classes and specialised lexicons . . . 56

2.2.1. Syntactic approach to verb classification . . . 56

2.2.2. Manually annotated lexical resources . . . 59

FrameNet . . . 59

The Proposition Bank (PropBank) . . . 64

VerbNet . . . 66

Comparing the resources . . . 69

(15)

2.3. Automatic approaches to the predicate-argument structure . . . 72

2.3.1. Early analyses . . . 74

2.3.2. Semantic role labelling . . . 75

Standard semantic role labelling . . . 75

Joint and unsupervised learning . . . 82

2.3.3. Automatic verb classification . . . 83

2.4. Summary . . . 87

3. Using parallel corpora for linguistic research — rationale and methodology 89 3.1. Cross-linguistic variation and parallel corpora . . . 90

3.1.1. Instance-level microvariation . . . 91

3.1.2. Translators’ choice vs. structural variation . . . 94

3.2. Parallel corpora in natural language processing . . . 96

3.2.1. Automatic word alignment . . . 96

3.2.2. Using automatic word alignment in natural language processing . 100 3.3. Statistical analysis . . . 103

3.3.1. Summary tables . . . 103

3.3.2. Statistical inference and modelling . . . 105

3.3.3. Bayesian modelling . . . 110

3.4. Machine learning techniques . . . 114

3.4.1. Supervised learning . . . 115

3.4.2. Unsupervised learning . . . 122

3.4.3. Learning with Bayesian Networks . . . 127

3.4.4. Evaluation of predictions . . . 129

3.5. Summary . . . 130

4. Force dynamics schemata and cross-linguistic alignment of light verb constructions 133 4.1. Introduction . . . 133

4.2. Theoretical background . . . 136

4.2.1. Light verb constructions as complex predicates . . . 136

4.2.2. The diversity of light verb constructions . . . 140

(16)

4.3. Experiments . . . 143

4.3.1. Experiment 1: manual alignment of light verb constructions in a parallel corpus . . . 146

Materials and methods . . . 147

Results and discussion . . . 150

4.3.2. Experiment 2: Automatic alignment of light verb constructions in a parallel corpus . . . 152

4.4. General discussion . . . 161

4.4.1. Two force dynamics schemata in light verbs . . . 161

4.4.2. Relevance of the findings to natural language processing . . . 163

4.5. Related work . . . 164

4.6. Summary of contributions . . . 167

5. Likelihood of external causation and the cross-linguistic variation in lexical causatives 169 5.1. Introduction . . . 169

5.2. Theoretical accounts of lexical causatives . . . 173

5.2.1. Externally and internally caused events . . . 174

5.2.2. Two or three classes of verb roots? . . . 176

5.2.3. The scale of spontaneous occurrence . . . 178

5.3. Experiments . . . 181

5.3.1. Experiment 1: Corpus-based validation of the scale of spontaneous occurrence . . . 183

5.3.2. Experiment 2: Scaling up . . . 188

5.3.3. Experiment 3: Spontaneity and cross-linguistic variation . . . 192

(17)

5.3.4. Experiment 4: Learning spontaneity with a probabilistic model . 204

The model . . . 206

Experimental evaluation . . . 208

5.4.1. The scale of external causation and the classes of verbs . . . 214

5.4.2. Cross-linguistic variation in English and German . . . 215

6. Unlexicalised learning of event duration using parallel corpora 223 6.1. Introduction . . . 223

6.2. Theoretical background . . . 228

6.2.1. Aspectual classes of verbs . . . 228

6.2.2. Observable traits of verb aspect . . . 233

6.2.3. Aspect encoding in the morphology of Serbian verbs . . . 235

6.3. A quantitative representation of aspect based on cross-linguistic data . . 240

6.3.1. Corpus and processing . . . 242

6.3.2. Manual aspect classification in Serbian . . . 245

6.3.3. Morphological attributes . . . 246

6.3.4. Numerical values of aspect attributes . . . 247

6.4. Experiment: Learning event duration with a statistical model . . . 250

6.4.1. The model . . . 251

The Bayesian net classifier . . . 253

6.4.2. Experimental evaluation . . . 255

6.5.1. Aspectual classes . . . 262

(18)

7. Conclusion 267 7.1. Theoretical contribution . . . 269 7.2. Methodological contribution . . . 271 7.3. Directions for future work . . . 272

Bibliography 275

A. Light verb constructions data 295

A.1. Word alignment of the constructions with ’take’ . . . 295 A.2. Word alignment of the constructions with ’make’ . . . 299 A.3. Word alignments of regular constructions . . . 303 B. Corpus counts and measures for lexical causatives 307

C. Verb aspect and event duration data 319

(19)

(20)

1.1. Cross-linguistic mapping between morphosyntactic categories. . . 28

1.2. Cross-linguistic mapping between morphosyntactic categories. . . 30

3.1. Word alignment in a parallel corpus . . . 97

3.2. Probability distributions of the morphological forms and syntactic realisations of the example instances. . . 107

3.3. Probability distributions of the example verbs and their frequency. . . 108

3.4. A general graphical representation of the normal distribution. . . 109

3.5. An example of a decision tree . . . 118

3.6. An example of a Bayesian network . . . 128

4.1. A schematic representation of the structure of a light verb construction compared with a typical verb phrase . . . 134

4.2. Constructions with vague action verbs . . . 145

4.3. True light verb constructions . . . 146

4.4. Extracting verb-noun combinations . . . 148

4.5. The difference in automatic alignment depending on the direction. . . 154

4.6. The distribution of nominal complements in constructions withtake . . . 157

4.7. The distribution of nominal complements in constructions withmake . . 157

4.8. The distribution of nominal complements in regular constructions . . . . 158

4.9. The difference in automatic alignment depending on the complement frequency . . . 159

5.1. The correlation between the rankings of verbs on the scale of spontaneous occurrence . . . 187

5.2. Density distribution of the Sp value in the two samples of verbs . . . 190

(21)

5.3. Collecting data on lexical causatives . . . 195

5.4. Density distribution of the Sp value over instances of 354 verbs . . . 200

5.5. Joint distribution of verb instances in the parallel corpus . . . 203

5.6. Bayesian net model for learning spontaneity. . . 206

5.7. The Interaction of the factors involved in the causative alternation . . . . 213

6.1. Traditional lexical verb aspect classes, known as Vendler’s classes . . . . 229

6.2. Serbian verb structure summary . . . 239

6.3. Bayesian net model for learning event duration . . . 253

(22)

2.1. Frame elements for the verb achieve . . . 63 2.2. Some combinations of frame elements for the verb achieve. . . 64 2.3. The PropBank lexicon entry for the verb pay. . . 67 2.4. The VerbNet entry for the class Approve-77. . . 68 3.1. Examples of instance variables . . . 104 3.2. Examples of type variables . . . 104 3.3. A simple contingency table summarising the instance variables . . . 104 3.4. An example of data summary in Bayesian modelling . . . 111 3.5. An example of a data record suitable for supervised machine learning . . 116 3.6. Grouping values for training a decision tree . . . 120 3.7. An example of a data record suitable for supervised machine learning . . 122 3.8. An example of probability estimation using the expectation-maximisation

algorithm . . . 124 3.9. Precision and recall matrix . . . 129 4.1. Types of mapping between English constructions and their translation

equivalents in German. . . 150 4.2. Well-aligned instances of light verb constructions . . . 155 4.3. The three types of constructions partitioned by the frequency of the com-

plements in the sample. . . 156 4.4. Counts and percentages of well-aligned instances in relation with the fre-

quency of the complements in the sample . . . 160 5.1. Cross-linguistic variation in lexical causatives . . . 171 5.2. Morphological marking of cause-unspecified verbs . . . 177

(23)

5.3. Morphological marking across languages . . . 179 5.4. An example of an extracted instance of an English alternating verb and

its translation to German . . . 197 5.5. Examples of parallel instances of lexical causatives. . . 198 5.6. Contingency tables for the English and German forms in different samples

of parallel instances. . . 202 5.7. Examples of the cross-linguistic input data . . . 209 5.8. Agreement between corpus-based and typology-based classification of verbs.

The classes are denoted in the following way: a=anticausative (interanally caused), c=causative (externally caused) , m=cause-unspecified. . . 210 5.9. Confusion matrix for monolingual and cross-linguistic classification on 2

classes . . . 212 5.10. Confusion matrix for monolingual and cross-linguistic classification on 3

classes . . . 212 6.1. A relationship between English verb tenses and aspectual classes. . . 233 6.2. Serbian lexical derivations . . . 236 6.3. Serbian lexical derivations with a bare perfective . . . 239 6.4. An illustration of the MULTEX-East corpus . . . 243 6.5. A sample of the verb aspect data set. . . 250 6.6. A sample of the two versions of data . . . 258 6.7. Results of machine learning experiments . . . 260

(24)

Languages use different means to express the same content. Variation in the choice of lexical items or syntactic constructions is possible without changing the meaning of a sentence. For example, any of the sentences in (1.1a-c) can be used to express the same event. Similarly, the meaning of the sentences in (1.2a-b), (1.3a-b), and (1.4a-b) can be considered as equivalent. The sentences in (1.1) illustrate the variation in the choice of lexical items, while the sentences in (1.2-1.4) show that the syntactic structure of a sentence can be changed without changing the meaning. In both cases, the variation is limited to the options which are provided by the rules of grammar. In order to be exchangeable, linguistic units have to share certain properties. Identifying the properties shared by different formal expressions of semantically equivalent units is, thus, a way of identifying abstract elements of the structure of language.

(1.1) a. Mary drank a cup of tea.

b. Mary took a cup of tea.

c. Mary had a cup of tea.

d. Mary had a cup of coffee.

As illustrated in (1.1d), verbs allow alternative expressions more easily than other categories. Replacing the noun tea, for example, by coffee changes the meaning of the sentence so that (1.1d) can no longer be considered as equivalent with (1.1a-c). The property which allows verbs to alternate more easily than other categories is their relational meaning. In the given examples, the verbs drink, take, and have relate the nouns Mary and tea. The relational meaning of a verb is commonly represented as the predicate-argument structure, where a verb is considered as a predicate which takes other

(25)

constituents of a sentence as its arguments. The number and the type of the arguments that a verb takes in a particular instance is partially determined by the verb’s meaning and partially by the contextual and pragmatic factors involved in the instance.

(1.2) a. Mary laughed.

b. Mary had a laugh.

(1.3) a. Adam broke the laptop.

b. The laptop broke.

(1.4) a. John pushed the cart.

b. John pushed the cart for some time.

In this dissertation, we study systematic variation in the use of verbs involving alternation in the syntactic structure, as in (1.2-1.4). We study frequency distributions of the syntactic alternants as an observable indicator of the underlying meaning of verbs with the aim of discovering the components of verbs’ meaning which are relevant for their predicate-argument structure and for the grammar of language.

1.1. Grammatically relevant components of the meaning of verbs

As argued by Pesetsky (1995) and later by Levin and Rappaport Hovav (2005), only some of the potential components of the meaning of verbs are grammatically relevant. For example, the distinction between verbs describing loud speaking (e.g. shout) and verbs describing quiet speaking (e.g. whisper) is grammatically irrelevant in the sense that it does not influence any particular syntactic behaviour of these verbs (Pesetsky 1995).

Contrary to this, the distinction between verbs which describe primarily the manner of speaking (whisper) and verbs which describe primarily the content of speaking (e.g. say) is grammatically relevant in the sense that the latter group of verbs can be used without the complementizer that, while the former cannot. Along the same lines, Levin and

(26)

Rappaport Hovav (2005) argue that the quality of sound described by verbs of sound emission — volume, pitch, resonance, duration — does not influence their syntactic behaviour. Syntactic behaviour of these verbs is, in fact, influenced by the source of the sound: verbs which describe sound emission with the source of the sound external to the emitting object (e.g. rattle) can alternate between transitive and intransitive uses (in a similar fashion as break in (1.3)), while verbs which describe sound emission with the source of the sound internal to the emitting object (e.g. rumble) do not alternate.

Our research continues in the same direction investigating other semantic properties of verbs which are potentially relevant for the grammar. We take into consideration a wide range of verbs and their syntactic realisations. If a particular observed distribution of syntactic alternants can be predicted from a semantic poperty of a verb, then we can say that this property underlies the distribution. If a semantic property undrlies a frequency distribution of syntactic alternants, then this property can be considered as grammaticaly relevant.

We focus on three kinds of alternations in realisation of verbs’ arguments. First, by studying the alternation between light verb constructions (1.2b) and the corresponding single verbs (1.2a), we address the issue of whether certain lexical content, in the form of the predicate-argument structure, is present in the verbs which are used as light verbs, such ashave in (1.2b). Determining whether some components of meaning are present in light verbs is important for understanding whether the choice of the light verb in a construction is arbitrary or it is constrained by the meaning of light verbs. Second, we study the alternation in the use of lexical causatives such asbreak in (1.3). Lexical causatives are the verbs which can be used in two ways: as causative (1.3a), where the agent or the causer of the event described by the verb is realised as a constituent of a sentence, and as anticausative (1.3b), where the agent or the causer is not syntactically realised. Many verbs across many different languages can alternate in this way. However, the fact that some verbs in some languages do not alternate raises the question which is addressed in this dissertation: What property of verbs is responsible for allowing or blocking the alternation? Finally, we study the factors involved in the interpretation of temporal properties of events described by verbs. As illustrated in (1.4), the temporal properties of events described by verbs play a role in syntactic structuring of a sentence.

For example, the event of pushing is interpreted as short by default (1.4a). With the

(27)

appropriate temporal modifier, as in (1.4b), it can also be interpreted as lasting for a longer time. In contrast to this, other verbs, such as tick, stay, walk, describe events which are understood as lasting for some time by default. We look for observable indica- tors in the use of a wide range of verbs pointing to the event duration which is implicit to their meaning.

1.2. Natural language processing in linguistic research

The approach that we take in addressing the defined questions is empirical and computational. We take advantage of automatic language processing to collect and analyse large data sets, applying established statistical approaches to infer elements of linguistic structure from the patterns in the observed variation. The tools, methods, and resources which we use are originally developed for practical natural language processing tasks which fall within the domain of computational linguistics. The developments in automatic language processing are directly related to the increasing demand for automatic analysis of large amounts of linguistic contents which are now freely available (mostly through the Internet). Natural language processing tasks include automatic information extraction, question answering, translation etc. Despite the fact that it provides extremely rich resources for empirical linguistic investigations, natural language processing technology has rarely been used for theoretical linguistic research. On the other hand, linguistic representations that are used in developing language technology rarely reflect the current state-of-the-art in linguistic theory. Our research should contribute to bridging the gap between theoretical and computational linguistics by addressing current theoretical discussion with a computational methodology.

The work in this dissertation draws on the work in natural language processing in two ways. First, we use automatic processing tools to extract the information from large language corpora. For example, to identify syntactic forms of the realisations of verbs, we use automatically parsed corpora. The information provided by the parses is then used to extract automatically the instances which are relevant for a particular question. Second, we use natural language processing methodology to analyse the extracted instances. This methodology involves three main components: a) the generalisations in the observations

(28)

are captured by designing statistical models; b) the parameters of the models are learnt automatically from the extracted data applying machine learning techniques; c) the predictions of the models are tested on an independent set of data, quantifying and measuring the performance. Adopting this methodology for our research allows us not only to study language use in a valid experimental framework, but also to discover generalisations which can be integrated into further development of natural language processing more easily than the generalisations based on linguistic introspection.

1.3. Using parallel corpora to study language variation

Our approach to the relationship between the variation in language use and the structure of language takes into account both language-internal and cross-linguistic variation. This is achieved by extracting verb instances from parallel corpora. By studying the variation in the use of verbs in parallel corpora, we combine and extend two main approaches to language variation: the corpus-based approach to language-internal variation and the theoretical approach to cross-linguistic variation.

Corpus-based studies of linguistic variation have been mostly monolingual, following the use of linguistic units either over a period of time or across different language registers.

Extending the corpus-based approach to parallel corpora allows a better insight into structural linguistic elements, setting them apart from other potential factors of variation. Consider, for example, the alternations in (1.2-1.4). An occurrence of one or the other syntactic alternant in a monolingual corpus depends partially on the predicate- argument structure of the verbs and partially on the contextual and pragmatic factors.

However, if we can observe actual translations of the sentences, then we can observe at least two uses of semantically equivalent units in the same contextual and pragmatic conditions, since these conditions are constant in translation. In this way, we control for contextual and pragmatic factors while potentially observing the variation due to structural factors.

Unlike language-internal variation, which has become the subject of research relatively recently, with the development of corpus-based approaches, cross-linguistic variation is

(29)

traditionally one of the core issues in theoretical linguistics. Differences in the expressions of the same contents across languages have always been analysed with the aim of discovering universally invariable elements of the structure of language which constrain the variation. Consider, for example, the English sentence in (1.5a) and its corresponding German, Serbian, and French sentences in (1.5b-d).

(1.5) a. Mary has just sent the letter. (English)

b. Maria hat gerade eben den Brief geschickt. (German) c. Marija je upravo poslala pismo. (Serbian)

d. Marie vein d’envoyer la lettre. (French)

English German Serbian French

present perfect

adverb+

perfect prefix venir+

infinitive

Figure 1.1.: Cross-linguistic mapping between morphosyntactic categories.

All the four sentences describe a short completed action that happened immediately before the time in which the sentence is uttered, but the meaning of shortness, completeness, and time (immediate precedence) is expressed in different ways in the four languages. In English, this meaning is encoded with a verb tense, present perfect.

German uses more general perfect tense, and the immediate precedence component is encoded in the adverbs (gerade eben). French, on the other hand, does not use any particular verb conjugation to express this meaning, but rather a construction which consists of a semantically impoverished verb (venir ’come’) and the main verb (envoyer

’send’) in the neutral, infinitive form. The corresponding Serbian expression is formed in yet another different way: through lexical derivation. The verbposlati used in (1.5c) is derived from the verb slati, which does not encode any specific temporal properties, by adding the prefix po-. Figure 1.1 summarises the identified grammatical mappings across languages. Note also that, unlike the sentences in other languages, the French sentence does not contain a temporal adverb. The meaning of immediate precedence

(30)

is already encoded as part of the meaning of the constructions formed with the verb

’venir’.

These examples illustrate systematic variation across languages, and not just incidental differences between these particular sentences. If we replace the constituents of the sentences with some other members of their paradigms, we will observe the same patterns of variation. For instance, we can replace the phrase send the letter in the English sentence and its lexical counterparts in German, Serbian, and French by some other phrases, such as open the window, read the message, arrive to the meeting and so on.

The choice of the corresponding morphosyntactic categories can be expected to stay the same. The regular patterns in cross-linguistic variation are due to the fact that sentences are composed of the same abstract units. As mentioned before, all the four sentences in (1.5) express the same event, with the same temporal properties (shortness, completeness, immediate precedence). The fact that they influence (morpho)syntactic realisations of verbs makes these properties grammatically relevant. The fact that they are equally interpreted across languages, despite the differences in the morphosyntactic realisations, makes them candidates for universal elements of the structure of language.

Theoretical approaches to cross-linguistic variation are concerned with identifying not only the elements of linguistic structure which are invariable across languages, but also the parameters of variation and their possible settings. With these two elements one could then construct a general representation of language capacity shared by all speakers of all languages. In this system, the grammar of any particular language instantiates the general grammar by setting the parameters to a certain value. For example, temporal properties of events in our example, which are invariable across languages, can be encoded in a syntactic construction (French), in the morphology (English), or in the lexical derivation (Serbian). Ideally, the number of possible values for a parameter should be small.

However, identifying the parameters of cross-linguistic variation and their possible settings is far from being a trivial task. Even though there are some regular patterns of cross-linguistic mapping, as we saw earlier, it is hard to define general rules which apply to all instances of a given category, independently of a given context. In fact, when we take a closer look, finding regularities in cross-linguistic variation turns out to be a very

(31)

difficult task for which no common methodology has been proposed. To illustrate the difficulties, we will look again at the example of English present perfect tense, for which we have defined cross-linguistic mappings shown in Figure 1.1. As we can see in (1.6), the mappings in Figure 1.1 do not hold for all the instances of English present perfect tense. A different use of this tense in English brings to rather different mappings.

(1.6) a. Mary still has not seen the film. (English)

b. Maria hat noch immer nicth den Film gesehen. (German) c. Marija joˇs nije gledala film. (Serbian)

d. Marie n’a pas encore vu le film. (French)

English German Serbian French

present

perfect perfect bare

form

pass´e compos´e

Figure 1.2.: Cross-linguistic mapping between morphosyntactic categories.

Figure 1.2 summarises the mappings between the sentences in (1.6). We can see that, iinstead of the construction with the verb venir, the corresponding French form n this case is a verb tense (pass´e compos´e). The corresponding Serbian verb in this context is neither prefixed nor perfective. This means that the English present perfect tense has multiple cross-linguistic mappings even in this small sample of only two other languages (the German form can be considered invariable in this case). Other uses might be mapped in yet different ways. For instance, there can be a use which maps to French as in Figure 1.2, and to Serbian as in Figure 1.1. If we take into account all the other languages and all possible uses of present perfect tense in English, the number of possible cross-linguistic mappings of this single morphological category is likely to become very big. We can expect to encounter the same situation with all the other categories and their combinations. This creates a very large space of possible cross-linguistic mappings, which is hard to explore and to account for in an exhaustive fashion.

(32)

Extracting verb instances from parallel corpora allows us to observe directly a wide range of cross-linguistic mappings of the target morphosyntactic categories at the instance level, taking into account contextual factors. With a large number of instances analysed using computational and statistical methods, we can take a new perspective on the cross-linguistic variation. Zooming out to analyse general tendencies in the data, rather than individual cases, we can identify patterns signalling potential constraints on the variation. Even though this approach is not exhaustive, it is systematic in the sense that it allows us to observe patterns in cross-linguistic variation in large samples and to use statistical inference to formulate generalisations which hold beyond the observed samples.

1.4. The overview of the dissertation

The dissertation consists of seven chapters. In addition to Introduction and Conclusion, there are five central chapters which are divided between two main parts. The first part (Chapters 2 and 3) presents the conceptual and technical background of our work, the rationale for our methodological choices, as well as a detailed description of general methods used in our experiments. The second part (Chapters 4, 5, and 6) contains three case studies in which our experimental methodology is used to address three specific theoretical questions.

In Chapter 2, we discuss the issues in the predicate-argument structure of verbs from two points of view: theoretical and computational. The theoretical track follows the development in the view of the predicate argument structure from the first proposals which divide the grammatical and the idiosyncratic components of the lexical structure of verbs to the current view of verbs as composed of multiple predicates, which is adopted in our research. We review theoretical arguments for abandoning the initial “atomic”

view of the predicate-argument structure, as well as some proposals for its systematic decomposition into smaller components. We then proceed by reviewing the work on extensive verb classification, which relates the grammatical and the idiosyncratic layer of the lexical structure of verbs. We discuss the principles of semantic classification of verbs on the basis of their syntactic behaviour, as well as practical implementations of verb

(33)

classification principles in developing extensive language resources. Finally, we review approaches to automatic acquisition of verb classification and the predicate-argument structure, discussing the representations and methods used for these tasks.

Chapter 3 deals with the methodology of using parallel corpora for linguistic research.

Since parallel corpora are not commonly used as a source of data for linguistic research, we first present our rationale for this choice, discussing its advantages, but also its limitations. We then give an overview of natural language processing approaches based on parallel corpora and the contributions of this line of research. The second part of the chapter deals with the technical and practical issues in using natural language processing methodology for linguistic research. We first describe steps in processing parallel corpora for extracting linguistic data, in particular, automatic word alignment, which is crucial for our approach. We then turn to the methods used for analysing the extracted data providing the technical background necessary to follow the discussion in the three case studies. The background includes an introduction to statistical inference and modelling in general, as well as to Bayesian modelling in particular, which is followed by an overview of four standard machine learning classification techniques which are used or referred to in our case studies: na¨ıve Bayes, decision tree, Bayesian net, and the expectation- maximisation algorithm.

The first case study, on light verb constructions, is presented in Chapter 4. We first give an overview of the theoretical background and the questions raised by light verb constructions. We introduce two classes of light verb constructions discussed in the literature, true light verb constructions and constructions with vague action verbs. We then introduce our proposed classification which is based on verb types. We argue that light verb constructions headed by light take behave like true light verb constructions, while the constructions headed by lightmake behave like the constructions with vague action verbs. We relate this behaviour to the force dynamics representation of the predicate- argument structure of these verbs. We then present two experiments in which we test two hypotheses about the relationship between the force dynamics in the meaning of the verbs and the cross-linguistic frequency distribution of the alternating morhosyntactic forms.

The case study on the causative alternation is presented in Chapter 5. We start by

(34)

reviewing the proposed generalisations addressing the meaning of the verbs which par- ticipate in the causative alternation. In particular, we address the notions of change of state, external vs. internal causation, and cross-linguistic variation in the availability of the alternation. We then introduce the discussion on the number of classes into which verbs should be classified with respect to these notions. Two proposal have been put forward in the literature: a) a two-way distinction between alternating and not alternating verbs, where alternating verbs are characterised as describing externally caused events, while the verbs which do not alternate describe internally caused events; b) a three-way classification involving a third class of verbs situated between the two previ- ously proposed classes. We then discuss the distribution of the morphological marking on alternating verbs across languages as a potential indicator of the grammatically relevant meaning of the alternating verbs. This leads us to introducing the notion of the likelihood of external causation. The experimental part of this study consists of four steps. In the first step, we validate a corpus based measure of the likelihood of external causation showing that it correlates with the typological distribution of the morphological marking. In the second step, we show that the corpus based measure can be extended to a large sample of verbs. In the third step, we extract the instances of the large sample of verbs from a parallel corpus and test the influence of the likelihood of external causation on the cross-linguistic distribution of their morphosyntactic realisations. In the fourth step, we address the issue of classifying the alternating verbs by designing a statistical model which takes as input cross-linguistic realisations of verbs and outputs their semantic classification. We test the model in two modes, on the two-way and on the three-way classification.

The last case study, presented in Chapter 6, deals with the representation of grammatically relevant temporal properties of events described by a wide range of verbs. We start by introducing verb aspect as a grammatical category usually thought to encode temporal meaning. More specifically, we discuss two notions related to verb aspect: temporal boundedness and event duration. We then discuss Serbian verb derivations associated with verb aspect as a potential observable indicator of these two temporal properties of events described by verbs. We proceed by proposing a quantitative representation of Serbian verb aspect based on cross-linguistic realisations of verbs extracted from parallel corpora. We then design a Bayesian model which predicts the duration (short vs.

(35)

long) of events taking this representation as input. We test the performance of the model against English native speakers’ judgments of the duration of events described by English verbs. We compare our results to the results of models based on monolingual English input.

In Chapter 7, we draw some general conclusions, pointing to the limitations of the current approach as well as to some directions for future research.

(36)

The conceptual and methodological framework of the experiments presented in this dissertation encompasses three partially interrelated lines of research: theoretical accounts of the grammatically relevant meaning of verbs, its extensive descriptions in specialised lexicons, and its automatic acquisition from language corpora.

Theoretical accounts of the meaning of verbs are crucial for defining the hypotheses which are tested in our experiments. Our hypotheses are formulated in the context and framework of recent developments in theoretical accounts of lexical representation of verbs. While using the tools and the methodology developed in computational linguistics, our main goal is not to develop a new tool or resource, but to extend the general knowledge about what kinds of meaning are actually part of the lexical representation of verbs and how they are related to the grammar. Our work is related to the work on constructing comprehensive specialised lexicons of verbs because we work with large sets of verbs assigning specific lexical and grammatical properties to each verb in each sample. Finally, we follow the work on automatic acquisition of the meaning of verbs in that we learn the elements of their lexical representation automatically from the observed distributions of their realisations in a corpus. This aspect distinguishes our work from theoretical approaches, as well as from the work on developing specialised lexicons, which are based on linguistic introspection rather than on empirical observations.

This chapter contains an overview of the existing research in all three domains. In Sec- tion 2.1, we follow the developments in theoretical approaches to the meaning of verbs.

We start by introducing the notion of predicate-argument structure of verbs, discussing its role in the grammar of language, as well as in linguistic theory (2.1.1). We proceed by reviewing proposed theoretical accounts which represent general views of the predicate- argument structure in the literature, discussing at length crucial turning points in the

(37)

theoretical development leading to the temporal and causal decomposition of the meaning of verbs which is adopted in our experiments. In Section 2.2, we discuss the principles of large-scale implementations of some views of the predicate-argument structure. We summarise the main ideas behind the syntactic behavioural approach to the meaning of verbs (2.2.1), which is followed by descriptions of three lexical resources which contain thousands of verbs with explicit analyses of their predicate-argument structure. In Section 2.3, we discuss approaches to automatic acquisition of the predicate-argument structure from language corpora which rely on the described lexical resources conceptu- ally (they adopt the principles of syntactic approach to verb meaning) and practically (they use the resources for training and testing systems for automatic acquisition).

2.1. Theoretical approaches to the argument structure

It is generally assumed in linguistic theory that the structure of a sentence depends, to a certain degree, on the meaning of its main verb. Some verbs, such as see in (2.1) require a subject and an object; others, such as laugh in (2.2), form grammatical sentences expressing only the subject; others, such as tell in (2.3) require expressing three constituents. (Clauses with more than three principal constituents are rare.) The assumption concerning these observations is that the association of certain verbs with a certain number and kind of constituents is not due to chance, but that it is part of the grammar of language.

(2.1) [Mary]

subject

saw [a friend].

object

(2.2) [Mary]

subject

laughed.

(2.3) [Mary]

subject

told [her friend]

indirect-object

[a story].

object

(38)

Although the relation between the meaning of verbs and the available syntactic patterns seems obvious, defining precise rules to derive a phrase structure from the lexical structure of a verb proves to be a difficult task. The task, known in the the linguistic literature as the linking problem, is one of the central concerns of the theory of language (Baker 1997). The main difficulty in linking the meaning of verbs and the form of the phrases that they head is in analysing verbs’ meaning so that the components responsible for the syntactic forms of the phrases are identified.

There are many different ways in which the meaning of verbs can be analysed and it is hard to see what kind of analysis is relevant for the grammar. Consider, for example, basic dictionary definitions of the verbs used in (2.1-2.3) given in (2.4).

(2.4)

see to notice people and things with your eyes

laugh to smile while making sounds with your voice that show you are happy or think something is funny tell to say something to someone, usually giving them

information

Cambridge Dictionaries Online

http://dictionary.cambridge.org/

In the definitions above, the meaning of the verbs is analysed into smaller components.

They state, for example, that seeing involves eyes, things, and people, that laughing involves sounds, showing that you are happy, and something funny, and that telling involves something, someone, and giving information. The units which are identified as components of the verbs’ meaning are very different in nature: some are nouns with specific meaning, some are pronouns with very general meaning, some are complex phrases.

In theoretical approaches to the meaning of verbs, like in lexicography, the analysis results in identifying smaller, more primitive notions of which the meaning is composed.

Unlike lexicographic analysis, however, theoretical analysis aims at defining and organ- ising these notions having in mind the language system as a whole, and not only the meaning of each verb separately. This implies establishing general components which apply across lexical items and which play a role in the rules of grammar.

(39)

2.1.1. The relational meaning of verbs

The most important general distinction made in the theory of lexical structure of verbs is the one between the relational meaning and the idiosyncratic lexical content. In the definition of the verb see given in (2.4), for example, things and people belong to the relational structure, while eyes belong to the idiosyncratic content. In the case of the verb laugh, all the components listed in the definition are idiosyncratic. The verb tell has two relational components (something, someone).

The relational meaning expresses the fact that the verb relates its subject with another entity or with a property. In this sense, verbs are analysed as logicalpredicates which can take one, two, three, or more arguments. This part of their lexical structure is usually called thepredicate-argument structure. It is seen as an abstract component of meaning present in all verbs. There are only a few possible predicate-argument structures, so that they are typically shared by many verbs, while the idiosyncratic content characterises each individual verb.

The predicate-argument structure is the part of the lexical representation of verbs which determines the basic shape of clauses. In a simplified scenario, a verbal predicate which takes two arguments forms a clause with two principal constituents, as in (2.1) and in (2.5a). One argument in the lexical structure of a verb results in intransitive clauses (2.2 and 2.5b) and so on. Formally, the transfer of the information from the lexicon to syntax is handled by more general mechanisms, by projection in earlier accounts (Chomsky 1970; Jackendoff 1977; Chomsky 1986) and feature checking in newer proposals (Chomsky 1995; Radford 2004).

In the accounts that are based on the notion of projection, lexical items project their relational properties into syntax by forming a specific formal structure which can then be combined only with the structures with which it is compatible. So, for instance, a two- argument verb will form a structure with empty positions intended for its subject and object. In principle, these positions can only be filled by nominal structures, while other verbal, adjectival, or adverbial structures will not be compatible with these positions.

In the feature checking account, lexical items do not form any specific structures, but they carry their properties as features, which, by a general rule, need to match between

(40)

the items which are to be combined in a phrase structure. For instance, the list of features of a two-argument verb will contain one feature requiring a subject and one requiring an object. A verb with these features can be combined only with items which have the matching features, that is with nominal items which bear the same features.

Characterisation of possible semantic arguments of verbs depends on the theoretical framework adopted for an analysis, but all approaches make distinctions between at least several kinds of arguments. The kind of meaning expressed by a verb’s argument is usually called asemantic role. Two traditional semantic roles,agentandthemeare illustrated in (2.5).

(2.5) a. [Mary]

subject/agent

stopped [the car].

object/theme

b. [The car]

subject/theme

stopped.

There is a certain alignment between semantic roles and syntactic functions. Agents, for instance, tend to be realised as subjects across languages, while themes are usually objects as in (2.5a). However, the same semantic role can be realised with different syntactic functions, as it is the case with the theme role assigned to the car in (2.5a- b). The phenomenon of multiple syntactic realisations of the same predicate-argument structure is known asargument alternation. The alternation illustrated in (2.5) is called the causative alternation, because the argument which causes the car to stop (Mary) is present in one expression (2.5a), but not in the other (2.5b). Other well-known examples of argument alternations include thedative alternation (2.6) and thelocative alternation (2.7).

(2.6) a. [Mary]

subject/agent

told [her friend]

indirect-object/recipient

[a story].

object/theme

b. [Mary]

subject/agent

told [a story]

object/theme

[to her friend].

prep-complement/recipient

(41)

(2.7) a. [People]

subject/agent

were swarming [in the exhibition hall].

prep-complement/location

b. [The exhibition hall]

subject/location

was swarming [with people].

prep-complement/agent

In the dative alternation, the recipient role (her friend in (2.6)) can be expressed as the indirect object which usually takes dative case (2.6a),¹ or as a prepositional complement (2.6b). In the locative alternation, the arguments which express the location and the agent of the situation described by the verb swap syntactic functions: the location (exhibition hall) is the prepositional complement in (2.7a) and the subject in (2.7b). The agent (people) is in the subject position in (2.7a) and it is the prepositional complement in (2.7b).

The view of the predicate argument structure has evolved with the developments in linguistic theory, from the quite intuitive notions illustrated in the examples so far to more formal and general analyses. The main changes in the theory are reviewed in the following sections.

2.1.2. Atomic approach to the predicate-argument structure

In the earliest approaches, the roles of the semantic arguments of verbs are regarded as simple, atomic labels. Apart from the roles illustrated in (2.5-2.7), the set of labels commonly includes: experiencer, instrument, source, and goal, illustrated in (2.8-2.11).² The atomic semantic labels of the constituents originate in the notions of

“deep cases” in Case grammar (Fillmore 1968).

These labels capture common intuitions about the relational meaning of verbs which cannot be addressed using only the notions of syntactic functions. For example, the meanings of the subjects in (2.5a-b), as well as the role that they play in the event

1Although the dative case is not visible in most of English phrases, including (2.6a), it can be shown that it exists in the syntactic representation of the phrases.

2The labelspatientandthemeare often used as synonyms (as, for example, in (Levin and Rappaport Hovav 2005)). If a difference is made,patientis the participant undergoing a change of state, and themeis the one that undergoes a change of location.

(42)

described by the verb stop, are rather different. Mary refers to a human being who is actively (and possibly intentionally) taking part in the event, while the car refers to an object which cannot have any control of what is happening. This difference cannot be formulated without referring to the semantic argument label of the constituents. A similar distinction is made betweenpeople and the exhibition hall in (2.7a-b).

Another important intuition which is made evident by the predicate-argument representation is that the sentences such as (2.5a) and (2.5b) are related in the sense that they are paraphrases of each other. The same applies for (2.6a) and (2.6b) and (2.7a) and (2.7b). The fact that the predicate-argument structure is shared by the two paraphrases, while their syntactic structure is different, represents the intuition that the two sentences have approximately the same meaning, despite the different arrangements of the constituents.

(2.8) [Mary]

experiencer

enjoyed the film.

(2.9) Mary opened the door [with a card].

instrument

(2.10) Mary borrowed a DVD [from the library].

source

(2.11) Mary arrived [at the party].

goal

Finally, the predicate-argument representation is useful in establishing the relationship between the sentences which express the same content across languages. As the examples in (2.12) show, the relational structure of the verbs like in English and plaire in French is the same, despite the fact that their semantic arguments have inverse syntactic functions.

(2.12) a. [Mary]

subject/experiencer

liked [the idea].

object/theme

(English)

(43)

b. [L’id´ee]

subject/theme

a plu [`a Marie].

prep-complement/experiencer

(French)

Although the predicat-argument structure proves to be a theoretically necessary level of representation of the phrase structure, it was soon shown that the concept of semantic roles as atomic labels for the verbs’ arguments is too na¨ıve with respect to the reality of the observations that it is intended to capture.

First of all, the set of roles is not definitive. There are no common criteria which define all possible members of the set. New roles often need to be added to account for different language facts. For example, the sentence in (2.9) can be transformed so that instrument is the subject as in (2.13), but if we replace the card with the wind as in (2.14), the meaning of this subject cannot be described with any of the labels listed so far. It calls for a new role — cause or immediate cause (Levin and Rappaport Hovav 2005). Similarly, many other sentences cannot be described with the given set of roles. This is why different analyses keep adding new roles (such as beneficiary, destination, path, time, measure, extentetc.) to the set.

(2.13) [The card]

instrument

opened the door.

(2.14) [The wind]

cause

opened the door.

Another problem posed by the atomic view of semantic roles is that there are no trans- parent criteria or tests for identifying a particular role. Definitions of semantic roles do not provide sets of necessary and sufficient conditions that can be used in identifying the semantic role of a particular argument of a verb. For example,agentis usually defined as the participant in an activity that deliberately performs the action,goalis the participant toward which an action is directed,³ and sourceis the participant denoting the origin of an action. These definitions, however, do not apply in many cases, as noted by Dowty (1991). For example, both Mary and John in (2.15) seem to act voluntarily

3Dowty analysesto Mary in (2.15a) asgoal, while the role of this constituent would be analysed as recipientby other authors, which further illustrates the problem.

(44)

in both sentences, which means that they both bear the role of agent. Furthermore, John is not justagent, but also source, whileMary is both agentand goal. (2.15) (a) [John]

?

sold the piano [to Mary]

?

for $1000.

(b) [Mary]

?

bought the piano [from John]

?

for $1000.

(Dowty 1991: 556)

The example in (2.15) shows that the relational structure of such sentences cannot be described by assigning a single and distinct semantic label to each principal constituent of the clause. The meaning of the verbs’ arguments seems to express multiple relations with the verbal predicate.

There is one more observation which cannot be addressed with the simple view of semantic labels. This is the fact that the meaning of the roles is not equally distinct in all the cases. Some roles obviously express similar meanings, while others are very different. Furthermore, semantic clustering of the roles seems to be related with the kinds of syntactic functions that the arguments have in a phrase. For example, the arguments which are realised as subjects in (2.9), (2.13), and (2.14), agent, instrument, cause respectively, constitute a paradigm — they can be replaced by each other in the same context. It has been noticed that two of these roles,agentand cause, can never occur together in the same phrase. On the other hand, the roles such as source and goal are in a syntagmatic relation: they tend to occur together in the same phrase. The traditional view of semantic roles as a set of atomic notions does not provide a means to account for these facts.

Different theoretical frameworks have been developed in the linguistic literature to deal with these problems and to provide a more adequate definitions of the predicate- argument relations. Studying in more detail how semantic arguments of verbs are realised in the phrase structure, some authors (Larson 1988; Grimshaw 1990) propose a universal hierarchy of the arguments. The order in the hierarchy is imposed by the syntactic prominence of the arguments. For example, agents are at the top of the hierarchy,