Three Implementations - Language Acquisition

To test the ideas discussed in the previous chapter, I have constructed three systems that incorporate these ideas into working implementations. Each of these systems applies cross-situational learning tech-niques to a combination of both linguistic and non-linguistic input. In accord with current hypotheses about child language acquisition, these systems use only positive examples to drive their acquisition of a language model. These systems dier from one another in the syntactic and semantic theory which they use. Maimra,¹ the rst system constructed, incorporates a xed context-free grammar as its syntactic theory, and represents word and utterance meanings using Jackendovian conceptual structures. Maimra

learns both the syntactic categories and meanings of words, given a corpus of utterances paired with sets of possible meanings. Davra,² the second system constructed, extends the results obtained with

Maimraby replacing the xed context-free grammar with a parameterized version of X theory. This grammar contains two binary-valued parameters which determine whether the language is head-initial or head-nal, and SPEC-initial or SPEC-nal. Given a corpus much like that given to^Maimra,^Davra learns not only a lexicon similar to that learned by ^Maimra, but the syntactic parameter settings as well. ^Davrahas been successfully applied to very small corpora in both English and Japanese, learning that English is head-initial while Japanese is head-nal. ^Kenunia,³the third system constructed, incor-porates the most substantial linguistic theory of the three systems. This theory closely follows current linguistic theory and is based on the DP hypothesis, base generation of VP-internal subjects, and V-to-I movement. Kenunia incorporates a version of X theory with sixteen binary-valued parameters that supports both adjunction as well as head-complement structures. More importantly,Kenuniasupports movement and empty categories. Two types of empty categories are supported: traces of movement, and non-overt words and morphemes. Kenuniaincorporates several other linguistic subsystems in addition to X theory. These include -theory, the empty category principle (ECP), and the case lter. The current version ofKenuniahas learned both the parameter settings of this theory, as well as the syntactic cate-gories of words, given an initial lexicon pairing words to their -grids. Future work will extendKenunia

to learn these -grids from the corpus, along with the syntactic categories and parameters, instead of giving them toKenuniaas prior input. In the longer term, I also plan to integrate the language learning strategies from^Maimra,^Davra, and^Kenuniawith the visual perception mechanisms incorporated in

Abigail⁴ and discussed in part II of this thesis. The remainder of this chapter will discuss ^Maimra,

Davra, and^Kenuniain greater detail.

1Maimra, or^`xnin, is an Aramaic word which meansword.

2Davra, or^`xac, is an Aramaic word which does not meanword.

3Kenunia, or^`ipepw, is an Aramaic word which meansconspiracy. InKenunia, the linguistic principles conspire to enable the learner to acquire language.

4Abigailis not an Aramaic word.

S ! NP VP S ^{! f}COMP^g S

NP ^{! f}DET^g N ^fS^jNP^jVP^jPP^g VP ! fAUXg V fSjNPjVPjPPg

PP ^! P ^fS^jNP^jVP^jPP^g

AUX ^{! f}DO^jBE^jfMODAL^jTO^jffMODAL^jTO^ggHAVE^{g f}BE^gg

Figure 4.1: The context-free grammar used by Maimra. The categories enclosed in boxes indicate the heads of each phrase type. The distinction between head and complement children is used by the linking rule to form the meaning of a phrase out of the meaning of its constituents.

4.1 Maimra

Maimra(Siskind 1990) was constructed as an initial test of the feasibility of applying cross-situational learning techniques to a combination of linguistic and non-linguistic input in an attempt to simul-taneously learn both syntactic and semantic information about language. Maimra is given a xed context-free grammar as input; grammar acquisition is not part of the task faced byMaimra. Though the grammar is not hardwired intoMaimra, and could be changed to attempt acquisition experiments with dierent input grammars, all of the experiments discussed in this chapter utilize the grammar given in gure 4.1. This grammar was derived from a variant of X theory by xing the head-initial and SPEC-initial parameters, and adding rules for S, S, and AUX. Note that this grammar severely overgenerates due to the lack of subcategorization restrictions. The grammar allows nouns, verbs, and prepositions to take an arbitrary number of complements of any type. Maimrais nonetheless able to learn despite the ensuing ambiguity.

Maimra incorporates a semantic theory based on Jackendovian conceptual structures. Words, phrases, and complete utterances are assigned fragments of conceptual structure as their meaning. The meaning of a phrase is derived from the meanings of its constituents by the linking rule discussed in section 3.1. To reiterate briey, the linking rule operates as follows. The linking rule is mediated by a parse tree. Lexical entries provide the meanings of terminal nodes. Each non-terminal node has a distin-guished child called its head. The remaining children are called the complements of the head. Unlike the puzzle given in section 3.3, the grammar given toMaimraindicates the head child for every phrase type.

Figure 4.1 depicts this information by enclosing the head of each phrase with a box. The meaning of a non-terminal is derived from the meaning of its head by substituting the meaning of the complements for the variables in the meaning of the head. Complements whose meaning is the distinguished symbol?

are ignored and not linked to a variable in the head. Maimrarestricts all complement meanings to be variable-free so that no variable renaming is required.

In addition to the grammar,^Maimrais given a corpus of linguistic and non-linguistic input. Fig-ure 4.2 depicts one such corpus given to^Maimra. This corpus consists of a sequence of nine multi-word utterances, ranging in length from two to seven words. Each utterance is paired with a set of between three and six possible meanings.⁵ ^Maimrais not told which of the meanings is the correct one for each

5As described in Siskind (1990),Maimrais not given this set of meanings directly but instead derives this set from more primitive information using perceptual rules. These rules state, for instance, that seeing an object at one location followed by seeing it later at a dierent location implies that the object moved from the rst location to the second. The corpus actually given toMaimrapairs utterances with sequences of states rather than potential utterance meanings. Thus

Maimrawould derive GO(^x;[^PathFROM(^y)^;TO(^z)]) as a potential meaning for an utterance if the state sequence paired

utterance, only that the set contains the correct meaning as one of its members. Thus the corpus given toMaimracan exhibit referential uncertainty in mapping the linguistic to the non-linguistic input.

Maimra processes the corpus, utterance by utterance, producing a disjunctive lexicon formula for each utterance meaning-set pair. No information other than this lexicon formula is retained after pro-cessing an utterance. This propro-cessing occurs in two phases, corresponding to the parser and linker from the architecture given in gure 2.1. In the rst phase, Maimra constructs a disjunctive parse tree representing the set of all possible ways of parsing the input utterance according to the given context-free grammar. Appendix A illustrates sample disjunctive parse trees which are produced by^Maimra when processing the corpus from gure 4.2. Structural ambiguity can result both from the fact that the grammar is ambiguous, as well as the fact that ^Maimradoes not yet have unique mappings from words to their syntactic categories. Initially,^Maimraassumes that each word can assume any terminal category. This introduces substantial lexical ambiguity and results in corresponding structural ambigu-ity. As ^Maimrafurther constrains the lexicon, she can rule out some word-to-category mappings and thus reduce the lexical ambiguity when processing subsequent utterances. Thus parse trees tend to have less ambiguity as Maimra processes more utterances. This is evident in the parse trees depicted on pages 210 and 213 which are also illustrated below. WhenMaimrarst parses the utteranceBill ran to Mary, the syntactic category ofran is not yet fully determined. Thus Maimraproduces the following disjunctive parse tree for this utterance.

(OR (S (OR (NP (N BILL) (NP (N RAN))) (NP (N BILL) (VP (V RAN))) (NP (N BILL) (PP (P RAN)))) (VP (V TO) (NP (N MARY)))) (S (NP (N BILL))

(OR (VP (V RAN) (PP (P TO)) (NP (N MARY))) (VP (V RAN) (VP (V TO)) (NP (N MARY))) (VP (V RAN) (NP (N TO)) (NP (N MARY))) (VP (OR (AUX (DO RAN))

(AUX (BE RAN)) (AUX (MODAL RAN)) (AUX (TO RAN)) (AUX (HAVE RAN))) (V TO)

(NP (N MARY))) (VP (V RAN)

(OR (NP (DET TO) (N MARY)) (NP (N TO) (NP (N MARY))))) (VP (V RAN) (VP (V TO) (NP (N MARY)))) (VP (V RAN) (PP (P TO) (NP (N MARY)))))))

As a result of processing that utterance, in conjunction with the constraint provided by prior utterances,

Maimracan determine thatranmust be a verb. Thus when parsing the subsequent utterance Bill ran from Mary, which nominally has the same structure, Maimra can nonetheless produce the following smaller disjunctive parse tree by taking into account partial information acquired so far.

with that utterance contained a state in which BE(^x;AT(^y)) was true, followed later by a state where BE(^x;AT(^z)) was true. This primitive theory of event perception is grossly inadequateand largely irrelevant to the remainder of the learning strategy. For the purposes of this chapter,Maimra's perceptual rules can be ignored and the input toMaimra viewed as comprising a set of potential meanings associated with each utterance. The ultimate goal is to base future language acquisition models on the theory of event perception put forth in part II of this thesis, instead of the simplistic rules used byMaimra.

BE(

person

¹;AT(

person

³))_BE(

person

¹;AT(

person

²))_

GO(

person

¹;[^Path])_GO(

person

¹;FROM(

person

³))_

GO(

person

¹;TO(

person

²))^_GO(

person

¹;[^Path FROM(

person

³);TO(

person

²)])

Dans le document Language Acquisition (Page 48-51)