HAL Id: hal-00134205
https://hal.archives-ouvertes.fr/hal-00134205
Submitted on 1 Mar 2007
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Towards a Quantitative Theory of Variability
Philippe Blache
To cite this version:
Philippe Blache. Towards a Quantitative Theory of Variability: Language, brain and computation.
Ana-Maria Di Sciullo. UG and External Systems, John Benjamins, pp.375-388, 2005. �hal-00134205�
PhilippeBlache
LPL-CNRS, Universitede Provence
29,Avenue Robert Schuman
13621Aix-en-Provence, France
pb@lpl.univ-aix.fr
1 Introduction
Relations between dierent components of linguistic analysis, such as prosody, morphol-
ogy, syntax, semantics, discourse, etc. remains a problem for a systematic description (see
[Blache03]). However, thisis amainchallengenot onlyfrom a theoretical pointof view, but
also fornaturallanguageprocessing,especiallyinhuman/machinecommunicationsystemsor
speech processing (e.g. synthesis). Several phenomena highlighting such relations has been
described. This is typically the case for relations existing betweenprosody and syntax (see
[Selkirk84], [DiCristo85] or[Bear90]). However, explanations areoften empirical and excep-
tionally given intheperspectiveofanactualtheoryoflanguage. Itisforexamplepossibleto
specifysome relationsexisting between topicalization andsyllableduration (cf. [Doetjes02])
or between prosodic architecture and discourse organization after focus (cf. [DiCristo99]).
However, themodularityperspective, which relies on the independence of linguistic compo-
nents, remainstheruleinthiskindof descriptionand doesn't supporta globalvision ofthe
problem.
Oneof thediÆcultiesintheelaborationof ageneralaccount ofthisproblemcomesfrom
the fact that there are only few cases of superposition between structures of the dierent
components. ItisforexamplediÆculttoprecisesomecongruencepossibilitiesbetweensyntax
and prosody (see [Hirst98], [Mertens01]). In the same way, and these aspects are related,
the problem of variability is not taken into account in a systematic way for example inthe
framework of a theory. Indeed, we observe situations inwhich prosody can be realized in a
veryvariable waywhereasinsome other cases,strong constraintshave to be considered.
Wethinkthatthisquestionoftheinteractionbetweenthedierentlinguisticcomponents
isusuallyaddressedinthewrong(orincomplete)way. Itisimpossibletoexplainrelationsby
meansof abijectionsuperposingstructures(forexamplestipulatingdirectrelationsbetween
a syntactic tree and a prosodic hierarchy). One of the problems comes from the fact that
the linguistic objects are notthe same forsyntax and prosody: a word can be formedwith
several syllables,buta syllablecan in turnbeformedwithdierent words (cf. [Hirst98]).
Moregenerally,theproblemcomesfromtheconceptionoflinguisticinformationorganiza-
tion. It isdiÆcult(andprobablynotuseful)totry to representeach analysis component (1)
in anhomogeneous andhierarchized waybymeansof atotal relationand (2) independently
from other components. Inother words, we thinkthateach component of linguisticanalysis
isnotnecessarilyfullystructured: itisoftendiÆcult,orevenimpossible,tospecifyarelation
subordinationrelation.
il pleut tu es mouille(itrains you arewet) (1)
The sameobservationcan be doneforothercomponentsof linguisticanalysis. Thereare
for examplesome prosodicphenomena thataretypical andrecurrent inthisdomain(weuse
in this papera simpliedprosodic descriptionlimited to thenotion of contour presentedin
[Rossi99]), but it is not necessary to represent them into a hierarchized structure covering
the entire input. Generally speaking, each component participates in a partial manner to
the elaboration ofthe informationalcontent of anutterance. We arethen faraway fromthe
classicalmodularconceptionofanalysisconsistingindescribingthisprocessassequentialand
relyingonacompleteandsequentialanalysisofeachdomain(organizedinlevelanalysis,from
phoneticsto pragmatic). We think thattheinterpretation of anutterance is donethanksto
pieces of information comingfrom any component, eventuallyin a redundant way. There is
redundancy when congruence between components exists. But this is not the general case
in which part of informationcan come from prosody,another one from syntax,and another
from pragmatic, forexample.
We propose in this paper an approach taking advantage of this conception of linguistic
analysisandmakingitpossibletodescriberelationsbetweendierentcomponents,notatthe
structure level, but directly between objects belonging to dierent components involved in
therelation. Itbecomesthenpossibleto describerelationswithavariablegranularitylinking
objectsthatcanbeata dierentlevelfromone component totheother. Wecan forexample
describearelationbetweenaninterrogativemorphemeandanintonative contourorbetween
a phraseand some prosodicstress.
Such relations constitute a basis for describing and explaining variability. This phe-
nomenon cannotbeinterpretedby meansofdescriptions comingfrom aunique domain. We
propose an account of variability bringingtogether informationcoming from dierent com-
ponentsandstipulatinganequilibriumrelationbetweenthesecomponents. Theideaconsists
in indicating thatas soon asa certain quantityof information(aninformationthreshold) is
reached thanks to some linguistic components, then variability can appear in other compo-
nents. We willseeforexamplethat inthecasewheresyntaxcontainsinformationenoughby
itself, thenprosodybecomesvariable.
We propose to start from some examples illustrating some variability phenomena inthe
prosodicrealization. Wecanthenproviderstsomeconstraintspecifyingthisvariability. We
dene nallya principleprovidinga generalframework fordescribingvariability.
2 Some Examples
We present in thissection some examples together with a stylized intonative contour. This
kind ofrepresentation doesn'tallowto represent theset of prosodicphenomena that should
be taken into account for a precise study. It allows however a rst approximation in the
perspective of the question addressed in this paper. We use for this some of the notion
proposed in [Rossi99], in particular the notion of conclusive, parenthetic and continuative
ilpleut tuesmouille
itrains youarewet
This utterance is formed by twodistinct parts, not linked with
anyexplicit syntacticrelation. Intonation givesacorrelativein-
terpretationindicating\it rains, becauseyouare wet".
(2)
ilpleut tuesmouille
itrains youarewet
The same utterance, with a dierent intonation, receives a
causativeinterpretationindicating\ifitrains, thenyouarewet".
(3)
Marie larobe elleluivabien
Marie the dress ittohertswell
Example of a dislocation of two NPs with an
anaphoric relation with twoclitics. The intona-
tionfollowsthesamerisingschema foreach dis-
locatedNP.
(4)
Marie larobe elleluivabien
Marie the dress ittohertswell
Thesameexamplecanberealizedwithadierent
intonativecontour. Inbothcases,theinterpreta-
tionofadouble NP dislocationis favored,with-
outmanyambiguity(\thedresstsMary well") (5)
Marie lagarce elleluidonnerien
Marie the bitch shetohergives nothing
The syntactic structure is identical to
theone ofthepreviousexample. How-
ever,thepreferredinterpretationisthat
of an apposition morethan a multiple
dislocation. Thisinterpretationisrein-
forcedbyadierentintonativecontour
betweentheNPs,thesecondonebeing
parenthetic.
(6)
Marie lagarce elleluidonne rien
Marie thebitch shetohergives nothing
ThesameexampleseemsdiÆculttore-
alizewithanintonativecontourtypical
toadoubledislocation.
(7)
Marie elledevraitfaireattention
Marie sheshouldpayattention
Thesyntacticstructure correspondsto asimpledis-
location. TherisingcontouroftheNP constitutesa
strongprosodicmark.
(8)
Marie elledevraitfaireattention
Marie sheshouldpayattention
The form of this example is the sameas in the
previous example. However, the interpretation
isratheravocativeone,morethanadislocated.
Thisinterpretation isthendrivenbytheintona-
tion,notbythesyntacticstructure.
(9)
c'est la personne quim'interesse
itis the person that meinterests
The syntactic structure is that of a cleft. In this
realization,theintonationofthecleftNPismarked
withafall. Theinterpretationissomethinglike\itis
the personthat interests me(not theclothes) ".
(10)
c'est la personne quim'interesse
itis the person that meinterests
relative more than a cleft (of the kind \this is
her, eectively "). Itis drivenbytheintonation
that presentsacontinuativecontourratherthan
aconclusiveone.
(11)
c'est untruc qu'onprefere
itis atrick thatwe prefer
The interpretation of this example is that of a relative.
Suchinterpretationisnatural,withouttakingintoaccount
prosody(itisdiÆcultto associateacleftinterpretationto
this element which has apoorsemanticcontent). A typi-
cal cleft intonation(with a conclusivecontour) cannot be
easilyrealizedinthis case.
(12)
lesvieux c'est lanuit qu'onest malades
the old itis the night that wearesick
Inthisexample,thecleftinterpretation
isgivenbythesyntacticstructure(with
adislocated). This eect is reinforced
by a conclusive intonative contour on
thecleftNP.
(13)
latechno c'estlamusiquequ'onprefere
the techno thisisthe musicthat weprefer
Contrarily to the previous example, the pre-
ferredinterpretationisthatofarelative.Such
interpretation is given by the semanticlevel,
the generalform beingidentical with that of
the previous example. The intonation rein-
forcesthisinterpretation.
(14)
latechno c'est lamusique qu'onprefere
the techno this isthe musicthat weprefer
Aconclusiveintonation,rathertypicalofa
cleft, cannot beeasily realizedin this ex-
ample.
(15)
3 Basic Constraints
Theclassicaldescriptionofprosody/syntaxrelationisgenerallydonebymeansofconstraints
representingeitherthenecessityofaspecicrealizationoritsimpossibility. Intheperspective
of a constraint-based approach, thiskindof information isrepresented directly bymeans of
properties of the objects. This is the case for example of Property Grammars, described
in [Blache01], that rely on dierent kinds of constraints (e.g. requirement, exclusion, linear
precedence, etc.).
At this stage, it is possible to stipulate a rst set of constraints that will constitute a
preliminarystep inthedescriptionof therelations.
3.1 Describing an object with several components
A same linguisticobject isdescribed by meansof informationcomingfrom dierent compo-
nents. This characteristics is illustrated by several examples of the previous section. Let's
focus more precisely on examples 7-8. This case is apparently simple and regular. Indeed,
if thedataare veried, eachinterpretation (beingvocative ornot)is associated to a specic
intonative contour without possibility of variation. We obtain then the several constraints
that make itpossibleto precisesome principles.
[p1] SN[detached] ^Contour[conclusive] ) [-vocative]
(typicallya conclusive contour), takes a vocative interpretation. We are thenin the case of
a classical dislocation coming together with an anaphoric relation between the dislocated
NP and theclitic. The vocative interpretation described in [p2] impliesa detached NP plus
an intonational fall. In these constraints, the objects belong to three dierent components:
syntax, semanticsand prosody. It isnecessary to precise these domains. Moreover, itis also
necessary to precise their respective positions. The solution making it possible to build a
representation independently from any theoretical presuppositionconsists in indicating the
position of the object in the acoustic signal (cf. [Bird99]). This kindof indication is direct
for prosodicinformation, butdiÆcultto specifyforother domainssuchassyntax,semantics
or pragmatics. We propose (see [Blache03]) a general indexation mechanism specifying a
dierent kindof localization for any objects. We propose then to use an anchor containing
a dierent kindof indexation: localization in the signal, in the string or in the context. A
complex featurerepresentssuch anchorasfollows:
anchor! 2
4
temporal
i,j
position
k,l
contextc 3
5
Thetemporalindexisrepresentedbytwovalues(beginningandend). Thepositionisalso
a coupleofindexes(corresponding to nodes inachartinterpretation)localizingan objectin
the input. The context feature implements the notion of universe (i.e. a set of discourse
referents)asinDRT.Anobjectcanthenbespeciedbymeansofthree kindsofinformation:
its domain, its anchor and its characterization (the set of corresponding properties). The
followingexampledescribesan objectfrom thesyntactic domain, witha precise localization
bothon thetemporal andthe linearaxis:
obj! 2
6
6
6
4
domainsynt
anchor
"
temp
880,1000
position
2,3
#
charac
catDet
3
7
7
7
5
Asdetailedabove,constraints[p1]and[p2]areexpressedintermsofimplication. However,
thekindof relationrepresentedthereconsistsmorepreciselyinaco-variationofthedierent
values. Itismoreovernecessary,inparticular fortherepresentationofinformationat aner
levelthanthatoftheatomicobject,to expressanelementundertheformofasetoffeatures,
each one beingan attribute/value pair. This is thecase for exampleof a phoneme that can
be characterized by a set of segments or a syntactic category that corresponds to a set of
morphological, syntactic and semantic features. The relation[p2] concerns in fact dierent
featuresofasame objectcharacterizingasubpartoftheutterance. Thisobjectisrepresented
as follows:
p2! 2
6
6
6
6
6
6
6
6
4 synt
"
cat
NP
detached
pos
i,j
#
sem
type+vocative
pos
i,j
pros
contourcontinuative
3
7
7
7
7
7
7
7
7
5
ingfrom dierentcomponentsand participatingto thedescriptionof asame objector, more
generally, a same linguistic phenomenon. Each characteristic is associated with a position
in the signal represented by the complexfeature anchor. The dierent informationis still
representedseparately, thefeaturestructurebeinga wayfordescribing anobjectcontaining
features connectedwith some relations.
Thecovariationrelationspeciedaboveisexpressedbythespecicationofasimultaneous
variation of the value of some features in a structure. There are several ways to represent
this kind of relation, one of them being the useof \named disjunctions"(cf. [Kasper95] or
[Blache98]). Themechanismconsistsinenumeratingthesetofpossiblevaluesforeachfeature
and indicatingthevaluesthatareina mutualdependency. Allvaluesbelonging tothe same
part ofthedisjunctioncovary: whenavalueisinstantiated, thenallothervaluesof thesame
rank inthenamed disjunctionare also instantiated.
detached/dislocated ! 2
6
6
6
6
6
6
6
6
6
6
6
4 synt
2
4 cat
n
NP
detached
_1
NP
dislocated
o
anch
i,j
3
5
sem
"
type
+vocative_1 -vocative
anch
i,j
#
pros
"
contour
continuative_
1
conclusive
anch
i,j
# 3
7
7
7
7
7
7
7
7
7
7
7
5
In this example, the named disjunction is represented by _
1
. The values NP
detached ,
+vocative and continuative are then dependent (rst part of the disjunction), as well as
the values NP
disl ocated
, - vocative and conclusive. The previous structure works then as a
constraint on the concerned objects. As soon as an utterance description needs a set of
features speciedinthisstructure, theirvalueshave to satisfythe constraint.
3.2 Information on dierent parts of a same object
Aquickstudyofexamples9-14,describingcasesofcleftsandrelatives,exhibitsarstproperty
constraining the relative. This one is incompatible with a conclusive contour asin the case
of a cleft. This restrictionis representedby thefollowingconstraint stipulatingthata set of
categories constitutinga NP with a relative cannot be realized with an intonative stress on
the name,which correspondsto a parentheticcontour.
relative! 2
6
6
6
6
6
6
6
6
6
6
6
4 synt
2
6
4
*
catDet
anch
i,j
,
catN
anch
k,l
,
catRel
anch
m,n
+
anch
i,n
3
7
5
sem
type-focus
anch
i,l
pros
contourparenthetic
pos
i,l
3
7
7
7
7
7
7
7
7
7
7
7
5
In this example, we can remark on top of the constraint on the relative, the possibility
forasame objectto representinformationon dierentparts. Syntacticinformationconcerns
then theentirestructurewhereassemantic andprosodicinformationonlyconcernsa subset.
Thekindofconstraintspresentedabovecan representmanydierent relationsbetweencom-
ponents of linguistic analysis. However, itis impossible to provide generaldescriptions that
cannot be captured by covariation. In particular, it is diÆcult, or even impossible, using
such an approach to explain why prosodic realization seems less constrained under certain
circumstances. In the case of the distinction between clefts and relatives, a constraint can
characterize the general realization of the relative, but nothing can be said as for the cleft:
we can remark a great variability for this construction. Dierent corpus studies show that
it seems possibleto realizeclefts without any specic prosodic mark orwith many dierent
marks. The same phenomenon appears when a semantic featurereinforces a syntactic turn.
Thisisthecaseoftheexamples3-6thatpresentsomecasesofsimpleormultipledislocations.
In the case ofmultipledislocation(example3), two clitics are inan anaphoric relationwith
thedetachedNP.Inthiscase, wehave amorpho-syntacticcriterion(two cliticsagreeingwith
theNPs)plusasemanticindex(theanaphoricrelation). Wecanthenconsiderthat,whatever
theprosody,theinterpretationisconstrainedenoughbyinformationcomingfromsyntaxand
semantics. Onthe contrary, when theanaphoric relationdoesn'texist, asin theexamples 5
and 6,prosodyisstronglyconstrainedand playsan important roleintheinterpretation. For
example, thesecond realizationthat would considerthetwoNPs at thesame level(favoring
a doubledislocationinterpretation) isimpossible.
Generally speaking, we can consider that, when an utterance cannot be disambiguated
with a morpho-syntactic mark, thenthe prosody would play thisrole. In the examples 1-2,
intonationinitselfmakesitpossibletodistinguishbetweencausativeandcorrelativeinterpre-
tations. Moreclearly,inthecaseoftheexamples9-11,intonationdrivestheinterpretationas
relativeorcleft. Asalientturncannot be assignedto arelative,acleft interpretation isthen
favoredinthiscase. Cleftvariabilitywouldthencomefromthefactthatthisturnisstrongly
marked from amorpho-syntacticpointof view (morethantherelative). In thesame wayas
fordoubledislocation,morpho-syntacticandsemanticconstraintsarestrongenoughandthe
interpretationdoesn'tneedmoreinformation,forexampleaprosodicone. Thischaracteristic
is also clear inthe examples 12-14. Onecan observe inthesame waya syntactic variability
allowed by prosody. For example, a rising intonative contour is classically associated with
interrogative turns. We considerin thiscasethat theintonative schema is notvery ambigu-
ous, it can be associated with an heavy weight forprosody (in the same way as clefts have
an heavyweightforsyntax). Suchacharacteristicallowsvariabilityofthesyntacticform. In
a general way,and foranycomponent of linguisticanalysis,theweight value isproportional
to the ambiguity degree of the form. For example, a conclusive contour, specic to certain
constructions, or a major break are associated to heavy weights for prosody. In the same
way, marked syntactic constructions with strongmorpho-syntactic elements (such as clefts)
correspond to heavy syntactic weights.
There existsthen a relationbetween syntax,semantics and prosodythat cannot be rep-
resentedclassicallyintermsofcorrespondencesbetweentherespectivestructures. Morepre-
cisely,theonlyconstraintsthatcan be proposedinthisperspectivearethose ofcooccurency
restrictions, butwithoutproviding anaccount ofvariability.