SeineDial: 16th Workshop on the Semantics and Pragmatics of Dialogue (SemDial)

(1)

HAL Id: hal-01138035

https://hal.archives-ouvertes.fr/hal-01138035

Submitted on 3 Apr 2015

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

SeineDial: 16th Workshop on the Semantics and Pragmatics of Dialogue (SemDial)

Sarah Brown-Schmidt, Jonathan Ginzburg, Staffan Larsson

To cite this version:

Sarah Brown-Schmidt, Jonathan Ginzburg, Staffan Larsson. SeineDial: 16th Workshop on the Se- mantics and Pragmatics of Dialogue (SemDial). France. 2012. �hal-01138035�

(2)

Proceedings of SemDial 2012 (SeineDial):

The 16 ^th Workshop on the Semantics and Pragmatics of Dialogue

Sarah Brown-Schmidt, Jonathan Ginzburg, Staffan Larsson (eds.)

(3)

Sponsors

(4)

MARGUERITE: Il m’a assuré que tu n’avais jamais été amoureux.

JACQUES: Oh! pour cela il a dit vrai.

MARGUERITE: Quoi! Jamais de ta vie?

JACQUES: De ma vie.

MARGUERITE: Comment! `a ton ˆage, tu ne saurais pas ce que c’est qu’une femme?

JACQUES: Pardonnez-moi, dame Marguerite.

MARGUERITE: Et qu’est-ce que c’est qu’une femme?

JACQUES: Une femme?

MARGUERITE: Oui, une femme.

JACQUES: Attendez . . .

Denis DiderotJacques le fataliste et son maˆıtre We are happy to present SemDial 2012 (SeineDial), the 16th annual workshop on the Semantics and Pragmatics of Dialogue. This year’s workshop is hosted at Universit´e Paris- Diderot, named for the greatencyclop´ediste, himself a great writer of dialogues. SeineDial continues the tradition of presenting high-quality talks and posters on dialogue from a variety of perspectives such as formal semantics and pragmatics, artificial intelligence, computational linguistics, and psycholinguistics.

38 submissions were received for the main session, and each was reviewed by three experts. 16 talks were selected for oral presentation; the poster session hosts many of the remaining submissions, together with additional submissions that came in response to a call for late-breaking posters and demos.

We are lucky to have three world famous researchers as invited speakers—Eve Clark, Geert-Jan Kruijff, and Fran¸cois Recanati. Each of these represents a broad range of perspectives and disciplines. We are sure that their talks will stimulate much interest and at least some controversy. Together with the accepted talks and posters we look forward to a productive and interactive conference.

We are grateful to the reviewers, who invested a lot of time giving very useful feedback, both to the program chairs and to the authors, and to members of the local organizing committee, Anne Abeill´e, Margot Colinet, and Gregoire Winterstein for their hard work in helping to bring the conference to fruition.

We are also very grateful to a number of organizations, who provided generous financial support to SeineDial:

• CLILLAC-ARP, Universit´e Paris-Diderot

• Laboratoire de Linguistique Formelle, Universit´e Paris-Diderot

• The Laboratoire d’excellence LabEx-EFL (Empirical Foundations of Linguistics), Paris Sorbonne-Cit´e.

• La région Île de France, through their competitive scheme Manifestations scientifiques en Île-de-France hors DIM.

Sarah Brown-Schmidt, Jonathan Ginzburg, Staffan Larsson September, 2012

(5)

Programme Committee

Sarah Brown-Schmidt

(University of Illinois, Urbana Champaign, Co-chair)

Staffan Larsson

(Gothenburg University, Co-chair)

• Jennifer Arnold (University of North Carolina)

• Ron Artstein (Institute for Creative Technologies, LA)

• Ellen Gurman Bard (Edinburgh University)

• Luciana Benotti (Universidad Nacional de Crdoba)

• Claire Beyssade (Institut Jean Nicod, Paris)

• Nate Blaylock (IHMC)

• Johan Bos (Groningen University)

• Susan Brennan (SUNY, Stonybrook)

• Mark Core (Institute for Creative Technologies, LA)

• Mariapaola D’Imperio (LPL, Aix en Provence)

• David Devault (Institute for Creative Technologies, LA)

• Myroslava Dzikovska (Edinburgh University)

• Jens Edlund (KTH, Stockholm)

• Heather Ferguson (University of Kent)

• Raquel Fern´andez (University of Amsterdam)

• Victor Ferreira (UC San Diego)

• Claire Gardent (LORIA)

• Kallirroi Georgila (Institute for Creative Technologies, LA)

• Eleni Gregoromichelaki (King’s College, London)

• Anna Hjalmarsson (KTH, Stockholm)

• Amy Isard (Edinburgh University)

• Elsi Kaiser (University of Southern California)

• Andrew Kehler (UC San Diego)

• Ruth Kempson (King’s College, London)

• Ivana Kruijff-Korbayova (DFKI, Saarbrcken)

• Alex Lascarides (Edinburgh University)

• Oliver Lemon (Herriot Watt University)

(6)

• Danielle Matthews (Sheffield)

• Gregory Mills (Edinburgh University)

• Benjamin Spector (Institut Jean Nicod, ENS)

• Aliyah Morgenstern (Universit´e Paris 3)

• Chris Potts (Stanford University)

• Laurent Pr´evot (LPL, Aix en Provence)

• Matthew Purver (Queen Mary, University of London)

• Antoine Raux (CMU)

• Hannes Rieser (Bielefeld University)

• Verena Rieser (Herriot Watt University)

• David Schlangen (Bielefeld University)

• Gabriel Skantze (KTH, Stockholm)

• Benjamin Spector (Institut Jean Nicod, Paris)

• Matthew Stone (Rutgers)

• David Traum (Institute for Creative Technologies, LA)

• Nigel Ward (UTEP)

Organizing Committee

• Jonathan Ginzburg (chair)

• Anne Abeill´e

• Margot Colinet

• Gregoire Winterstein

(7)

Referential Coordination through Mental Files

Franc¸ois Recanati Institut Jean-Nicod Ecole Normale Superieure 29 rue d’Ulm, 75005 Paris, France

recanati@ens.fr

http://www.institutnicod.org

On the standard model, linguistic communica- tion makes it possible for the hearer to entertain the thoughts expressed by the speaker, and what makes that possible is the fact that the thoughts in question are encoded in the speakers words.

However, there are challenges both to the idea that communication results in the sharing of thoughts, and to the idea that it works by encoding the thoughts. After briefly reviewing the contextualist challenge, which targets the latter idea, I will turn to another challenge to the standard model, raised by singular thought.

What characterizes singular thoughts, and es- pecially indexical thoughts (the paradigm case), is the fact that the modes of presentation through which one thinks of objects are context-bound and perspectival. Such modes of presentation are best construed as mental files exploiting (and presup- posing) certain contextual relations to the refer- ence. This raises the communication problem, first raised by Frege: if indexical thoughts are context-bound and relation-based, how is it pos- sible to communicate them to those who are not in the same context and do not stand in the right relations to the object? Arguably, one has to give up the claim that communication involves thought sharing, in such cases.

Following Frege, I will appeal to an important distinction between linguistic and psychological modes of presentation. Psychological modes of presentation are thought ingredients, while lin- guistic modes of presentation are encoded. Psy- chological modes of presentation are perspectival and context-bound: they are mental files whose role is to store information one can gain in virtue of standing in certain contextual relations to the

to subjects who are appropriately situated vis vis the object. It follows that thoughts involving such modes of presentation are not shareable with sub- jects who are not in the right type of context. But linguistic modes of presentation are fixed by the conventions of the language and they are shared by all the language users. They are public and serve to coordinate mental files in communication by constraining them to contain the piece of infor- mation they encode. In this way communication takes place even though the indexical thoughts en- tertained by the speaker are, in some sense, pri- vate and cannot be shared by the audience. Com- munication no longer involves the replication of thoughts only their coordination.

In the last part of the talk I will apply the coor-

dination model of communication to the referen-

tial use of definite descriptions, and I will discuss

a key objection based on the distinction between

semantic reference and speakers reference.

(11)

Optimal Reasoning About Referential Expressions

Judith Degen

Dept. of Brain and Cognitive Sciences University of Rochester

jdegen@bcs.rochester.edu

Michael Franke ILLC

Universiteit van Amsterdam m.franke@uva.nl

Abstract

Theiterated best response (IBR) model is a game-theoretic approach to formal pragmatics that spells out pragmatic reasoning as back- and-forth reasoning about interlocutors’ rational choices and beliefs (Franke, 2011; J¨ager, 2011). We investigate the comprehension and production of referential expressions within this framework. Two studies manipulating the complexity of inferences involved in comprehension (Exp. 1) and production (Exp. 2) of referential expressions show an intriguing asymmetry: comprehension performance is better than production in corresponding complex inference tasks, but worse on simpler ones. This is not predicted by standard formulations ofIBR, which makes categorical predictions about rational choices. We suggest that taking into account quantitative information about beliefs of reasoners results in a better fit to the data, thus calling for a revision of the game-theoretic model.

1 Introduction

Reference to objects is pivotal in communication and a central concern of linguistic pragmatics. If interlocutors were ideal reasoners, speakers would choose the most convenient referential expression that is sufficiently discriminating given the hearer’s perspective, while hearers would choose the referent for which an observed referential expression is optimal given the speaker’s perspective. But it would be folly to assume that humans are ideal reasoners, so the question is: how much do interlocutors take each

other’s perspective into account when producing and interpreting referential expressions?

A lot of work has been dedicated to this is- sue. For example, computational linguists have investigated efficient and natural rules for generating and comprehending referential expressions (see Dale and Reiter (1995) and Golland et al. (2010) for work directly related to ours). Many empirical studies have addressed the more specific questions of whether, when and/or how, hearers take speakers’

privileged information into account (Keysar et al., 2000; Keysar et al., 2003; Hanna et al., 2003; Heller et al., 2008; Brown-Schmidt et al., 2008). Also, eye- tracking studies in the visual-world paradigm have been used to investigate howquantity reasoningin- fluences the interpretation of referential expressions (Sedivy, 2003; Grodner and Sedivy, 2011; Huang and Snedeker, 2009; Grodner et al., 2010). In recent work closely related to ours, Stiller et al. (2011) and Frank and Goodman (2012) proposed a Bayesian model of producing and comprehending referential expressions in a game setting similar to the kind we consider here. We will more closely compare these related approaches in Section 6. Despite these various efforts, it is still a matter of debate whether or to what extent interlocutors routinely consider each other’s perspective.

In order to contribute to this question, we follow a recent line of experimental approaches to formal epistemology and game theory (Hedden and Zhang, 2002; Crawford and Iriberri, 2007) to investigate how muchstrategicback-and-forth reasoning speakers and hearers employ in abstract language games.

The tasks we investigate translate directly to the kind

(12)

of signaling games that have variously been used to account for a number of pragmatic phenomena, most notablyconversational implicatures(see, e.g., Parikh (2001), Benz and van Rooij (2007) or J¨ager (2008)). A benchmark model of idealized step-by- step reasoning, called iterated best response (IBR) model, exists for these games (Franke, 2011; J¨ager, 2011). IBR makes concrete predictions about the depth of strategic reasoning required to “solve” different kinds of referential language games, so that by varying the difficulty of our referential tasks, it is possible to both: (i) test the predictions ofIBRmod- els of pragmatic reasoning and (ii) determine the extent to which speakers and hearers reason strategi- cally about the use of referential expressions.

Our data shows that participants perform better at reasoning tasks thatIBRpredicts to involve fewer inference steps. This holds for comprehension and production. However, our data also shows an interesting asymmetry: comprehension performance is better than production in corresponding complex inference tasks, but worse on simpler ones. This is not predicted by standard formulations of IBR

which makes categorical predictions about rational choices. However, it is predicted by a more nuanced variation ofIBRthat pays attention to the quantitative information in the belief hierarchies postulated by the model.

Section 2 introduces signaling games as abstract models of referential language use. Section 3 out- lines the relevant notions of IBR reasoning. Sec- tions 4 & 5 describe our comprehension and production studies respectively. Section 6 discusses the results.

2 Referential Language Games

If speaker and hearer share a commonly observ- able setT of possible referents in their immediate environment, referential communication has essen- tially the structure of asignaling game: the sender S knows whicht ∈ T she wants to talk about, but the receiver Rdoes not; the speaker chooses some descriptionm; if Rcan identify the intended referent, communication is successful, otherwise a fail- ure. Such a game consists of a set T (of possible referents), a set M of messages thatS could use, a prior probability distributionProver T that cap-

turesR’s prior expectation about the most likely intended referent, and a utility function that captures the players’ preferences in the game. We assume thatSandRare both interested in establishing reference, so that iftis the intended referent andt⁰isR’s guess, then for some constantss > f:U(t, t⁰) =sif t=t⁰andfotherwise. Additionally, if messages are meaningful, this is expressed by a denotation function[[m]]⊆T that gives the set of referents to which mis applicable (e.g., of which it is true).

Consider, e.g., the situations depicted in Fig. 1.

There are three possible referents T = {tt, t_c, t_d} in the form of monsters and robots wearing one accessory each that both S and R observe. Since there is no reason to prefer any referent over another, we assume thatPr is a flat distribution over T. There are also four possible messages M = {mt, mc, m_d1, m_d2} with some intuitively obvious

“semantic meaning”. For example, the messagemc

for red hat would intuitively be applicable to either the robot t_t or the green monster t_c, so that [[m_c]] ={t_t, t_c}.

Signaling games like those in Fig. 1 are the basis for the critical conditions of our experiments (see also Sections 4 and 5), where we test which referent subjects choose for a giventrigger messageand which message they choose for a trigger referent.

Trigger items for comprehension and production experiments are marked with an asterisk in Fig. 1. In- dicest, c, dstand fortarget,competitoranddistrac- torrespectively.

We refer to a game as in Fig. 1(a) as thesimple implicature condition, because it involves a simple scalar implicature. Hearingtrigger messagem^∗_c,R should reason thatSmust have meanttarget statet_t, and notcompetitor statetc, because ifShad wanted to refer to the latter she could have used an unambiguous message. Conversely, whenSwants to refer totrigger statet^∗_c, she should not use the true but semantically ambiguous message m_c, because she has a stronger messagemt. Similarly, we refer to a game in Fig. 1(b) as thecomplex implicature condition, because it requires performing scalar reasoning twice in sequence (see Fig. 2 later on).

(13)

tt t^∗_c td

Possible Referents

mt m^∗_c

m_d1 m_d2

Message Options

(a) simple

t^∗_t tc td

Possible Referents

m^∗_t mc

md1 md2

Message Options

(b) complex

Figure 1: Target implicature conditions. Hearers choose one of the POSSIBLEREFERENTST ={tt,tc,td}. Speakers have MESSAGEOPTIONSM={mt,mc,md1,md2}. Trigger items are indicated with asterisks: e.g.,t^∗_tis the referent to be communicated on complex production trials.

R0

mt

m^∗_c md1

md2

S1

tt

t^∗_c td

R2

mt

m^∗_c md1

md2

tt

t^∗_c td

S0

tt

t^∗c

td

R1

mt

m^∗c

md1

md2

S2

tt

t^∗c

td

mt

m^∗c

md1

md2

(a) simple

R0

m^∗_t mc

md1

md2

S1

t^∗_t tc

td

R2

m^∗_t mc

md1

md2

S3

t^∗_t tc

td

m^∗_t mc

md1

md2

S0

t^∗t

tc

td

R1

m^∗t

mc

md1

md2

S2

t^∗t

tc

td

R3

m^∗t

mc

md1

md2

t^∗t

tc

td

(b) complex

Figure 2: Qualitative predictions of theIBRmodel for simple and complex conditions. The graphs give the set of best responses at each level of strategic reasoning as a mapping from the left to the right.

(14)

3

IBR

Reasoning

TheIBR model defines two independent strands of strategic reasoning about language use: one that starts with a na¨ıve (level-0) receiverR0and one that starts with a na¨ıve senderS0 (Franke, 2011; J¨ager, 2011). If utilities are as indicated and priors are flat, the behavior of level-0 players is predicted to be a uniform choice over options that conform to the semantic meaning of messages: R0(m) = [[m]] and S0(t) ={m | t∈[[m]]}. Sophisticated player types of level k+ 1 play any rational choice with equal probability given a belief that the opponent player is of level k. For our experimental examples, the

“light” system of Franke (2011) applies, where sophisticated types are defined as:¹

S_k+1(t) =







arg min_m∈R⁻¹

k (t)|Rk(m)| ifR⁻¹_k (t)6=∅ S0(t) otherwise

R_k+1(m) =







arg min_t∈S−1

k (m)|S_k(t)| ifS_k⁻¹(m)6=∅ R0(m) otherwise

The sequences of best responses for the simple and complex games from Fig. 1 are given in Fig. 2. On this purely qualitative picture, theIBRmodel makes the same predictions for comprehension and production. In the simple condition, the trigger item is mapped to either target or competitor with equal chance by na¨ıve players; all higher level types map the trigger item to the target item with probability one. In the complex condition, the trigger items are mapped to target and competitor in levels 0 and 1 with equal probability, but uniquely to the target item fork≥2.

The sequences in Fig. 2 only consider the actual best responses ofSandR, but not the more nuanced quantitative information that gives rise to these. Best responses are defined as those that maximize expected utility given what the players believe about how likely each choice option would lead to communicative success. The relevant expected success probabilities are given in Table 1 for sophisticated

1HereR⁻¹_k (t) ={m |t∈Rk(m)}. Likewise forS_k⁻¹.

types. (Na¨ıve types have no or only trivial beliefs about the game.)

For reasons of space suffice it to give the intuition behind these numbers. E.g., in the simple condition R1 believes that the trigger message is used by na¨ıve senders who want to refer tot_tort_c. But na¨ıve senders who want to refer tot_cwould also use mtwith probability¹/2. So, by Bayesian conditionalization, after hearingmc,R1believes the intended referent ist_twith probability²/3.

Notice that whileR’s success expectations always sum to one (there is always only exactly one intended referent), S’s success expectations need not (several messages could be believed to lead to successful communication). A further difference con- cerns whenSandRare sure of communicative success. In the simple condition,S1 is already sure of success, but onlyR≥2is. In the complex condition, R2 is already sure of success, but onlyS≥3 is. So, if we assume that human reasoners aim for certainty of communicative success in pragmatic reasoning, the simple condition is less demanding in production than in comprehension, while for the complex condition the reverse is the case.

4 Experiment 1

Exp. 1 tested participants’ behavior in a compre- hensiontask that used instantiations of the signaling games described in Section 2.

4.1 Methods

Participants. Using Amazon’s Mechanical Turk, 30 workers were paid $0.60 to participate. All were na¨ıve as to the purpose of the experiment and participants’ IP address was limited to US addresses only.

Two participants did the experiment twice. Their second run was excluded.

Procedure and Materials. Participants engaged in a referential comprehension task. On each trial they saw three objects on a display. Each object differed systematically along two dimensions: its ontologi- cal kind (robot or one of two monster species) and accessory (scarf or either blue or red hat). In addition to these three objects, participants saw a pictorial message that they were told was sent to them by a previous participant whose job it was to get them to pick out one of these three objects. They

(15)

simple complex

level R S R S

1 h²/3,¹/3,0i h1,¹/2,0,0i h¹/2,¹/2,0i h¹/2,¹/2,0,¹/3i 2 h1,0,0i h1,0,0,0i h1,0,0i h¹/2,0,0,¹/3i 3 h1,0,0i h1,0,0,0i h1,0,0i h1,0,0,¹/3i

Table 1: Success expectations for the trigger items in the simple and complex condition. Success expectations forR are given in order fortt,tcandtd, those forSin order formt,mc,md1andmd2.

were told that the previous participant was allowed to send a message expressing only one feature of a given object, and that the messages the participant could send were furthermore restricted to monsters and hats. The four expressible features were visible to participants at the bottom of the display on every trial.

Participants initially played four sender trials.

They saw three objects, one of which was highlighted with a yellow rectangle, and were asked to click on one of four pictorial messages to send to another Mechanical Turk worker to get them to pick out the highlighted object. They were told that the other worker did not know which object was highlighted but knew which messages could be sent. The four sender trials contained three unambiguous and one ambiguous trial which functioned as fillers in the main experiment.

Participants saw 36 experimental trials, with a 2:1 ratio of fillers to critical trials. Of the 12 critical trials, 6 constituted a simple implicature situation and 6 a complex one as defined in Section 2 (see also Fig. 1).

Target position was counterbalanced (each critical trial occurred equally often in each of the 6 possible orders of target, competitor, and distractor), as were the target’s features and the number of times each message was sent. Of the 24 filler trials, half used the displays from the implicature conditions but the target was eithert_c or t_d (as identified un- ambiguously by the trigger message). This was also intended to prevent learning associations of display type with the target. On the other 12 filler trials, the target was either entirely unambiguous or entirely ambiguous given the message. That is, there was either only one object with the feature denoted by the trigger message, or there were two identical objects that were equally viable target candidates.

Trial order was pseudo-randomized such that there

were two lists (reverse order) of three blocks, where critical trials and fillers were distributed evenly over blocks. Each list began with three filler trials.

4.2 Results and Discussion

Proportions of choice types are displayed in Fig. 3(a). As expected, participants were close to ceiling in choosing the target on unambiguous filler trials but at chance on ambiguous ones. This con- firms that participants understood the task. On critical implicature trials, participants’ performance was intermediate between ambiguous and unambiguous filler trials. On simple implicature trials, participants chose the target 79% of the time and the competitor 21% of the time. On complex implicature trials, the target was chosen less often (54% of the time).

To test whether the observed differences in target choices above were significantly different, we fitted a logistic mixed-effects regression to the data.

Trials on which the distractor was selected were excluded to allow for a binary outcome variable (target vs. no target choice). This led to an exclusion of 5%

of the data. The model predicted the log odds of choosing a target over a competitor from a Helmert- coded CONDITIONpredictor, a predictor coding the TRIAL number to account for learning effects, and their interaction. Three Helmert contrasts over the four relevant critical and filler conditions were included in the model, comparing each condition with a relatively less skewed distribution against the more skewed distributions (in order: ambiguous fillers, complex implicatures, simple implicatures, unambiguous fillers). This allowed us to capture whether the differences in distributions for neighboring conditions suggested by Fig. 3(a) were significant. We included the maximal random effect structure that allowed the model to converge:² by-participant ran-

2For the procedure that was used to generate the random effect structure, seehttp://hlplab.wordpress.com/

(16)

Coefβ SE(β) z p (I^NTERCEPT) 1.81 0.22 8.3 <.0001 A^MBIG.^VS.RÊST −2.56 0.45 −5.6 <.0001 CÔMPLEX.^VS.EÂSIER −3.20 0.53 −6.0 <.0001 SÎMPLE.^VS.U^NAMBIG −2.68 0.81 −3.3 <.001

T^RIAL 0.00 0.01 0.3 0.8

TRIAL:AMBIG.VS.REST −0.07 0.03 −2.6 <.05 TRIAL:COMPLEX.VS.EASIER −0.01 0.03 −0.4 0.7 TRIAL:SIMPLE.VS.UNAMBIG 0.08 0.05 1.7 0.08

Table 2: Model output of Exp. 1. AMBIG.VS.REST, COMPLEX.VS.EASIER, and SIMPLE.VS.UNAMBIG are the Helmert-coded condition contrast predictors, in order.

dom slopes for CONDITIONand TRIALand by-item random intercepts. Results are given in Table 2.

All Helmert contrasts reached significance atp <

.001. That is, all target/competitor distributions shown in Fig. 3(a) are different from each other.

There was no main effect of TRIAL, indicating that no learning took place overall during the course of the experiment. However, there were significant interactions, suggesting selective learning in a subset of conditions. In particular there was a significant interaction between TRIALand the Helmert contrast coding the difference between ambiguous fillers and the rest of the conditions (AMBIG.VS.REST, β =

−.05, SE = .02, p < .05) and a marginally significant interaction between TRIALand the Helmert contrast coding the difference between the simple implicature and unambiguous filler condition (SIMPLE.VS.UNAMBIG,β = .08,SE = .05,p = .08). Further probing the simple effects revealed that participants chose the target more frequently later in the experiment in the simple and complex condition.

This was evidenced by a main effect of TRIAL on that subset of the data (β=.03,SE=.01,p < .05) but no interactions with condition. There were no learning effects in the ambiguous and unambiguous filler conditions; participants were at chance for ambiguous items and at ceiling for unambiguous items throughout. This suggests that at least some participants became aware that there was an optimal strat- egy and began to employ it as the experiment pro- gressed.

We next address the question of whether the data supports the within-participant distributions predicted by standardIBR. Recall from Section 2 that 2009/05/14/random-effect-structure/

for the simple condition,IBRpredictsR0players to have a uniform distribution over target and competitor choices andR_≥1 players to choose only the target. For the complex condition, the uniform distribution is predicted for bothR0 andR1 players, while only target choices are expected forR≥2players.

This is not borne out (see Fig. 4(a)). On the one hand, there were 3 participants in the simple condition and 5 in the complex condition who chose the target on half of the trials and could thus be classified asR₀ (orR₁ in the complex condition). Similarly, there were 11 participants in the simple condition and one in the complex condition who chose only targets and thus behaved as sophisticated receivers according toIBR. On the other hand, the majority of participants’ distributions over target and competitor choices deviated from both the uniform and the target-only distribution.

One possibility is that some participants’ type shifted from R_k to R_k+1 as the experiment pro- gressed. That is, they may have shifted from initially choosing targets and competitors at random to choosing only targets. However, while it is the case that overall more targets were chosen later in the experiment in both implicature conditions, there was nevertheless within-participant variation in choices late in the experiment inconsistent with a categorical shift. Another possibility is that the experiment was too short to observe this categorical shift.

5 Experiment 2

Exp. 2 tested participants’ behavior in aproduction task that used instantiations of the signaling games described in Section 2.

(17)

0.0 0.2 0.4 0.6 0.8 1.0

ambiguous filler comple

x implicature

simple implicatureunambiguous filler

Proportion of choices

(a) Experiment 1

0.0 0.2 0.4 0.6 0.8 1.0

ambiguous filler comple

x implicature

simple implicatureunambiguous filler

Response target distractors competitor

(b) Experiment 2

Figure 3: Proportions of target, competitor, and distractor choices in implicature and filler conditions (Exps. 1 & 2).

0 5 10 15 20

0 1 2 3 4 5 6

Number of target responses

Number of subjects

Implicature simple complex

(a) Experiment 1

0 5 10 15 20

1 2 3 4 5 6

Number of target responses Implicature

simple complex

(b) Experiment 2

Figure 4: Distribution of participants over number of target choices in implicature conditions (Exp. 1 & 2).

5.1 Methods

Participants. Using Amazon’s Mechanical Turk, 30 workers were paid $0.60 to participate under the same conditions as in Exp. 1. Data from two participants whose comments indicated that not all images displayed properly were excluded.

Procedure and Materials. The procedure was the same as on the sender trials in Exp. 1. Participants saw 36 trials with a 2:1 ratio of fillers to critical trials. There were 12 critical trials (6 simple and 6 complex implicature situations as in Fig. 1). Half of the fillers used the same displays as the implicature trials, but one of the other two objects was highlighted. This meant that the target message was either unambiguous (e.g. when the highlighted object wastt in Fig. 1(a) the target message wasmc) or entirely ambiguous. The remaining 12 filler trials employed other displays with either entirely unambiguous or ambiguous target messages. Two exper-

imental lists were created and counterbalancing was ensured as in Exp. 1.

5.2 Results and Discussion

Proportions of choice types are displayed in Fig. 3(b). As in Exp. 1, participants were close to ceiling for target message choices on unambiguous filler trials but at chance on ambiguous ones. On critical implicature trials, participants’ performance was slightly different than in Exp. 1. Most notably, the distribution over target and competitor choices in the simple implicature condition was more skewed than in Exp. 1 (95% targets, 5% competitors), while it was more uniform than in Exp. 1 on complex implicature trials (50% targets, 47% competitors).

We again fitted a logistic mixed-effects regression model to the data. Trials on which the distractor messages were selected were excluded to allow for a binary outcome variable (target vs. competi-

(18)

tor choice). This led to an exclusion of 2% of trials. In addition, the unambiguous filler condition is not included in the analysis reported here since there was only 1 non-target choice after exclusion of distractor choices, leading to unreliable model convergence. Thus, as in Exp. 1, CONDITIONwas entered into the model as a Helmert-coded variable but with only two contrasts, one comparing the simple implicature condition to the mean of ambiguous fillers and the complex implicature condition (SIMPLE.VS.HARDER), and another one comparing the ambiguous fillers with the complex implicatures (AMBIG.VS.COMPLEX). The model reported here further does not contain a TRIAL predictor to con- trol for learning effects because model comparison revealed that it was not justified (χ²(1) = 0.06, p=.8). That is, there were no measurable learning effects in this experiment. We included the maximal random effects structure that allowed the model to converge: by-participant random slopes for CONDI-

TIONand by-item random intercepts.

The SIMPLE.VS.HARDER Helmert contrast reached significance (β = 3.04, SE = 0.5, p < .0001) while AMBIG.VS.COMPLEX did not (β = 0.08, SE = 0.41, p = .9). That is, there was no difference between choosing a target in the ambiguous filler condition and in the complex implicature condition, suggesting that participants were at chance in deriving complex implicatures in production. However, they were close to ceiling in choosing targets in the simple implicature condition.

The observed within-participant distributions are better predicted by the qualitative version of IBR

than in Exp. 1 (see Fig. 4(b)). For the simple condition,IBRpredictsS0 players to have a uniform distribution over target and competitor choices andS_≥1 players to choose only the target. For the complex condition, the uniform distribution is predicted for both S₀ and S₁ players, while only target choices are expected forS_≥2players.

In the simple implicature condition, 75% of participants were perfectS1 reasoners. The remaining 25% chose almost only targets. That is, participants very consistently computed the implicature. In contrast, the bulk of participants chose targets versus competitors at random in the complex implicature condition. Only 2 participants chose the target 5 out of 6 times.

Comparing these results to the results from Exp. 1, we see the following pattern: in production the simple one-level implicatures are more read- ily computed than in comprehension, while the more complex two-level implicatures are more read- ily computed in comprehension than in production.

That is, rather than comprehension mirroring production, in this paradigm there is an asymmetry between the two. This is consistent with the quantitative interpretation of IBR(as described in section 3) that takes into account players’ uncertainty about communicative success.

6 General Discussion

In two studies using an abstract language game we investigated speakers’ and hearers’ strategic reasoning about referential descriptions. Most generally, our results clearly favor step-wise solution concepts like IBR over equilibrium-based solution concepts (e.g. Parikh (2001)) as predictors of participants’

pragmatic reasoning: our results suggest that interlocutors do take perspective and simulate each oth- ers’ beliefs, although (a) message and interpretation choice behavior is not always optimal and (b) perspective-taking decreases as the number of reasoning steps required to arrive at the optimal response, as predicted byIBR, increases.

We also found evidence for an intriguing asymmetry between production and comprehension.

While not predicted by the standard formulation of the IBR model, this asymmetry is consistent with an interpretation of IBRthat takes into account the uncertainty that interlocutors have about the probability of communicative success given a restricted set of message and interpretation options. This calls for a revision of theIBRmodel to incorporate more nuanced quantitative information. Since, moreover, there is a substantial amount of individual variation, further investigating the role of individual differences on perspective-taking (e.g. Brown-Schmidt (2009)) promises to be a fruitful avenue of further research that could inform model revisions.

It could be objected that the comparison of implicatures across experiments may be problematic due to the different nature of the tasks involved in the production vs. comprehension experiments and differences underlying the involved inference pro-

(19)

cesses. However, note that the version of the IBR

model that takes into account interlocutor uncertainty predicts the asymmetry between production and comprehension that we found precisely by in- tegrating some of the differences involved in the two processes: most importantly, since conversation is modelled as a dynamic game, the sender reasons about the future behavior of the receiver, while the receiver reasons “backward”, so to speak, using Bayesian conditionalization, about the most likely initial state the sender could have been in; this gives rise, as we have seen, to different predictions about when a speaker or a hearer can be absolutely certain of communicative success. How this difference is implemented mechanistically is an interesting question that merits further investigation.

Frank and Goodman (2012) report the results of an experiment using a referential game almost identical to ours and show that a particular Bayesian choice model very reliably predicts the observed data for both comprehension and production. In fact, the proposed Bayesian model is a variant of

IBR reasoning that considers only a level-1 sender and a level-2 receiver, but assumes smoothed best response functions at each optimization step. In a smoothedIBR model, players’ choices are stochas- tic with choice probabilities proportional to expected utilities (see Rogers et al. (2009) for a general formulation of such a model in game theoretic terms).

This suggests a straightforward agenda for future work: combining our approach and that of Frank and Goodman (2012), smoothedIBRmodels that allow various strategic types for speakers and listeners should be further tested on empirical data.

In related work investigating comprehenders’ ca- pacity for deriving ad hoc scalar implicatures, Stiller et al. (2011) found that subjects could draw simple implicatures of the type we report above in a setup very similar to ours, but failed to draw complex ones. In contrast, our comprehenders performed above chance in the complex condition (albeit only slightly so). One possible explanation for this difference is that unlike Stiller et al. (2011), we restricted the set of message alternatives and also made it ex- plicit to participants that a message could only de- note one feature. This highlights the importance of (mutual knowledge of) the set of alternatives assumed by interlocutors in a particular communica-

tive setting. While we restricted this set explicitly, in natural dialogue there is likely a variety of factors that determine what constitutes an alternative.

This suggests that future extensions of this work should move towards an artificial language paradigm. For example, whether a given message constitutes an alternative is likely to be affected by message complexity, which was held constant in our setup by using pictorial messages. Artificial language paradigms allow for investigating the effect of message complexity on inferences of the type reported here. Similarly, it will be important to further test the quantitative predictions made by IBR, e.g. by parametrically varying the payoff of communicative success and failuresandfand the interaction thereof with message complexity.

One question that arises in connection with the restrictions we imposed on the set of available pictorial messages, is the extent to which our results are transferable to natural language use. This is a legitimate concern that we would have to address empirically in future work. But notice also that, firstly, there is no a priori reason to believe that reasoning about natural language use and reasoning about our abstract referential games should necessarily differ — indeed it has been noted as early as Grice (1975) that conversational exchanges consti- tute but one case of rational communicative behavior. More importantly, even if reasoning about natural languagewere different in kind from strategic reasoning in general, the kind of strategic IBRrea- soning we address here is a specific variety of reasoning that has been explicitly proposed in the lit- erature as a model of pragmatic reasoning. The reported experiments are thus relevant in at least as far as they are the first empirical test of whether human reasoners are, in general, able to performthis kind of strategic reasoning in a task that translates the proposed pragmatic context models as directly as possible into an experimental setting.

We conclude that the studies reported are an encouraging first step towards validating game- theoretic approaches to formal pragmatics, which are well-suited to modeling pragmatic phenomena and generating quantitative, testable predictions about language use. The future challenge, as we see it, lies in fine-tuning the formal models alongside further careful empirical investigation.

(20)

Acknowledgements

We thank Gerhard J¨ager, T. Florian Jaeger, and Michael K. Tanenhaus for fruitful discussion. This work was partially supported by a EURO-XPRAG grant awarded to the authors and NIH grant HD- 27206 to Michael K. Tanenhaus.

References

Anton Benz and Robert van Rooij. 2007. Optimal asser- tions and what they implicate.Topoi, 26:63–78.

Sarah Brown-Schmidt, Christine Gunlogson, and Michael K. Tanenhaus. 2008. Addressees distinguish shared from private information when interpreting questions during interactive conversation. Cognition, 107:1122–1134.

Sarah Brown-Schmidt. 2009. The role of executive function in perspective taking during online language comprehension. Psychonomic Bulletin and Review, 16(5):893 – 900.

Vincent P. Crawford and Nagore Iriberri. 2007. Fatal at- traction: Salience, na¨ıvet´e, and sophistication in experimental ”hide-and-seek” games. The American Eco- nomic Review, 97(5):1731–1750.

Robert Dale and Ehud Reiter. 1995. Computational interpretations of the gricean maxims in the gener- ation of referring expressions. Cognitive Science, 19(2):233–263.

Michael C. Frank and Noah D. Goodman. 2012. Predict- ing pragmatic reasoning in language games. Science, 336(6084):998.

Michael Franke. 2011. Quantity implicatures, exhaus- tive interpretation, and rational conversation. Seman- tics & Pragmatics, 4(1):1–82.

Dave Golland, Percy Liang, and Dan Klein. 2010.

A game-theoretic approach to generating spatial descriptions. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Process- ing, pages 410–419, Cambridge, MA. Association for Computational Linguistics.

H.P. Grice. 1975. Logic and conversation. Syntax and Semantics, 3:41–58.

Daniel Grodner and Julie C. Sedivy. 2011. The effect of speaker-specific information on pragmatic inferences.

In N. Pearlmutter and E. Gibson, editors,The Process- ing and Acquisition of Reference. MIT Press, Cam- bridge, MA.

Daniel Grodner, Natalie M. Klein, Kathleen M. Carbary, and Michael K. Tanenhaus. 2010. ”Some”, and possi- bly all, scalar inferences are not delayed: Evidence for immediate pragmatic enrichment. Cognition, 116:42 – 55.

Joy Hanna, Michael K. Tanenhaus, and John C.

Trueswell. 2003. The effects of common ground and perspective on domains of referential interpretation.

Journal of Memory and Language, 49:43–61.

Trey Hedden and Jun Zhang. 2002. What do you think i think you think?: Strategic reasoning in matrix games.

Cognition, 85(1):1–36.

Daphna Heller, Daniel Grodner, and Michael K. Tanen- haus. 2008. The role of perspective in identifying domains of reference.Cognition, 108:831–836.

Y. Huang and Jesse Snedeker. 2009. On-line interpretation of scalar quantifiers: Insight into the semantics- pragmatics interface. Cognitive Psychology, 58:376–

415.

Gerhard J¨ager. 2008. Applications of game theory in linguistics. Language and Linguistics Compass, 2/3:406–421.

Gerhard J¨ager. 2011. Game-theoretical pragmatics. In Johan van Benthem and Alice ter Meulen, editors, Handbook of Logic and Language, pages 467–491. El- sevier, Amsterdam.

Boaz Keysar, Dale J. Barr, and J. S. Brauner. 2000. Tak- ing perspective in conversation: The role of mutual knowledge in comprehension. Psychological Science, 11:32–37.

Boaz Keysar, S. Lin, and Dale J. Barr. 2003. Limits on theory of mind use in adults. Cognition, 89:25–41.

Prashant Parikh. 2001. The Use of Language. CSLI Publications, Stanford University.

Brian W. Rogers, Thomas R. Palfrey, and Colin Camerer.

2009. Heterogeneous quantal response equilibrium and cognitive hierarchies. Journal of Economic The- ory, 144(4):1440–1467.

Julie C. Sedivy. 2003. Pragmatic versus form-based ac- counts of referential contrast: Evidence for effects of informativity expectations. Journal of Psycholinguis- tic Research, 32:3–23.

Alex Stiller, Noah D. Goodman, and Michael C. Frank.

2011. Ad-hoc scalar implicature in adults and chil- dren. InProceedings of the 33^rdAnnual Conference of the Cognitive Science Society.

(21)

Using a Bayesian Model of the Listener to Unveil the Dialogue Information State

Hendrik Buschmeier and Stefan Kopp

Sociable Agents Group – CITEC and Faculty of Technology, Bielefeld University PO-Box 10 01 31, 33501 Bielefeld, Germany

{hbuschme, skopp}@uni-bielefeld.de

Abstract

Communicative listener feedback is a pre- valent coordination mechanism in dialogue.

Listeners use feedback to provide evidence of understanding to speakers, who, in turn, use it to reason about the listeners’ mental state of listening, determine the groundedness of communicated information, and ad- apt their subsequent utterances to the listeners’ needs. We describe a speaker-centric Bayesian model of listeners and their feedback behaviour, which can interpret the listener’s feedback signal in its dialogue context and reason about the listener’s mental state as well as the grounding status of objects in information state.

1 Introduction

In dialogue, the interlocutor not currently holding a turn, is usually not truly passive when listen- ing to what the turn-holding interlocutor is saying.

Quite the contrary, ‘listeners’ actively participate in the dialogue. They do so by providing commu- nicative feedback, which, among other signals, is evidence of their perception, understanding and acceptance of and agreement to the speakers’ ut- terances. ‘Speakers’ use this evidence to reason about common ground and to design their utter- ances to accommodate the listener’s needs. This interplay makes communicative listener feedback an important mechanism for dialogue coordination and critical to dialogue success.

From a theoretical perspective, however, the in- terpretation of communicative feedback is a diffi- cult problem. Feedback signals are only conven- tionalised to a certain degree (meaning and use might vary with the individual listener) and, as Allwood et al. (1992) argue, they are highly sensit-

utterances – and the communicative situation in general.

We present a Bayesian network model for inter- preting a listener’s feedback signals in their dia- logue context. Taking a speaker-centric perspect- ive, the model keeps representations of the men- tal ‘state of listening’ attributed to the listener in the form of belief states over random variables, as well as an estimation of groundedness of the information in the speaker’s utterance. To reason about these representations, the model relates the listener’s feedback signal to the speaker’s utter- ance and his expectations of the listener’s reaction to it.

2 Background and related work

Feedback signals, verbal-vocal or non-verbal, are communicative acts

¹

that bear meaning and serve communicative functions. Allwood et al. (1992, p. 3) identified four

basic

communicative functions of feedback, namely

contact

(being “willing and able to continue the interaction”),

perception

(be- ing “willing and able to perceive the message”),

understanding

(being “willing and able to understand the message”), and

attitudinal reactions

(being

“willing and able to react and (adequately) respond to the message”). It is also argued that these func- tions form a hierarchy such that higher functions encompass lower ones (e.g., communicating under- standing implies perception, which implies being in contact). Kopp et al. (2008) extended this set of basic functions by adding

acceptance/agreement

(previously considered an attitudinal reaction) and

1Note, however, that listeners might not be (fully) aware of some of the feedback they are producing. Not all should be considered as necessarily having communicative intent (Allwood et al., 1992). Nevertheless, even such ‘indicated’

feedback is communicative and is often interpreted by inter-

(22)

by regarding expressions of emotion as attitudinal reactions

Feedback signals can likely take an infinite num- ber of forms. Although verbal-vocal feedback sig- nals, as one example, are taken from a rather small repertoire of lexical items such as ‘yes’, ‘no’, as well as non-lexical vocalisations such as ‘uh-huh’,

‘huh’, ‘oh’, ‘mm’, many variations can be produced spontaneously through generative processes such as by combination of different vocalisations or re- peating syllables (Ward, 2006). In addition, these verbalisations can be subject to significant pros- odic variation. Naturally, this continuous space of possible feedback signals can express much more than the basic functions described above.

And listeners make use of these possibilities to ex- press subtle differences in meaning (Ehlich, 1986) – which speakers are able to recognise, interpret (Stocksmeier et al., 2007; Pammi, 2011) and react to (Clark and Krych, 2004).

For a computational model of feedback produc- tion, Kopp et al. (2008) proposed a simple concept termed ‘listener state.’ It represents a listener’s current mental state of contact, perception, under- standing, acceptance and agreement as simple nu- merical values. The fundamental idea of this model is that the communicative function of a feedback signal encodes the listener’s current mental state.

An appropriate expression of this function can be retrieved by mapping the listener state onto the continuous space of feedback signals.

In previous work (Buschmeier and Kopp, 2011), we adopted the concept of listener state as a repres- entation of a mental state that speakers in dialogue

attribute

to listeners through Theory of Mind. That is, we made it the result of a feedback interpret- ation process. We argued that such an ‘attributed listener state’ (

ALS

) is an important prerequisite to designing utterances to the immediate needs a listener communicates through feedback. The

ALS

captures such needs in an abstract form (e.g., is there a difficulty in perception or understanding) by describing them with a small number of vari- ables, and is in this way similar to the “one-bit, most minimal partner model” which Galati and Brennan (2010, p. 47) propose as a representation suitable for guiding general audience design pro- cesses in dialogue.

For more specific adaptations, a speaker needs to consider more detailed information, such as

1996). Knowing whether previously conveyed in- formation can be assumed to be part of the com- mon ground (or even its degree of groundedness [Roque and Traum, 2008]) is important in order to estimate the success of a contribution (and initiate a repair if necessary) and to produce subsequent ut- terances that meet a listener’s informational needs.

Analysing an inherently vague phenomenon such as feedback signals in their dialogue context is almost only possible in a probabilistic frame- work. It is difficult to draw clear-cut conclusions from listener feedback and even human annotators, not being directly involved in the interaction, have difficulties consistently annotating feedback sig- nals in terms of conversational functions (Geertzen et al., 2008).

A probabilistic framework well suited for reas- oning about knowledge in an uncertain world is that offered by Bayesian networks. They represent knowledge in terms of ‘degrees of belief’, meaning that they do not hold one definite belief about the current state of the world, but represent different possible world states along with their probabilit- ies of being true. Furthermore, Bayesian networks make it possible to model the relevant influences between random variables representing different aspects of the world in a compact model. This is why they are potentially well suited for reasoning about feedback use in dialogue. Using a Bayesian network, the conditioning influences between dia- logue context, listener feedback,

ALS

, as well as the estimated grounding status of speaker’s utter- ances can be captured in a unified and well-defined probabilistic framework.

Representing grounding status not only in de- grees of groundedness but also in terms of de- grees of belief, adds a new dimension to the ap- proach put forth by Roque and Traum (2008). Deal- ing with uncertainty in the representation of com- mon ground simplifies the interface to vague in- formation gained from listener feedback, and re- moves the need to prematurely commit to a specific grounding level. This keeps the information status of an utterance open to change.

Bayesian networks have already been used

to model problems similar to the one in ques-

tion. Paek and Horvitz (2000), for example, use

Bayesian networks to manage the uncertainties,

among other things, in the model of grounding

behaviour in the ‘Quartet’ architecture for spoken

(23)

other hand created a Bayesian network model of dialogue system users’ grounding behaviour. There the Bayesian network simulates consistent user behaviour which can be used for experimenta- tion with, and training of, dialogue management policies. Finally, Stone and Lascarides (2010) pro- pose to combine Bayesian networks with the logic based Segmented Discourse Representation The- ory (

SDRT

; Asher and Lascarides, 2010) for a the- ory of grounding in dialogue that is both rational (in the utility theoretic sense) and coherent (by assigning discourse relations a prominent role in making sense of utterances).

3 A Bayesian model of the listener

A speaker’s Bayesian model of a listener should relate dialogue context, listener feedback, the at- tributed listener state as well as the grounding status of the speaker’s utterances to each other.

Constructing such a model either needs corpora with fine-grained annotations of all these aspects of dialogue (to ‘learn’ it from data) or detailed knowledge about the relations (to design it). Apart from the fact that adequate corpora are practic- ally non-existent, structure-learning of a Bayesian network can only infer conditional independence between variables and not their underlying causal relations. The top-ranking results of a structure learning algorithm might therefore differ substan- tially, resulting in networks that disagree about influences and causal relationships (Barber, 2012).

For this reason, we take the approach of construct- ing a Bayesian network by ‘hand’, making – as is not uncommon in cognitive modelling – informed decisions based on research findings and intuition.

3.1 Assumed causal structure

When analysing or modelling a phenomenon with Bayesian networks, it is helpful to think of them as representing the phenomenon’s underlying causal structure (Pearl, 2009). Network nodes represent causes, effects or both, and directed edges between nodes represent causality. A directed edge from a node

A

to a node

B, for example, models thatA

is a cause for

B, and thatB

is an effect of

A. Another

directed edge from

B

to a third node

C, makesB

the cause of

C. Being intermediate, it is possible

that

B

is both an effect (of

A) and a cause (ofC).

Figure 1 illustrates the causal structure of listener feedback in verbal interaction that we as-

S L

IS

Utterance Expec-

tations

ALS Mental

state Situation

Feedback IS

Figure 1: SpeakerSreasoning about the mental state of listener L.S’s utterances causeLto move into a certain state of understanding. This influencesL’s feedback signals, which are evidence forS’s attributed listener state ofL.

an utterance in the presence of a listener

L

and wants to know what

L’s mental state of listening is

towards her utterance, i.e., whether

L

is in contact, has perceived, understood and accepts or agrees with

S’s utterance. As it is impossible forS

to dir- ectly observe

L’s mental state, she can only try to

reconstruct it based on

L’s communicative actions

(i.e.,

L’s feedback) and by relating it to the dia-

logue context: her utterance, her expectations and the communicative situation.

To make a causally coherent argument, we as- sume, for the moment, that

L’s unobservable men-

tal state is part of the Bayesian listener model (parts unobservable to

S

are drawn with grey dashed lines in Figure 1).

L’s mental state results

from the effect of

S’s utterance, the communicative

situation as well as

L’s information state.L’s men-

tal state, on the other hand, causes him to provide evidence of his understanding by producing a feed- back signal. In this way closure is achieved for the causal chain from utterance, via mental state and feedback signal, to

S’s reconstruction ‘ALS

’ of

L’s

mental state.

This causally coherent model can easily be re- duced to an agent-centric model for

S, which con-

sists of only those influences that

S

SeineDial: 16th Workshop on the Semantics and Pragmatics of Dialogue (SemDial)

HAL Id: hal-01138035

https://hal.archives-ouvertes.fr/hal-01138035

SeineDial: 16th Workshop on the Semantics and Pragmatics of Dialogue (SemDial)

Sarah Brown-Schmidt, Jonathan Ginzburg, Staffan Larsson

To cite this version:

Proceedings of SemDial 2012 (SeineDial):

The 16 th Workshop on the Semantics and Pragmatics of Dialogue

Sarah Brown-Schmidt, Jonathan Ginzburg, Staffan Larsson (eds.)

Sponsors

Programme Committee

Sarah Brown-Schmidt

Staffan Larsson

Organizing Committee

CONTENTS

Referential Coordination through Mental Files

Franc¸ois Recanati Institut Jean-Nicod Ecole Normale Superieure 29 rue d’Ulm, 75005 Paris, France

recanati@ens.fr

http://www.institutnicod.org

On the standard model, linguistic communica- tion makes it possible for the hearer to entertain the thoughts expressed by the speaker, and what makes that possible is the fact that the thoughts in question are encoded in the speakers words.

In the last part of the talk I will apply the coor-

dination model of communication to the referen-

tial use of definite descriptions, and I will discuss

a key objection based on the distinction between

semantic reference and speakers reference.

Optimal Reasoning About Referential Expressions

Judith Degen

Dept. of Brain and Cognitive Sciences University of Rochester

jdegen@bcs.rochester.edu

Michael Franke ILLC

Universiteit van Amsterdam m.franke@uva.nl

Abstract

1 Introduction

2 Referential Language Games

3

Reasoning

4 Experiment 1

5 Experiment 2

6 General Discussion

Acknowledgements

References

Using a Bayesian Model of the Listener to Unveil the Dialogue Information State

Hendrik Buschmeier and Stefan Kopp

Sociable Agents Group – CITEC and Faculty of Technology, Bielefeld University PO-Box 10 01 31, 33501 Bielefeld, Germany

{hbuschme, skopp}@uni-bielefeld.de

Abstract

1 Introduction

In dialogue, the interlocutor not currently holding a turn, is usually not truly passive when listen- ing to what the turn-holding interlocutor is saying.

From a theoretical perspective, however, the in- terpretation of communicative feedback is a diffi- cult problem. Feedback signals are only conven- tionalised to a certain degree (meaning and use might vary with the individual listener) and, as Allwood et al. (1992) argue, they are highly sensit-

utterances – and the communicative situation in general.

2 Background and related work

Feedback signals, verbal-vocal or non-verbal, are communicative acts

that bear meaning and serve communicative functions. Allwood et al. (1992, p. 3) identified four

communicative functions of feedback, namely

(being “willing and able to continue the interaction”),

(be- ing “willing and able to perceive the message”),

(being “willing and able to understand the message”), and

(being

(previously considered an attitudinal reaction) and

by regarding expressions of emotion as attitudinal reactions

Feedback signals can likely take an infinite num- ber of forms. Although verbal-vocal feedback sig- nals, as one example, are taken from a rather small repertoire of lexical items such as ‘yes’, ‘no’, as well as non-lexical vocalisations such as ‘uh-huh’,

And listeners make use of these possibilities to ex- press subtle differences in meaning (Ehlich, 1986) – which speakers are able to recognise, interpret (Stocksmeier et al., 2007; Pammi, 2011) and react to (Clark and Krych, 2004).

An appropriate expression of this function can be retrieved by mapping the listener state onto the continuous space of feedback signals.

In previous work (Buschmeier and Kopp, 2011), we adopted the concept of listener state as a repres- entation of a mental state that speakers in dialogue

to listeners through Theory of Mind. That is, we made it the result of a feedback interpret- ation process. We argued that such an ‘attributed listener state’ (

) is an important prerequisite to designing utterances to the immediate needs a listener communicates through feedback. The

For more specific adaptations, a speaker needs to consider more detailed information, such as

, as well as the estimated grounding status of speaker’s utter- ances can be captured in a unified and well-defined probabilistic framework.

Bayesian networks have already been used

to model problems similar to the one in ques-

tion. Paek and Horvitz (2000), for example, use

Bayesian networks to manage the uncertainties,

among other things, in the model of grounding

behaviour in the ‘Quartet’ architecture for spoken

; Asher and Lascarides, 2010) for a the- ory of grounding in dialogue that is both rational (in the utility theoretic sense) and coherent (by assigning discourse relations a prominent role in making sense of utterances).

3 A Bayesian model of the listener

A speaker’s Bayesian model of a listener should relate dialogue context, listener feedback, the at- tributed listener state as well as the grounding status of the speaker’s utterances to each other.

For this reason, we take the approach of construct- ing a Bayesian network by ‘hand’, making – as is not uncommon in cognitive modelling – informed decisions based on research findings and intuition.

to a node

is a cause for

The 16 ^th Workshop on the Semantics and Pragmatics of Dialogue