A Goal-Directed User Model with Memory - A F IRST S IMPLE M ODEL

CHAPTER 5: USER MODEL

5.2. A F IRST S IMPLE M ODEL

5.2.1. A Goal-Directed User Model with Memory

To overcome problems described in the above, a similar UM is proposed but each probability is then also conditioned on the user’s goal and on a small amount of information about the history of the interaction. The UM has then to build a new goal at the beginning of each dialogue session and will act consistently according this goal all along the interaction. Moreover, the information kept by the UM about the history mainly counts for occurrence of the same system’s utterance and a new probability is added: the probability to hang off.

According to the notation of Fig. 27, the information kept by the UM about the history is roughly the user’s knowledge at time t (k_t) and the goal is denoted g.

The goal contains a set of AV pairs that will condition the answers of the UM

and the knowledge contains few information about the history of the current dialogue (see Table 2).

The goal not only contains AV pairs but also priorities for each AV pair that will be helpful when prompted for relaxation of some attribute. An AV pair with high priority will be less likely to be relaxed. The knowledge of the UM is essentially composed of counters initialised at 0 and incremented each time a value is provided for a given attribute. This will enable the UM to supply new

AV pairs when mixed-initiative behaviour is provided (and not to provide already supplied values) and to react to unsatisfactory behaviour of the system.

Goal Know.

Att. Val. Prior. Count

g¹ ¹

vi ^p¹ ^k¹

g² ²

vj ^p² ^k²

M M M M

gⁿ ⁿ

vm ^pⁿ ^kⁿ

Table 2: UM’s goal and knowledge representation The probabilities then become:

• Probabilities associated with responses to greeting:

- P(n|Greeting, g) is the probability to provide n AV pairs in a single utterance when the greeting message is prompted according to the goal g.

- P(A|k_t, g) is the prior probability distribution of attributes given the knowledge at time t of the UM about the history and its goal.

It is modified by the knowledge since already provided attributes (k^α> 0) see their probability decreasing.

- P(V|A, g) is the probability distribution of values of each attribute in A given the goal. The probability of providing values appearing in the goal is maximal while assessing non-null values to other probabilities can simulate lack of cooperativeness.

• Probabilities associated with responses to constraining questions:

- P(u^α| s^β, kt, g) is the probability of the user to provide a value for attribute u^α when prompted for attribute s^β. Here again the knowledge and the goal will determine which value will be provided.

- P(n| s^β) is the probability of the user to provide n unsolicited AV

pairs when prompted for attribute s^β. Attribute will be provided according to P(A|k_t, g) which means that the probability of providing attributes that have not yet been provided can be set to higher values since the knowledge of the user conditions the probability.

• Probabilities associated with responses to relaxation prompts:

- P(yes| s^β, kt, g) = 1- P(no| s^β, kt, g) is the probability of the user to accept the relaxation of attribute s^β. Priorities associated to AV

pairs in the user’s goal will modify the probability of certain attributes to be relaxed when asked for.

• Probabilities associated to user’s satisfaction:

- P(close| s^β, k_t, g) is the probability to close the dialogue session before the natural end when asked for the argument s^βbecause of unsatisfactory behaviour of the system, for example when the counter of k^β associated to the attribute s^β is over a given threshold.

At the opposite of the previous described model, the mean length of a dialogue is now a function of user’s goal (and thus of the task structure), knowledge and degree of cooperativeness, but not of the number of possible values that can be taken by each of the arguments. The model can be learned or handcrafted. Indeed, even if the number of probabilities is larger

than in the previous model, they can be assessed a priori by using a set of rules if no data is available:

• The probability P(u^α| s^β, kt, g) is set to a high value for u^α = s^β and non-null values can be assigned to other values to simulate the lack of cooperativeness.

• P(A|kt, g) is conditioned on the knowledge and already provided attributes (k^α> 0) have lower probabilities. Moreover, the priority of an attribute increases it probability to be provided.

• P(V|A, g) is maximal for the value appearing in the goal for the attribute while assessing non-null values to other probabilities can simulate lack of cooperativeness (a special care must be take to ensure that the sum of the probabilities is equal to 1).

• P(yes| s^β, k_t, g) is assessed according to the priority of the attribute:

the higher the priority, the lower the probability to relax it.

• The probability P(close| s^β, k_t, g) to close the dialogue when asked for the attribute s^β is based on a threshold for k^β.

The generated dialogues are consistent according to the goal and infinite dialogues are impossible if the probability P(close| s^β, kt, g) is non null for all values of s^β.

Moreover, the specification of the goal as a set of AV pairs enables the measure of the task completion. For example, in the case of a form filling dialogue or a voice-enabled interface for database querying, the system can access the AV pairs transmitted from the UM to itself and then compare them.

The kappa coefficient described in section 2.4.5 can be used for task completion measure, for instance.

The task completion measure can condition the user satisfaction, but unsatisfactory behaviour of the system, leading to a dialogue prematurely closed by the user can also lead to a decreased satisfaction. Thus, this model also allows a simple modelling of user satisfaction.

5.2.1.1. Application

Although this model is still very simple, it has been tested on a database querying system simulating an automatic computer dealing system like described in [Pietquin & Renals, 2002] and provided good results in the framework of strategy learning. In this experiment, the database contained 350 different computer configurations split into 2 tables (for notebooks and desktops) containing 5 fields each: pc_mac (pc or mac), processor_type,

processor_speed, ram_size, hdd_size. To demonstrate this UM behaviour, it can be connected to a simple SDS that can perform 6 generic actions:

greeting, constraining questions (const(arg)), confirmation (conf(arg)), relaxation queries rel(arg), database queries (dbquery), close the dialogue (close). Each argument (arg) may be the table’s type (notebook or desktop) or one of the 5 table fields.

In order to demonstrate the user’s behaviour, a random SDS strategy is chosen. That is the system is only constrained to start with the greeting prompt and subsequently chose any other action with the same probability.

Notice that the system did not enter into a negotiation phase in order to conclude the selling but retrieves a computer corresponding to the user’s request in the database and provides information about the brand and the price of the selected product.

In this framework, the UM’s goal can be described as in Table 3.

Goal Know.

Att. Val. Prior. Count

notebook_desktop ‘desktop’ high k¹= 0

pc_mac ‘PC’ high k²= 0

processor_type ‘Pentium’ high k³= 0 processor_speed 800 low k⁴= 0

ram_size 128 low k⁵= 0

hdd_size 20 low k⁶= 0

Table 3: UM’s goal in the computer dealing task

The threshold for counts is set at 3 (that means that the after having provided the value for an attribute and confirmed it, the UM considers that this piece of information should not be asked again). Once again, the communication canal is supposed to be perfect and the understanding process as well. A typical problematic dialogue is shown in Table 4.

Intentions k Expanded Intentions sys0 A_S = greeting Hello! How may I help you?

u¹ = note_desk

v¹ = ‘desktop’ k¹ = 1 I’d like to buy a desktop computer.

sys1 AS = const(pc_mac) Would you prefer a PC or a Mac?

u₁ u¹ = pc_mac

v¹ = ‘PC’ k² = 1 I’d prefer a PC.

sys2 AS = rel(note_desk) Don’t you prefer a laptop?

u2 u¹ = rel No.

v¹ = false (p = high)

sys₃ AS = conf(pc_mac) Can you confirm you want a PC?

u¹ = conf

v¹ = true k² = 2 Yes, I want a PC

… … … …

sysi AS = const(pc_mac) Would you prefer a PC or a Mac?

ui u¹ = close

v¹ = true k² = 3 Ok, bye! (hang off)

Table 4: A problematic dialogue with the goal-directed UM

This example illustrates two main points of the goal-directed model. First, a relaxation query is denied according to the goal priority of the attribute that was asked for relaxation. Second, the user closes the dialogue because the system behaved very badly. The update of the user’s counts is also shown.

Dans le document 1 A Framework for Unsupervised Learning of Dialogue Strategies A Framework for Unsupervised Learning of Dialogue Strategies (Page 131-135)