• Aucun résultat trouvé

Regret and the rationality of choices

N/A
N/A
Protected

Academic year: 2022

Partager "Regret and the rationality of choices"

Copied!
11
0
0

Texte intégral

(1)

HAL Id: ijn_00432308

https://jeannicod.ccsd.cnrs.fr/ijn_00432308

Submitted on 16 Nov 2009

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Regret and the rationality of choices

Sacha Bourgeois-Gironde

To cite this version:

Sacha Bourgeois-Gironde. Regret and the rationality of choices. Philosophical Transactions of the Royal Society B: Biological Sciences, Royal Society, The, 2009, pp.1-9. �ijn_00432308�

(2)

Author Queries

Journal: Philosophical Transactions of the Royal Society B Manuscript: rstb20090163

Q1 References Bell (1982), Blum et al. (1996) and Baratginet al. (in preparation) have been cited in text but not provided in the list. Please supply reference details or delete the reference citation from the text.

Q2 Please note that the reference citations Allais (1954) and Pieters & Zeelenberg (2005) have been changed to Allais (1953) and Pieters & Zeelenberg (2003) with respect to the reference list provided.

Q3 We have inserted a citation for Figures 1–6. Please approve or provide an alternative.

Q4 Please supply a caption for this table.

Q5 Please supply a caption for figure 5.

(3)

Regret and the rationality of choices

Sacha Bourgeois-Gironde*

Institut Jean-Nicod(ENS-EHESS), Pavillon Jardin, Ecole Normale Supe´rieure, 29, rue d’Ulm 75005, Paris, France

Regret helps to optimize decision behaviour. It can be defined as a rational emotion. Several recent neurobiological studies have confirmed the interface between emotion and cognition at which regret is located and documented its role in decision behaviour. These data give credibility to the incor- poration of regret in decision theory that had been proposed by economists in the 1980s.

However, finer distinctions are required in order to get a better grasp of how regret and behaviour influence each other. Regret can be defined as a predictive error signal but this signal does not necessarily transpose into a decision-weight influencing behaviour. Clinical studies on several types of patients show that the processing of an error signal and its influence on subsequent behav- iour can be dissociated. We propose a general understanding of how regret and decision-making are connected in terms of regret being modulated by rational antecedents of choice. Regret and the modification of behaviour on its basis will depend on the criteria of rationality involved in decision-making. We indicate current and prospective lines of research in order to refine our views on how regret contributes to optimal decision-making.

Keywords:regret; predictive error signal; decision weight; addiction; paradoxes of rationality

1. INTRODUCTION

Regret can be defined as a rational emotion in the sense that its presence seems to be correlated with improved decision-making. Regret is defined as invol- ving both cognitive and emotional components. On the basis of a comparison between what I got and what I could have got, I may experience to a variable extent the emotion of regret. On the basis of this emotion, I will attune my future decisions. Anticipated regret can then be defined as a decision criterion.

Recent neurobiological evidence has tended to con- firm this simple view, which gives some credibility to the incorporation of regret in decision theory that had been proposed by decision theorists in the 1980s. However, finer distinctions are required in order to get a better grasp of how regret and behaviour influence each other. Anticipated regret can be defined as a predictive error signal: the human brain on the basis of past experience forms comparative expec- tations on the results of available alternative courses of action. But the information on the most favourable course of action does not necessarily transpose into a corresponding optimal decision. Clinical studies on several types of patients show that the processing of an error signal and its influence on subsequent behav- iour can be dissociated. We will discuss some of these data in order to refine our views on how regret contrib- utes to optimal decision-making. We also propose a general understanding of how regret and decision- making are connected in terms of regret being modulated by rational antecedents of choice.

Namely, regret and the modification of behaviour on its basis will depend on the criteria of rationality

involved in decision-making. Intuitively, the more rational I think my decision was, the less I tend to regret its outcomes. But we will be interested in less clear-cut cases, like when, in particular, apparent con- flicting rational decision criteria prevail in choice. The aim of this article is to suggest conceptual refinements, by evaluating the evidence of existing or ongoing experiments, on how the rationality of choices, the experience of regret and the optimization of behaviour are in principle connected and potentially disconnected in some clinical conditions.

2. TESTING THE REGRET EXPLANATION OF ALLAISIAN BEHAVIOUR

Regret has been incorporated into theories of rational decision-making (Bell 1982; Loomes & Sugden 1982; Hart & Mas-Collel 2000) because of theQ1 explanation it provides of apparent deviations from rationality such as transitivity and independence of choice from irrelevant alternatives. Regret-theory, notably, explains theAllais (1953)paradox. Q2

Let us represent the classical Allais paradox by the following matrix.

Matrix 1: standard Allaisian behaviour.

P(p¼0.01) Q(p¼0.10) R(p¼0.89)

A 500 000 500 000 500 000

B 0 2 500 000 500 000

C 500 000 500 000 0

D 0 2 500 000 0

Herep,qandrare states of affairs whose probability to occur is indicated by the figures in the second line from the top. In between-groups experiments, a

*sbgironde@gmail.com

One contribution of 11 to a Theme Issue ‘Rationality and emotions’.

Phil. Trans. R. Soc. B(2009)0, 1–9 doi:10.1098/rstb.2009.0163

1 This journal isq2009 The Royal Society 1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64

65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128

(4)

group of participants is invited to choose between options A and B and another group between options C and D. We then compare which options were favoured in each group. As underlined in bold charac- ters in the matrix, A is the option most often chosen in the first group and D the one favoured by the partici- pants in the second group. In within-subjects designs, when participants are presented with the whole matrix, the choice of the pairkA,Dlalso prevails.Kahneman &

Tversky (1979) report the following results for Allai- sian options presented to participants in extensive lottery forms:

(i) between groups: A: 82%/D: 83%

(ii) within subjects: B – C: 7/A – D: 60/B – D:

13/A – C: 5.

These results exemplify a violation of the indepen- dence axiom of Von Neumann and Morgenstern decision theory. The violation can be made intuitive by expressing it in terms of informational dispersion on the part of the subject, in the sense that she seemingly does not focus on the relevant decision- theoretical core of the matrix or lotteries she is presented with. Such normative informational focus has been labelled in terms of the elimination of common consequences of pairs of options in decision theory. It is made clear in the following matrix that states of affairrshould be discarded as it makes appar- ent that A and B and C and D are, respectively, similar from that standpoint. But once stripped of their common consequences, it is also clear that A and C and B and D are equivalent and that it is irrational to modify one’s choices across pairskA,BlandkC,Dl.

Matrix 2: deleting common consequences.

P(p¼0.01) Q(p¼0.10) R(p¼0.89)

A 500 000 500 000 500 000

B 0 2 500 000 500 000

C 500 000 500 000 0

D 0 2 500 000 0

Now, an obvious feature of Matrices 1 and 2 is the intuitiveness with which the respective choices A – D and B – D or A – C impose themselves on the subject’s mind. Intuitiveness is by no means a criterion of rationality, but the principle of elimination of common consequences practically embodies the axiom of independence which is at the core of rational decision-theory, and makes it visually salient in Matrix 2. However Allaisian behaviour as demon- strated through Matrix 1 is also intuitive and compelling. Individuals can easily justify their choices, even though they deviate from rational standards of decision theory. One can even experience conflicts of intuitions when asked to perform a choice in this task and knowingly deviate from rationality standards, hence, perhaps, its classical denomination as a paradox. Slovic & Tversky (1974) have shown that experts in decision theory consistently exemplify Allaisian behaviour even though they are of course perfectly cognizant of the independence axiom. The problem is then to understand what makes A – D

attractive in Matrix 1 and why Matrix 2 may not be a sufficiently powerful debiasing device.

An answer is given in Matrix 3 which incorporates anticipated regrets as weights of utility determining the A – D choice.

Matrix 3: introducing regret.

P(p¼0.01) Q(p¼0.10) R(p¼0.89)

A 500 000 500 000 500 000

B 0þR1 2 500 000 500 000

C 500 000 500 000 0

D 0þR2 2 500 000 0

R1 and R2 are qualitative designations of levels of regret. The usual explanation goes as follows:R1,R2, in the sense that if p occurs, you would regret more having chosen B instead of A, than if P orR occurs, you would regret having chosen C instead of D. So if B – D is the coherent pattern, R2—conceived as an amount of anticipated regret—has no enough weight to make you chose C, while R1 has enough of such

‘decision weight’ to make you chose A. Anticipated regret is then considered an explanatory factor of Allaisian behaviour. It vindicates the intuitive aspect of Matrix 1 but it also preserves rationality as presented in its crude form in Matrix 2 to the extent that it incor- porates regret as an ingredient which is rationally processed in decision-making, on a par with payoffs and their associated probabilities. When one includes regret, it is clear that the elimination of common conse- quences does not yield equivalent choices any longer and that apparent inconsistent behaviour can be explained away. But the argument relies now on the plausibility of a view of anticipated regret as inflecting decision behaviour in the intended sense.

The integration of regret in decision theory has been supported by recent neurobiological investi- gation. Present studies on the neural correlates of regret take advantage of previous observations on the role of the orbitofrontal cortex in the processing of reward and its role on subsequent behaviour. Rolls (2000) has evidenced the incapacity of orbitofrontal patients to modify their behaviour in response to negative consequences. Ursu & Carter (2005) have demonstrated how the anticipated affective impact of a choice was modulated by the comparison between the different available alternatives. These reasoning patterns, consisting in anticipating contrasts between actual outcomes and counterfactual ones (counterfac- tual in the sense that those outcomes are the ones that I would have gotten had I taken an alternative course of action), are reflected in the orbitofrontal cortex activity. More precisely, the impact of potentially nega- tive consequences of choices is essentially represented in the lateral areas of the orbitofrontal cortex, whereas the medial and dorsal areas of the prefrontal cortex are more specialized in the impact of positive conse- quences. Camille et al. (2004) have shown that patients presenting orbitofrontal lesions do not seem to take regret into account in experimental sessions repeating stimuli such as the following:

Partial feedback: in the partial feedback condition of Camille’s experiment, subjects consider two wheels 2 S. Bourgeois-Gironde Regret and the rationality of choices

Phil. Trans. R. Soc. B(2009) 129

130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192

193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256

ARTICLE IN PRESS

rstb20090163—12/10/09—11:26–Copy Edited by: C. Darlin Premela

(5)

presenting possible gains and losses, they pick up one of them (squared) and get feedback only for the chose wheel (figure 1)

Q3 .

Complete feedback: in the complete feedback con- dition, subjects get also feedback for the foregone wheel, making possible a comparison between what they get (the squared circle) and what they could have got (figure 2)

Q3 .

Camille et al. (2004) and Coricelli et al. (2005), using the same experimental paradigm in an fMRI study, show that the orbitofrontal cortex has a funda- mental role in experiencing regret and integrating cognitive and emotional components of the entire pro- cess of decision-making. Across repetition of this task, participants tend to become regret aversive. The authors speculate that the orbitofrontal cortex uses top-down process in which cognitive components, such as counterfactual thinking, modulate emotional and behavioural responses tending to increased regret aversion.

Regret is understood as an emotion guiding decision-making, fitting well with Damasio’s (1994) understanding of the contribution of emotions to rationality. The understanding of brain activities reflecting anticipated affective impacts makes possible the neurobiological validation of the regret hypothesis in orienting decision-making towards apparent non- normative behaviour. Laland & Grafman (2005) test lotteries on medial orbitofrontal patients and observe higher coherence among them than among healthy participants, although patients are not more risk- seeking. This is quite interesting because it shows that these patients—the same population with respect to which Damasio has elaborated his somatic marker hypothesis—do not show incoherence owing to inconsiderate risk-taking in decision-making.

Given plausible data on the connection between orbitofrontal lesions and the absence of regret, it would be interesting to directly tackle the original motive for which regret had been introduced in decision theory, namely to provide a plausible explanation of seemingly irrational behaviour, such

as the one provoked by the Allais problem. We specu- late that if the finding that orbitofrontal patients present an impaired treatment of regret is robust, and if anticipated regret is a correct explanation for the type of behaviour usually induced by Allais pro- blem, then those patients should behave normatively when facing Allais paradox stimuli. Unlike healthy subjects, they should not violate the independence axiom, rather they would show consistency across their choices and ironically behave normatively in a task that has been considered a staple of irrationality among decision theorists. Bourgeois-Gironde and Cova (in progress) directly test Allais problems on patients presenting focal orbitofrontal lesions, and first results tend to document coherence, rationality and limited risk-seeking behaviour among these patients. These data would tend to confirm the overall plausibility of the regret hypothesis in explaining Allai- sian behaviour. In cases in which anticipated regrets are a source of apparent biased decision-making, their presumed absence seems to make behaviour tend towards rationality as normatively encapsulated by the axiom of independence. But a better view remains to be acquired on the mechanisms through which an emotional and cognitive state such as regret manages to inflect behaviour in one way or the other.

3. REGRETS AS ERROR SIGNALS AND/OR DECISION WEIGHTS

Anticipated regret can be understood in neuroscience and learning models as a predictive error signal which is accompanied or not by an emotional state.

This signal can be simply defined as the difference between an actual outcome and a fictive or counterfac- tual outcome. On the basis of this signal, learning can take place in sequential rewarding tasks, as in the case in Camille and Coricelli’s studies. In those studies the underlying hypothesis is that orbitofrontal patients do not generate such signals and consequently cannot modify their behaviour by processing anticipated regrets. But an alternative hypothesis is that even

–50 –200

200 50 200 50 –50

–200

200 50 –50

–200

Figure 1. Partial feedback condition.

200 50 –50

–200

200 50 –50

–200

200 50 –50

–200

Figure 2. Complete feedback condition.

Regret and the rationality of choices S. Bourgeois-Gironde 3

Phil. Trans. R. Soc. B(2009) 257

258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320

321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384

(6)

though some patients may be unable to generate pre- dictive error signals, some others may generate them while these signals may not help modify their behav- iour. In the absence of regret-aversive behaviour, indeed, we need to discriminate between non- generation versus inefficiency of error signals in patients’ brains. The role of the orbitofrontal cortex may be associated with the integration of properly gen- erated error signals into behavioural strategies. In case of lesions of the orbitofrontal cortex, this integration does not take place, but an alternative cause of non-integration, in the presence of impaired orbito- frontal cortices, is a dysfunction in the production of error-signals.

The question was raised byChiuet al.(2008). They observed that chronic smokers showed a reduced influence of predictive error signals on subsequent behaviour. However, given the neural response in the caudate typically associated with the generation of pre- dictive errors (e.g.Lau & Glimcher 2007), the authors were also in a position to infer that there was no loss in the production of these signals. There was an observa- ble dissociation, then, between the generation of error signals and the modification of behaviour. It was as if the correct treatment of comparative information between actual experience and what might have been the case had no weight in improving subsequent repeated decision-making. Cognitive processing of information on potential outcomes and behavioural control were not integrated.

To get a precise specification on how caudate based generated error signals fail to play a role in optimizing behaviour of addictive smokers, Chiu and his col- leagues used the sequential investment game which can be abstractly represented as follows (figure 3)

Q3 .

A subject starts in the stateSt, in the centre square, and moves to state Stþ1, in the upper square. This is what the subject actually does. She has access to her actual gains. But she can also retrieve information about fictive experience, i.e. what she would have experienced had she followed another path, rep- resented by lateral arrows in the schema, and experienced alternative gains. In Chiu’s experiment, the decision to move to Stþ1 or to alternative states corresponds to investments of a portion of an individual endowment on a realistically reproduced fluctuating market. After each move the subject could compare the results of his investment decisions with the market returns history. Predictive error over gains is then computed as the difference between the maximum gain made possible by the market history and the actual gain realized by the individual. Two distinct groups of participants have performed this sequential market task: smokers and non-smokers.

In one experimental condition, smokers have been satiated while in the other they have been deprived of nicotine.

In order to determine the role played by predictive error signals in decision-making, Chiuet al. have con- centrated their analysis on predictive errors in the case of gains, i.e. only in situations in which participants earned something below the possible maximum market return. The question is to observe whether behaviour at tþ1 is dependent on less than optimal

positive returns at t. Individuals in the control group (non-smokers) illustrate this dependence as we observe among them a positive influence on the fore- gone maximal possible return on the subsequent investment decision. This is not the case either for sated or for non sated smokers. Behavioural patterns on this sequential investment task show that predictive error signals have no weights in smokers’ decision- making. However, brain-imaging data show that fictive error signals are equally generated among smokers and non-smokers. Activity in the bilateral ventral caudate nucleus has been correlated with the treatment of pre- dictive errors in the investment game (Lohrenz et al.

2007). Chiu et al. conclude that the intact neuronal response to predictive errors in smokers’ brains does not translate into corrective behavioural strategies.

This dissociation between error signals and behaviour can be further interpreted as a failure of integration between emotion and rationality. Significant activity in the anterior cingular cortex in nicotine deprived smokers, which can in fact be interpreted as a response to negative salient emotionally laden stimuli, show that a ‘feeling of error’ is experienced by this group of par- ticipants, even though it is not enough to modify their subsequent decisions.

As Ahmed (2004) clearly puts it ‘drug addicts are often portrayed as irrational persons who fail to maxi- mize future rewards. [. . .] (But) to prove that addiction is an irrational behaviour, one needs to show that addicts would be better off if they had been prevented from taking drugs in the first place’. The tacit postu- late in the application of learning models, and conceptual constructs such as ‘predictive error signal’, to suboptimal behaviour is that among distinct group of populations (addicts versus non-addicts) there is a homogeneous and exogenous appraisal of actual and counterfactual rewards. One can differently speculate that this very ability to deal with equanimity with such comparisons is precisely what is impaired in addictive brains (Redish 2004). Chiu himself inter- prets his results by confirming the idea that addicts may be thought to have a diminished response to bio- logical rewards: actual gains are not treated as rewards in the smokers group and are not positive reinforcers on which learning is normally based. But Chiu stops short of positing an endogenous dependence between the ‘internal’ supervisor which compares actual and foregone outcomes and addictive behaviour, because he observes that comparisons are intact while behav- iour does not take as inputs those cognitive, possibly associated with strong emotions, anticipated signals of regret.

Many studies have documented the role of midbrain dopamine neurons in generating predictive error

St

S S

St+1

Figure 3. Sequential investment game.

4 S. Bourgeois-Gironde Regret and the rationality of choices

Phil. Trans. R. Soc. B(2009) 385

386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448

449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512

ARTICLE IN PRESS

rstb20090163—12/10/09—11:26–Copy Edited by: C. Darlin Premela

(7)

signals (Schultz & Dayan 1997) and that dopamine was more sensitive to the prediction of reward than to the reception of reward (Heikkila et al. 1975).

In Redish’s model of addiction changes in the output of dopamine cells are supposed to signal to the fore- brain discrepancies between prediction of reward and actual reward. The role of dopamine in learning models can also be phrased in terms of a distinction between monitoring and control functions more familiar to students of metacognition. Addictive indi- viduals seem able to generate proper signals of error in the light of their past and present decisions but they are not able to maximize future rewards by con- ferring more weights to decisions that will issue on optimal outcomes. Monitoring is intact but discon- nected from cognitive control. This squares well with complementary data on discounting behaviour in non-smokers and smokers, the latter choosing com- paratively smaller immediate gains over larger more delayed ones (McClure et al. 2004). What is usually described in this context in terms of lack of control, impatience or myopia may be more generally inter- preted as the behavioural manifestation of a more general deficiency in the efficiency of dopamine based error signals to guide decision-making in an optimal sense.

The main lesson we can draw is the dissociation in certain individuals between the presence of signals of regret, both at cognitive and emotional and at implicit and explicit levels, and the correlative absence of strategic decision-making owing to the inefficiency of these signals in view of behavioural control. We can envision the reverse dissociation that would consist in over regret-aversive behaviour uncorrelated to the presence of reliable error signals. We saw in addictive patients that error signals were generated, that a course of action could be cognitively estimated to be the most optimal and that, yet, this estimation was not transposed into actual behaviour. Observing mani- festations of Tourette’s syndrome, one is tempted to describe a reverse sequence: an action is selected, which escapes cognitive and motor control (it is felt as an urge or a tic), and post hoc regret, if experienced, cannot be translated into a reliable error signal for the next occurrence of an action of this type. Blumet al.

(1996

Q1 ) argue that the dopaminergic system, and in particular the dopamine D2 receptor, has been

profoundly implicated in deficiencies of reward mech- anisms in Tourette’s syndrome. Overproduction of dopamine by the brain may induce a patient to pro- duce involuntary and uncontrolled actions. These involuntary actions should not in principle be associ- ated with efficient predictive error signals as they are uncontrolled.

An attempt at capturing this general prediction through a precise experimental paradigm is still tenta- tive and we simply suggest a possible way of making use at this juncture of the well-known behavioural economics so-called clicking paradigm (Erev &

Barron 2005) (figure 4). Q3

Simple decision tasks such as the clicking paradigm present the opportunity to manipulate the information on expected outcomes and feedback in a very flexible way. It is first possible to leave gains and their probabil- ities unknown at the moment of choice. Participants decide in a state of full ambiguity in the sense, then, that no information is made available. One can then vary the expected gains as the task unfolds, making it an implicit learning task, on the model of the classical Iowa Gambling Task (Damasio 1994). It is also poss- ible to provide a feedback, either partial or complete, once a choice is made between the two boxes. This reproduces the two major conditions in Camille and Coricelli’s experiments. But in the absence of explicit information at the moment of choice, the difference, again, is that no calculus is explicitly made at the moment of choice. The regret task is then embedded in an implicit learning task. In other terms, regret, as the task unfolds, will not tap directly into a cognitively elaborated anticipated counterfactual reasoning process, but directly into the experienced value of each box.

Another layer in the clicking paradigm can be manipulated, which more closely relates to the norma- tive dimension of regret we are interested in. In previous studies on the neurobiology of regret, the question whether regret was rational or not has been left aside. However, one can presume that regrets are finely modulated by their normative antecedents.

Schematically, if an individual is not responsible for any bad consequence she faces, that individual is less liable to experience regret than if she can attribute to herself the authorship of the act leading to that consequence (Zeelenberg 1999). Responsibility and the current experiment includes many trials. Your task, in each trial, is to click on one

of the two keys presented on the screen. Each click will result in a payoff that will be presented on the selected key, and will be added to your total payoff.

your goal is to maximize your total payoff.

click on one of the two keys.

Figure 4. The clicking paradigm.

Regret and the rationality of choices S. Bourgeois-Gironde 5

Phil. Trans. R. Soc. B(2009) 513

514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576

577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640

(8)

self-attributed authorship figure among what we label the rational antecedents of regret. Availability of information about the consequences of one’s choices is another obvious component of the rationality of regrets. In the clicking paradigm, one relevant combi- nation in order to study the adaptive impact of regret among Tourette patients would combine implicit learning, explicit feedback and an experimental manipulation of the connection between choices and consequences. More precisely, patients will sometimes get a feedback for choices they have not made, whereas the box they have actually clicked will yield no feed- back. If one is in a position to observe no difference, in terms of regret aversive behaviour, for outcomes that correspond and outcomes that do not correspond to actual patients’ choices, it would constitute starting evidence in favour of a disconnection between regret and a typical rational antecedent of choice such as authorship or responsibility.

It has been more generally noted that Tourette’s syndrome patients had paradoxical (or, at least, diffi- cult to understand) attitudes with self-attribution of responsibility (Schroeder 2007). Those patients are presumably over-attributers of self-responsibility, which would be confirmed by a salient behavioural pattern over our crucial condition of the box-clicking experimental design. This invites to further questions over the alleged constitutive connection between regret and its rational antecedents. The introduction of regret in decision theory in terms of decision weights must be refined in order to take into accounts the cases in which anticipated regret is under-weighted (e.g. in addictive patients) or over-weighted (e.g.

possibly in Tourette’s patients).

4. DECISION TYPES AND REGRET

One type of normative antecedents that can modulate the triggering of post hoc or anticipated regret in decision-making is the type of procedure one follows and the awareness with which one follows that pro- cedure. Imagine one is deliberately negligent in deciding in the Allais matrix, it is possible that having not experienced anticipated regret she will experience no post hoc regret either. She has left the outcome to chance and at best she will be more or less disappointed by her lack of luck or, inversely, may experience non-normatively rejoicing if lucky enough. But it may be abusive to properly speak of regret in the case of negligence and luck, except may be of post hoc second-order regret not to have devoted more time and energy to pondering one’s decision.

Evocative of the conceptual difficulties surrounding moral luck when defining an agent as morally responsible (Williams 1981), we expect our emotions to be attuned to our normative status: scruples are the mark of moral deliberation in the same way as anticipated regret could be of our rational decision-making.

One case in point, then, is to be able to experimen- tally discriminate between regret linked to outcome and regret linked to procedure. Pieters & Zeelenberg (2003)

Q2 underline two sources of regret: outcome and procedure. The use of poor decision procedures,

when recognized by the subject, may arouse regret of its own. We will distinguish the case in which subjects have given more or less dedication to their decision procedures, on a scale that goes from complete negli- gence to extreme conscientiousness, with the other case in which subjects may hesitate between compet- ing procedures possibly embodying alternative criteria of rationality. As we have already glossed with respect to the Allais problem, alternative solutions may self-impose to an individual’s mind. This is what makes this decision problem a paradox. But how regret, in such paradoxical situations, may become a mark of rationality?

Regret is usually provoked by the emotional impact of the foregone alternative. When the choice of the latter has weak normative appeal the standard predic- tion is that in spite of a negative outcome following it, the choice of the more normatively appealing alterna- tive is itself sufficient to block post hoc regrets. Let us give a very simple example of this situation. Choose between lotteries A and B (figure 5). Q3

Imagine you choose B but get 0 and A yields 50.

You would certainly be disappointed but do you have anything to regret? It would have been a clear irrational choice to prefer A over B. For some individuals, the rationality of choosing B may be enough to block regrets, would the imagined situation have occurred.

Note that this situation is not symmetrical with respect to lucky issues. Imagine you choose A, get 50 and would you have chosen B it would have yielded 0.

Now it is hard to refrain one’s rejoicing on the basis of post hoc rationalization.

Our issue is with decision problems for which there is no such normative gap between the alternatives.

There is a special problem in situations in which it is particularly hard to make up one’s mind about the respective normative appeal of the choices presented.

In paradoxes such as Allais’s problem, a lucid partici- pant may mentally balance the intuitiveness of one type of choice versus the other with no clear decision criterion to use but, precisely, the attempt to minimize anticipated regrets. Procedural indeterminacy, in that very case, may turn potential regret linked to outcome into the sole rational decision criterion at hand.

The investigation of how regret is a mark of the rationality/irrationality of choice procedures must include, in those special contexts in which subjects may hesitate to apply alternative norms and pro- cedures, an independent measure of the decisiveness or confidence with which the decision has been made.

We can conceive of two ways in view to add this cru- cial measure, direct and indirect. One can consistently, along the performing of a task, elicit the degree of con- fidence that accompanies the decision performance.

Confidence scales provide a common means, along 0

50

0 200

(a) (b)

Figure 5. Q5

6 S. Bourgeois-Gironde Regret and the rationality of choices

Phil. Trans. R. Soc. B(2009) 641

642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704

705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768

ARTICLE IN PRESS

rstb20090163—12/10/09—11:26–Copy Edited by: C. Darlin Premela

(9)

with post-wagering methods (Persaud 2007) and other confidence elicitation methods favoured by exper- imental economists (Holt & Laury 2002). We will not dwell upon the further methodological difficulties affecting the addition of those measures to the repeti- tive unfolding of an experimental session, as we propose to proceed in a completely different in-built manner. We will take advantage of a classical decision problem, Newcomb problem (Nozick 1969), pre- sented as involving a paradox of rationality in which the choice of alternatives coincides in principle with types (rather than levels) of confidence vis-a`-vis one’s choice (Baratginet al. in preparation)

Q1 .

Newcomb problems have the following structure (figure 6)

Q3 .

Let us label people one-boxers and two-boxers according to their decisions in the Newcomb pro- blems. What is the presumed mental typology associated with those decision-types and how does it connect to the issue of normative antecedents of regret? Two-boxers go against the prediction. The decision criteria they presumably follow have been characterized, in the philosophical branch of decision theory, as causalists versus evidentialists (Joyce 1999). Two-boxers show, so to say, a higher autonomy, that is, a higher level of decisiveness, in their choices than do one-boxers, whose possible faith in their choice amounts to a form of alienated confidence or credulity. But integrating in one’s decision-criteria predictions, signs and symbolic value may not be altogether irrational (Nozick 1993). It is at least perva- sive enough, as in convincing oneself of one’s good health by accomplishing acts that could be signs of one’s good health or of the influence of one’s vote in national elections by going to vote (Quattrone &

Tversky 1984).

Shafir & Tversky (1992)have run the first empirical investigation of Newcomb problems. They submitted to their subjects a Newcomb problem as a bonus problem at the end of a series of Prisoner’s Dilemmas via computer terminals. Their cover story was that

‘a program developed at MIT was applied during the entire session (of Prisoner’s Dilemma choices) to analyze the pattern of your preference, and predict your choice (one or two boxes) with an 85 per cent accuracy’. Although it was evident that the money amounts were already set at the moment of choice, most experimental subjects opted for the single box.

It is ‘as if ’ they believed that by declining to take the money in Box B2, they could change the amount of money already deposited in Box B1. They have not tested whether regret was different when outcomes are revealed to one-boxers and two-boxers.

We formed the prediction that one-boxers, when facing negative outcomes, would experience a greater amount of regret than would two-boxers in the same situation. This is due, we speculate, to the lesser decisiveness or autonomy with which those choices are made, in spite of their greater faithfulness to the prediction. If a difference emerges between types of decision and amount of regret in the Newcomb problem, this can be considered as a step forward a better understanding of how regret taps into rational antecedents of choices and can be modulated by com- peting criteria of rationality. We proceeded in a way comparable to Shafir and Tversky’s as our participants were told that if the program had predicted that they would now choose the two boxes, Box B1 would be empty, and if it had predicted that they would choose Box B1 only, it would containE10.

The game was framed so that Box B1 would always be empty when participants chose it. So when participants chose Box B1þBox B2, they would earn E1 and nothing when they chose Box B1. We added a retrospective measure of regret on a 5-point scale.

Our results show a significant difference between types of choices and levels of regret as captured on this scale. The following table presents descriptive statistics for the variable Regret for each type of decisions (one-boxers or two-boxers) in the Newcomb problem.

Q4

analysis number means of regret IC

two-boxers 20 2.25 0.6

one-boxers 10 4.23 1.21

total 30 2.93 0.66

Two-boxers experience a statistically significant lesser amount of regret than one-boxers in spite, of course, of the disappointment of discovering that the second box is empty. The reason is that two- boxers acted with a higher level of confidence and made a choice that was less dependent on external guidance than one-boxers. It is true that one-boxers having put their faith in the Newcomb prediction feel fooled by the experiment. The disappointment is in principle the same among the two types of deciders in the sense that they both miss E10 that they expected, but the way in which they have lost it radically differ. In the case of disappointed one- boxers, they think that they should have not trusted the prediction; in the other case of disappointed two-boxers, they have less reason to think that things would have been otherwise would have they chosen Box B1 only. This result tends to show that regret is sensitive to the way disappointment occurs as well as to the fact whether I can retrospectively assess my decision criterion as being the most rational, when conflicting decision principles were available at the moment of choice.

imagine a being with great predictive powers.

you are confronted with two boxes: B1 and B2. B1 is opaque and B2 is transparent, you can see that it contains C1.

B2 contains C1; B1 contains either C10 or nothing.

you may choose B2 alone or B1 and B2 together.

if the being predicts that you choose both boxes, he does not put anything in B1; if he predicts that you choose B1 only, he puts C10 in B1.

what should you choose?

Figure 6. A Newcomb problem.

Regret and the rationality of choices S. Bourgeois-Gironde 7

Phil. Trans. R. Soc. B(2009) 769

770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832

833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896

(10)

5. CONCLUSION

We addressed the question whether regret is modu- lated by the rationality of decision procedures on the basis of existing or prospective experiments on patients and healthy subjects. We think that a variety of rational antecedents of choice explains the impact of regret on subsequent decision behaviour. Extant neurobiologi- cal studies by Camille et al. (2004) and Coricelli et al. (2005), on the adaptive role of regret in decision-making, rightly emphasize the necessary inte- gration of emotional and cognitive components in view of optimal decision behaviour. We think that further conceptual distinctions are useful, in particular between regrets considered as error signals and regrets as decision weights, in order to uncover the cognitive and neural mechanisms through which regret posi- tively influences behaviour. Dissociations between the ability to anticipate regret on the basis of infor- mation on alternative rewards and the ability to implement a behavioural strategy in accordance with this piece of information may occur in certain types of patients. We labelled this difference in terms of regrets as error signals and regrets as decision weights.

Regrets can be under-weighted or over-weighted in decision-making, loosening the connection between a proper processing of error signals and behaviour.

In healthy individuals, we postulate a calibration between the rational processing of information in the decision task and the level of regret experienced.

In chronic smokers and Tourette syndrome patients, we observe, on the contrary, that the generation of error signals may be inefficient in reinforcing optimal behaviour, either because information has no weight on decision-making or because it is improperly processed.

Regret is not only dependent upon the quality of information processing relative to past and future outcomes. It is, as we termed them, also dependent upon an array of rational antecedents of choices, i.e. factors that make it more or less rational to experience regret. Being sure that I have properly processed information that was available to me is one of these factors. When I realize that I neglected some relevant aspects of the situation in making a decision that issued in a poor result, I am liable to experience more acute pangs of regret than if I were meticulous. Conversely, I may feel regret only for outcomes vis-a`-vis which I bear some degree of responsibility. When nature or hazard has yielded the outcome, I have no reason to blame myself for what happens. This conflict between responsibility and nature (or God) is what is paradigmatically encapsulated in the famous Newcomb paradox.

We addressed the issue to know whether regret associated with the experience of disappointing outcomes in an experimental Newcomb test was dependent on the types of decision subjects were invited to make. We observed that when subjects were not deferring their decision-criteria to an exter- nal guidance they tended to experience less regret than in the contrary case. This is but a seemingly paradox to say that regret is both triggered by my implication in a course of action and attenuated by the feeling that I acted as an autonomous agent.

Future clinical and neurobiological studies on regret will probably tackle this deep philosophical issue of the connection between self-blame and free will.

REFERENCES

Ahmed, S. 2004 Addiction as a compulsive reward predic- tion.Science300, 1901 – 1902.

Allais, M. 1953 Le Comportement de lˇHomme Rationnel devant le Risque. Critique des Postulats de l’Ecole Ame´ricaine. Econometrica 21, 503 – 546. (doi:10.2307/

1907921)

Camille, N., Coricelli, G., Sallet, J., Pradat, P., Duhamel, J.-R. & Sirigu, A. 2004 The involvement of the orbito- frontal cortex in the experience of regret. Science 304, 1167 – 1170. (doi:10.1126/science.1094550)

Chiu, P., Lohrenz, T. & Montague, R. 2008 Smokers’ brains compute, but ignore a fictive error signal in a sequential investment game. Nat. Neurosci.11, 514 – 520. (doi:10.

1038/nn2067)

Coricelli, G., Critchley, H., Joffily, M., O’Doherty, L., Sirigu, A. & Dolan, R. 2005 Regret and its avoidance: a neuroimaging study of choice behavior.Nat. Neurosci.8, 1255 – 1262. (doi:10.1038/nn1514)

Damasio, A. 1994 Descartes’ error: emotion, reason, and the human brain. New York, NY: Avon Books.

Erev, I. & Barron, G. 2005 On adaptation, maximization, and reinforcement learning among cognitive strategies.

Psychol. Rev. 112, 912 – 931. (doi:10.1037/0033-295X.

112.4.912)

Hart, S. & Mas-Collel, A. 2000 A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 1127 – 1150. (doi:10.1111/1468-0262.00153)

Heikkila, R. E., Orlansky, H. & Cohen, G. 1975 Studies on the distinction between uptake inhibition and release of (3H) dopamine in rat brain tissue slices. Biochem.

Pharmacol. 24, 847 – 852. (doi:10.1016/0006-2952(75) 90152-5)

Holt, C. & Laury, S. 2002 Risk aversion and incentive effects. Am. Econ. Rev. 92, 1644 – 1655. (doi:10.1257/

000282802762024700)

Joyce, J. 1999 The foundations of causal decision theory.

Cambridge, UK: Cambridge University Press.

Kahneman, D. & Tversky, A. 1979 Prospect theory: an analysis of decisions under risk. Econometrica 47, 313 – 327.

Laland, J. W. & Grafman, J. 2005 Experimental tests of the somatic marker hypothesis. Games Econ. Behav. 52, 386 – 409.

Lau, B. & Glimcher, P. 2007 Action and outcome encoding in the primate caudate nucleus. J. Neurosci. 52, 14 502 – 14 514.

Lohrenz, T., McCabe, K., Camerer, C. & Montague, R.

2007 Neural signature of fictive learning signals in a sequential investment task. Proc. Natl Acad. Sci. 104, 9493 – 9498. (doi:10.1073/pnas.0608842104)

Loomes, G. & Sugden, R. 1982 Regret theory: an alternative theory of rational choice under uncertainty. Econ. J.92, 805 – 824. (doi:10.2307/2232669)

McClure, S., Laibson, D., Loewenstein, G. & Cohen, J.

2004 Separate neural systems value and delayed monet- ary rewards. Science 306, 503 – 507. (doi:10.1126/

science.1100907)

Nozick, R. 1969 Newcomb’s problem and two principles of choice. In Essays in honor of Carl G. Hempel (ed.

N. Rescher), pp. 107 –133. Dordrecht, The Netherlands:

Reidel.

Nozick, R. 1993 The nature of rationality. Princeton, NJ:

Princeton University Press.

8 S. Bourgeois-Gironde Regret and the rationality of choices

Phil. Trans. R. Soc. B(2009) 897

898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960

961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024

ARTICLE IN PRESS

rstb20090163—12/10/09—11:26–Copy Edited by: C. Darlin Premela

Références

Documents relatifs

Indeed, using some generalized notions of regret and/or calibration, one can construct approachability strategies (in the case of convex sets).... 1

Also, deciding the winner for quantitative objectives (mean-payoff and energy games) is also undecidable under partial observation [DDG + 10].. A lot

In the one-player case, we recommend Upper Confidence Bound as an exploration algorithm (and in particular its variants adaptUCBE for parameter-free simple regret) and Lower

Recently, there has been a revitilization of interest in the adaptive linear quadratic regulator (LQR) as it serves as good theoretically tractable example of reinforcement learning

First, due to the role of regret in their modified utility functions, mutual fund managers who perform worse than their peers (i.e., who ex- hibit return-regret) in one period tend

they developed non-asymptotic problem-dependent lower bounds on the regret of any algorithm, in the case of more general limited feedback models than just the simplest case

It is shown that convergence of the empirical frequencies of play to the set of correlated equilibria can also be achieved in this case, by playing internal

In Proceedings of the 20th International Conference on Algorithmic Learning Theory, ALT 2009 (Porto, Portugal, October 3-5, 2009) (Lecture Notes in Artificial Intelligence, Vol..