Solving sequential collective decision problems under qualitative uncertainty

(1)

HAL Id: hal-02378370

https://hal.archives-ouvertes.fr/hal-02378370

Submitted on 25 Nov 2019

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Solving sequential collective decision problems under

qualitative uncertainty

Nahla Ben Amor, Fatma Essghaier, Hélène Fargier

To cite this version:

Nahla Ben Amor, Fatma Essghaier, Hélène Fargier. Solving sequential collective decision problems

under qualitative uncertainty. International Journal of Approximate Reasoning, Elsevier, 2019, 109,

pp.1-18. �10.1016/j.ijar.2019.03.003�. �hal-02378370�

(2)

Any correspondence concerning this service should be sent

to the repository administrator:

tech-oatao@listes-diff.inp-toulouse.fr

This is an author’s version published in:

http://oatao.univ-toulouse.fr/25042

To cite this version:

Ben Amor, Nahla and Essghaier,

Fatma and Fargier, Hélène Solving sequential collective

decision problems under qualitative uncertainty. (2019)

International Journal of Approximate Reasoning, 109. 1-18.

ISSN 0888-613X

Official URL

DOI :

https://doi.org/10.1016/j.ijar.2019.03.003

Open Archive Toulouse Archive Ouverte

OATAO is an open access repository that collects the work of Toulouse

researchers and makes it freely available over the web where possible

(3)

Solving

sequential

collective

decision

problems

under

qualitative

uncertainty

✩

Nahla

Ben Amor

a

,

Fatma

Essghaier

a,b

,

Hélène

Fargier

b

a _LARODEC,_University_of_Tunis,_Tunisia

b _IRIT,_UPS-CNRS,₁₁₈_route_de_Narbonne,₃₁₀₆₂_Toulouse,_France

a b s t r a c t

Keywords: Possibilitytheory Qualitativeuncertainty Collectivedecisionmaking Sequentialdecisionmaking Decisiontree

DynamicProgramming

This paper addresses the question of sequential collective decision making under qualitative uncertainty. It resumes the criteria introduced in previous works [4–6] by Ben Amor et al. and extendsthemtoamoregeneralcontextwhereeverydecisionmakerisfreetohavean optimisticor a pessimistic attitude w.r.t.uncertainty. These criteria are then considered for the optimization of possibilisticdecisiontreesandanalgorithmicstudyisperformedfor each ofthem. Whentheglobal utilitydoessatisfythemonotonicityproperty,aclassical possibilistic Dynamic Programming can be applied. Otherwise, two cases are possible: either the criterionis maxoriented (the more isthe satisfaction of any agent, the greater is the global satisfaction), and a dedicated algorithm can be proposed, thatrelies onas many callsto DynamicProgramming as thenumberofdecisionmakers; or the criterion is min oriented (all the agents must like the common decision) and the optimal strategycanbe providedby aBranchandBoundAlgorithm.Thepaperconcludesbyanexperimental study that shows the feasibility of the approaches, and details to what extent simple Dynamic programmingalgorithmscanbeusedasapproximationproceduresforthenonmonotonic criteria.

1. Introduction

The handlingofacollective decisionproblemunder uncertaintyresorts on(i)theidentificationof atheoryofdecision making under uncertainty (DMU) that captures the decision makers’ behavior with respect to uncertainty and (ii) the specificationofacollectiveutilityfunction(CUF)asitmaybeusedwhentheproblemisnotpervadedwithuncertainty.But also,oneneedstoprecisewhentheutilityoftheagentsistobeevaluated:before(ex-ante)orafter(ex-post)therealization oftheuncertainevents.Inthefirstcase,theglobalutilityfunctionisafunctionoftheDMUutilitiesofthedifferentagents; inthesecondcaseitisanaggregation,w.r.t.thelikelihoodofthefinalstates,ofthecollectiveutilities.

Following Fleming [23], Harsanyi [27] has shown that, when the uncertainty about consequences of decisions can be quantifiedinaprobabilisticway:thecollectiveutilityshouldbeaweightedsumoftheindividualexpected utilities.Many contributions have been inspired by this seminal work: Some authors (such as Diamond [11]) criticized this approach becausenotapplicablewhenthecollectiveutilityismoreegalitarianthanutilitarian.Othershavedeveloped Harsanyi’s

ap-✩ Thispaperisanextendedversionofpreliminaryresultspresentedin[7];itincludesthefullproofsofthepropositions,andnewmodelsandalgorithms

thatallowthemodelingofcollectivesequentialproblemswheretheagentshavedifferentattitudesw.r.t.uncertainty. E-mailaddresses: nahla.benamor@gmx.fr (N. Ben Amor),essghaier.fatma@gmail.com (F. Essghaier),fargier@irit.fr (H. Fargier). https://doi.org/10.1016/j.ijar.2019.03.003

(4)

proach, inparticularMyerson [37] who provedthat onlythechoiceofanutilitariansocialwelfarefunctioncanreconciliate the ex-ante andex-post approaches. However in theprobabilistic case,all other welfare functions sufferfrom the “timing effect” [37],i.e.,leadtoadiscrepancybetweentheex-ante andtheex-post approach.

Harsanyi’s and Meyrson’s results rely on the assumption that the knowledge ofthe agents about the consequences of their decisionsisrich enoughto bemodeledbyprobabilisticlotteries.When theinformation aboutuncertainty cannotbe quantifiedinaprobabilisticway,thetopicofpossibilisticdecisiontheoryisoftenanaturalonetoconsider[12–14,18,21,25]. Qualitativedecisiontheoryisrelevant,among otherfields,forapplicationsto planningunderuncertainty,whereasuitable

strategy (i.e., a set of conditional or unconditionaldecisions) isto be found, starting from a qualitativedescription of the initial world, of theavailable alternatives,oftheir (perhapsuncertain)effects and ofthegoal to reach(see [8,41,43]).But uptothispoint,theevaluationofthestrategieswas consideredonlyinasimple,mono-agentcontext,whileitisoftenthe casethatseveral agentsareinvolvedinthedecision.

Thepresent paperraises thequestionofsequentialcollective decisionmakingunderpossibilisticuncertainty. Itfollows recent works [4–6] which propose atheoretical frameworkfor multi-agent (nonsequential) decisionmaking under possi-bilistic uncertainty. It extends these results to consider decision problemswhere agents may have different attitudew.r.t. uncertainty(somemaybeoptimisticwhentheothersarepessimistic)andprovidesnewdecisionrulesforthisspecificcase. Then, wetackletheproblemof(possibilistic)sequential decisionmakingand weprovideanalgorithmicstudy forstrategy optimization incollectivepossibilisticdecisiontrees.

The remainder of this paper is organized as follows: The next Section recalls the basic notions on which our work relies(decisionunderpossibilisticuncertainty,collectiveutilityfunctions,etc.).InSection3,weresumethedecisioncriteria introduced in [4] and define new ones for agents with different attitudes. Section 4 is devoted to strategy optimization in collectivepossibilistic decisiontrees,usingthe decisionrulespreviously defined.Thepictureis finallycompletedbyan experimental study,presentedinSection5.

2. Backgroundandbasicnotions

2.1. Collectiveutilityfunctions

LetusconsiderasetA= {1,. . . ,p} ofagentsthat havetomakeadecision.Each agenti∈ Abeingsupposed toexpress his / her preferences on a set of alternatives (say, aset X), bya ranking functionor a utilityfunction ui that associates

to each element of X a value ina subsetof

R

+ (typically in theunitinterval [0,1]). In theabsence ofuncertainty, each decision leads to a uniqueconsequence and an utilityvector Eu= hu1,. . . ,upi is associated to eachone of them. Besides,

when agents are not equally important, we define a vector wE = hw1,. . . ,wpi where each i is equipped with a weight

wi∈ [0,1]reflectingitsimportance.Thus,solvingtheproblemcomes downtocomputeaglobalutilitydegreethat reflects

thecollectivepreferencebyaggregatingthedifferent ui’s.

In a qualitativeframework, such aggregation shall beeither conjunctive (i.e.,based on a weighted min) or disjunctive (i.e., basedon aweighted max)- see [15] formoredetailsabout weighted minand weighted maxaggregations.Formally, these aggregationsaredefinedasfollows:

Disjunctive aggregation : Aggmax

(

x

) =

max

i∈A min

(

wi

,

ui

(

x

)).

(1)

Conjunctive aggregation : Aggmin

(

x

) =

min

i∈A max

(

1

−

wi

,

ui

(

x

)).

(2)

2.2. Multi-agentdecisionmakingunderrisk

Inaframeworkofdecisionmakingunderrisk,whentheinformationabouttheconsequencesofdecisionsisprobabilistic, a popular criterion to compare alternatives is theexpected utilitymodel axiomatized by Von Neumannand Morgenstern [35]: anelementary decision is modeled bya probability distribution over theset X of possible outcomes. It iscalled a simpleprobabilisticlotteryanditisdenotedbyL= hλ1/x1,. . . ,λn/xni,whereλj=p(xj)istheprobabilitythatthedecision

leadstooutcomexj.Also,itissupposedthatthepreferencesofasingledecisionmakerarecapturedbyautilityfunctionuj

assigning anumerical valueto eachxj.Solvingsuch problemsamountstoevaluate riskyalternatives and choosingamong

them. In otherwords, we compute theexpected utilityofeach lottery and weselect theone withthe highest value(the greater, thebetter).

When several agents are involved, theaggregation ofindividualpreferences under risk raises aparticular problem de-pending on whenthe utilityof theagents is to beevaluated, before or after theconsideration of uncertainty. This yields two differentapproaches,namelytheex-ante andtheex-post aggregation:

• The ex-ante approach consists in computing the utility of each agent, before performing the aggregation using the agents’weights.

• The ex-post approach consists in first determining the aggregated utility (conjunctive or disjunctive) relative to each possibleoutcomeofX;thenconsidertheuncertaintyandthelikelihood ofstates.

(5)

For thesamedecision problemand usingthesamecriterion,thetwo approachesdonot alwayscoincide:Aggregating the agents attitudes before or after theconsideration of uncertainty maylead to different conclusions.This phenomenon has been identifiedbyMyerson [37] as “timingeffect”.

2.3. Singleagentdecisionmakingunderpossibilisticuncertainty

The expected utilitymodel owes itspopularity essentiallyto itsstrong axiomatic justifications [35,44].However, it in-volves only the use of a quantitative representation of uncertainty and it has been proved that this formalism cannot represent alldecisionmakers’behaviors [1,22].Besides,whenthedecisionmakerisunabletoexpresshis/heruncertainty and preferences numericallyand can onlygivean orderamong differentalternatives,theprobabilistic frameworkremains inappropriateandnonprobabilisticmodels(suchasimpreciseprobabilities[46],evidencetheory[45],roughsettheory[39] and possibility theory [16,19,47,48]) become relevant alternatives. In particular, one may consider qualitative possibilistic decisionrulesthathaveemergedwiththegrowth ofpossibilitytheory.

Possibilitytheoryisaframeworktohandle uncertaintyissued fromFuzzy Setstheory. Ithas beenintroducedbyZadeh [47,48] and further developed by Dubois and Prade [16,17]. The basic building block in this framework is the notion of

possibility distribution. Itrepresents the knowledgeof adecision maker about the stateof theworld. Apossibility distri-bution is denoted by

π

and it maps each state s in the universe of discourse S to a bounded linearly ordered scale V ,

typicallytheunitinterval[0,1].Thisscalecanbeinterpretedinaquantitativeoraqualitativeway,thelatteristhecontext of ourwork. Independentlyofthe used scale,given s∈S,

π

(s)=1 meansthat therealization of s is totally possibleand

π

(s)=0 means that s is impossible.

π

is said to benormalized if there exist at least one s∈S that is totally possible. Extremecasesofknowledgeinpossibilitytheoryarecompleteknowledgeandtotal ignorance.Inthefirstcase,weassign1 toatotallypossiblestates0 and0otherwise(i.e.,∃s0,

π

(s0)=1and∀s6=s0,

π

(s)=0).Inthesecondone,weassign1to

allsituations(i.e.,

π

(s)=1, ∀s∈S).

Possibility theory is characterized by the useof two dual measures, namely the possibility measure 5 and necessity measure N definedby:

• Possibilitymeasure: 5(E)=max

s∈E

π

(s). It denotes the possibility degree evaluating at which level an event E⊆S is

consistentwiththeknowledgerepresentedby

π

.

• Necessitymeasure: N(E)=1− 5(¬E)=min

s∈E (1−

π

(s)). Itdenotes the necessitydegreeevaluatingatwhich level an

event E⊆S iscertainlyimpliedbytheknowledge.

FollowingDuboisand Prade’spossibilisticapproachofdecisionmakingunderqualitativeuncertainty[18],adecisioncan beseen asapossibility distributionover afiniteset ofoutcomesX calleda(simple)possibilisticlottery.Sucha lotteryis denoted byL= hλ1/x1,. . . ,λn/xniwhere λj=

π

L(xj) isthepossibilitythat decisionL leads tooutcome xj;thispossibility

degree canalso bedenoted by L[xj]. Inthis framework,a decisionproblem isthus fully specifiedbya set ofpossibilistic

lotterieson X and autilityfunctionu: X 7→ [0,1] expressing thedecisionmaker preferences.Under theassumption that theutilityscaleandthepossibilityscalearecommensurateandpurelyordinal,Duboisand Pradeproposetoevaluate each lottery byaqualitative,optimisticorpessimisticutility[18]:

Optimistic utility : U+

(

L

) =

max

xj∈X

min

(

L

[

xj

],

u

(

xj

)).

(3)

Pessimistic utility : U−

(

L

) =

min

xj∈X

max

(

1

−

L

[

xj

],

u

(

xj

)).

(4)

U+(L)isamildversionofthemaximaxcriterion:L isgoodassoonasitistotallyplausiblethatitgivesagoodconsequence. On thecontrary, thepessimisticutility,U−(L) estimatestheutilityofanactbyitsworstpossibleconsequence:itsvalueis high wheneverLgives good consequences in every “ratherplausible”state. Thesetwo utilities canbeseen as theordinal counterpartoftheexpected utility criterionand havebeenaxiomatizedinthestyleofVonNeumannandMorgenstern[18] and Savage[20] frameworks.

3. Apossibilisticapproachtocollectivedecisionmaking

3.1. Collectivequalitativedecisionrules

Let us now consider collective decision making under qualitativeuncertainty: In this framework, the decisionmaker’s attitude with respectto uncertainty canbe eitheroptimistic (U+) orpessimistic (U−)and theaggregation ofthe agents’ preferencescanbeeitherconjunctive,egalitarian( Aggmin₎_or_disjunctive,_non_egalitarian_{( Agg}max_).

Consider the decision problem defined by A a set of agents, Eu= hu1,. . . ,upi a vector of utility functions, wE =

hw1,. . . ,wpi a weighting vector and L = hL[x1]/Eu(x1), . . ., L[xn]/Eu(xn)i a possibilistic lottery on a set of consequences

X. Ben Amor et al. [4–6] have proposed and axiomatized four ex-ante and four ex-post decision criteria to solve such problems.

(6)

Fig. 1. Possibilistic (constant) lottery of Example1.

U_ante−min

(

L

) =

min

i∈A max

((

1

−

wi

),

xminj∈X

max

(

ui

(

xj

), (

1

−

L

[

xj

]))).

(5)

U−_antemax

(

L

) =

max

i∈A min

(

wi

,

xminj∈X

max

(

ui

(

xj

), (

1

−

L

[

xj

]))).

(6)

U_ante+min

(

L

) =

min

i∈A max

((

1

−

wi

),

maxxj∈X

min

(

ui

(

xj

),

L

[

xj

])).

(7)

U+_antemax

(

L

) =

max

i∈A min

(

wi

,

maxxj∈X

min

(

ui

(

xj

),

L

[

xj

])).

(8)

U−_postmin

(

L

) =

min

xj∈X

max

((

1

−

L

[

xj

]),

min

i∈A max

(

ui

(

xj

),

(

1

−

wi

))).

(9)

U−_postmax

(

L

) =

min

xj∈X

max

((

1

−

L

[

xj

]),

max

i∈A min

(

ui

(

xj

),

wi

)).

(10)

U+_postmin

(

L

) =

max

xj∈X

min

(

L

[

xj

],

min

i∈A max

(

ui

(

xj

),

(

1

−

wi

))).

(11)

U+_postmax

(

L

) =

max

xj∈X

min

(

L

[

xj

],

max

i∈A min

(

ui

(

xj

),

wi

)).

(12)

For the notations: thesubscript indicates theused approach(ex-ante or ex-post) and the superscriptdenotes thedecision makers’ attitude w.r.t. uncertainty (pessimistic “-” oroptimistic “+”) and the agents’ preferences aggregation (conjunctive “min” ordisjunctive“max”).

The U_ante−min utilityfor instance, considersthat the decisionmakers arepessimistic and computesthepessimistic utility ofeachone ofthem.Then, theU−_i ’sareaggregatedon acautiousbasis:thehigherthesatisfactionoftheleast satisfiedof theimportantagents,thebetteristhelottery.Usingthesamenotations,U−max_post considersthataconsequence xi isgoodas

soon asone of theimportant agents issatisfied: amax-based aggregation of theutilities isperformed, yieldinga unique utilityfunction Ag g()on thebasisofwhichthepessimisticutilityiscomputed.

In Ref. [6], authorshave proposed a qualitativecounterpart of Harsanyi’s theorem [27],and have shown that the fully min orientedand fullymax orientedex-ante utilities areequivalentto theirex-post counterparts,i.e., U_ante−min=U−_postmin and

U+_antemax=U+_postmax. But U_ante−max (resp. U_ante+min) may differfrom U−_postmax (resp. from U+min_post ).These criteria sufferfrom timing effect.

Example1.Considertwoequallyimportantagents1 and2 (w1=w2=1),andalottery L= h1/xa,1/xbi definingastateof

total ignoranceabout consequencesxa and xb (

π

(xa)=

π

(xb)=1) (seeFig.1). Thefirstconsequence isgoodfor1 and bad

for 2,and thesecondoneisbadfor1 andgoodfor 2:u1(xa)=u2(xb)=1 and u2(xa)=u1(xb)=0.

Itiseasytocheckthat U_ante+min(L)=06=U+min_post (L)=1 where:

U+_postmin

(

L

) =

max

(

min

(

1

,

min

(

max

(

1

−

1

,

1

),

max

(

1

−

1

,

0

))) ,

min

(

1

,

min

(

max

(

1

−

1

,

0

),

max

(

1

−

1

,

1

)))) =

0

.

U+_antemin

(

L

) =

min

(

max

(

1

−

1

,

max

(

min

(

1

,

1

),

min

(

1

,

0

))) ,

max

(

1

−

1

,

max

(

min

(

1

,

0

),

min

(

1

,

1

)))) =

1

.

3.2. Possibilisticdecisionrulesforagentswithdifferentattitudew.r.t.touncertainty

The decision rules presented in theprevious section assume that all theagents are either purely optimisticor purely pessimistic.However,theattitudeofeachdecisionmakerhasamajorimpactonthechosenalternativeordecision.Forthe samedecisionproblemonlyvaryingtheattitudeofadecisionmakermayradicallychangetheresults.Besides,inreal-world problems,itisseldomthatallthedecisionmakershavesameattitude:thegroupofdecisionmakersmaygatherpessimistic as wellasoptimisticpersons. So, thedecision haveto bemade byconsidering theindividualdifferences in toleranceand intolerance foruncertainty.In thefollowing,wegetridof theassumptionofthesameattitudeforthecollectivity and we propose moregeneraldecisionrulesthat areappropriate tohandlesituations whereeachagentisfreeto expresshis/her attitudew.r.t.uncertainty.

Nevertheless, dealing withsuch problems imposesthe useof ex-ante aggregation and forces usto give upthe ex-post

(7)

agent representing thecollectivity),imposes the useofoneand only oneattitude.Hence, itisnotpossible torespecteach agent’s attitudetowardsuncertaintywithasuchmethod.Theex-ante approach,onthecontrary,allowsthehandlingofthe differentattitudesofheterogeneousagents.Thisleadsustoextend theex-ante decisionrulesasfollows:

Definition1. Given a possibilistic lottery L on X, a set of agents A (where each i is either optimistic or pessimistic), a vectorofutilityfunctionsu andE aweightingvector w,E let:

U_antemax

(

L

) =

max

i∈Amin

(

wi

, ⊗

xj∈X

⊕ (

ui

(

xj

),

3[

xj

])).

(13)

U_antemin

(

L

) =

min

i∈Amax

(

1

−

wi

, ⊗

xj∈X

⊕ (

ui

(

xj

),

3[

xj

])),

(14)

where ⊗ = min (resp. max), ⊕ = max (resp. min) and 3[xj] = 1 - L[xj] (resp. L[xj]) if the agent i is pessimistic (resp.

optimistic).

Example2. Consider two agents 1 and 2 having thesame importance (w1 = w2 = 1) such that 1 is optimistic and 2 is

pessimistic,and considerthelotteryL definedinExample1:L= h1/h1,0i,1/h0,1ii.Wegetthefollowingresults:

U_antemax

(

L

) =

max

(

min

(

w1

,

U1+

(

L

)),

min

(

w2

,

U₂−

(

L

))).

=

max

(

min

(

1

,

max

(

min

(

1

,

1

),

min

(

1

,

0

))) ,

min

(

1

,

min

(

max

(

1

−

1

,

0

),

max

(

1

−

1

,

1

)))) =

1

.

U_antemin

(

L

) =

min

(

max

(

1

−

w1

,

U₁+

(

L

)),

max

(

1

−

w2

,

U−₂

(

L

))).

=

min

(

max

(

1

−

1

,

max

(

min

(

1

,

1

),

min

(

1

,

0

))) ,

max

(

1

−

1

,

min

(

max

(

1

−

1

,

0

),

max

(

1

−

1

,

1

)))) =

0

.

These criteria can be considered as a generalization of the four ex-ante utilities: Using the min (resp. max) oriented aggregation, U_antemin (resp.Umax_ante)isequal toU_ante−min (resp.U_ante−max)if allagentsarepessimistic and itisequalto U_ante+min (resp.

U+_antemax)ifthereareoptimistic.Obviously,theegalitarianutilityU_antemin isrelatedtoitsnonegalitariancounterpartbyduality. Formally, itholdsthat:

Proposition1.LetP = hL,wE,uEibeaqualitativecollectivedecisionproblemandlet Pτ = hL,wE,uEτibeitsdualproblem,i.e.,the problemsuchthatforeachagentweconsiderhis/herdualattitudeandforanyxj∈ X,i∈ A,wedefineuτi(xj)=1−ui(xj).Then,

foranyL∈ L:

U_anteτmax

(

L

) =

1

−

U_antemin

(

L

)

and

,

U_anteτmin

(

L

) =

1

−

U_antemax

(

L

).

Likewise,ifagentsareequallyimportant,weightscanberuledoutandtheproposedcriteriacanbesimplifiedasfollows:

Definition2.Givenapossibilisticlottery L onX,asetofequallyimportantagentsAwhereeachagentiseitheroptimistic orpessimisticandavectorofutilityfunctions u let:E

U_antemax

(

L

) =

max

i∈A

( ⊗

xj∈X

⊕ (

ui

(

xj

),

3[

xj

])).

(15)

U_antemin

(

L

) =

min

i∈A

( ⊗

xj∈X

⊕ (

ui

(

xj

),

3[

xj

])),

(16)

where ⊗ = min (resp. max), ⊕ = max (resp. min) and 3[xj] = 1 - L[xj] (resp. L[xj]) if the agent i is pessimistic (resp.

optimistic).

4. Collectivesequentialdecisionmakingunderqualitativeuncertainty

4.1. Definitionofcollectivepossibilisticdecisiontrees

Representationformalisms suchasdecisiontrees[40],influencediagrams [28] andMarkovdecision process[3],offer a clear description ofsequential decision problemsand allow thedefinition ofoptimalstrategies. Inrecent years, therehas been agrowinginterestinmorecomplex problems(namelymulti-criteriaormulti-objectivesdecisionmaking)and several extensions ofclassical graphicalmodels[9,26,32,33] have emergedtopresent suchcases.Theseresearchproposalsrely on

(8)

the probability theoryand the wellknown expected utilitycriterionto solve theproblem. However, thiscriterionfails to represent allthedecisionmakers’behaviors.

With thegrowth of thequalitativeframeworks, especiallypossibility theory, many authorshave advocated thisordinal view ofdecision making and have gaverise to qualitativedecision modelsi.e., possibilisticdecision tree [24], possibilistic influence diagrams[24],and possibilisticMarkovdecisionprocesses [42].We noticethat othernon possibilisticqualitative paradigms havebeen developed,we citeamongothers,works presentedin[10,31,34].But,inthispaperwefocusonlyon possibilistic formalisms because of the useof pessimistic and optimisticutilities that are the ordinal counter part of the expected utilitycriterionandhavereceivedanaxiomaticjustificationbyDuboisandPrade[18,20].

In the remaining ofthis Section, we propose to solvesequential collective decisionproblems under qualitative uncer-tainty. For the best of our knowledge, this work is the first attempt to solve such problems and we propose to limit ourselvestotheuseofdecisiontrees;becauseeveninthissimple,explicitformalism,thesetofpotentialstrategiesis com-binatorial (i.e.,itssize increasesexponentiallywiththesizeofthetree)and thedeterminationofanoptimalstrategy fora given problemandusingthedifferentproposeddecisionrulesisanalgorithmicissueinitself.

Decision trees proposed by Raiffa [40] in 1968 are the most popular graphical models. They encode the structure of sequential problemsbyrepresentingallpossiblescenarios.Thegraphicalcomponentofadecisiontreeisadirectedlabeled treeDT = (N ,E )whereE istheset ofedgesand N= D ∪ C ∪ LN isthesetofnodesthatcontainsthreekindsofnodes:

D theset of decisionnodes(represented bysquares); C theset of chancenodes(represented bycircles)and LN theset of leaves. Inthis formalism,the rootof thetree isgenerally a decision node,denoted by D0. Succ(N) denotes theset of

children nodes ofnode N. For any Di∈ D,Succ(Di)⊆ C i.e., a chance node(an action) mustbe chosen ateach decision

node. For any Ci∈ C, Succ(Ci)⊆ LN ∪ D:theset ofoutcomesof anactionis eitheraleaf nodeoradecision node(and

thenanewactionshouldbeexecuted).

The numerical component of decision trees consists on assigning utilityvalues to leave nodes and labeling theedges outgoingfromchancenodes.Thequantificationofadecisiontreedependsessentiallyonthenatureofuncertaintypertaining theproblemandthetheoryused torepresentit.Initsclassicalversion,decisiontreesareprobabilistic.However,whenthe available informationisordinal,Garcia etal.[24] proposetolabeltheleaves byutilitydegreesintheunitscale[0,1] and to represent the uncertainty pertaining to the possible outcomes of each Ci by a conditionalpossibilitydistribution

π

i on

Succ(Ci),suchthat ∀N∈Succ(Ci),

π

i(N)= 5(N|path(Ci))where path(Ci)denotesallthevalueassignmentsofchanceand

decisionnodesonthepathfromtheroot D0 toCi.

Solving adecisiontreeamounts atbuildingacompletestrategy thatselects anaction (achancenode)for eachdecision node: astrategy isamapping δ : D 7→ C ∪ {⊥}. δ(Di)= ⊥ meansthat no action has beenselected for Di (δ ispartial). To

selecttheoptimalstrategy,authorsin[24] proposetoevaluate and comparestrategiesw.r.t. thepessimistic andoptimistic utilities axiomatized in[18].Leafnodes being labeledwithutilitydegrees,the rightmostchance nodes(i.e.,chance nodes on the far right-handside) canbeseen as simplepossibilisticlotteries. Then, eachstrategy δ canbeviewed asa connected sub-tree ofthedecisiontreeandisidentifiedwithapossibilisticcompoundlottery Lδ,i.e.,withapossibilitydistributionover

asetof(simpleorcompound)lotteries.Anycompoundlotteryisdenotedbyhλ1/L1,. . . ,λm/Lmianditcanbereducedinto

anequivalentsimplelottery1 asfollows [18]:

Reduction

(hλ

1

/

L1

, ..., λ

m

/

Lm

i) = h

max k=1,m

(

min

(λ

k 1

, λ

k

))/

x1

, . . . ,

max k=1,m

(

min

(λ

k n

, λ

k

))/

xn

i,

where λk isthepossibility ofgettinglottery Lk accordingtoL and λkj =

π

Lk(xj) istheconditionalpossibility ofgetting xj

from Lk.Hence,thepessimisticand optimisticutilityofastrategyδ canbecomputedon thebasisofthereductionof Lδ:

theutilityofastrategyδ isthentheoneof Reduction(Lδ).

To definecollective qualitativedecision trees,we resumethe samegraphicaland numericalcomponent of possibilistic decisiontrees[24] exceptforleavenodesthatareevaluatedaccordingtoseveralagentsinsteadofasingleone.

Each leafnode LN is now labeledby avectoruE(LN)= hu1(LN),. . . ,up(LN)i ratherthan bya singleutilitydegree (see

Fig.2).Astrategystillleads toacompoundlottery,and canbereduced, thusleadinginturn toasimple(butmulti-agent) lottery. We can now compare strategies according to any of the collective decision rules O previously presented (U−_postmin,

U+_postmin, U−_postmax, U_post+max, U_ante−min, U_ante+min, U−_antemax, U+_antemax, Umin_ante and Umax_ante) bycomparing their reductions. Formally, given two strategiesδ1and δ2,andacollectivedecisionrule O :

δ

1

º

O

δ

2iff UO

(δ

1

) ≥

UO

(δ

2

),

where

∀δ,

UO

(δ) =

UO

(

Reduction

(

Lδ

)).

(17)

Example3. Consider the tree of Fig.2, involving two equally important agents and thestrategy δ(D0)=C1, δ(D1)=C3,

δ(D2)=C5.Itholdsthat:

Lδ

= h

1

/

LC3

,

0

.

9

/

LC5

i

with LC3

= h

0

.

5

/

xa

,

1

/

xb

i,

LC5

= h

0

.

2

/

xa

,

1

/

xb

i.

Thereduction ofLδ canbecomputed:

(9)

Fig. 2. Collective possibilistic decision tree of Example3.

Reduction

(

Lδ

) = h

max

(

0

.

5

,

0

.

2

)/

xa

,

max

(

1

,

0

.

9

)/

xb

i = h

0

.

5

/

xa

,

1

/

xb

i.

So, ifweconsiderforinstancetheU+_antemincriterion,weget:

U_ante+min

(δ) =

min

(

max min

(

0

.

5

,

0

.

3

),

min

(

1

,

0

.

6

),

max

(

min

(

0

.

5

,

0

.

8

)

min

(

1

,

0

.

4

))) =

0

.

5

.

The definition proposed byEq (17) is intuitive but raises an algorithmic challenge: theset ofstrategies to compare is exponentialw.r.t. thesizeofthetreewhichmakestheexplicitevaluationofstrategiesnotrealistic.Thesequelofthepaper providesanalgorithmicstudyoftheproblem- applyingvariants ofDynamicProgrammingwhenitispossible.

4.2. Optimizationincollectivepossibilistictrees

4.2.1. DynamicProgrammingasatoolfor ex-postutilities

DynamicProgramming [2] is anefficientprocedureofstrategyoptimization.Itproceedsbybackwardinduction wherethe problem ishandledfromtheend(inourcase,fromtheleafs);thelastdecision nodesareconsideredfirst,and recursively until reachingtheroot.Morespecifically,thealgorithmcanbedescribedasfollows:whenachancenodeCi isreached,an

optimalsub-strategyisbuiltforeachofitschildren.Thesesub-strategiesarecombinedw.r.t.theiruncertaintydegrees.Then, the resulting compound strategy is reducedto anequivalent simple lottery representing thecurrent optimalsub-strategy. WhenadecisionnodeDi isreached,weselectadecision D∗amongallthepossiblealternatives N∈Succ(D)leadingtoan

optimalsub-strategyw.r.t.ºO.Thechoiceisperformedbycomparingthesimple lotteriesequivalenttoeachsub-strategy.

This algorithmis soundand complete assoonas ºO iscomplete, transitiveand satisfies theprinciple ofweak

mono-tonicity,2that ensuresthateachsubstrategy ofanoptimalstrategyisoptimalinitssub-tree.Formally,adecisionrule O is

weaklymonotoniciffwhatever L,L′ and L′′,whatever(

α

,β)suchthat max(

α

,β)=1:

L

º

O L′

⇒ h

α

/

L

, β/

L′′

i º

O

h

α

/

L′

, β/

L′′

i.

Eachoftheex-post criteria satisfiestransitivity,completeness andweak monotonicity,becausecollapsingto either clas-sicalpessimistic(U−)oroptimistic(U+)utility,whichsatisfiesthese properties[8,24].

TheadaptationofDynamicProgrammingtotheex-post decisionrulesisdetailedinAlgorithm1.Thisalgorithmcomputes thecollectiveutilityrelativeto eachpossibleconsequence byaggregatingtheutilityvaluesofeachleaf,and thenbuildsan optimal strategy from the last decision nodes to the root of the tree using the principle defined in [24,43] for classical (mono-agent) possibilisticdecisiontrees.

4.2.2. DynamicProgrammingfor ex-anteutilities

Whenthedecisionrulefollowstheex-ante approach,theapplicationofDynamicProgrammingisalittlemoretricky.The

ex-ante DynamicProgrammingwepropose(seeAlgorithm2)keepsateachnodeavectorofp pessimistic(resp.optimistic) utilities, one for each agent. The computation of the ex-ante utility can then be performed each time a decision is to be made. Recall that U−_antemin=U−min_post and U_ante+max=U+_postmax. Hence, for these two criteria the optimization could also be performedusingtheex-post algorithm.

2 _It_is_a_common_knowledge_in _sequential_decision_making_that _monotonicity_is_a _necessary_and_a_sufficient_condition_for_the_optimality_of_Dynamic

(10)

Algorithm 1: DynProgPost:Ex-post DynamicProgramming.

Data: T :adecisiontree,N:anodeofT

Result: u∗:thevalueoftheoptimalstrategyδ∗-δ∗isstoredasaglobalvariable

1 begin

2 u∗=0;// Initialization

3 if N∈ LN then// Leaf: CDM aggregation

4 for i∈ {1,. . . ,p}do uN← (uN⊕ (ui⊗ωi));

5 // ⊗ =min, ωi=wi, ⊕ =max for disjunctive aggregation 6 // ⊗ =max, ωi=1−wi, ⊕ = min for conjunctive aggregation; 7 if N∈ Cthen// Chance Node: computes the qualitative utility

8 foreach Y∈Succ(N)do uN← (uN⊕ (λY)⊗D yn P rog P ost(T,Y)); 9 // ⊗ =min, λY=π(Y), ⊕ =max for optimistic utility 10 // ⊗ =max, λY=1−π(Y), ⊕ =min for pessimistic utility 11 if N∈ Dthen// Decision node: determines the best decision

12 foreach Y∈Succ(N)do

13 uY←D yn P rog P ost(T,Y);

14 if uY≥u∗thenδ(N)←Y andu∗←uY;

15 return u∗;

Algorithm 2: DynProgAnte:Ex-ante DynamicProgramming.

Data: T :adecisiontree,N:anodeofT

1 begin

3 if N∈ LN then// Leaf

4 for i∈ {1,. . . ,p}douEN[i]←ui;

5 if N∈ Cthen// Chance Node: computes the utility vectors

6 for i∈ {1,. . . ,p}do uEN[i]←ǫ; 7 foreach Y∈Succ(N)do

8 uEY←D yn P rog Ante(T,Y);

9 for i∈ {1, . . . ,p}do uEN[i] ← ( EuN[i]⊕ (λY⊗ EuY[i]));

10 // Optimistic utility ⊗=min, λY=π(Y), ⊕=max, ǫ←0 11 // Pessimistic utility ⊗ =max, λY=1−π(Y), ⊕ =min, ǫ←1 12 if N∈ Dthen// Decision node

14 vY←ǫ;uEY←D yn P rog Ante(T,Y);

15 for i∈ {1,. . . ,p}do vY←vY⊕ ( EuY[i]⊗ ωi); 16 if vY>u∗thenδ(N) ←Y ,uEN← EuY and u∗←vY;

17 // Disjunctive CDM: let ⊗ =min, ωi=wi, ⊕ =max, ǫ←0 18 // Conjunctive CDM: let ⊗ =max, ωi=1−wi, ⊕ = min, ǫ←1 19 return u∗;

As shown by Counter Example 1, the U_ante−max and U_ante+min decision rules do not satisfy the monotonicity principle [4]. Hence,Algorithm2mayprovideagoodstrategybutwithoutanyguaranteeofoptimality.Itcanneverthelessbeconsidered asanapproximationalgorithmwhenusedforoptimizing anyofthese problematiccriteria

Counter-Example1. Considertheset ofconsequences X= {x1,x2,x3} andconsider two equallyimportantagents 1 and2

(w1=w2=1)with:u1(x1)=1,u1(x2)=0.8,u1(x3)=0.5; u2(x1)=0.6,u2(x2)=0.8,u2(x3)=0.8.

Considerthelotteries L1= h1/x1,0/x2,0/x3i,L2= h0/x1,1/x2,0/x3iand L3= h0/x1,0/x2,1/x3i:

L1gives consequencex1forsure, L2gives consequencex2forsureand L3 givesconsequencex3 forsure.Itholdsthat:

U_ante−max

(

L1

)

=

max i=1,2U

−

i

(

L1

)

=

max

(

1

,

0

.

6

)

=

1

.

U_ante−max

(

L2

)

=

max i=1,2U

−

i

(

L2

)

=

max

(

0

.

8

,

0

.

8

)

=

0

.

8

.

Hence L1≻L2withrespecttotheU−antemaxrule.

Consider now the compound lotteries L= h1/L1,1/L3i and L′= h1/L2,1/L3i. If the weak monotonicityprinciple were

satisfied, wewouldget:U_ante−max(L)>U_ante−max(L′).

(11)

Fig. 3. Lotteries of Counter-Example2.

Reduction(h1/L1,1/L3i)= h1/x1,0/x2,1/x3iand

Reduction(h1/L2,1/L3i)= h0/x1,1/x2,1/x3i.Itholdsthat:

U_ante−max(L)=U_ante−max(Reduction(h1/L1,1/L3i))=0.6.

U_ante−max(L′)=U_ante−max(Reduction(h1/L2,1/L3i))=0.8.

U_ante−max(L) <U_ante−max(L′)while U_ante−max(L1) >U_ante−max(L2). So, U_ante−max is not monotonic.

Using thefact that U_ante+min=1−U_anteτ−max [4],thiscounter-example is modifiedto show that U_ante+min doesnot satisfythe monotonicityprincipleeither.

Considertwoequallyimportantagents, 1 and2 withw1=w2=1 andutilities

uτ₁

(

x1

) =

0

,

uτ1

(

x2

) =

0

.

2

,

uτ1

(

x3

) =

0

.

5

;

uτ2

(

x1

) =

0

.

4

,

uτ2

(

x2

) =

0

.

2

,

uτ2

(

x3

) =

0

.

2

.

Considernowthesamelotteries L1, L2 and L3presentedabove.Itholdsthat:

U_ante+min

(

L1

) =

min i=1,2U + i

(

L1

) =

0

<

U +min ante

(

L2

) =

_imin₌₁ ,2U + i

(

L2

) =

0

.

2, while

U_ante+min

(

Reduction

(h

1

/

L1

,

1

/

L3

i)) =

0

.

4

>

Uante+min

(

Reduction

(h

1

/

L2

,

1

/

L3

i)) =

0

.

2,

whichcontradictstheweakmonotonicity.

Likewise, the ex-post Dynamic Programming Algorithm (Algorithm 1) shall also be considered as an algorithm of ap-proximation forU_ante−max and U_ante+min sincetheyarecorrelated totheirex-post counter-partsasshownin[6].Indeed,itholds that:

Proposition2.

U_ante+min

(

L

) ≥

U+_postmin

(

L

).

U_ante−max

(

L

) ≤

U−_postmax

(

L

).

So, even if itis notalways thecase,it often happensthat U−max_post =U_ante−max (resp. U+_postmin=U_ante+min); inthese casesthe solutionprovidedbytheex-post Algorithmisoptimal.

Algorithm 2 applies alsofor theoptimization ofthe ex-ante generalization decision rules, namely the Umax_ante and U_antemin

utilitiesfor heterogeneousagents.However, obtainingtheoptimalstrategy usingDynamicProgrammingisguaranteedonly formonotoniccriteria.Counter-Example2showsthat Umax_ante aswellasU_antemin donotsatisfythisproperty.

Counter-Example2. Consider two agents 1 and 2 having the sameimportance (w1 = w2 = 1), 1 being optimistic and 2

beingpessimistic,andconsiderthethreelotteriesonX= {x1,x2}depictedinFig.3.LetL andL′ betwocompoundlotteries

defined by:L= h0.6/L1,1/L3i and L′= h0.6/L2,1/L3i.

Wecanverifythat L1 isgloballypreferredto L2where:

U_antemax

(

L1

) =

0

.

9

>

Uantemax

(

L2

) =

0

.

8 whereas Uantemax

(

L

) =

0

.

6

<

Uantemax

(

L′

) =

0

.

8

,

whichprovesthat U_antemaxisnotmonotonic.

Since Umin_ante=1−U_anteτmax (Proposition 1), this counter-example can be modified to show that U_ante+min does not satisfy the monotonicity principleeither. We consider the agents 1 and 2 with thesame importance (w1 = w2 = 1) where 1 is

pessimistic and 2 is optimistic and we replace utilityfunctions relative to x1 and x2 for lotteries L1, L2, L3, L and L′ as

follows: uτ₁(x1)=0.1, uτ₂(x1)=0.4, uτ₁(x2)=0.2, uτ₂(x2)=0.2, u₁τ(x3)=0.5 and uτ₂(x3)=0.2. We can check that L2 is

better than L1 since U_antemin(L1) = 0.1 < Umin_ante(L2) = 0.2while U_antemin(L) = 0.4> U_antemin(L′)=0.2,which provesthat Umin_ante does

notsatisfymonotonicityproperty.

4.2.3. RightoptimizationofU_ante−maxbyMulti-DynamicProgramming

ThelackofmonotonicityofU_ante−max isnotdramatic,even whenoptimalitymustbeguaranteed.Indeed,withU_ante−max we look for a strategy that has a good pessimistic utility U_i− for atleast one agent i.This means that if it ispossible to get

(12)

Algorithm 3: MultiDynProg: rightoptimizationofU_ante−max.

Data: T :adecisiontree

Result: u∗:thevalueoftheoptimalstrategyδ∗-δ∗isstoredasaglobalvalue

1 begin

2 u =0;// Initialization

4 for i∈ {1,. . . ,p}do

5 δi= ∅;// Initialization

6 δi←P esD yn P rog(T,i) // Call to classical poss.Dyn.Prog. [24] - returns an optimal strategy and its value

U−i(δi);

7 u←min(U−_i (δi), ωi); 8 if u>u∗thenδ∗← δi;u∗←u; 9 return u∗;

Fig. 4. Lotteries of Counter-Example3.

for eachi astrategythat optimizesU_i− (andthiscanbedonebytheclassicalDynamicProgramming,sincethepessimistic utility ismonotonic),the onewith thehighest value for U_ante−max isgloballyoptimal. Formally, U_ante−max canbe expressedas follows:

U−_antemax

(

L

) =

max

i=1,pmin

(

wi

,

U

−

i

(

L

))

(18)

where U_i−(L)isthepessimisticutilityofL accordingtoagenti.

Corollary1.LetLbethesetofpossibilisticlotteriesthatcanbebuiltonX,L beanypossibilisticlotteryandlet: -L∗⊂ Ls.t.L∗= {L∗₁,. . . ,L∗_p}and∀L∈ L,U_i−(L∗_i)≥U−_i (L); - L∗∈ L∗_,_s.t._∀_L∗ i ∈ L∗:_imax_=1,_p min(wi,U − i (L∗))≥max_i_=1,_pmin(wi,U − i (L∗i)).

ItholdsthatU_ante−max(L∗)≥U_ante−max(L),∀L∈ L.

It follows that the optimization problem can be solved by a series of p calls to a classical (mono-agent) pessimistic optimization. This is the principle of the Multi-Dynamic Programming approach detailed by Algorithm 3. To reduce the execution time, this algorithm could be improved byconsidering only agents having importance weight wi greater than

U−_antemax oftheactualstrategy.

4.2.4. RightoptimizationofU_ante+min:aBranchandBoundalgorithm

As previously said, U+_antemin utility does not satisfy monotonicity. So, ex-ante Dynamic Programming (Algorithm 2) can provideagoodstrategy,butwithoutanyguaranteeofoptimality.Besides,thiscriterionperformstheegalitarianaggregation (use theminoperator). Thenunfortunately,asshowninCounter-Example3itisnot possibletoprovidearesultsimilarto Corollary1tocircumventthelackofmonotonicity(as forU_ante−max).

Counter-Example3.Considertwooptimisticagents1 and2 havingthesameimportancedegree(w1=w2=1),andconsider

thetwo lotteries L and L′ on X= {x1,x2} depictedin Fig.4. L givesconsequence x1 forsureand L′ gives consequence x2

forsure.Itholdsthat:

U+_antemin

(

L

) =

min

(

max

((

1

−

1

),

max

(

min

(

1

,

0

.

7

),

min

(

0

,

0

.

4

))) ,

max

((

1

−

1

),

max

(

min

(

1

,

0

.

3

),

min

(

0

,

0

.

9

)))) =

0

.

3

.

U+_antemin

(

L′

) =

min

(

max

((

1

−

1

),

max

(

min

(

0

,

0

,

7

),

min

(

1

,

0

.

4

))) ,

max

((

1

−

1

),

max

(

min

(

0

,

0

.

3

),

min

(

1

,

0

.

9

)))) =

0

.

4

.

Hence, L′ ≻ L withrespecttotheU_ante+min rule.Then, L′ istheoptimallottery L∗. Now,welookforthestrategy thatoptimizesU+_i foreachagent i.Weget:

U+₁

(

L

) =

0

.

7

≻

U₁+

(

L′

) =

0

.

4

.

So, L∗₁

=

L

.

(13)

Algorithm 4: B&B algorithmfortheoptimizationofU_ante+min.

Data: T :adecisiontree,δ:a(partial)strategy,u:anupperBoundofU+_antemin(δ)

Result: u∗:theU+_anteminvalueoftheoptimalstrategyδ∗foundsofar

1 begin

2 ifδ(D0)= ⊥then Dpend← {D0};

3 else Dpend← {Di∈ Ds.t.∃Dj, δ(Dj)6= ⊥and Di∈Succ(δ(Dj))}; 4 ifD_pend= ∅then// δ is a complete strategy

5 δ∗← δ;u∗←u;

6 else

7 Dnext←arg minDi∈Dpend i ;

8 foreach Ci∈Succ(Dnext)do 9 δ(Dnext)←Ci; 10 u←U pper Bound(T,δ);

11 if u>u∗then u∗←B&B(u,δ);

12 return u∗_;

We can check that max

i∈{1,2} min (wi,Ui(L ∗

i))=0.7≻_i_∈{1,2}max min (wi,Ui(L

∗₎₎₌₀_._4, _which _proves _that _a _result _similar _to

Corollary1doesnotholdfor U+_antemin.

Toguaranteetheoptimality,wehavetoproposeanexactalgorithmandproceedbyanimplicitenumerationviaaBranch and Bound(B&B)algorithm,asdoneforRankDependentUtility[29] andforPossibilistic Choquetintegrals[8] (bothinthe monoagentcase).

TheBranchand Boundprocedure described byAlgorithm4takes asargumentapartialstrategy δ andanupperbound ofthe U+_antemin valueofitsbestextension. Itreturnsu∗ theU_ante+min valueofthebeststrategy δ∗ foundso far.Toreducethe researchtime,wecaninitializeδ∗withany strategy,e.g.theoneprovidedbyDynamicProgramming(using Algorithm2or even Algorithm 1proposedfor theex-post approach).Ateachstep oftheBranchand Boundalgorithm, thecurrentpartial strategy δ is developed bythe choice of an action for some unassigned decision node. When several decision nodes are candidate,theonewiththeminimalrank(i.e.,theformeroneaccordingtothetemporalorder)isdeveloped.Therecursive procedurebacktrackswheneitherthecurrentstrategyiscomplete(thenδ∗ andu∗areupdated)orprovestobeworsethan thecurrentδ∗.

Function UpperBound(T,δ) outlined byAlgorithm 5 providesan upper bound of thebest completion ofδ: In practice, it builds for each agent i, a strategy δi that maximizes Ui+ (using [41,43]’s algorithm, which is linear). It then selects,

among these strategies, the one with the highest U_ante+min. Notice that U pper Bound(T,δ)=U_ante+min(δ) when δ is complete. Whenever the value returned by UpperBound(T,δ) is lower or equal to u∗, the value of the best current strategy, the algorithmbacktracksyieldingthechoiceofanotheractionforthelast considereddecisionnode.

4.2.5. Agentswithdifferentattitudes:rightoptimizationofUmax_anteandUmin_ante

Letusfinallystudy theheterogeneousutilitiesUmax_ante and Umin_ante proposedinSection3.2tosolvecollectivedecision prob-lems wheretheset ofdecision makersgatherspessimisticand optimisticagents. Thesedecisionrules areageneralization ofex-ante utilitiessoitisnotsurprisingthat theydonotsatisfytheweakmonotonicity(seeCounterExample2).

4.2.6. Optimizationforheterogeneous agents- themax-basedrule

Whenoptimizing U_antemax, wearelookingfor astrategythat maximizesthequalitativeutility(U−_i orU+_i )foratleast one agent i: wegetforeachagent anoptimalstrategywith regardsto U⊗ that isdefined accordingtohis/herattitudew.r.t. uncertainty i.e., U_i⊗ = U−_i (resp.U⊗_i = U_i+) if theagent is pessimistic (resp.optimistic).Then, we selectthestrategy that maximizestheglobalutilityU_antemax.Basically,theoptimizationofthesecriteria reliesonthesameideathantheoneproposed fortheoptimizationofU_ante−max.Formally,wecanwrite:

U_antemax

(

L

) =

max

i=1,pmin

(

wi

,

U

⊗

i

(

L

));

(19)

where U_i⊗(L) denotes either the pessimistic orthe optimistic utilityof L for agent i,his / her attitude w.r.t. uncertainty being capturedby⊗.

Corollary2.LetLbethesetofpossibilisticlotteriesthatcanbebuiltonX,L beanypossibilisticlotteryandlet: -L∗_{⊂ L}_s.t._L∗_{= {}_L∗ 1,. . . ,L∗p}and∀L∈ L,Ui⊗(L ∗ i)≥U ⊗ i (L); - L∗∈ L∗_,_s.t._∀_L∗ i ∈ L∗:_imax_=1,_pmin(wi,U ⊗ i (L∗))≥_imax_=1,_pmin(wi, U ⊗ i (L∗i)).

(14)

Algorithm 5: UpperBound ofB&Balgorithmfortheoptimization ofU+_antemin.

Data: T :adecisiontree,δ:a(partialorcomplete)strategy,N:anodeofT

Result: u(δ):theU+min_ante valueofthecurrentstrategyδ

1 begin 2 uN=0;// Initialization 3 u(δ)= 0;// Initialization 4 for i∈ {1,. . . ,p}do 5 if N∈ LN then// Leaf 6 uN←ui;

7 if N∈ Cthen// Chance Node: computes the optimistic utility

9 uY←D yn P rog Ante(T,Y);// Call to Ex-ante Dyn.Prog. (Algorithm 2) 10 uN←max(uN,min(πY,uY));

11 if N∈ Dthen// Decision node

12 ifδ(N)6= ⊥then// Prefixed action

13 δi(N)← δ(N)anduN←uY 14 else 15 foreach Y∈Succ(N)do 16 uY←D yn P rog Ante(T,Y); 17 if uY>uNthen 18 δi(N)←Y anduN←uY; 19 for j∈ {1,. . . ,p}do 20 u =0;// Initialization

21 U+_j(δi)←O ptU til(δi,j);// Computes for each agent j the value of its optimistic utility U+j(δi); 22 u←min(u,max(U+_j(δi),1−ωi));

23 if u>u(δ)then u(δ)←u;

24 return u(δ);

Algorithm 6: ImproHetMultiDynProg: rightoptimizationofU_antemax.

Data: T :adecisiontree;D0:rootofT

1 begin

3 δ =D yn P rog P ost(T,D0,O pt);// O pt is the subset of optimistic agents - returns an optimal strategy δ∗ for Uante+max and its optimal value u∗ 4 foreach i∈/O pt do

5 if wi>u∗then

6 δi=P esD yn P rog(T,i); // Call to classical poss. Dyn. Prog. [24] - returns an optimal strategy for the pessimistic utility U_i−;

7 ui=min(wi,U−_i(δi));

8 if ui>u∗thenδ∗← δi;u∗←ui; 9 return u∗;

ThisresultallowstheuseofMulti-DynamicProgrammingfortheoptimizationof Umax_ante.Afirstidea consistsonadirect adaptationoftheMultiDynProg(Algorithm3)byreplacingline6and 7respectivelyby:

line6’:δi←D yn P rog(T,i); // Call to poss. Dyn. Prog. w.r.t. the agent attitude (Pes or Opt) [24].

line7’:u←min(U_i⊗(δi),wi); // U⊗i =U

−

i if i is pessimistic and

U⊗_i =U_i+ if i is optimistic.

This algorithm (so called HetMultiDynProg) can be improved by considering first the optimistic decision makers and then thepessimistic onesinstead of p calls to DynamicProgramming foreach one ofthem.This gives rise tothe second versionofMulti-DynamicProgramming(socalledImproHetMultiDynProg)outlinedbyAlgorithm6.Inshort,thisprocedure can be described as follows: For optimistic agents the optimization of Umax_ante comes down to the optimization of U_ante+max

usingDynamicProgramming(eitherAlgorithm2orAlgorithm1,sinceU_ante+max =U+_postmax).Then,weconsideronlypessimistic agents withan importancedegree wi higherthan thecurrent optimalvalue (obtainedfor optimisticagents),we compute

theirpessimisticutilitiesandselectthestrategythatmaximizes Umax_ante.Obviously,HetMultiDynProgand ImproHetMultiDyn-Prog providethesameoptimalsolutionssincetheyarebothexactalgorithms.However, ImproHetMultiDynProgneeds less iterations - thisreducestheexecutiontimeasitwillbeshownbytheexperimentalstudy.

(15)

4.2.7. Optimizationforheterogeneous agents- themin-basedrule

Let us consider the latest awkward criterion U_antemin. Since it is not monotonic, Dynamic Programming comes without guaranteeofoptimality.Thus,toobtainoptimalstrategyweadapttheBranchandBoundprocedure(Algorithm4)proposed for the optimization of U_ante+min by adjusting the computing of U pper Bound(T,δ). We retain the same principle but here we compute theU_antemin of thebest completion of δ:for eachagent, U pper Bound(T,δ) buildsa strategy δi that maximizes

U⊗_i taking into account the agent’s attitude w.r.t. uncertainty (⊗=min if the agent is pessimistic and ⊗=max if he is optimistic).Then,itselectsamongthesestrategiestheonewiththehighest Umin_ante.Morespecifically,weextendAlgorithm5

todealwithheterogeneous decisionmakersratherthanonlyoptimisticones.InsteadofcomputingU_i+theoptimisticutility for allagentswecompute U⊗_i foreachoneofthemdependingonthedecisionmakerattitude(U⊗_i =U+_i (resp.U_i−)ifthe agent isoptimistic(resp.pessimistic)).Themodificationsconcernlines9and21 oftheUpperBoundfunction(Algorithm 5) that arerespectivelyreplacedby:

line 9’: uN ← (uN ⊗ (λY ⊕uY)); // if i is Optimistic: ⊗=max, λY =π(Y), ⊕=min, if i is Pessimistic: ⊗=min,

λY=1−π(Y), ⊕=max.

line 21’: Uj(δi) ← U til(δi,j); // Computes for each agent j the value of its optimistic or pessimistic utility w.r.t. his/her attitude.

5. Experiments

This last Sectionaims atexperimentingthe feasibility oftheexact algorithmsproposed, namely(i)Dynamic Program-ming forU+_postmax and U−min_post , andalso forU_ante+max and U−_antemin becausethelatterscoincide withformers, (ii)Multi-Dynamic Programming forU_ante−max (purepessimisticagents)andU_antemax (heterogeneousagents),and (iii)BranchandBoundfor U+_antemin

(pureoptimisticagents)and U_antemin (heterogeneousagents).

Beyondaproofof feasibilityofthese algorithms,ourexperiments aimatevaluatingto whatextent theoptimization of theproblematic(nonmonotonic)utilities,canbeapproximatedbyDynamicProgramming.Forhomogeneousagents,ex-post

and ex-ante DynamicProgramming algorithms canindeed beused but come withoutguarantees of optimality- they can beconsideredasapproximationalgorithms.However,forheterogeneousagents,thepostapproachismeaninglessand only

ex-ante DynamicProgrammingshallbeconsideredforapproximationpurposes.

The implementation has been done in Java, on a processor Intel Core i7 2670 QMCPU, 2.2Ghz, 6Gb of RAM. The ex-periments were performed on complete binary decision trees. We have considered five sets of problems, the number of decisions to be made in sequence (denoted seq) varying from 2 to 6, with analternation of decision and chance nodes: ateachdecisionlevel l (i.e.,odd level),thetreecontains2l−1 _decision_nodes_followed _by₂l _chance_nodes.3_In _the_present

experiments, thenumberofagentsissetequalto 6 (forheterogeneousagents cases,weset 3optimisticand3pessimistic agents). Theutilityvalues aswellastheweights degreesareuniformlyfired intheset {0,0.1,0.2,. . . ,0.9,1}.Conditional possibilities arechosen randomly in [0,1] and normalized. Each ofthe fivesamples ofproblems contains 1000 randomly generatedtrees.

5.1. Feasibilityanalysisandtemporalperformances

Table1presents,foreachcriterion,theexecutiontimeofeachpossiblealgorithm.Obviously,whateverthealgorithmthe CPU timeincreases withthe sizeof thetree. DynamicProgramming is alwaysbelow to thethreshold of1 ms, whilethe BranchandBoundalgorithmsaremoreexpensive(upto16 ms)but itremainsaffordableevenforbigtrees(1365decision nodes).

Fortricky(nonmonotonic)decisionrules,boththeexactalgorithm(s)andtheapproximationalgorithm(s)arepresented. It can be checked that for these rules the approximation Dynamic Programming is always faster than exact algorithms. Unsurprisingly, and whatever the rule tested, the ex-ante Dynamic Programming is slightly slower than the ex-post

Dy-namic Programming - both remaining far below the millisecond,in any case.Finally, asto the optimization of U_antemax, the experimental resultsverifythat ImproHetMultiDynProgisquickerthanHetMultiDynProg- bothbeing exactalgorithms.

Furthermore,tostudytheeffectsofvaryingthenumberofagents,weconsidertheoptimizationofU_ante−maxand Umax_ante,for reasonable trees(341decisionnodes)with pagents from3to 10,usingthemoretime-consuming algorithm (Branchand Bound). Clearly,asshownin Table2, theaverageCPUtimewith 3 agents, isabout 3 millisecondsfor U_ante−max and about 4 millisecondsfor U_antemax.ThemaximalCPUtimefor decisiontreeswith10 agentsisless than 11 millisecondsinbothcases. Thus,wecansaythat theresultsaregoodenoughtoallowthehandlingofreal-size problems.

5.2. QualityofapproximationofexactalgorithmsbyDynamicProgramming

As previously said, U−_antemax and U_ante+min, relative to homogeneousagents, and Umax_ante and Umin_ante, for heterogeneous ones, are not monotonic. For both cases, right optimization is performed using Multi-Dynamic Programming for max-oriented

(16)

Table 1

AverageCPUtime,inmilliseconds,accordingtothesizeofthetree(innumberofdecisionnodes).

Algorithm # of decision nodes

5 21 85 341 1365

U−_postminU_ante−min Post Dyn. Prog. 0.022 0.026 0.038 0.052 0.106 U+_postmaxU_ante+max Post Dyn. Prog. 0.024 0.030 0.043 0.060 0.117 U−_postmax Post Dyn. Prog. 0.025 0.027 0.039 0.053 0.112 U+_postmin Post Dyn. Prog. 0.026 0.028 0.041 0.059 0.110

U−_antemax Multi Dyn. Prog. 0.063 0.074 0.102 0.129 0.605 U−_antemax Ante Dyn. Prog. 0.049 0.065 0.93 0.102 0.446

U+_antemin Branch & Bound 0.359 0.794 2.044 6.095 14.198 U+_antemin Ante Dyn. Prog. 0.032 0.063 0.090 0.114 0.534

Umax

ante Het. Multi. Dyn. Prog. 0.068 0.073 0.114 0.136 0.319

Umax_ante Impro. Het. Multi. Dyn. Prog. 0.047 0.058 0.079 0.124 0.187 Umax_ante Ante Dyn. Prog. 0.053 0.065 0.096 0.149 0.217

Umin_ante Het. Branch & Bound 0.420 0.972 2.708 8.483 16.356 Umin

ante Ante Dyn. Prog. 0.051 0.071 0.109 0.131 0.206

Table 2

AverageCPUtime (inmilliseconds)forU_ante−max andUmax

ante usingBranch andBound algo-rithms(B&BandHet.B&B)fortreeswith341 decisionnodes.

# of agents 3 4 5 6 7 8 9 10 U−_antemax 3.596 4.394 5.344 6.056 7.137 7.840 8.534 9.204 Umax ante 4.520 5.444 7.023 7.824 8.785 9.521 10.457 10.987 Table 3

QualityofapproximationofexactalgorithmsMultiDyn.Prog.(for U−_antemax)andB&B (for U+_antemin)byax-ante andax-post Dyn.Prog.

Algorithm # of decision nodes

5 21 85 341 1365

% of success

U−_antemax Ante Dyn. Prog 16.1% 19.8% 23.7% 27.1% 31.9% U−_antemax Post. Dyn. Prog 17% 24.2% 28.9% 33.7% 39% U+_antemin Ante Dyn. Prog. 82% 78.6% 71% 65.4% 60.2% U+_antemin Post Dyn. Prog. 93.2% 91% 89.3% 87.5% 84.7%

Closeness Value

U−_antemax Ante Dyn. Prog. 0.49 0.54 0.62 0.71 0.80 U−_antemax Post Dyn. Prog. 0.47 0.51 0.59 0.69 0.73 U+_antemin Ante Dyn. Prog. 0.96 0.94 0.92 0.91 0.90 U+_antemin Post Dyn. Prog. 0.97 0.96 0.95 0.94 0.93

aggregationutilitiesand Branchand Boundformin-oriented ones.Forthesecriteria, DynamicProgrammingalgorithmscan nevertheless beconsideredasapproximationalgorithms. Thefollowingexperiments estimatethequality ofthese approxi-mations. Tothisextent, we computefor eachsample thesuccess rateoftheconsidered approximationalgorithm, i.e.,the number of treesfor which the value provided bythe approximationalgorithm is actually optimal (i.e.,equals to theone computed bytheexact algorithm);thenfor thetreesfor which theapproximationalgorithm failsto reachoptimality, we report the average closeness value to UApprox

UExact where UApprox is the utility of the strategy provided by the approximation

algorithm and UExact is the optimal utility- the one of the solution by the exact algorithm: Namely, Branch and Bound

algorithmfor U_ante+min and itsadaptationforthegeneralizedcriterionU_antemin and Multi-DynamicProgrammingforU_ante−max and itsgeneralizationforU_antemax.Theresults aregiven inTables 3and 4.