• Aucun résultat trouvé

Belief functions induced by random fuzzy sets: A general framework for representing uncertain and fuzzy evidence

N/A
N/A
Protected

Academic year: 2021

Partager "Belief functions induced by random fuzzy sets: A general framework for representing uncertain and fuzzy evidence"

Copied!
39
0
0

Texte intégral

(1)

HAL Id: hal-03132386

https://hal.archives-ouvertes.fr/hal-03132386

Submitted on 5 Feb 2021

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Belief functions induced by random fuzzy sets: A general framework for representing uncertain and fuzzy evidence

Thierry Denoeux

To cite this version:

Thierry Denoeux. Belief functions induced by random fuzzy sets: A general framework for representing uncertain and fuzzy evidence. Fuzzy Sets and Systems, Elsevier, In press, �10.1016/j.fss.2020.12.004�.

�hal-03132386�

(2)

Belief functions induced by random fuzzy sets: A general framework for representing uncertain and fuzzy evidence

Thierry Denœux

a,b,c

a

Universit´ e de technologie de Compi` egne, CNRS UMR 7253 Heudiasyc, Compi` egne, France

b

Shanghai University, UTSEUS, Shanghai, China

c

Institut universitaire de France, Paris, France

Abstract

We revisit Zadeh’s notion of “evidence of the second kind” and show that it provides the foundation for a general theory of epistemic random fuzzy sets, which generalizes both the Dempster-Shafer theory of belief functions and possibility theory. In this perspective, Dempster-Shafer theory deals with belief functions generated by random sets, while possibil- ity theory deals with belief functions induced by fuzzy sets. The more general theory allows us to represent and combine evidence that is both uncertain and fuzzy. We demonstrate the application of this formalism to statistical inference, and show that it makes it possible to reconcile the possibilistic interpretation of likelihood with Bayesian inference.

Keywords: Dempster-Shafer theory, evidence theory, possibility theory, fuzzy mass functions, uncertain reasoning, likelihood, estimation, prediction.

1. Introduction

The theory of belief functions developed by Dempster [7] and Shafer [42] is a theory of uncertain reasoning that puts the emphasis on the concept of evidence. It is based on the rep- resentation of elementary pieces of evidence by belief functions (defined as completely mono- tone set functions) and on their combination by an operator called the product-intersection rule, or Dempster’s rule of combination. A belief function can be constructed by comparing a piece evidence to a scale of canonical examples such as randomly coded messages, whose meanings are determined by chance [43]. A belief function on a set Θ can be seen as be- ing induced by a multi-valued mapping from a probability space to Ω; it is mathematically equivalent to a random set [7, 37]. As rational beliefs are essentially determined by evidence, the Dempster-Shafer (DS) theory can be regarded as a general framework for reasoning with uncertainty [13].

Shortly after the introduction of DS theory, Zadeh independently proposed another for- malism, called Possibility Theory [57], in which the concept of “fuzzy restriction” plays a

Email address: Thierry.Denoeux@utc.fr (Thierry Denœux)

Preprint submitted to Fuzzy Sets and Systems January 26, 2021

(3)

central role (see, e.g., [14] for a recent review). A fuzzy restriction is typically imposed on a variable X taking values in a set Θ by a statement of the form “X is F e ”, where F e is a fuzzy subset of Θ. For instance, the statement “John is young” acts as a flexible constraint on the age of John. Zadeh proposed to identify the membership function of the fuzzy set F e with a possibility distribution: for any θ ∈ Θ, F e (θ) is then both the degree of membership of θ to the fuzzy set F e , and the degree of possibility that X takes value θ, knowing that “X is F e ”. The possibility of a subset A ⊆ Θ is then the supremum of the degrees of possibility π(θ) for all θ ∈ Θ.

The mathematical connections between the two theories were soon pointed out [20], namely: a possibility measure corresponds to a belief function induced by a multi-valued mapping that assigns probabilities only to a collection of nested subsets. Such a belief func- tion is said to be consonant. From this point of view, the formalism of belief functions is more general than that of possibility theory. However, Dempster’s rule does not fit in possibility theory as it does not preserve consonance: the combination of two possibility measures by Dempster’s rule is not a possibility measure. Possibility theory has its own conjunctive and disjunctive combination operators based on triangular norms and conorms [23]: consequently, it is not a special case of DS theory, but a stand-alone framework. An open question is whether DS and possibility theories should be regarded are two “com- peting” models of uncertainty, which can be used interchangeably to reason with partial information, or whether they should rather be considered as complementary, each theory being suited to represent different kinds of uncertain evidence. If one adopts the latter view, as I do in this paper, it makes sense to search for a more general theory that encompasses belief functions, possibility measures, and their combination operators as special cases. In- terestingly, elements of such a theory already exist, but they have received relatively little attention until now.

The idea of combining DS and possibility theories can first be found in an early paper by Zadeh [58]. Zadeh calls “evidence of the second kind” a pair (X, Π

(Y|X)

) in which X is a discrete random variable on a set Ω and Π

(Y|X)

a collection of “conditioned π-granules”, i.e., conditional possibility distributions of Y given X = x, for all x ∈ Ω. The probability distribution on X and the conditional possibility distributions together define a mixture of possibility measures. If the random variable X is certain, then the mixture has only one component and it boils down to a single possibility measure representing fuzzy evidence. If the conditional possibility distributions take values in {0, 1}, we essentially get a DS belief function, representing uncertain evidence. In general, Zadeh’s concept of evidence of the second kind makes it possible to represent evidence that is both fuzzy and uncertain, and it lays down the foundation of a general theory of uncertainty encompassing possibility theory and DS theory as special cases. This theory, only sketched by Zadeh in [58] and mostly overlooked in later work, is further explored in this paper.

The mathematical structure (X, Π

(Y|X)

) is actually related to the notion of random fuzzy

set or fuzzy random variable, a concept that also appeared at the end of the 1970’s [26,

33, 34]. Random fuzzy sets have been used with different interpretations that differ from

Zadeh’s evidence of the second kind, such as a model of a random mechanism for generating

(4)

fuzzy data [41, 27], or a representation of imprecise information about the true probability distribution associated with a random experiment [32, 4]. Baudrit et al [3] use random fuzzy sets to propagate probabilistic and possibilistic uncertainty in risk analysis from an imprecise probability perspective. Couso and S´ anchez [5] propose a framework based on a known probability measure on a set Ω

1

and a family of conditional possibility measures {Π(·|ω) : ω ∈ Ω

1

} on a set Ω

2

, similar to Zadeh’s “conditioned π-granules”. Independently from the literature on fuzzy random variables, a few contributions on “fuzzy belief structures”, defined as DS mass functions with fuzzy focal sets, were also published in the 1980’s and early 1990’s [29, 51, 53]. Applications to classification, regression and missing data imputation were presented, respectively, in [17], [39, 40] and [38]. However, in these papers, the formalism of fuzzy belief structures was seen merely as a fuzzy generalization of DS mass functions, and not as a general concept encompassing DS and possibility theories as special cases, a perspective that is adopted in this paper.

A very important problem for which a general theory of epistemic random fuzzy sets can be useful is statistical inference. In [42], Shafer proposed to treat the relative likelihood function as the contour function or a consonant belief function. This view is quite appealing and it was further justified axiomatically in [12]. However, it was later rejected by Shafer [44] because it is not consistent with Dempster’s rule of combination: the belief function induced by a joint random sample composed of two independent samples is not the orthog- onal sum of the belief functions induced by each of the two samples considered separately.

Rather, relative likelihood functions induced by independent samples must be multiplied and renormalized, an operation that is consistent with a possibilistic interpretation of the likelihood [46, 1]. Yet, the limited expressiveness of possibility theory alone does not allow it to represent probability distributions and their combination with likelihoods to yield poste- rior distributions. The generalized setting proposed in this paper allows us to resolve these difficulties, by viewing the likelihood function as defining a mass function with a single fuzzy focal set. The resulting model still generalizes Bayesian inference as did Shafer’s original model, while being consistent with Dempster’s rule of combination. To highlight the basic principles without being distracted by mathematical intricacies, we will assume the variables to be defined on finite domains throughout this paper. The case of infinite domains will be addressed in a companion paper.

The rest of the paper is organized as follows. DS and possibility theories are first recalled in Section 2, where some new definitions and results are also given. The more general random fuzzy set model is then exposed in Section 3. Finally, the application to statistical inference is addressed in Section 4, and Section 5 concludes the paper.

2. DS and possibility theories

In this section, we consider a variable θ taking values in a finite set Θ, and we review different classical models for representing and combining evidence about θ. The case of logical evidence is first recalled in Section 2.1. This simple case can be generalized to uncertain evidence, leading to DS theory, or to fuzzy evidence, resulting in possibility theory.

These two theories are reviewed, respectively, in Sections 2.2 and 2.3.

3

(5)

2.1. Logical evidence

Let F be a subset of Θ, and assume that we receive a piece of evidence that us that θ ∈ F for sure, and nothing more. We call such evidence “logical” because it consists of a proposition that is known to be true. Given this evidence, a proposition “θ ∈ A” for some A ⊆ Θ is possible if and only if F ∩ A 6= ∅, and it is certain if and only if F ⊆ A. We can define two set functions Π

F

and N

F

from the power set 2

Θ

to {0, 1}, as

Π

F

(A) := I(F ∩ A 6= ∅) and

N

F

(F ) := I (F ⊆ A),

where I(·) is the indicator function, which returns 1 if its argument is true, and 0 otherwise [24]. We can remark that Π

F

(A) = 1 if and only if there is some θ ∈ A that belongs to F . We can thus write, equivalently,

Π

F

(A) = max

θ∈A

F (θ), (1)

where F (·) denotes the characteristic function of set F . The possibility value of {θ}, denoted by π

F

(θ), is

π

F

(θ) := Π

F

({θ}) = F (θ). (2)

Furthermore, the proposition F ⊆ A is equivalent to F ∩ A

c

= ∅, where A

c

denotes the complement of A. Consequently, we have

N

F

(A) = 1 − Π

F

(A

c

). (3)

It is clear that, for any two subsets A and B of Θ, the following equalities hold:

Π

F

(A ∪ B) = Π

F

(A) ∨ Π

F

(B) (4) and

N

F

(A ∩ B) = N

F

(A) ∧ N

F

(B ), (5) where ∨ and ∧ denote, respectively, the maximum and minimum operators. Functions Π

F

and N

F

can be called, respectively, Boolean possibility and necessity measures, and function π

F

: Θ → {0, 1} can be called a Boolean possibility distribution.

A subset A ⊆ Θ is possible if at least one element θ ∈ A is possible (i.e., belongs to F ). This is captured by function Π

F

. We can also define the stronger notion of guaranteed possibility [24]: A ⊆ Θ is “guaranteed possible” if all its elements are possible, i.e., if A ⊆ F . This notion is captured by the guaranteed possibility function

F

(A) := I (A ⊆ F ) = min

θ∈A

F (θ). (6)

We can also define a dual “potential certainty” function, ∇

F

(A) = 1 − ∆

F

(A

c

), which equals

one if and only if there is some θ outside A that is not possible, i.e., if A ∪ F 6= Θ.

(6)

If we now have two pieces of evidence telling us that θ ∈ F for sure and θ ∈ G for sure, and if we consider that both sources can be trusted, then we can infer that θ ∈ F ∩ G. The combined Boolean possibility distribution is, thus,

π

F∩G

(θ) := π

F

(θ) ∧ π

G

(θ). (7)

If, on the other hand, we only consider that at least one of the two sources can be trusted, then we can infer that θ ∈ F ∪ G, and the combined Boolean possibility distribution is

π

F∪G

(θ) := π

F

(θ) ∨ π

G

(θ). (8)

This simple model can be extended in two ways, by allowing the piece of information to be uncertain, or fuzzy. These two extensions correspond, respectively, to Dempster-Shafer (DS) and Possibility theories, which are recalled in Sections 2.2 and 2.3 below.

2.2. Uncertain evidence: DS theory

Let us now assume that we receive a piece of evidence that can be interpreted in different ways, with given probabilities [42, 13]. Let Ω be the set of interpretations, assumed to be finite. If interpretation ω ∈ Ω holds, then the evidence tells us that θ belongs to some nonempty subset Γ(ω) ⊆ Θ, and nothing more. We further assume that we can assess probabilities on Ω, on which we define a probability measure P . The tuple (Ω, 2

, P, Γ), where Γ is a mapping from Ω to 2

Θ

is a random set. We define the corresponding mass function m : 2

Θ

to [0, 1] as

m(A) := P ({ω ∈ Ω : Γ(ω) = A}) ,

for all nonempty subset A ⊆ Θ, and m(∅) = 0. A subset F such that m(F ) > 0 is called a focal set of m. We denote the focal sets by F

1

, . . . , F

f

, and their masses by m

i

:= m(F

i

) for i = 1, . . . , f . A mass function is said to be logical if it has only one focal set, and Bayesian if all of its focal sets are singletons.

If interpretation ω holds, we know that θ ∈ Γ(ω). The possibility and necessity that θ ∈ A are then, respectively, Π

Γ(ω)

(A) and N

Γ(ω)

(A). We can then compute the expected possibility and the expected necessity of the proposition “θ ∈ A” as, respectively,

P l

m

(A) = X

ω∈Ω

P ({ω})Π

Γ(ω)

(A) =

f

X

i=1

m

i

Π

Fi

(A) =

f

X

i=1

m

i

I(F

i

∩ A 6= ∅) (9) and

Bel

m

(A) = X

ω∈Ω

P ({ω})N

Γ(ω)

(A) =

f

X

i=1

m

i

N

Fi

(A) =

f

X

i=1

m

i

I(F

i

⊆ A). (10) These two functions are called, respectively, plausibility and a belief functions. If m has only one focal set F , it is clear that functions P l

m

and Bel

m

boil down, respectively, to the Boolean possibility and necessity functions Π

F

and N

F

reviewed in Section 2.1. In

5

(7)

the general case, they are mixtures of such functions. The restriction of function P l

m

to singletons, denoted as pl

m

(θ) := P l

m

({θ}), is called the contour function associated to m.

It is equal to function π

F

when m has only one focal set F .

Functions P l

m

and Bel

m

are linked by the duality relation Bel

m

(A) = 1 − P l

m

(A

c

), which is a direct consequence of (3). Shafer [42] shows that a mapping Bel : 2

Θ

→ [0, 1] can be written in the form (10) if and only if Bel(∅) = 0, Bel(Θ) = 1, and Bel is completely monotone, i.e.,

Bel

k

[

i=1

A

i

!

≥ X

∅6=I⊆{1,...,k}

(−1)

|I|+1

Bel \

i∈I

A

i

!

(11) for any k ≥ 2 and any collection A

1

, . . . , A

k

of subsets of Θ.

In addition to the expected possibility and necessity, we can also compute the expected guaranteed possibility of the proposition “θ ∈ A” as

Q

m

(A) = X

ω∈Ω

P ({ω})∆

Γ(ω)

(A) =

f

X

i=1

m

i

Fi

(A) =

f

X

i=1

m

i

I(A ⊆ F

i

). (12) Function Q

m

is called the commonality function induced by m [42]; it plays an important role in DS theory, as will be shown below. The dual function ∇

m

(A) = 1 − Q

m

(A

c

) could also be defined, but its interpretation in DS theory is less clear. Functions m, Bel

m

, P l

m

and Q

m

are in one-to-one correspondence and any one of them allows us to recover the other three. They can thus be considered as different facets of the same information.

Dempster’s rule. Let us now assume that we have two mass functions m

1

and m

2

on Θ induced by independent pieces of evidence, i.e., by independent random sets (Ω

1

, 2

1

, P

1

, Γ

1

) and (Ω

2

, 2

2

, P

2

, Γ

2

). We further assume that both pieces of evidence are reliable, i.e., if the pair of interpretations (ω

1

, ω

2

) ∈ Ω

1

×Ω

2

holds, then we know for sure that θ ∈ Γ

1

, ω

2

) :=

Γ

1

1

) ∩ Γ

2

2

), provided that Γ

1

1

) ∩ Γ

2

2

) 6= ∅. To compute the probability that a particular event in Ω

1

× Ω

2

holds, we need to compute the product measure P :=P

1

⊗ P

2

and to condition it on the event

Θ

:= {(ω

1

, ω

2

) ∈ Ω

1

× Ω

2

: Γ

1

, ω

2

) 6= ∅}. (13) The orthogonal sum [42] of m

1

and m

2

is then defined, for all A ∈ 2

Θ

, as

(m

1

⊕ m

2

)(A) := P ({(ω

1

, ω

2

) ∈ Ω

1

× Ω

2

: Γ

1

, ω

2

) = A} | Θ

) (14a)

= P ({(ω

1

, ω

2

) ∈ Ω

1

× Ω

2

: Γ

1

, ω

2

) = A} ∩ Θ

)

P (Θ

) (14b)

=

 

 

0 if A = ∅

P

F∩G=A

m

1

(F )m

2

(G) P

F∩G6=∅

m

1

(F )m

2

(G) otherwise, (14c)

(8)

which is well defined on the condition that P (Θ

) > 0. The binary operation ⊕ is called Dempster’s rule. It is commutative and associative. The quantity

κ := 1 − P (Θ

) = X

F∩G=∅

m

1

(F )m

2

(G) is called the degree of conflict between m

1

and m

2

.

The following two propositions are important in practice [42].

Proposition 1. Let m

1

and m

2

be two mass functions on Θ. The commonality function Q

m1⊕m2

corresponding to the orthogonal sum m

1

⊕ m

2

is proportional to the product of the commonality functions Q

m1

and Q

m2

corresponding to m

1

and m

2

:

Q

m1⊕m2

= 1

1 − κ Q

m1

Q

m2

.

As Q

m

({θ}) = pl

m

(θ) for all θ ∈ Θ, we also have pl

m1⊕m2

= (1 − κ)

−1

pl

m1

pl

m2

.

Proposition 2. Let m

1

be a mass function and let m

2

be a Bayesian mass function. Then, the mass function m

1

⊕ m

2

is Bayesian; it is defined as

(m

1

⊕ m

2

)({θ}) = pl

m1

(θ)m

2

({θ}) P

θ0∈Θ

pl

m1

0

)m

2

({θ

0

}) for all θ ∈ Θ.

As mentioned above, Dempster’s rule is based on the assumption that both pieces of evidence are reliable. We can also weaken this assumption and only assume that at least one of them is reliable [22, 47]. If the pair of interpretations (ω

1

, ω

2

) ∈ Ω

1

× Ω

2

holds, we can then deduce that θ ∈ Γ

1

, ω

2

) := Γ

1

1

) ∪ Γ

2

2

). Still assuming the two pieces of evidence to be independent, we get the combined mass function m

1

d m

2

defined, for all A ∈ 2

Θ

, as

(m

1

d m

2

)(A) := P ({(ω

1

, ω

2

) ∈ Ω

1

× Ω

2

: Γ

1

, ω

2

) = A}) (15a)

= X

F∪G=A

m

1

(F )m

2

(G). (15b)

We note that normalization is not needed in this case, as Γ

1

, ω

2

) cannot be empty as long as Γ

1

1

) and Γ

2

2

) are nonempty.

Degree of belief in a fuzzy event. The notions of a fuzzy event and its probability were first defined by Zadeh [55]. Given a probability space (Θ, A, P ), a fuzzy event is a fuzzy subset A e of Θ with measurable membership function, and the probability of A e is defined as the expectation of its membership function. When Θ is finite, as assumed throughout his paper, and A = 2

Θ

, the measurability of A e is always satisfied and we have

P ( A) := e X

θ∈Θ

P ({θ}) A(θ), e (16)

7

(9)

where A(θ) denotes the degree of membership of e θ in the fuzzy set A. It can be shown [19] e that P ( A) can also be written as: e

P ( A) = e Z

1

0

P (

α

A)dα, e (17)

where

α

A e = {θ ∈ Θ : A(θ) e ≥ α} is the α-cut of A. e

Smets [45] extended this definition to the case where uncertainty on Θ is defined by a mass function m. He defined the degrees of belief and plausibility of fuzzy event A e as, respectively, the lower and upper expectations of its membership function:

Bel

m

( A) = e

f

X

i=1

m

i

min

θ∈Fi

A(θ) e (18a)

P l

m

( A) = e

f

X

i=1

m

i

max

θ∈Fi

A(θ). e (18b)

Similarly to (17), we have

Bel

m

( A) = e Z

1

0

Bel

m

(

α

A)dα e (19a)

P l

m

( A) = e Z

1

0

P l

m

(

α

A)dα. e (19b) We note that Bel

m

( A) and e P l

m

( A) are the Choquet integrals of e A(·) with respect to e Bel

m

and P l

m

, respectively. It is clear that definition (18) coincides with (16) when m is Bayesian.

Let F (Θ) be the set of all fuzzy subsets of Θ. It becomes a lattice when equipped with fuzzy intersection and union defined, respectively, as

( A e ∧ B)(θ) = e A(θ) e ∧ B e (θ) (20a) and

( A e ∨ B)(θ) = e A(θ) e ∨ B e (θ) (20b) for all θ ∈ Θ, where ∨ and ∧ denote, respectively, the minimum and the maximum. As shown by Smets [45], the mapping Bel from F (Θ) to [0, 1] defined by (18) is a belief function on (F(Θ), ∧, ∨), i.e., it verifies the following inequalities for any k ≥ 2 and countable collection A e

1

, . . . , A e

k

of fuzzy subsets of Θ:

Bel

m k

_

i=1

A e

i

!

≥ X

∅6=I⊆{1,...,k}

(−1)

|I|+1

Bel

m

^

i∈I

A e

i

!

. (21)

(10)

Although Smets did not consider extending the commonality function to fuzzy events, this can also be done as follows:

Q

m

( A) := e Z

1

0

Q(

α

A)dα e (22a)

= Z

1

0 f

X

i=1

m

i

I(

α

A e ⊆ F

i

)

!

dα (22b)

=

f

X

i=1

m

i

Z

1

0

I(

α

A e ⊆ F

i

)dα (22c)

=

f

X

i=1

m

i

1 − max

θ6∈Fi

A(θ) e

. (22d)

2.3. Fuzzy evidence: possibility theory

As recalled in Section 1, possibility theory introduced by Zadeh [57] extends the simple model outlined in Section 2.1 by considering statements of the form “θ is F e ”, where F e si a normal fuzzy subset of Θ, i.e., a fuzzy subset verifying F e (θ) = 1 for some θ ∈ Θ (see also [23, 24, 14]). Such fuzzy statements are good representations of evidence that can be fully trusted but that does not have a precise meaning because, e.g., it is conveyed through natural language. For instance, I may know from a fully reliable witness that “John is old”. Zadeh [57] defines the possibility measure Π

Fe

: 2

Θ

→ [0, 1] induced by the piece of information “θ is F e ” as

Π

Fe

(A) := max

θ∈A

F e (θ), (23)

which generalizes (2). The dual necessity measure is defined as N

Fe

(A) := 1 − Π

Fe

(A

c

) = min

θ6∈A

h

1 − F e (θ) i

(24) for all A ⊆ Θ, which extends (3). The corresponding possibility distribution is defined as

π

Fe

(θ) := Π

Fe

({θ}) = F e (θ) for all θ ∈ Θ. It is, thus, numerically equal to F e .

It can easy be seen that

Π

Fe

(A) = Z

1

0

Π

αFe

(A)dα (25a)

and

N

Fe

(A) = Z

1

0

N

αFe

(A)dα. (25b)

Trivially, we have Π

Fe

(∅) = N

Fe

(∅) = 0, Π

Fe

(Θ) = N

Fe

(Θ) = 1, and N

Fe

(A) ≤ Π

Fe

(A) for all A ⊆ Θ. Furthermore, the following equalities hold:

Π

Fe

(A ∪ B) = Π

Fe

(A) ∨ Π

Fe

(B) (26a)

9

(11)

and

N

Fe

(A ∩ B ) = N

Fe

(A) ∧ N

Fe

(B) (26b) for all A, B ⊆ Θ. These equalities generalize (4) and (5).

In addition to the possibility and necessity measures defined by (26), Dubois and Prade [24] also generalize the notion of guaranteed possibility (6) as

Fe

(A) := min

θ∈A

F e (θ) = Z

1

0

αFe

(A)dα, (27)

and that of potential certainty as

Fe

(A) := 1 − ∆

Fe

(A

c

) = max

θ6∈A

h

1 − F e (θ) i

= Z

1

0

αFe

(A)dα. (28) The quantity ∆

Fe

(A) measures the extent to which all values in A are actually possible given the statement “θ is F e ”, while ∇

Fe

(A) measures the extent to which at least one value θ outside A has a low degree of possibility. Clearly, the following equality holds for any subsets A and B of Θ:

Fe

(A ∪ B) = ∆

Fe

(A) ∧ ∆

Fe

(B ). (29) Combination of possibility measures. Let us now assume that we receive two pieces of evi- dence that tell us that “θ is F e ” and “θ is G”, where e F e and G e are two normal fuzzy subsets of Θ. These fuzzy sets induce two possibility distributions π

Fe

and π

Ge

. How should they be combined? If both sources are assumed to be reliable, then it makes sense to infer that

“θ is F e ∩ G”, where e F e ∩ G e denotes the intersection of fuzzy sets F e and G. There are, how- e ever, two difficulties. First, there are several possible definitions of fuzzy set intersection.

Zadeh [54] defines the intersection of fuzzy sets F e and G e as ( F e ∧ G)(θ) = e F e (θ) ∧ G(θ), e but he also proposes the following alternative definition:

( F e · G)(θ) = e F e (θ) · G(θ). e

It is clear that both definitions extend the usual set intersection. Later, these two definitions have been generalized as ( F e ∩

>

G)(θ) := e F e (θ)> G(θ), where e > is a triangular norm (or t- norm for short) [18]. The minimum is the largest t-norm. It is consistent with the definition of inclusion as F e ⊆ G e iff F e ≤ G: e F e ∧ G e is then the largest fuzzy set included in F e and G. It e is also idempotent, i.e., F e ∧ F e = F e ; consequently, there is no reinforcement effect when two pieces of evidence are identical, which makes minimum-intersection combination useful for combining evidence from dependent and possibly redundant sources. In contrast, product intersection has a reinforcement effect that is appropriate when the sources are assumed to be independent [25, page 352].

The second difficulty when combining possibility distributions conjunctively is that the

intersection of two fuzzy sets F e and G e may not be normal, and some normalization step

(12)

has to take place. The standard normalization procedure divides F e ∩ G e by its height (i.e., supremum). As a consequence, the associativity property is usually lost. However, the normalized product intersection, defined as

F e G e := F e · G e

h( F e · G) e , (30)

is associative [25]. We give a simple proof of this well-known result below for completeness.

Proposition 3. Let F e , G e and H e be fuzzy subsets of Θ, and let denote the normalized intersection based on the product t-norm. Then,

( F e G) e H e = F e ( G e H). e

Proof. The key property of the product-based intersection is that, for any α ≥ 0, h((α F e ) · G) = e h( F e · (α G)) = e αh( F e · G). Using this property, we have e

( F e G) e H e =

Fe·Ge h(FeG)e

· H e h

Fe·Ge

h(Fe·G)e

· H e = F e · G e · H/h( e F e · G) e

h( F e · G e · H)/h( e F e · G) e = F e · G e · H e h( F e · G e · H) e . and, by the commutativity of ,

F e ( G e H) = ( e G e H) e F e = F e · G e · H e h( F e · G e · H) e .

The normalized intersection (30) is only defined if h( F e · G) e > 0. A value of h( F e · G) e close to zero signals conflicting evidence. Several authors have warned agains the use of normalized intersection in this case (see, e.g., [25, page 354]). Indeed, the division by a small number in (30) may result in high sensitivity of the combination to small changes in F e or G. Actually, conflict in the evidence may sometimes lead us to questioning its reliability. e If we only assume that at least one source is reliable, we can only deduce that “θ is F e ∪ G”, e where F e ∪ G e denotes the union of fuzzy sets F e and G. The union e F e ∪ G e is usually defined as ( F e ∪

G)(θ) := e F e (θ)⊥ G(θ), where e ⊥ is a t-conorm. To each t-norm > corresponds a dual t-conorm ⊥ defined as u⊥v = 1 − [(1 − u)>(1 − v )]. The t-conorms corresponding to the minimum and the product are, respectively, the maximum and the probabilistic sum u⊥v = u + v − uv.

Relation between possibility measures and belief functions. It can easily be shown that the mapping N

Fe

: 2

→ [0, 1] is completely monotone (11), i.e., it is a belief function, and Π

Fe

is the dual plausibility function [24]. Actually, a belief function Bel

m

and the associated plausibility function P l

m

verify the equalities

Bel

m

(A ∩ B) = Bel

m

(A) ∨ Bel

m

(B )

11

(13)

and

P l

m

(A ∪ B) = P l

m

(A) ∨ P l(

m

B)

iff the corresponding mass function m is consonant [42], i.e., if for any pair of focal sets F

i

and F

j

, we have F

i

⊂ F

j

or F

j

⊂ F

i

. To each possibility distribution π

Fe

thus corresponds a unique consonant mass function m

Fe

, which can be recovered as follows [20]. Let θ

1

, . . . , θ

q

denote the elements of Θ, assumed to be indexed in such a way that 1 = π

Fe

1

) ≥ π

Fe

2

) ≥ . . . ≥ π(

Fe

θ

q

).

The corresponding mass function m

Fe

is given by

m

Fe

({θ

1

, . . . , θ

i

}) = π

Fe

i

) − π

Fe

i+1

) (31) for i = 1, . . . , q − 1, and m

Fe

(Θ) = π

Fe

q

). It can easily be checked that functions Bel

m

Fe

, P l

m

Fe

, pl

m

Fe

and Q

m

Fe

are equal, respectively, to N

Fe

, Π

Fe

, π

Fe

and ∆

Fe

.

However, the combination of two consonant mass functions by Dempster’s rule is no longer consonant. To combine two consonant belief functions Bel

1

and Bel

2

induced by independent sources, we must, therefore, consider the evidence on which they are based:

• If they are based on fully reliable but vague (fuzzy) evidence such as θ is F e and θ is G, then they should be combined by a conjunctive operator of possibility theory; e the normalized product intersection (30) seems to be a good choice as it is associative;

• If they are based on uncertain but crisp (nonfuzzy) evidence pointing to consonant focal sets, then the corresponding consonant mass functions should be combined by Dempster’s rule (14).

The two combination mechanisms yield different results but, as a consequence of Proposition 1, the contour functions are proportional, as illustrated by the following example.

Example 1. Let Θ = {θ

1

, θ

2

, θ

3

, θ

4

}. Assume that we receive the following pieces of evidence from two independent and reliable sources:

• First piece of evidence: “θ is F e ”, with

F e = θ

1

0.5 , θ

2

1 , θ

3

0.8 , θ

4

0.3

;

• Second piece of evidence: “θ is G”, with e

G e = θ

1

0.3 , θ

2

0.7 , θ

3

1 , θ

4

0.2

;

(14)

These pieces of evidence can be represented by possibility distributions π

Fe

and π

Ge

and com- bined by (30), resulting in the combined possibility distribution π

Fe

Ge

, with F e G e equal to

F e G e =

θ

1

0.15/0.8 , θ

2

0.7/0.8 , θ

3

1 , θ

4

0.06/0.8

=

θ

1

0.1875 , θ

2

0.875 , θ

3

1 , θ

4

0.075

. The consonant mass function m

FeGe

corresponding to π

FeGe

is m

Fe

Ge

({θ

3

}) = 1 − 0.875 = 0.125 m

Fe

Ge

({θ

2

, θ

3

}) = 0.875 − 0.1875 = 0.6875 m

Fe

Ge

({θ

1

, θ

2

, θ

3

}) = 0.1875 − 0.075 = 0.1125 m

Fe

Ge

(Θ) = 0.075.

Consider now another situation in which we receive two pieces of evidence represented by the following consonant mass functions:

m

Fe

({θ

2

}) = 1 − 0.8 = 0.2 m

Fe

({θ

2

, θ

3

}) = 0.8 − 0.5 = 0.3 m

Fe

({θ

1

, θ

2

, θ

3

}) = 0.5 − 0.3 = 0.2

m

Fe

(Θ) = 0.3, and

m

Ge

({θ

3

}) = 1 − 0.7 = 0.3 m

Ge

({θ

2

, θ

3

}) = 0.7 − 0.3 = 0.4 m

Ge

({θ

1

, θ

2

, θ

3

}) = 0.3 − 0.2 = 0.1

m

Ge

(Θ) = 0.2.

Mass functions m

Fe

and m

Ge

induce the same belief functions as, respectively, F e and G. Yet, e they are combined differently by Dempster’s rule. Their orthogonal sum m

Fe

⊕ m

Ge

is

(m

Fe

⊕ m

Ge

)({θ

3

}) = 0.24/(1 − 0.06) ≈ 0.255 (m

Fe

⊕ m

Ge

)({θ

2

}) = 0.14/(1 − 0.06) ≈ 0.149 (m

Fe

⊕ m

Ge

)({θ

2

, θ

3

}) = 0.41/(1 − 0.06) ≈ 0.436 (m

Fe

⊕ m

Ge

)({θ

1

, θ

2

, θ

3

}) = 0.09/(1 − 0.06) ≈ 0.0957 (m

Fe

⊕ m

Ge

)(Θ) = 0.06/(1 − 0.06) ≈ 0.0638, which is different from m

Fe

Ge

. In particular, m

Fe

⊕m

Ge

is not consonant. Its contour function of m

Fe

⊕ m

Ge

is pl

m

Fe⊕m

Ge

1

) = (0.09 + 0.06)/(1 − 0.06) ≈ 0.160 pl

m

Fe⊕m

Ge

2

) = (0.14 + 0.41 + 0.09 + 0.06)/(1 − 0.06) ≈ 0.745 pl

m

Fe⊕m

Ge

3

) = (0.24 + 0.41 + 0.09 + 0.06)/(1 − 0.06) ≈ 0.851 pl

m

Fe⊕m

Ge

4

) = 0.06/(1 − 0.06) ≈ 0.0638.

It can be checked that π

Fe

Ge

and pl

m

Fe⊕m

Ge

are proportional, with π

Fe

Ge

/pl

m

Fe⊕m

Ge

= 1.175.

13

(15)

The above considerations show that it is important, in practice, to determine whether a piece of evidence should be represented by a possibility distribution or by a consonant mass function. It seems reasonable to use the former representation for reliable but fuzzy evidence such as conveyed, e.g., by natural language, as in the proposition “John is tall”.

More surprisingly, it appears that statistical evidence should also be represented in that way, as will be shown in Section 4. In contrast, consonant evidence usually arises when combining several elementary pieces of evidence, such as simple mass functions of the form m

i

(F

i

) = p

i

, m

i

(Θ) = 1 − p

i

with F

1

⊆ . . . ⊆ F

f

. Such simple mass functions may be obtained by combining expert opinions or sensor readings telling us that F

i

⊆ Θ with confidence degree p

i

.

Possibility and necessity of a fuzzy event. In [57], Zadeh defines the possibility and the necessity of a fuzzy event A e ∈ F (Θ) as

Π

(S)

Fe

( A) := max e

θ∈Θ

( A e ∧ F e )(θ) = h( A e ∧ F e ). (32) and

N

(S)

Fe

( A) := 1 e − Π

(S)

Fe

( A e

c

) (33a)

= 1 − max

θ∈Θ

[1 − A(θ)] ∧ F e (θ) (33b)

= min

θ∈Θ

n

1 − [1 − A(θ)] e ∧ F e (θ) o

(33c)

= min

θ∈Θ

A(θ) e ∨ [1 − F e (θ)] (33d)

= min

θ∈Θ

( A e ∨ F e

c

)(θ). (33e)

We note that Π

(S)

Fe

( A) and e N

(S)

Fe

( A) are Sugeno integrals of the mapping e A e : Θ → [0, 1] with respect to Π

Fe

and N

Fe

, respectively [18, Section 7.6.2]. We can also remark that Int( A, e F e ) :=

h( A∧ e F e ) can be seen as a degree of intersection of A e and F e , whereas Incl( F , e A) := min e

θ∈Θ

( A∨ e F e

c

)(θ) can be seen as a degree of inclusion of F e in A. e

It is easy to check that Π

(S)

Fe

and N

(S)

Fe

are still, respectively, possibility and necessity measures in the lattice (F (Θ), ∧, ∨), as

Π

(S)

Fe

( A e ∨ B) = Π e

(S)

Fe

( A) e ∨ Π

(S)

Fe

( B e ) (34a)

and

N

(S)

Fe

( A e ∧ B e ) = N

(S)

Fe

( A) e ∧ N

(S)

Fe

( B e ) (34b)

for all fuzzy sets A e and B, which generalize (4)-(5). It is also easy to show that e N

(S)

Fe

( A) e ≤ Π

(S)

Fe

( A) for all e A e ∈ F(Θ) [21]. Dubois and Prade [21] also show that N

(S)

Fe

is a belief function in the lattice (F (Θ), ∧, ∨), and Π

(S)

Fe

is the dual plausibility function.

(16)

Dubois et al. review generalizations of (32)-(33) as well as alternative definitions in [18, Section 7.6]. In particular, they remark that we can identify Π

Fe

and N

Fe

to, respectively, the plausibility function P l

m

Fe

and the belief function Bel

m

Fe

induced by the consonant mass function m

Fe

defined by (31). We can then define the possibility and necessity of a fuzzy event A e from (18) and (19) by the Choquet integrals of the mapping ˜ A : Θ → [0, 1] with respect to the non-additive functions Π

Fe

and N

Fe

: Π

(C)

Fe

( A) := e Z

1

0

Π

Fe

(

α

A)dα e = X

A⊆Θ

m

Fe

(A) max

θ∈A

A(θ) = e P l

m

Fe

( A) e (35a) N

(C)

Fe

( A) := e Z

1

0

N

Fe

(

α

A)dα e = X

A⊆Θ

m

Fe

(A) min

θ∈A

A(θ) = e Bel

m

Fe

( A). e (35b) From (25), we then have

Π

(C)

Fe

( A) = e Z

1

0

Z

1 0

Π

βFe

(

α

A)dβ e

dα = Z

1

0

Π

βFe

( A)dβ e (36a) and

N

(C)

Fe

( A) = e Z

1

0

Z

1 0

N

βFe

(

α

A)dβ e

dα = Z

1

0

N

βFe

( A)dβ, e (36b) with

Π

βFe

( A) = e Z

1

0

Π

βFe

(

α

A)dα e = max

θ∈βFe

A(θ) = e max

{θ:Fe(θ)≥β}

A(θ) e (36c)

and

N

βFe

( A) = e Z

1

0

N

βFe

(

α

A)dα e = min

θ∈βFe

A(θ) = e min

{θ:Fe(θ)≥β}

A(θ). e (36d) As N

(S)

Fe

, function N

(C)

Fe

defined by (35b) is a belief function on the lattice (F (Θ), ∧, ∨), and Π

(C)

Fe

is its dual plausibility function. However, Π

(C)

Fe

and N

(C)

Fe

are no longer possibility and necessity measures, as they fail to satisfy the basic axioms (34) [21].

The following example illustrates the difference between Π

(S)

Fe

and Π

(C)

Fe

. Example 2. Consider the fuzzy sets F e and G e of Example 1. We have

Π

(S)

Fe

( G) = e h( F e ∧ G) = e h θ

1

0.3 , θ

2

0.7 , θ

3

0.8 , θ

4

0.2

= 0.8 and

N

(S)

Fe

( G) = 1 e − h( F e ∧ G e

c

) = 1 − h θ

1

0.5 , θ

2

0.3 , θ

3

0 , θ

4

0.3

= 1 − 0.5 = 0.5, but

Π

(C)

Fe

( G) = 0.2 e × 0.7 + 0.3 × 1 + 0.2 × 1 + 0.3 × 1 = 0.94.

and

N

(C)

Fe

( G) = 0.2 e × 0.7 + 0.3 × 0.7 + 0.2 × 0.3 + 0.3 × 0.2 = 0.47.

15

(17)

To conclude this section, we can remark that the guaranteed possibility function (27) can be extended to fuzzy events as well. It can easily be seen that, for crisp A,

min

θ∈A

F e (θ) = min

θ∈Θ

(A

c

∨ F e )(θ) = Incl(A, F e ).

A natural definition for the guaranteed possibility of fuzzy event A, in the spirit of Zadeh’s e definitions (32)-(33) for the possibility and belief of a fuzzy event is, thus,

(S)

Fe

( A) := Incl( e A, e F e ) = min

θ∈Θ

( A e

c

∨ F e )(θ). (37) As a generalization of (29), we have

(S)

Fe

( A e ∨ B e ) = ∆

(S)

Fe

( A) e ∧ ∆

(S)

Fe

( B). e (38)

Alternatively, we can define the guaranteed possibility of fuzzy event A e from (22) by the Choquet integral

(C)

Fe

( A) := e Z

1

0

Fe

(

α

A)dα e = X

A⊆Θ

m

Fe

(A)

1 − max

θ6∈A

A(θ) e

dα = Q

m

Fe

( A). e (39) Unfortunately, equality (38) does not hold anymore with this alternative definition.

Example 3. Consider again the fuzzy sets F e and G e of Examples 1 and 2. From

G e

c

∨ F e = θ

1

0.7 , θ

2

1 , θ

3

0.8 , θ

4

0.8

,

we have ∆

(S)

Fe

( G) = 0.7, but e

(C)

Fe

( G) = 0.2(1 e − 1) + 0.3(1 − 0.3) + 0.2(1 − 0.2) + 0.3 = 0.67.

Now, let

H e = θ

1

1 , θ

2

0.6 , θ

3

0. , θ

4

0.1

. We have

H e

c

∨ F e = θ

1

0.5 , θ

2

1 , θ

3

0.8 , θ

4

0.9

, hence ∆

(S)

Fe

( H) = 0.5, and e

( G e ∨ H) e

c

∨ F e = θ

1

0.5 , θ

2

1 , θ

3

0.8 , θ

4

0.8

,

hence ∆

(S)

Fe

( G e ∨ H) = 0.5 = ∆ e

(S)

Fe

( G) e ∧ ∆

(S)

Fe

( H). But we have e

(C)

Fe

( H) = 0.2(1 e − 1) + 0.3(1 − 1) + 0.2(1 − 0.1) + 0.3 = 0.48 and

(C)

Fe

( G e ∨ H) = 0.2(1 e − 1) + 0.3(1 − 1) + 0.2(1 − 0.2) + 0.3 = 0.46 6= ∆

(C)

Fe

( G) e ∧ ∆

(C)

Fe

( H). e

(18)

3. Uncertain and fuzzy information: fuzzy mass functions

To handle cases where one consonant belief function is based on fuzzy evidence and the other one on uncertain evidence, or to handle evidence that is both uncertain and fuzzy, we need to generalize DS and possibility theories. Such a generalization is exposed in this section. We first review the notion of fuzzy mass function (Section 3.1). In Section 3.2, we propose a combination operator that extends both Dempster’s rule (14) and the normalized product (30); we also define extensions of the disjunctive operators reviewed in Section 2.2 and 2.3. The degrees of belief and plausibility of fuzzy events are then defined in Section 3.3.

3.1. Fuzzy mass functions

Following [58], let us assume that we receive uncertain and fuzzy information, which can be modeled as follows. As in Section 2.2, we assume that the evidence can be interpreted in different ways, and the set of interpretations is denoted by Ω. If interpretation ω ∈ Ω holds, then we know for sure that proposition “θ is e Γ(ω)” is true, where Γ(ω) is a normal fuzzy e subset of Θ. Denoting by F

(Θ) the set of all normal fuzzy subsets of Θ, we thus have a mapping e Γ : Ω → F

(Θ). If, as before, we assume the existence of a probability measure P on (Ω, 2

), then the tuple (Ω, 2

, P, e Γ) is a random fuzzy set [26, 33, 34] (also called a

“fuzzy random variable” when Θ is R

p

).

Let m e be the mapping from F

(Θ) to [0, 1] defined as m( e F e ) := P ({ω ∈ Ω : Γ(ω) = e F e }).

Because Ω is assumed to be finite, there is only a finite number of fuzzy subsets F e such that m( e F e ) > 0, called the (fuzzy) focal sets of m. The set of focal sets is denoted as F (m) = { F e

1

, . . . , F e

f

}. We also use the notation m

i

:= m( F e

i

), i = 1, . . . , n. Mapping m e is called a fuzzy mass function

1

. The notion of fuzzy mass function extends that of DS mass function recalled in Section 2.2. The number m( e F e

i

) is interpreted as the degree with which the evidence supports the proposition θ is F e

i

, without supporting any more specific proposition.

If interpretation ω holds, we know that “θ is e Γ(ω)”. The possibility and necessity of a subset A ⊆ Θ are, respectively, Π

eΓ(ω)

(A) and N

Γ(ω)e

(A) defined by (23) and (24). As we only know that interpretation ω holds with probability P ({ω}), we can compute the expected possibility and the expected necessity [58] as

P l

me

(A) = X

ω∈Ω

P ({ω})Π

Γ(ω)e

(A) =

f

X

i=1

m

i

Π

Fe

i

(A) =

f

X

i=1

m

i

max

θ∈A

F e

i

(θ) (40)

1

This notion should not be confused with that of fuzzy-valued mass function introduced in [10]. A fuzz- valued mass function assigns fuzzy numbers to crisp focal sets, and can be interpreted as a fuzzy set of crisp mass functions. We could, of course, “fuzzify” both the masses and the focal sets; such a generalization will not be considered in this paper.

17

(19)

and

Bel

me

(A) = X

ω∈Ω

P ({ω})N

Γ(ω)e

(A) =

f

X

i=1

m

i

N

Fe

i

(A) =

f

X

i=1

m

i

min

θ6∈A

[1 − F e

i

(θ)]. (41) Functions P l

me

and Bel

me

are, respectively, mixtures of possibility and necessity measures.

As we have seen that each necessity measure N

Fe

i

is a belief function, and the set of belief functions is convex, Bel

me

is still a belief function, and P l

me

(which verifies P l

me

(A) = 1 − Bel

me

(A

c

) for all A ⊆ Θ) is the dual plausibility function. The contour function of m e is equal to the mean of the membership functions of its focal sets:

pl

me

(θ) =

f

X

i=1

m

i

F e

i

(θ). (42)

We can also define the commonality of A as its expected guaranteed possibility from (27):

Q

me

(A) = X

ω∈Ω

P ({ω})∆

Γ(ω)e

(A) =

f

X

i=1

m

i

Fe

i

(A) =

f

X

i=1

m

i

min

θ∈A

F e

i

(θ). (43) A fuzzy mass function m e with focal sets F e

1

, . . . , F e

f

and masses m

1

, . . . , m

f

can also be described as a collection of standard (crisp) mass functions

α

m e with focal sets

α

F e

1

, . . . ,

α

F e

f

and the same masses m

1

, . . . , m

f

. Each mass function

α

m e is induced by the random set (Ω, 2

, P,

α

Γ), with e

α

e Γ(ω) =

α

[e Γ(ω)]. From (25) and (27), we have

Bel

me

(A) = Z

1

0

Bel

α

me

(A)dα, (44a)

P l

me

(A) = Z

1

0

P l

αme

(A)dα, (44b)

and

Q

me

(A) = Z

1

0

Q

α

me

(A)dα (44c)

for all A ⊆ Θ. We note that equalities (44a) and (44b) are proved in [5] in a more general setting.

It is important to remark that the concept of fuzzy mass functions allows us to generalize both DS theory and possibility theory. In particular, a logical fuzzy mass function m e with a single fuzzy focal set F e is equivalent to a possibility distribution π = F e .

Remark 1. In “classical” DS theory there is a one-to-one correspondence between (crisp) mass functions and belief functions. The fundamental reason is that, for a given belief function Bel, the system of linear equations Bel(A) = P

B⊆A

m(B ) for all ∅ 6= A ⊆ Ω

has a unique solution. In the “fuzzified” version of DS theory, there is a many-to-one

correspondence between fuzzy mass functions and belief functions, i.e., a given belief function

can be obtained from many fuzzy mass functions. The simplest example is that of a belief

(20)

function verifying Bel(A∩B) = Bel(A)∧Bel(B) for all A, B ⊆ Θ (i.e., a necessity measure), which, as shown in Section 2.3, can be induced both by a crisp consonant mass function and by a logical fuzzy mass function.

In the following section, we define a combination operator for fuzzy mass functions that extends both the normalized product of possibility distributions (30) and Dempster’s rule (14).

3.2. Combination of fuzzy mass functions

Let us now assume that we have two fuzzy mass functions m e

1

and m e

2

induced by inde- pendent random fuzzy sets (Ω

1

, 2

1

, P

1

, e Γ

1

) and (Ω

2

, 2

2

, P

2

, Γ e

2

). If interpretations ω

1

∈ Ω

1

and ω

2

∈ Ω

2

both hold, then we can infer the fuzzy proposition “θ is Γ e

>

1

, ω

2

)”, with e Γ

>

1

, ω

2

) := Γ e

1

1

) ∩

>

e Γ

2

2

), where ∩

>

is a fuzzy intersection operator based on a t-norm >. However, the random fuzzy set (Ω

1

× Ω

2

, 2

1×Ω2

, P

1

⊗ P

2

, e Γ

>

) may not be nor- malized, as some fuzzy sets e Γ

>

1

, ω

2

) may not be normal. To obtain a normal random fuzzy set, we can replace Γ e

>

by the mapping

e Γ

>

1

, ω

2

) := Γ e

1

1

)∩

>

e Γ

2

2

) = e Γ

1

1

) ∩

>

e Γ

2

2

)

h(e Γ

1

1

) ∩

>

Γ e

2

2

)) , (45) where ∩

>

denotes normalized >-intersection, and we can condition P

1

⊗ P

2

by the following fuzzy subset Θ e

of Ω

1

× Ω

2

, which is a natural generalization of Θ

in (13):

Θ e

1

, ω

2

) = h(e Γ

>

1

, ω

2

)), ∀(ω

1

, ω

2

) ∈ Ω

1

× Ω

2

. (46) Using Zadeh’s definition of the probability of a fuzzy event (16), the conditional probability (P

1

⊗ P

2

)(· | Θ e

) is

(P

1

⊗ P

2

)(B | Θ e

) = (P

1

⊗ P

2

)(B ∩ Θ e

) (P

1

⊗ P

2

)( Θ e

)

= P

12)∈B

P

1

1

)P

2

2

)h(e Γ

>

1

, ω

2

)) P

12)∈Ω1×Ω2

P

1

1

)P

2

2

)h(e Γ

>

1

, ω

2

)) ,

for all B ⊆ Ω

1

× Ω

2

. The fuzzy mass function generated by the random fuzzy set (Ω

1

× Ω

2

, 2

1×Ω2

, (P

1

⊗ P

2

)(· | Θ e

), Γ e

0

>

) is then ( m e

1

e

>

m e

2

)( F e ) :=

P

G∩e >H=e Fe

h( G e ∩

>

H) e m e

1

( G) e m e

2

( H) e P

(G,eH)∈e F(me1)×F(me2)

h( G e ∩

>

H) e m e

1

( G) e m e

2

( H) e (47) for all F e ∈ F

(Θ). This operation was proposed by Yen [53], using > = min. The normal- ization operation in (47) was called “soft normalization” by Yager [52].

In general, the operation e

>

defined by (47) is not associative. However, it is associative if we use the product t-norm for defining the fuzzy set intersection. In the following, we will

19

(21)

thus define the orthogonal sum m e

1

⊕ m e

2

of two fuzzy mass functions m e

1

and m e

2

(where the symbol ⊕ has the same meaning as e

>

with > = product) as

( m e

1

⊕ m e

2

)( F e ) :=

P

GeH=e Fe

h( G e · H) e m e

1

( G) e m e

2

( H) e P

(G,eH)∈e F(me1)×F(me2)

h( G e · H) e m e

1

( G) e m e

2

( H) e , (48) for all F e ∈ F

(Θ). The operation defined by (48) will be called the generalized product- intersection rule. The denominator of the right-hand side of (48) can be denoted as 1 − κ, where κ can be interpreted as the degree of conflict between fuzzy mass functions m e

1

and m e

2

. It is clear that (48) is a proper generalization of (14), i.e., it gives the same result when the focal sets of m e

1

and m e

2

are crisp. It also boils down to normalized intersection (30) when combining two logical fuzzy mass functions m e

1

and m e

2

such that m e

1

( F e ) = 1 and m e

2

( G) = 1. As a consequence of Proposition 3, the generalized product-intersection rule is e associative, as stated in the following proposition.

Proposition 4. Let m

1

, m

2

and m

3

be fuzzy mass functions of Θ, and let ⊕ denote the generalized product-intersection defined by (48). Then,

(m

1

⊕ m

2

) ⊕ m

3

= m

1

⊕ (m

2

⊕ m

3

).

Proof. We have

[(m

1

⊕ m

2

) ⊕ m

3

]( A) = e P

HeK=e Ae

(m

1

⊕ m

2

)( H)m e

3

( K)h( e H e · K) e P

H,eKe

(m

1

⊕ m

2

)( H)m e

3

( K)h( e H e · K) e

= P

HeK=e Ae

P

FeG=e Hem1(Fe)m2(G)h(e Fe·G)e P

F ,eGem1(Fe)m2(G)e

m

3

( K)h( e H e · K) e P

H,eKe

P

Fe G=e Hem1(Fe)m2(G)h(e Fe·G)e P

F ,eGem1(Fe)m2(G)e

m

3

( K)h( e H e K) e

= P

FeGeK=e Ae

m

1

( F e )m

2

( G)m e

3

( H)h( e F e · G)h(( e F e G) e · K) e P

F ,eG,eKe

m

1

( F e )m

2

( G)m e

3

( H)h( e F e · G)h(( e F e G) e · K) e . Now,

h( F e · G)h(( e F e G) e · K e ) = h( F e · G)h e F e · G e h( F e · G) e · K e

!

= h( F e · G e · K). e Hence,

[(m

1

⊕ m

2

) ⊕ m

3

]( A) = e P

FeGeK=e Ae

m

1

( F e )m

2

( G)m e

3

( H)h( e F e · G e · K) e P

F ,eG,eKe

m

1

( F e )m

2

( G)m e

3

( H)h( e F e · G e · K) e . (49) Consequently, we can permute the indices in (49) and write

(m

1

⊕ m

2

) ⊕ m

3

= (m

2

⊕ m

3

) ⊕ m

1

= m

1

⊕ (m

2

⊕ m

3

).

Références

Documents relatifs

and Garcia-Valdez M., A comparative study of type-1 fuzzy logic systems, interval type-2 fuzzy logic systems and generalized type-2 fuzzy logic systems in control

An appli- cation shows the plausibility curves and conflict values obtained in case of combination operations done on Gaussian based agents and at last, an example of

A possible alternative is to search for subsets of conjunctively merged belief structures jointly satisfying the idempotent contour function merging principle, thus working with sets

• Belief degrees of consequence. For a complex classification problem, it is likely that the consequence of a rule may take a few values with different belief degrees. As indicated

Basically, each of the alternative partitional clustering structures (i.e., hard, fuzzy, possibilistic and rough par- titions) correspond to a special form of the mass functions

Next, in section 3 we propose a general framework of comparison measures of fuzzy sets (i.e. inclu- sion, similarity and distance) based on logical considerations, i.e.. using

Abstract —Embedding non-vectorial data into a normed vec- torial space is very common in machine learning, aiming to perform tasks such classification, regression, clustering and so

Our main idea to design kernels on fuzzy sets rely on plugging a distance between fuzzy sets into the kernel definition. Kernels obtained in this way are called, distance