• Aucun résultat trouvé

Stochastic domination for iterated convolutions and catalytic majorization

N/A
N/A
Protected

Academic year: 2022

Partager "Stochastic domination for iterated convolutions and catalytic majorization"

Copied!
15
0
0

Texte intégral

(1)

www.imstat.org/aihp 2009, Vol. 45, No. 3, 611–625

DOI: 10.1214/08-AIHP175

© Association des Publications de l’Institut Henri Poincaré, 2009

Stochastic domination for iterated convolutions and catalytic majorization 1

Guillaume Aubrun and Ion Nechita

Université de Lyon, Université Lyon 1, CNRS, UMR 5208 Institut Camille Jordan, Batiment du Doyen Jean Braconnier, 43, boulevard du 11 novembre 1918, F - 69622 Villeurbanne Cedex, France. E-mail: aubrun@math.univ-lyon1.fr; nechita@math.univ-lyon1.fr

Received 24 September 2007; revised 26 March 2008; accepted 4 April 2008

Abstract. We study how iterated convolutions of probability measures compare under stochastic domination. We give necessary and sufficient conditions for the existence of an integernsuch thatμnis stochastically dominated byνnfor two given probability measuresμandν. As a consequence we obtain a similar theorem on the majorization order for vectors inRd. In particular we prove results about catalysis in quantum information theory.

Résumé. Nous étudions comment les convolutions itérées des mesures de probabilités se comparent pour la domination stochas- tique. Nous donnons des conditions nécessaires et suffisantes pour l’existence d’un entier ntel queμn soit stochastiquement dominée parνn, étant données deux mesures de probabilitésμetν. Nous obtenons en corollaire un théorème similaire pour des vecteurs deRd et la relation de Schur-domination. Plus spécifiquement, nous démontrons des résultats sur la catalyse en théorie quantique de l’information.

MSC:Primary 60E15; secondary 94A05

Keywords:Stochastic domination; Iterated convolutions; Large deviations; Majorization; Catalysis

Introduction and notations

This work is a continuation of [1], where we study the phenomenon of catalytic majorization in quantum information theory. A probabilistic approach to this question involves stochastic domination which we introduce in Section 1 and its behavior with respect to the convolution of measures. We give in Section 2 a condition on measuresμandνfor the existence of an integer nsuch that μn is stochastically dominated by νn. We gather further topological and geometrical aspects in Section 3. Finally, we apply these results to our original problem of catalytic majorization. In Section 4 we introduce the background for quantum catalytic majorization and we state our results. Section 5 contains the proofs and in Section 6 we consider an infinite dimensional version of catalysis.

We introduce now some notation and recall basic facts about probability measures. We write P(R)for the set of probability measures onR. We denote byδx the Dirac mass at pointx. Ifμ∈P(R), we write suppμfor the support ofμ. We write respectively minμ∈ [−∞,+∞)and maxμ(−∞,+∞]for min suppμand max suppμ. We also write μ(a, b) andμ[a, b] as a shortcut forμ((a, b)) andμ([a, b]). The convolution of two measures μ andν is denotedμν. Recall that ifXandY are independent random variables of respective lawsμandν, the law ofX+Y is given byμν. The results of this paper are stated for convolutions of measures, they admit immediate translations in the language of sums of independent random variables. ForλR, the functioneλis defined byeλ(x)=exp(λx).

1Research was supported in part by the European Network Phenomena in High Dimensions, FP6 Marie Curie Actions, MCRN-511953.

(2)

1. Stochastic domination

A natural way of comparing two probability measures is given by the following relation.

Definition 1.1. Letμandν be two probability measures on the real line.We say thatμisstochastically dominated byνand we writeμstν if

tR μ[t,)ν[t,). (1)

Stochastic domination is an order relation on P(R)(in particular,μstν andνstμimplyμ=ν). The following result [9,16] provides useful characterizations of stochastic domination.

Theorem. Letμandνbe probability measures on the real line.The following are equivalent:

(1) μstν.

(2) Sample path characterization.There exists a probability space(Ω,F,P)and two random variablesXandY on Ωwith respective lawsμandν,so that

ωΩ X(ω)Y (ω).

(3) Functional characterization.For any increasing functionf:RRso that both integrals exist,

fdμ≤

fdν.

It is easily checked that stochastic domination is well behaved with respect to convolution.

Lemma 1.2. Letμ1,μ2,ν1,ν2be probability measures on the real line.Ifμ1stν1andμ2stν2,thenμ1μ2stν1ν2. Lemma 1.3. Letμandνbe two probability measures on the real line such thatμstν.Then,for alln≥2,μ∗nstν∗n. For fixedμandν, it follows from Lemma 1.2 that the set of integerskso thatμkstνk is stable under addition.

In generalμnstνndoes not implyμ(n+1)stν(n+1). Here is a typical example:

Example 1.4. Letμandνbe the probability measures defined as μ=0.4δ0+0.6δ2,

ν=0.8δ1+0.2δ3.

It is straightforward to verify(see Fig. 1)that:

Fork=2,and therefore for all evenk,we haveμkstνk.

Forkodd,we haveμkstν∗k only fork≥9.

Other examples show that the minimalnso thatμ∗nstν∗ncan be arbitrarily large. This is the content of the next proposition.

Proposition 1.5. For every integer n, there exist compactly supported probability measures μ and ν such that μnstν∗nand,for all1≤kn−1,μkstνk.

(3)

Fig. 1. Cumulative distribution functions ofμ∗k(solid line) andν∗k(dotted line) from Example 1.4 fork=1,2,3,9.

Proof. Letμ=εδ2n+(1ε)δ1andνbe the uniform measure on[0,2], where 0< ε <1 will be defined later. For k≥1,

μk= k i=0

k i

(1ε)iεkiδi2n(ki).

Note that supp(νk)R+, while for 1≤kn, the only part ofμk chargingR+is the Dirac mass at pointk. This implies that

μkstνk⇐⇒μk[k,+∞)νk[k,+∞).

We haveμk[k,+∞)=(1ε)kandνk[k,+∞)=1/2. It remains to chooseεso that(1ε)n<1/2< (1ε)n1. 2. Stochastic domination for iterated convolutions and Cramér’s theorem

In light of previous examples, we are going to study the following extension of stochastic domination:

Definition 2.1. We define a relationstonP(R)as follows:

μstν ⇐⇒ ∃n≥1 s.t.μnstν∗n.

In turns that when defined on P(R), this relation is not an order relation due to pathological poorly integrable measures. Indeed, there exist two probability measuresμandν so thatμ=ν andμμ=νν (see [7], p. 479).

Therefore, the relation≤stis not anti-symmetric. For this reason, we restrict ourselves to sufficiently integrable mea- sures (however, most of what follows generalizes to wider classes of measures). This is quite usual when studying orderings of probability measures; see [16] for examples of such situations.

Definition 2.2. A measureμonRis said to beexponentially integrableif

eλdμ <+∞for allλR[recall that eλ(x)=exp(λx)].We writePexp(R)for the set of exponentially integrable probability measures.

Notice that the space of exponentially integrable measures is stable under convolution.

Proposition 2.3. When restricted toPexp(R),the relationstis a partial order.

Proof. One has to check only the antisymmetry property, the other two being obvious. Letkandl be two integers such thatμkstνk andνlstμl. Thenμklstνklstμkl and thereforeμkl=νkl. But ifμandν are exponen- tially integrable, this implies thatμ=ν. One can see this in the following way: if we denote the moments ofμby

(4)

mp(μ)=

xpdμ(x), one checks by induction onpthatmp(μ)=mp(ν)for allpN. On the other hand, exponen- tial integrability implies thatm2p(μ)1/2pCpfor some constantC, so that Carleman’s condition is satisfied (see [7],

p. 224). Thereforeμis determined by its moments andμ=ν.

We would like to give a description of the relation≤st, for example, similar to the functional characterization of≤st. We start with the following lemma.

Lemma 2.4. Letμ, ν∈Pexp(R)such thatμstν.Then the following inequalities hold:

(a) ∀λ >0,

eλdμ≤ eλdν, (b) ∀λ <0,

eλdμ≥ eλdν, (c)

xdμ(x)≤

xdν(x), (d) minμ≤minν, (e) maxμ≤maxν.

Proof. Letμstνandλ >0. Sinceμnνnfor somen, we get from the functional characterization ofstthat

eλn

eλn. It remains to notice that

eλn=

eλn

and we get (a). The proof of (b) is completely symmetric, while (c) follows also from the functional characterization.

Conditions (d) and (e) are obvious since min(μn)=nmin(μ)and max(μn)=nmax(μ).

The following proposition shows that the necessary conditions of Lemma 2.4 are “almost sufficient.”

Proposition 2.5. Letμ, ν∈Pexp(R).Assume that the following inequalities hold:

(a) ∀λ >0,

eλdμ <

eλdν, (b) ∀λ <0,

eλdν <

eλdμ, (c)

xdμ(x) <

xdν(x), (d) maxμ <maxν, (e) minμ <minν.

Thenμstν,and more precisely there exists an integerNNsuch that for anynN,μnstνn.

We give in Proposition 3.6 a counter-example showing that Proposition 2.5 is not true when stated with large inequalities.

We are going to use Cramér’s theorem on large deviations. The cumulant generating functionΛμof the probability measureμis defined for anyλRby

Λμ(λ)=log

eλdμ.

It is a convex function taking values inR. Its convex conjugate Λμ, sometimes called the Cramér transform, is defined as

Λμ(t )=sup

λR

λtΛμ(λ).

Note thatΛμ:R→ [0,+∞]is a smooth convex function, which takes the value+∞onR\ [minμ,maxμ]. More- over, fort(minμ,maxμ), the supremum in the definition ofΛμ(t )is attained at a unique pointλt. Moreover,

(5)

λt >0 ift >

xdμ(x)andλt <0 ift <

xdμ(x). Also,Λμ(

xdμ(x))=0 sinceΛμ(0)=

xdμ(x). We now state Cramér’s theorem. The theorem can be equivalently stated in the language of sums of i.i.d. random variables [5,9].

Theorem (Cramér’s theorem). Letμ∈Pexp(R).Then for anytR,

nlim→∞

1

nlogμn[t n,+∞)=

0 ift

xdμ(x),

ΛX(t ) otherwise, (2)

nlim→∞

1 nlog

1−μn(t n,+∞) =

0 ift

xdμ(x),

ΛX(t ) otherwise. (3)

Proof of Proposition 2.5. Note that the hypotheses imply that the quantities maxμand minν are finite. We write alsoMμ=

xdμ(x)andMν=

xdν(x). Forn≥1, define(fn)and(gn)by fn(t )=μn[t n,+∞),

gn(t )=νn[t n,+∞).

We need to prove thatfngnonRfornlarge enough. Ift >maxμ, the inequality is trivial sincefn(t )=0. Similarly, ift <minνwe havegn(t )=1 and there is nothing to prove.

Fix a real numbert0such thatMμ< t0< Mν. We first work on the intervalI= [t0,maxμ]. By Cramér’s theorem, the sequences(fn1/n)and(g1/nn )converge respectively onI towardf andgdefined by

f (t )=exp

Λμ(t ), g(t )=

1 ift0tMν, exp

Λν(t ) ifMνt≤maxμ.

Note thatf andg are continuous on I. We claim also thatf < g onI. The inequality is clear on [t0, Mν]since f <1. Ift(Mν,maxμ], note that the supremum in the definition ofΛν(t )is attained for someλ >0 – to show this we used hypothesis (d). Using (a) and the definition of the convex conjugate, it implies thatΛν(t ) > Λμ(t ). We now use the following elementary fact: if a sequence of non-increasing functions defined on a compact interval I converges pointwise toward a continuous limit, then the convergence is actually uniform onI (for a proof see [15], Part 2, Problem 127; this statement is attributed to Pólya or to Dini depending on authors). We apply this result to both (fn1/n)and(g1/nn ); and sincef < g, uniform convergence implies that fornlarge enough,fn1/n< gn1/nonI, and thus fngn.

Finally, we apply a similar argument on the interval J = [minν, t0], except that we consider the sequences (1fn)1/nand(1gn)1/n, and we use (3) to compute the limit. We omit the details since the argument is totally symmetric.

We eventually showed that fornlarge enough,fngnonIJ, and thus onR. This is exactly the conclusion of

the proposition.

3. Geometry and topology of≤st

We investigate here the topology of the relation≤st. We first need to define an adequate topology on Pexp(R). This space can be topologized in several ways, an important point for us being that the map μ

eλdμ should be continuous.

Definition 3.1. A functionf:RRis said to be subexponential if there exist constantsc, Cso that for everyxR f (x)Cexp

c|x| .

(6)

Definition 3.2. Letτ be the topology defined on the space of exponentially integrable measures,generated by the family of seminorms(Nf)

Nf(μ)=

f,

wheref belongs to the class of continuous subexponential functions.

The topologyτ is a locally convex vector space topology. It can be shown that the relation≤stis notτ-closed (see Proposition 3.6). However, we can give a functional characterization of its closure. This is the content of the following theorem.

Theorem 3.3. LetR⊂Pexp(R)2be the set of couples(μ, ν)of exponentially integrable probability measures so that μstν.Then

R=

(μ, ν)∈Pexp(R)2s.t.λ≥0,

eλdμ≤

eλandλ≤0,

eλdμ≥

eλ

, (4)

the closure being taken with respect to the topologyτ.

Proof. Let us writeXfor the set on the right-hand side of (4). We get from Lemma 2.4 thatRX. Moreover, it is easily checked thatXisτ-closed, thereforeRX. Conversely, we are going to show that the set of couples(μ, ν) satisfying the hypotheses of Proposition 2.5 isτ-dense inX. Let(μ, ν)X. We get from the inequalities satisfied by μandνthat:

xdμ(x)≤xdν(x)(taking derivatives atλ=0),

• minμ≤minν(takingλ→ −∞),

• maxμ≤maxν(takingλ→ +∞).

We want to define two sequencesn, νn)whichτ-converge toward(μ, ν), withμnstμandνstνnand for which the above inequalities become strict. Assume for example that maxμ=maxν= +∞and minμ=minν= −∞.

Then we can defineμnandνnas follows: letεn=μ[n,+∞)andηn=ν(−∞,n], and set μn=μ|(−∞,n)+εnδn,

νn=ν|(n,+∞)+ηnδn.

We check using dominated convergence that limμn=μand limνn=νwith respect toτ, while by Proposition 2.5 we haveμnstνn. The other cases are treated in a similar way: we can always play with small Dirac masses to make all inequalities strict (for example, if maxμ=maxν=M <+∞, replaceνby(1ε)ν+εδM+1, and so on).

A more comfortable way of describing the relation≤stis given by the following sets:

Definition 3.4. Letν∈Pexp(R).We defineD(ν)to be the following set:

D(ν)=

μ∈Pexp(R)s.t.μstν .

Using the ideas in the proof of Theorem 3.3, it can easily be showed that forν∈Pexp(R)such that minν >−∞, one has

D(ν)=

μ∈Pexp(R)s.t.∀λ≥0,

eλdμ≤

eλdνand∀λ≤0,

eλdμ≥

eλ

, (5)

where the closure is taken in the topologyτ. However, for measuresνwith minν= −∞, the condition (e) of Propo- sition 2.5 is violated and we do not know if the relation (5) holds.

(7)

Another consequence of Eq. (5) is that theτ-closure ofD(ν)is a convex set. It is not clear that the setD(ν)itself is convex. We shall see in Proposition 3.7 that this is not the case in general for measuresν /∈Pexp(R). Note also that for fixedν∈P(R)the set{μ∈P(R)s.t.μstν}is easily checked to be convex.

Remark 3.5. One can analogously define forμ∈Pexp(R)the “dual” set E(μ)=

ν∈Pexp(R)s.t.μstν .

Results aboutD(ν)orE(μ)are equivalent.Indeed,letμbe the measure defined for a Borel setBbyμ(B)= μ(B).We haveμstν⇐⇒νstμand thereforeE(μ)=D(μ).

We now give an example showing that the relation≤stis notτ-closed.

Proposition 3.6. There exists a probability measureν∈Pexp(R)so that the setD(ν)is notτ-closed.Consequently, the setRappearing in(4)is not closed either.

Proof. Let us start with a simplified sketch of the proof. By the examples of Section 1, for each positive integerk, one can find probability measuresμkandνk such thatμkD(νk), whileμkkstνk

k . We sum properly rescaled and normalized versions of these measures in order to obtain two probability measures μ andν such that μ /D(ν).

However, successive approximationsμ˜nofμare shown to satisfyμ˜nstν which impliesμD(ν)and thusD(ν)= D(ν).

We now work out the details. Fork≥1, letak=(k+2)!, bk=(k+2)! +1 and γk =cexp(−kk), where the constantcis chosen so that

γk=1. We check that(ak)and(bk)satisfy the following inequalities:

(k−1)bk+bk1< kak, (6)

kbk< ak+1. (7)

It follows from Proposition 1.5 that for eachkNthere existμk andνk, probability measures with compact support such thatμkD(νk)whileμkkstν∗k

k . Moreover, we can assume that supp(μk)(ak, bk)and supp(νk)(ak, bk).

Indeed, we can apply to both measures a suitable affine transformation (increasing affine transformations preserve stochastic domination and are compatible with convolution). We now defineμandνas

μ= k=1

γkμk and ν= k=1

γkνk.

Note that the sequence k) has been chosen to tend very quickly to 0 to ensure thatμ and ν are exponentially integrable. We also introduce the following sequences of measures:

˜ μn=

n

k=1

γkμk+

k=n+1

γk

δ0,

˜ νn=

n

k=1

γkνk+

k=n+1

γk

δ0.

One checks using Lebesgue’s dominated convergence theorem that the sequences˜n)and˜n)converge respectively towardμandνfor the topologyτ. Note also that these sequences are increasing with respect to stochastic domination, so thatν˜nstν. For fixedk,μkandνksatisfy the hypotheses of Proposition 2.5 and thus the same holds forμ˜nandν˜n. Thereforeμ˜nD(ν˜n)D(ν). This proves thatμD(ν).

We now prove by contradiction thatμ /D(ν). Assume thatμD(ν), i.e.,μkstν∗kfor somek≥1. Letsk=kak andtk=kbk. Fix a sequencei1, . . . , ik of non-zero integers. Setm=μi1∗ · · · ∗μik orm=νi1 ∗ · · · ∗νik. We know that supp(m)⊂(a, b), with a=k

j=1aij andb=k

j=1bij. It is possible to locate precisely supp(m) using the inequalities (6) and (7).

(8)

(a) Ifij> kfor somej, thenaak+1> tk and therefore supp(m)⊂(tk,+∞).

(b) Ifij=kfor allj, thena=sk andb=tkand therefore supp(m)⊂(sk, tk).

(c) Ifijkfor allj andij0< kfor somej0, thenbbk1+(k−1)bk< sk and therefore supp(m)⊂ [0, sk).

Consequently,

μk[tk,+∞)=

i1,...,ik

γi1. . . γikμi1∗ · · · ∗μik[tk,+∞)=

i1,...,iksatisfying(a)

γi1· · ·γik=νk[tk,+∞).

Moreover, because of (b) and (c), we get that forskttk, μk[t, tk)=γkkμkk[t, tk)=γkkμkk[t,+∞)

and similarly

νk[t, tk)=γkkνkk[t,+∞).

We assumed thatμkstν∗k, i.e.,μk[t,+∞)νk[t,+∞)for allt. Ifttk, sinceμk(tk,+∞)=νk(tk,+∞), we get that μk[t, tk)νk[t, tk). Sinceγk >0, this implies that for all tsk, μkk[t,+∞)νkk[t,+∞). This contradicts the fact thatμkkstν∗k

k . ThereforeμD(ν)\D(ν), and soD(ν)is not closed.

We now give an example of what can happen if we consider measures with poor integrability properties.

Proposition 3.7. There exists a probability measureν∈P(R)such that the set μ∈P(R)s.t.μstν

(8) is not convex.

The difference between Eq. (8) and our definition of D(ν)is that here we do not suppose the measures to be exponentially integrable.

Proof of Proposition 3.7. We rely on the following fact which we already alluded to (see [7], p. 479): there exist two distinct real characteristic functionsφ1andφ2such thatφ12=φ22identically. Consider now the measuresμand νwith respective characteristic functionsφ1andφ2, i.e.,φ1(t )=

eitdμ(t )andφ2(t )=

eitdν(t ). Obviously, we haveνstνandμstνsinceμ2=ν2. Letχ=12μ+12νand let us show thatχstν. We have

χ2n= 1 22n

2n

i=0

2n i

μiν2ni= 1 22n

ieven

2n i

ν2n+

iodd

2n i

ν2n1μ

.

Thusχ2nstν2n, is equivalent toν2n1μstν2n. Let us show that this is impossible. Indeed, the measuresν2n1μ andν2nhave real characteristic functions and thus they are symmetric probability measures. Note however that two symmetric probability distributions cannot be compared with≤stunless they are equal. But it cannot be thatν2n1μ=ν2nbecause their characteristic functions are different (φ1(ξ )=φ2(ξ )iffφ1(ξ )=0). A similar argument holds

forχ2n+1stν2n+1.

We conclude this section with few remarks on a relation which is very similar to≤st. It is the analogue of catalytic majorization in quantum information theory (see Section 4).

Definition 3.8. Letμ, ν∈Pexp(R).We say thatμis catalytically stochastically dominated byνand writeμCstνif there exists a probability measureπ∈Pexp(R)such thatμπstνπ.

The following lemma shows a connection between the two relations.

(9)

Lemma 3.9. Letμ, ν∈Pexp(R).Assumeμstν.ThenμCstν.

Proof. Assume thatμnstν∗nfor somen. Letπbe the probability measure defined by π=1

n

n1

k=0

μkν(n1k).

Let alsoρbe the measure defined by ρ=1

n

n1

k=1

μkν(nk),

then one hasμπ= 1nμn+ρ andνπ=n1νn+ρ, and sinceμnstνn this implies μπstνπ. Since

π∈Pexp(R), we getμCstν.

From Theorem 3.3 and Lemma 3.9 one can easily derive

Corollary 3.10. The analogue of Theorem3.3is true if we substitutestwithCst. 4. Catalytic majorization

This section is dedicated to the study of the majorization relation, the notion which was the initial motivation of this work. The majorization relation provides, much as the stochastic domination for probability measures, a partial order on the set of probability vectors. Originally introduced in linear algebra [3,12], it has found many applications in quantum information theory with the work of Nielsen [13]. We shall not focus on quantum-theoretical aspects of majorization; we refer the interested reader to [1] and references therein. Here, we study majorization by adapting previously obtained results for stochastic domination.

The majorization relation is defined for probability vectors, i.e., vectorsxRN with non-negative components (xi ≥0) which sum up to one (

ixi =1). Before defining precisely majorization, let us introduce some notation.

FordN, let Pd be the set ofd-dimensional probability vectors:Pd = {xRds.t.xi≥0,

xi =1}. Consider also the set of finitely supported probability vectorsP<=

d>0Pd. We equipP<with the1norm defined by x1=

i|xi|. For a vectorxP<, we writexmaxfor the largest component ofxandxminfor its smallest non-zero component. In this section we shall consider only finitely supported vectors. For the general case, see Section 6. We shall identify an elementxPd with the corresponding element inPd (d> d) orP<obtained by appending null components at the end ofx.

Next, we definex, the decreasing rearrangement of a vectorxPdas the vector which has the same coordinates as x up to permutation and such that xixi+1for all 1≤i < d. We can now define majorization in terms of the ordered vectors:

Definition 4.1. Forx, yPdwe say thatxis majorized byyand we writexyif for allk∈ {1, . . . , d} k

i=1

xik

i=1

yi. (9)

Note however that there are several equivalent definitions of majorization which do not use the ordering of the vectorsxandy(see [3] for further details):

Proposition 4.2. The following assertions are equivalent:

(1) xy, (2) ∀tR,d

i=1|xit| ≤d

i=1|yit|,

(10)

(3) ∀tR,d

i=1(xit )+d

i=1(yit )+,wherez+=max(z,0), (4) there is a bistochastic matrixBsuch thatx=By.

There are two operations on probability vectors which are of particular interest to us: the tensor product and the direct sum. Forx=(x1, . . . , xd)Pd andx=(x1, . . . , xd)Pd, we define the tensor productxxas the vector (xixj)ijPdd. We also define the direct sumxx as the concatenated vector(x1, . . . , xd, x1, . . . , xd)Rd+d. Note that if we take⊕-convex combinations, we get probability vectors:λx(1λ)xPd+d.

The construction which permits us to use tools from stochastic domination in the framework of majorization is the following (inspired by [11]): to a probability vectorzP<we associate a probability measureμzdefined by:

μz= ziδlogzi.

These measures behave well with respect to tensor products:

μxy=μxμy.

The connection between majorization and stochastic domination is provided by the following lemma.

Lemma 4.3. Letx, yP<.Assume thatμxstμy.Thenxy. Proof. We can assume thatx=xandy=y. Note that

μx[t,)=

i:logxit

xi=

i:xiexp(t )

xi.

Thus, for allu >0,

i:xi≥uxi

i:yi≥uyi. To start, useu=y1to conclude thatx1y1. Notice that it suffices to show thatk

i=1xik

i=1yi only for thoseksuch thatxk> yk (indeed, ifxkyk, the(k+1)th inequality in (9) can be deduced from thekth inequality). Consider such akand letxk> u > yk. We get:

k

i=1

xi

i:xiu

xi

i:yiu

yik

i=1

yi,

which completes the proof of the lemma.

Remark 4.4. The converse of this lemma does not hold.Indeed,considerx=(0.5,0.5)andy=(0.9,0.1).Obviously, xybut1=μx[log 0.5,∞) > μy[log 0.5,∞)=0.9and thusμxstμy.

We can describe the majorization relation by the sets:

Sd(y)= {xPds.t.xy},

whereyis a finitely supported probability vector. Mathematically, such a set is characterized by the following lemma, which is a simple consequence of Birkhoff’s theorem on bistochastic matrices:

Lemma 4.5. Fory ad-dimensional probability vector,the setS(y)is a polytope whose extreme points arey and its permutations.

The initial motivation for our work was the following phenomena discovered in quantum information theory (see [10] and respectively [2]). It turns out that additional vectors can act ascatalystsfor the majorization relation: there are vectorsx, y, zP<such thatxybutxzyz; in such a situation we say thatxis catalytically majorized (ortrumped) byyand we writexT y. Another form of catalysis is provided bymultiple copiesof vectors: we can find vectorsx andy such thatxy but still, for somen≥2,xnyn; in this case we writexMy. We have

(11)

thus two new order relations on probability vectors, analogues of≤Cstand respectively≤st. As before, foryPd, we introduce the sets

Td(y)= {x∈Pds.t.xT y}

and

Md(y)= {xPds.t.xMy}.

It turns out that the relations≺T and≺M (and thus the setsTd(y)andMd(y)) are not as simple as≺andSd(y).

It is known that the inclusionMd(y)Td(y)holds (this is the analogue of Lemma 3.9) and that it can be strict [8].

In general, the setsTd(y)andMd(y)are neither closed nor open, and althoughTd(y)is known to be convex, nothing is known about the convexity ofMd(y)(such questions have been intensively studied in the physical literature; see [4,6] and the references therein). As explained in [1] it is natural from a mathematical point of view to introduce the setsT<(y)=

dNTd(y)andM<(y)=

dNMd(y). A key notion in characterizing them isSchur-convexity:

Definition 4.6. A functionf:PdRis said to be

Schur-convex iff (x)f (y)wheneverxy,

Schur-concave iff (x)f (y)wheneverxy,

strictly Schur-convex iff (x) < f (y)wheneverxy,

strictly Schur-concave iff (x) > f (y)wheneverxy, wherexy meansxyandx=y.

Examples are provided as follows: ifΦ:RRis a (strictly) convex/concave function, then the following function h:PdRdefined byh(x1, . . . , xd)=Φ(x1)+ · · · +Φ(xd)is (strictly) Schur-convex/Schur-concave.

ForxPdandpR, we defineNp(x)as Np(x)=

1id xi>0

xip.

We will also use the Shannon entropyH

H (x)= − d

i=1

xilogxi.

Note that−H (x)is the derivative ofpNp(x)atp=1 and thatN0(x)is the number of non-zero components of the vectorx. These functions satisfy the following properties:

(1) Ifp >1,Npis strictly Schur-convex onP<. (2) If 0< p <1,Npis strictly Schur-concave onP<.

(3) Ifp <0,Np is strictly Schur-convex onPdfor anyd. However, forp <0, it is not possible to compare vectors with a different number of non-zero components.

(4) H is strictly Schur-concave onP<.

One possible way of describing the relations≺Mand≺T is to find a family (the smallest possible) of Schur-convex functions which characterizes them. In this direction, Nielsen conjectured the following result:

Conjecture 4.7. Fix a vectoryPd,with non-zero coordinates.ThenTd(y)=Md(y)and they both are equal to the set ofxPdsatisfying:

(C1) Forp≥1,Np(x)Np(y).

(C2) For0< p≤1,Np(x)Np(y).

(12)

(C3) Forp <0,Np(x)Np(y).

Here, the closures are taken inRd (recall that neitherMd(y)norTd(y)is closed). By the previous remarks, any vector inTd(y)orMd(y)(and by continuity, also in the closures) must satisfy conditions (C1)–(C3). Recently, Turgut [17] provided a complete characterization of the setTd(y), which implies in particular that Nielsen’s conjecture is true forTd(y). His method, completely different from ours, consists in solving a discrete approximation of the problem using elementary algebraic techniques. Note however that the inclusionMd(y)Td(y)is strict in general, and thus the characterization ofMd(y)is still open. We shall now focus on the setMd(y). Conjecture 4.7 can be reformulated as follows: ifx, yPdand satisfy (C1)–(C3), then there exists a sequence(xn)inMd(y)such that(xn)converges to x. If we relax the condition thatxnandy have the same dimension, we can prove the following two theorems.

Theorem 4.8. Ifx, yPdand satisfy(C1),then there exists a sequence(xn)inM<(y)such that(xn)converges to xin1-norm.

Theorem 4.9. Ifx, yPdand satisfy(C1)–(C2),then there exists a sequence(xn)inMd+1(y)such that(xn)con- verges tox.

SinceMd(y)Td(y), both theorems have direct analogues forT<∞(y)and respectivelyTd+1(y). Theorem 4.8 restates the authors’ previous result in [1]; however, the proof presented in the next section is more transparent than the previous one. Theorem 4.9 answers a question of [1]. It is an intermediate result between Theorem 4.8 and Con- jecture 4.7.

5. Proof of the theorems

We show here how to derive Theorems 4.8 and 4.9. We first state a proposition which is the translation of Proposi- tion 2.5 in terms of majorization.

Proposition 5.1. Letx, yP<.Assume that x and y have non-zero coordinates,and respective dimensions dx anddy.Assume that:

(1) xmin< ymin. (2) xmax< ymax. (3) H (x) > H (y).

(4) Np(x) < Np(y)for allp∈ ]1,+∞[.

(5) Np(x) > Np(y)for allp∈ ] − ∞,1[.

Then there exists an integerN such that for allnN,we havex⊗ny⊗n.

It is important to notice that sinceN0(x)=dxandN0(y)=dy, the conditions of the proposition can be satisfied only whendx> dy. This is the main reason why our approach fails to prove Conjecture 4.7.

Proof. One checks that the probability measuresμxandμyassociated to the vectorsxandy satisfy the hypotheses of Proposition 2.5. Indeed, forpR, one has

Np(x)=

eλx, withλ=p−1.

As μxn=μxn, there exists a integer N such that for nN, we have μxnstμyn. It remains to apply the

Lemma 4.3 in order to complete the proof.

The main idea used in the following proofs is to slightly modify the vectorxso that the couple (x,y) satisfies the hypotheses of Proposition 5.1.

Références

Documents relatifs

From this we shall see that, in the particular case where L is a multiple of the Laplacian, the solution remains bounded for every finite time t &gt; 0, as long as the finite

Given any nonamenable transitive graph, does the plus state for the Ising model for large values of J dominate high density product measures.. Is amenability for transitive

Using stochastic Petri nets, we investigate the probability of a transition occurrence being critical for the entire process, i.e.. such that a small increase or decrease of

Moreover, in the case of point clouds/empirical measures (finite sums of Dirac measures), the computation of the distance function to a measure (and its gradient) at a given point

Hommel [5] has investigated rather fully the possibility of extending Nachbin's theorem to locally compact ordered spaces (see Theorem 4.2.5 and Corollary 4.2.8 in [5]). we can drop

Key words: asymptotics, tail probability, finite and infinite time ruin probabilities, Farlie-Gumbel-Morgenstern

Erratum to “Equicontinuity and existence of attractive probability measures for some iterated function systems”. by

Both approaches will lead us to consider algorithms such as (6) in which either a smooth approximation θ b of function θ will be used in both equations (6a) and (6b), or