Dirichlet and quasi-Bernoulli laws for perpetuities

(1)

HAL Id: hal-01977655

https://hal.archives-ouvertes.fr/hal-01977655

Submitted on 25 Jan 2019

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Dirichlet and quasi-Bernoulli laws for perpetuities

Pawel Hitczenko, Gérard Letac

To cite this version:

Pawel Hitczenko, Gérard Letac. Dirichlet and quasi-Bernoulli laws for perpetuities. Journal of Applied

Probability, Cambridge University press, 2014. �hal-01977655�

(2)

Dirichlet and Quasi Bernoulli laws for perpetuities

Paweł Hitczenko

^∗

, Gérard Letac

^†

Abstract

Let X, B and Y be three Dirichlet, Bernoulli and beta independent random variables such thatX∼D(a0, . . . , ad),such thatPr(B= (0, . . . ,0,1,0, . . . ,0) =ai/awitha=Pd

i=0aiand such thatY ∼β(1, a).We prove thatX ∼X(1−Y) +BY.This gives the stationary distribution of a simple Markov chain on a tetrahedron.

1 Introduction

In a recent paper [1], Ambrus, Kevei and Vígh make the following interesting observation: IfV, Y, W are independent random variables such thatV ∼_π¹(¹₄−v²)^−1/21_(−1/2,1/2)(v)dv,Y is uniform on(0,1) and Pr(W = 1) = Pr(W = −1) = 1/2 then V ∼ V(1−Y) +^W₂Y. Random variables V satisfying V ∼ V M +Q where the pair (M, Q) is independent of V on the right-hand side are often called perpetuities. Thus, another way of stating the observation from [1] is that an arcsine random variable on(−1/2,1/2)is a solution of perpetuity equation with(M, Q)∼(1−Y, W Y /2). Part of the reason we found it interesting is that there are relatively few examples of exact solutions to perpetuity equations in the literature. We will generalize this result, and our generalization will provide more example of perpetuities, including power semicircle distributions (see [2]).

To carry out the generalization, we reformulate this result on (0,1) by writing X =V + ¹₂ and B = (1 +W)/2. Clearly X, Y, B are independent with X ∼β(1/2,1/2), B ∼ ¹₂(δ₀+δ₁) and X ∼ X(1−Y) +BY.In Theorem 1 below, we give a generalization of the result in [1] expressed in terms ofX, Y andB.

We will need the following notation. The natural basis ofR^d+1is denoted bye0, . . . ed. Ifp0, . . . , pd

are positive numbers with sum equal to one, the distribution Pd

i=0piδe_i of B = (B0, . . . , Bd) ∈ R^d+1 is called a Bernoulli distribution. It satisfies Pr(B = ei) = pi and B0+. . .+Bd = 1. If a0, . . . , ad are positive numbers the Dirichlet distribution D(a0, . . . , ad) ofX = (X0, . . . , Xd)is such thatX0+· · ·+Xd= 1and the law of(X1, . . . , Xd)is

Γ(a₀+. . .+a_d)

Γ(a0). . .Γ(ad) (1−x₁− · · · −x_d)â⁰⁻¹xâ₁¹⁻¹. . . xâ_d^d⁻¹1Td(x₁, . . . , x_d)dx₁, . . . dx_d

where T_d is the set of (x₁, . . . , x_d) such thatx_i >0 for all i = 0,1, . . . , d, with the conventionx₀ = 1−x₁· · · −x_d.For instance, if the real random variableX₁has the beta distributionβ(a₁, a₀)(dx) =

1

B(a₁,a₀)x^a¹⁻¹(1−x)^a⁰⁻¹1_(0,1)(x)dxthen(1−X₁, X₁)∼D(a₀, a₁).

Theorem 1: Let a0, . . . , ad be positive numbers. Denote a = a0+· · ·+ad. Let X, Y and B be three Dirichlet, beta and Bernoulli independent random variables such that X ∼D(a0, . . . , ad)and B∼Pd

i=0 ai

aδe_i are valued inR^d+1 and such thatY ∼β(1, a).ThenX ∼X(1−Y) +BY.

Comments. Considering each coordinate, Theorem 1 says that for all i = 0, . . . , d we have Xi ∼ Xi(1−Y) +BiY.Since 1 = Pd

i=0Xi = Pd

i=0Bi the statement for i = 0is true if it is verified for i= 1, . . . , d.For instance ford= 1 a reformulation of Theorem 1 is

∗Department of Mathematics, Drexel University, Philadelphia P.A. 19104, U.S.A. Partially supported by a grant from Simons Foundation (grant #208766 to Paweł Hitczenko).

†Laboratoire de Statistique et Probabilités, Université Paul Sabatier, 31062 Toulouse, France.

(3)

Theorem 2: Leta0, a1 >0. LetX1, Y, B1 be three independent random variables such that X1 ∼ β(a1, a0), Y ∼β(1, a0+a1), B1∼_a^a⁰

0+a₁δ0+_a^a¹

0+a₁δ1.ThenX1∼X1(1−Y) +B1Y.

Therefore the initial remark contained in [1] is a particular case of Theorem 1 ford= 1anda0=a1= 1/2.More generally, the cased= 1anda0=a1 covers the power semicircle distributions discussed in [2] (withθ=a0−3/2). In particular, a0=a1= 3/2is the classical semicircle distribution.

One can prove Theorem 2 by showingE(X₁(1−Y) +B₁Y)ⁿ) =E(X₁ⁿ)for all integersn.Our proof of the more general Theorem 1 is somewhat linked to this method of moments.

2 A Markov chain on the tetrahedron

This section gives an application of Theorem 1. Consider the homogeneous Markov chain(X(n))n≥0

valued in the convex hullE_d+1of(e₀, . . . , e_d)with the following transition process: GivenX(n)∈E_d+1 choose a vertex B(n+ 1) randomly among {e0, . . . , e_d} such that Pr(B(n+ 1) = e_i) =a_i/a and a random numberY_n+1∼β(1, a).Now draw the segment(X(n), B(n+ 1))and take the point

X(n+ 1) =X(n)(1−Y_n+1) +B(n)Y_n+1

on this segment. Theorem 1 says that the Dirichlet distributionD(a0, . . . , ad)is a stationary distribution for the Markov chain(X(n))_n≥0.Recall the following principle (see [3] Proposition 1):

Theorem 3: If E is a metric space and if C is the set of continuous maps f : E → E let us fix a probability ν(df)onC. LetF1, . . . , Fn, . . . be a sequence of independent random variables ofCwith the same distribution ν.DefineWn =Fn◦. . .◦F2◦F1 andZn=F1◦. . .◦F_n−1◦Fn.Suppose that almost surelyZ= limnZn(x)exists inE and does not depend onx∈E. Then

1. The distribution µofZ is a stationary distribution of the Markov chain(Wn(x))n≥0 onE;

2. ifX andF1 are independent and ifX ∼F1(X)thenX ∼µ.

ChooseE=E_d+1.Apply Theorem 3 to the distributionν of the random map F₁ onE_d+1defined by F₁(x) = (1−Y₁)x+Y₁B(1) where Y₁ ∼β(1, a)and B(1)∼Pd

i=0 a_i

aδ_e_i are independent. If the F_n defined byFn(x) = (1−Yn)x+YnB(n)are independent with distribution ν, clearly

Zn(x) = (

n

Y

j=1

(1−Yj))x+

n

X

k=1





k−1

Y

j=1

(1−Yj)



YkB(k)

converges almost surely to the sum of the following converging series

Z=

∞

X

k=1





k−1

Y

j=1

(1−Yj)



YkB(k) (1)

and therefore hypotheses of Theorem 3 are met. As a consequence the Dirichlet lawD(a0, . . . , ad)is the unique stationary distribution of the Markov chain(X(n))n≥0and is the distribution ofZ defined by (1). Finally recall the definition of a perpetuity [5] on an affine spaceA.Letν(df)be a probability on the space of affine transformationsf mapping Ainto itself. We say that the probabilityµonAis a perpetuity (relatively toν) ifX∼F(X)whenF ∼ν andX ∼µare independent. If the conditions of Theorem 3 are met for ν, there is exactly one perpetuity relative to ν. Theorem 1 says that the Dirichlet distribution is a perpetuity for the random affine mapF(x) = (1−Y)x+Y B on the affine hyperplaneAofR^d+1 containinge0, . . . , ed.

(4)

3 Proof of Theorem 1

Iff = (f0, . . . , fd)andx= (x0, . . . , xd)we writehf, xi=Pd

i=0fixi.We need a lemma:

Lemma 4. Suppose thata_i >0for alli= 0, . . . , dand denote a=Pd

i=0a_i. LetX be valued in the convex hullE_d+1 of(e₀, . . . , e_d).The three following conditions are equivalent

1. The random variableX has the Dirichlet distributionD(a0, . . . , ad);

2. for allf = (f0, . . . , fd)such thatfi>0 we haveE(hf, Xi^−a) =Qd

i=0f_i^−aⁱ; 3. for allf = (f0, . . . , fd)such thatfi>0 we haveE(hf, Xi^−a−1) =

Qd

i=0f_i^−aⁱ Pd

i=0 ai

af_i.

Proof of Lemma 4. 1 ⇔2 is standard (see for instance Proposition 2.1 in [4]). Statement 3 ⇒1 is proved in the same way as 2 ⇒ 1 by observing that if X satisfies 3, thenX has the moments of D(a0, . . . , an). A short way to prove 2 ⇒ 3 is to differentiate the formula in 2 with respect to fi, obtaining

−aE(X_ihf, Xi^−a−1) =−a_i fi

d

Y

i=0

f_i^−aⁱ

! .

Summing these equalities fori= 0, . . . , nand usingPn

i=0Xi= 1 gives 3.

We now prove Theorem 1. Fixingfi>0 fori= 0, . . . , dwe can write

E(hf, Bi⁻¹) =

d

X

i=0

a_i afi

. (2)

Observe also that for−∞< t <1 we have

E((1−tY)^−a−1) = (1−t)⁻¹. (3)

One way to see this is to apply formula 2 of the Lemma to the Dirichlet random variable(1−Y, Y)∼ D(a,1)and to f0= 1andf1= 1−t.Denote X⁰ =X(1−Y) +BY. We now write

E

1 hf, X⁰i^a+1

= E (hf, Xi −Yhf, X−Bi)^−a−1

= E

hf, Xi^−a−1E

[1−Yhf, X−Bi

hf, Xi ]^−a−1|X, B

(4)

= E

hf, Xi^−a−1[1−hX−Bi hf, Xi ]⁻¹

=E hf, Xi^−ahf, Bi⁻¹

(5)

=

d

Y

i=0

f_i^−aⁱ

! _d X

i=0

ai

afi

=E 1

hf, Xi^a+1

. (6)

Equation (4) is obtained by conditioning using the independence ofX, Y, B; Equality (5) comes from (3) applied tot=hf, X−Bi/hf, Xi which is indeed<1;Equality (6) comes from the independence ofX andB, from (2)and from parts 1⇒2, 3 of the Lemma. From part 3⇒1we get thatX⁰ ∼X.

4 Extension to the quasi Bernoulli distributions

In this section, Theorem 6 generalizes Theorem 1. This choice of order of exposition comes from the fact that this extension of Theorem 1 is somewhat complicated. We need for this to define in Theorem 5 below the quasi Bernoulli distributions of orderkon the tetrahedronEd+1with parametersa0, . . . , ad.

(5)

On the open setUd+1 ={f = (f0, . . . , fd) ; fi>0 ∀i} we define the functionT(f) =Qd

i=0f_i^−aⁱ and the differential operator

H=

d

X

i=0

∂

∂fi

.

In the following statement(a)k is the Pochhammer symbol(a)0= 1 and(a)k+1= (a+k)(a)k. Lemma 5. LetX be a random variable ofEd+1leta0, . . . , adbe a sequence of positive numbers and letkbe a non negative integer. ThenX ∼D(a₀, . . . , a_d)if and only if for allf ∈U_d+1 we have

(−1)^k(a)kE(hf, Xi^−a−k) =H^k(T)(f). (7)

Proof. The if part is proved by induction. The case k= 0 comes from [4] Proposition 2.1. If it is correct fork we applyH to both sides of (7) and we use the fact thatX0+. . .+Xd= 1for claiming that HE(hf, Xi^−a−k) = −(a+k)E(hf, Xi^−a−k−1). This extends the induction hypothesis to k+ 1.

The only if part is proved by the same moment method used in [4] Proposition 2.1 in the particular casek= 0.

Formula (7) is crucial for defining our quasi Bernoulli distributions. For simplicity denoteL= logT.

Then

H(T) =T H(L), H²(T) =T H(L)²+T H²(L), H³(T) =T H(L)³+ 3T H²(L)H(L) +T H³(L), H⁴(T) =T(H(L))⁴+ 6H²(L)(H(L))²+ 4T H³(L)H(L) + 3T(H²(L))²+T H⁴(L).

More generally an easy induction onkshowsT⁻¹H^k(T)is a polynomialPk with respect toH^j(L) with j = 1, . . . , k with non negative integer coefficients c_k(α₁, . . . , α_k) such that c_k(α₁, . . . , α_k) 6= 0 impliesα₁+ 2α₂+· · ·+kα_k=k:

T⁻¹H^k(T) =Pk(H(L), . . . , H^k(L)) = X

α₁,...,α_k

ck(α1, . . . , αk)H(L)^α¹· · ·(H^k(L))^α^k. (8)

Now we observe that forj >0 we have

H^j(L) = (−1)^j(j−1)!

d

X

i=0

ai

f_i^j. (9)

The next theorem defines the quasi Bernoulli distributionB_k(a₀, . . . , a_d).

Theorem 5. Let a₀, . . . , a_d > 0 with a = Pd

i=0a_i. Fix a positive integer k. Then there exists a unique probability distribution B_k(a₀, . . . , a_d) for the random variable B on E_d+1 such that for all f_i>0,i= 0. . . , dwe have

E(hf, Bi^−k) = 1 (a)k

X

α1,...,αk

c_k(α₁, . . . , α_k)

k

Y

j=1

(j−1)!

k

X

i=0

a_i f_i^j

!^α^j

(10)

= 1

(a)_k

X

b₀,...,b_d∈N;b₀+···+bd=k

k!

b₀!. . . b_d!

d

Y

i=0

(ai)b_i

f_i^bⁱ . (11)

In particular, ifX ∼D(a0, . . . , ad)we have

E(hf, Xi^−a−k) =E(hf, Xi^−a)E(hf, Bi^−k) (12)

Proof. The uniqueness is easily proved by moments. For proving the existence it is worthy to explain the idea first by considering the cases k= 1 and k= 2.The case k= 1 gives the familiar Bernoulli

(6)

distribution since forB∼Pd i=0

ai

aδe_i equality (2) holds. Fork= 2formula (10) becomes

E(hf, Bi⁻²) = 1 a(a+ 1) (

d

X

i=0

a_i fi

)²+

d

X

i=0

a_i f_i²

!

= 1

a(a+ 1)





d

X

i=0

a_i(a_i+ 1)

f_i² + X

0≤i<j≤d

2a_ia_j fifj



. (13) We now make the two observations

• If 0 ≤ i < j ≤d and if B ∼ λ_ij which is the uniform probability on the edge (e_i, e_j) of the tetrahedron E_d+1 we haveE(hf, Bi⁻²) = 1/f_if_j.

• The second observation is that _a(a+1)¹ Pd

i=0a_i(a_i+ 1) +P

0≤i<j≤d2a_ia_j

= 1.

Therefore we see thatB2(a₀, . . . , a_d)is the mixture 1

a(a+ 1)





d

X

i=0

ai(ai+ 1)δe_i+ X

0≤i<j≤d

2aiajλij



.

We now pass to the general case of an arbitrary k. We slightly extend the definition of a Dirichlet distribution D(a0, . . . , ad) by allowing ai ≥ 0 instead of ai > 0 while keeping a = Pd

i=0ai > 0.

For such a sequence (a0, . . . , ad) we define T = {i ; ai > 0} and D(a0, . . . , ad) has the Dirichlet distribution concentrated on the tetrahedron ET generated by (ei)_i∈T with parameters (ai)_i∈T. If X ∼D(a0, . . . , ad)the formula E(hf, Xi^−a) =Qd

i=0f_i^−aⁱ still holds. If T contains only one element i0 thenD(a0, . . . , ad)is simplyδe_i₀ and does not depend ona.

To finish the proof of the theorem, we observe that putting fi = 1for alli= 0, . . . , din (8) gives (−1)^k on the left hand side. As a consequence, puttingfi= 1 for alli= 0, . . . , din (10) on the right hand side gives 1. The right hand side of (10) can be considered as a homogeneous polynomial of degreekwith respect to the variables1/f_i, namely

X

b1,...,bd

w_k(b₀, . . . , b_d)

d

Y

i=0

1 f_i^bⁱ.

In fact, it will follow from (11) thatwk(b0, . . . , bd) = _(a)¹

k

k!

b0!...bd!

Qd

i=0(ai)bi and hence they satisfy X

b1,...,b_d

wk(b0, . . . , bd) = 1.

Therefore, once we establish (11) we can claim thatBk(a₀, . . . , a_d)exists and is the following mixture of extended Dirichlet distributions

Bk(a0, . . . , ad) = X

b1,...,bd

wk(b0, . . . , bd)D(b0, . . . , bd).

To prove (11) recall that (8) and (9) imply that

H^k(T) = (−1)^kT X

α₁,...,α_k

ck(α1, . . . , αk)

k

Y

j=1

(j−1)!

k

X

i=0

ai

f_i^j

!^αj

.

Alternatively, applyingH^k(T)directly to the productQd

i=0f)i^−aⁱ we see thatH^k(T)is the sum over all possible choices of kapplications of partial derivatives _∂f^∂

i, i = 0, . . . , dto T. If _∂f^∂

i is appliedbi

times,i= 0, . . . , dwithPb_i=kthen the result is

(−1)^k

d

Y

i=0

(a_i)_b_i

f_i^aⁱ^+bⁱ = (−1)^kT

d

Y

i=0

(a_i)_b_i f_i^bⁱ .

(7)

Since there are _b ^k

0,...,b_d

ways of applying _∂f^∂

i bi times fori= 0, . . . , dit follows that H^k(T) = (−1)^kT X

b0,...,b_d

k^! b0!. . . bd!

d

Y

i=0

(ai)b_i

f_i^bⁱ , which proves (11).

Finally, formula (12) is a consequence of Lemma 4 and (10).

Comments. The complexity ofH^k(T)makes the general combinatorial description ofBk(a0, . . . , ad) difficult. However it is worthwhile to write the details whend= 1for small values ofk.Observe first that ifB= (1−B1, B1)∼D(b0, b1),orE(hf, Bi^−b⁰^−b¹) =f₀^−b⁰f₁^−b¹ thenB1∼δ0 ifb1= 0, B1∼δ1if b0= 0 andB1∼β(b1, b0)ifb0, b1>0.As a consequence, ifB= (1−B1, B1)has the quasi Bernoulli distributionBk(a0, a1),the distribution ofB1has the form

B1∼(a1)k

(a)_k δ0(dx) +gk(x)dx+(a0)k

(a)₀ δ1(dx)

where the functiongk is zero outside of(0,1).The first values ofgk(x)on(0,1)are the following g1(x) = 0, g2(x) = 2a0a1

a(a+ 1), g3(x) = 6a0a1

(a)3

(a0+ 1 + (a1−a0)x).

??????????????????????? Pawel, here we should put more examples.

Theorem 6. Leta₀, . . . , a_d>0witha=Pd

i=0a_i.Fix a positive integerk.Consider three independent random variablesX∼D(a₀, . . . , a_d), Y ∼β(k, a)andB∼ B_k(a₀, . . . , a_d).ThenX ∼(1−Y)X+Y B.

Proof. It is quite similar to the proof of Theorem 1. DenoteX⁰ = (1−Y)X+Y B and takef_i >0.

Then

E(hf, X⁰i^−a−k) = E

hf, Xi^−a−k[1−Yhf, X−Bi hf, Xi ]^−a−k

(14)

= E

hf, Xi^−a−kE

[1−Yhf, X−Bi

hf, Xi ]^−a−k|X, B

(15)

= E

hf, Xi^−a−k[1−hf, X−Bi hf, Xi ]^−k

=E hf, Xi^−ahf, Bi^−k

(16)

= E(hf, Xi^−a−k). (17)

Step (16) comes from the fact thatE((1−tY)^−a−k) = (1−t)^−kapplied to t= ^hf,X−Bi_hf,Xi <1. The proof comes from Lemma 4 part 2 applied tod= 1,(1−Y, Y)∼D(a, k)andf₀= 1andf₁= 1−t.

5 References

[1]Gergely Ambrus, Péter Kevei and Viktor Vígh (2012) ’The diminishing segment process’.

Statist. Probab. Letters 82 191-195.

[2]Octavio Arzimendi and Víctor Pérez-Abreu(2010) ’On the non-classical infinite divisibilityof power semicircle distributions.’ Commun. Stoch. Anal. 4161-178.

[3]Jean-François Chamayou and Gérard Letac (1991) ’Explicit stationary sequences for com- positions of random functions and products of random matrices.’ J. Theoret. Probab. 43-36

(8)

[4]Jean-François Chamayou and Gérard Letac(1994) ’A transient random walk on stochastic matrices with Dirichlet distributions.’ Ann. Probab. 22424-430.

[5]Charles M. Goldie and Ross A. Maller(2000) ’Stability of perpetuities.’ Ann. Probab. 28 1195-1218.