HAL Id: hal-01950665
https://hal.archives-ouvertes.fr/hal-01950665
Submitted on 11 Dec 2018
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Beta and Dirichlet sub-Gaussianity
Olivier Marchal, Julyan Arbel
To cite this version:
Beta and Dirichlet sub-Gaussianity
Olivier Marchal, Univ. Lyon, Univ. Monnet, Institut Camille Jordan, France
Julyan Arbel, Inria, Mistis, Grenoble, France,
www.julyanarbel.com
Abstract
• Optimal proxy variance σ2opt for the sub-Gaussianity of Beta
distribu-tion, improves recent conjecture σ20 by Elder (2016): σ2opt ≤ σ2
0 =
1
4(α+β+1) .
• Provide different proof techniques for
(i) symmetrical case (α = β): direct coef. comparison of entire series
(ii) non-symmetrical case (α , β): ordinary differential equation satisfied
by moment-generating function, aka confluent hypergeometric function
• Derive optimal proxy variance for Dirichlet.
Sub-Gaussianity & optimal proxy variance
• A random variable X with finite mean µ = E[X] is sub-Gaussian if there
is a positive number σ2 (called a proxy variance) such that:
E[exp(x(X − µ))] ≤ exp
x2σ2
2
!
for all x ∈ R. (1)
• If X is sub-Gaussian, the optimal proxy variance is:
σ2
opt(X) = min{σ
2
≥ 0 such that X is σ2-sub-Gaussian}.
Variance lower bounds optimal proxy variance: Var[X] ≤ σ2opt(X). When
σ2
opt(X) = Var[X], X is said to be strictly sub-Gaussian.
Optimal proxy variance illustrated & compared
0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.04 0.08 0.12 α+β=1 α 0 2 4 6 8 10 0.000 0.005 0.010 0.015 0.020 α+β=10 α 2 3 4 1 1 2 3 4 0.05 0.1 0.15
• Top: varying α on x-axis, α + β = 1
(left) and α + β = 10 (right).
• Left: α, β ∈ [0.2, 4]. ◦ Magenta: σ2opt(α, β) ◦ Green: Var[Beta(α, β)] ◦ Black (dots): σ2 0 = 1 4(α+β+1)
Looking for connections with literature
◦ Compare with Bernoulli setting (Berend and Kontorovich, 2013;
Buldygin and Moskvichova, 2013)
◦ Possible links of our non-uniform sub-Gaussian result with
◦ transportation inequalities
◦ logarithmic Sobolev inequalities
Optimal proxy variance for the Beta
Theorem Beta(α, β) is σ2opt(α, β)-sub-Gaussian with:
σ2 opt(α, β) = α (α + β)x0 1F1(α + 1; α + β + 1; x0) 1F1(α; α + β; x0) − 1 !
where x0 is the unique solution of the equation
log(1F1(α; α + β; x0)) = αx0 2(α + β) 1 + 1F1(α + 1; α + β + 1; x0) 1F1(α; α + β; x0) ! .
Simple explicit upper bound to σ2opt(α, β) is σ20(α, β) = 4(α+β+1)1 :
Var[Beta(α, β)] ≤ σ2opt(α, β) ≤ σ20 (strict when α , β)
Sketch of proof
• α = β: direct coef. comparison of entire series representations of (1)
• α , β: study the MGF aka confluent hypergeometric function or
Kummer’s function
y(x) def= E[exp(xX)] = 1F1(α; α + β; λ) =
∞ X j=0 Γ(α + j)Γ(α + β) ( j!)Γ(α)Γ(α + β + j) x j.
using the ordinary differential equation it satisfies
xy00(x) + (α + β − x)y0(x) − αy(x) = 0
via the difference
ut(x) def= exp µx + σ2t x2/2 − E[exp(xX)], σ2t = t Var[X] + (1 − t)σ20
0 1 2 3 −1 0 1 2 x10−3 ● x0 u0 utopt utnonopt u1
◦ For t = 0 (σ20), dotted black curve remains > 0
◦ For t = topt (σ2opt), magenta curve has minimum = 0 at x0
◦ For t = 1 (Var[X]), dashed green curve has negative second derivative
at x = 0, directly negative around 0
◦ For tnon opt ∈ (topt, 1), orange, dash and dots curve is first positive, then
negative, and positive again
References
Berend, D. and Kontorovich, A. (2013). On the concentration of the missing mass. Electronic Communications in Probability, 18(3):1–7.
Buldygin, V. V. and Moskvichova, K. (2013). The sub-Gaussian norm of a binary random variable. Theory of probability and mathematical statistics, 86:33–49.
Elder, S. (2016). Bayesian adaptive data analysis guarantees from subgaussianity. arXiv preprint: arXiv:1611.00065.
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American statistical association, 58(301):13–30.