A NNALES DE L ’I. H. P., SECTION B
W ERNER F LACKE
N ORBERT T HERSTAPPEN
Bayesian sufficient statistics and invariance
Annales de l’I. H. P., section B, tome 15, n
o4 (1979), p. 303-314
<http://www.numdam.org/item?id=AIHPB_1979__15_4_303_0>
© Gauthier-Villars, 1979, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section B » (http://www.elsevier.com/locate/anihpb) implique l’accord avec les condi- tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
Bayesian sufficient statistics and invariance
Werner FLACKE and Norbert THERSTAPPEN (*)
Lehrstuhl fur Mathematik, Fachrichtung Operations Research,
RWTH Aachen, D-5100 Aachen, Templergraben 57
Ann. Inst. Henri Poincaré, Vol. XV, n° 4, 1979, p. 303-314.
Section B : Calcul des Probabilités et Statistique.
SUMMARY. - The relations between
Bayesian
sufficient statistics and sufficient statistics are examined andBayesian sufficiency
iscomposed
withthe invariance reduction scheme.
An essential sufficient statistic is
defined
and conditions aregiven
underwhich it is
equivalent
to aBayesian
sufficient statistic.Various
counterexamples
show that the statement : «Bayesian sufficiency implies sufficiency »
is not true ingeneral.
RÉSUMÉ. 2014 La connexion entre une
statistique
exhaustive et une statis-tique bayésienne
exhaustive est etudiee. Le modèlestatistique bayesien
est
compose
avec la notion de reduction invariante.Apres
la definition d’unestatistique
exhaustive essentielle conditions sont faites que cettestatistique
estéquivalente
d’unestatistique bayésienne
exhaustive. Nous montrons sur des
exemples
que l’exhaustivitébayésienne n’implique
pas 1’exhaustivite.1. PRELIMINARIES
Let
(X, ~, P)
be a basicprobability
space associated with an observed random variableX,
where P ={ Po 0 e0}
is an identifiablefamily
ofprobability
measures on(X, ;
that means: 0 4= 0’ => 0 can be(*) Mailing Address, N. Therstappen, Cyprianusweg 6, D-5100 Aachen.
Annales de l’Institut Henri Poincaré-Section B-Vol. XV, 0020-2347/1979/303/$ 5.00/
© Gauthier-Villars. 13
303
304 W. FLACKE AND N. THERSTAPPEN
assumed as a metric space. If there is no
given metric,
takep(91, 82)
= sup{ A ~ U}.
Let be L the6-algebra
of Borel sets of0 andH a
family
ofprobability
distributions on(0, X).
Demand the families P and H be dominatedby
a-finite measures/~, ~
and denote thebelonging
versions of the
Radon-Nikodym
densitiesby f (x, 8)
andrespectively.
According
toBayes theorem,
theposterior density
of 03B8 as a version of theconditional
density of 0, given
X = x, is :with
denotes the
complement
and x~; the indicator function of a set N.The
following
notation is used in thesequel.
PH
a. e. meansPo
a. e. for H almost all 0 x K a. e. meansPo
x Ka. e. for H almost all 03B8
e e; H,
K E H u{03B6}.
In the same wayPH
andPH
x K null sets may be defined.Every 03B6
null set is aPH 03B6
null set forHEHU~~~.
2. BAYESIAN SUFFICIENCY
Let Y be a statistic on
(X, ~
into( Y, ~) . Y(Po),
8 EO,
and are theimage
measures under themapping
Y and denoteby fy(y, 8) a
version ofthe
density
of relative toY(~,c).
Thus theposterior density
of0, given
Y = y, becomes :
with
for H almost all 8 E O.
Now a
Bayesian
definition of a sufficient statistic can begiven.
DEFINITION 0. - Y on
(X, ~, P)
to a measurable space(F, ~)
is calledBayesian
sufficient for H, if for all H E H:(hy
o Y denotes thecomposition
ofIzy
andY). _l
B
305
BAYESIAN SUFFICIENT STATISTICS AND INVARIANCE
The usual theorem about the relation between the
Bayesian
and non-Bayesian
definitions ofsufficiency
states: a statistic isBayesian
sufficientfor H if and
only
if it is sufficient for P[5] [6].
The
following example
shows that this theorem is not valid forarbitrary
H.EXAMPLE 1. - Let be
0o and’ =
Eeo, where Boo is the Dirac measure. Under these conditions every statistic on(X, 9t)
isBayesian
sufficient for
H. h(8) is (
a. e. determinedby h(8o) -
1. Soand 0
otherwise J1 x ~
a. e. JAs is
easily
shown the statement »sufficiency implies Bayesian
suffi-ciency
« is true ingeneral.
Because of the
following principal difficulty
the converse statement can’tbe shown. Fundamental for the
Bayesian
statistical model is the existence ofprior
measures, therefore theequality
of the densities in definition 0 canonly
berequired for (
almost all 0.Sufficiency
omitsprior
measures andpostulates
a version of the conditionalprobability independent
of all 0.The
following
case demonstrates theconceptional
difference. 0real, (Xi, X2) independent
normalN(0, 1)
if fJ isirrational, N(0,
if 0 is rational.With’ = N(0, 1),
S =(Xi
+X2)/2
isBayesian
sufficient but not sufficient.This leads to the
following
definition.DEFINITION 2. - Given
(X, ~, P);
P= ~ Pe/B
and(0, 3:, 0.
A sta-tistic Y :
(X, N)
to( Y,
is calledessentially
sufficient(ess. sufficient)
withrespect to
(,
if for all sets A~ U,
B ~ Bvn
for (
almost all0,
whereP(A ~
Y =y)
isindependent
of the classof pro- bability
measures P. Definition 2 is valid for sufficient statistics. Ife is coun-table and > 0 V0
E O,
ess.sufficiency implies sufficiency.
To prove thata
Bayesian
sufficient statistic is ess. sufficient three conditions areimposed.
CONDITION I.
For all 0’ c e with
~(O’)
= 1 let0 E O’ ~ _
P - p.(this
condition insures thatNH
is ap-null
set; P - P’signifies
P « P’ andP’ «
P).
Choose the
(-densities
of theprobability
measures H E H so thatVol. XV, nO 4-1979.
306 W. FLACKE AND N. THERSTAPPEN
there exists a countable
covering
H EH*, H ~
of0,
whereand H* is countable.
’
Now
distinguish
two cases. II :one 0398H
forms thecovering;
III : the cover-ing
is formedby several 0398H.
CONDITION II.
There exists a H E H so that a version of the
density
of H ispositive (h(0)
>0)
‘do E O.CONDITION III.
There is a countable subset H* c H to which
corresponds
a set H consist-ing
of the versions of the(-density
ofH,
H EH*,
withi) ~(8o E O) V(h
E~(1)(, f’(x, 80)
>0~
a. e. Ah(8o)
>0).
ii) V(6 E 8) 3(h
E~-(1)(h(o)
>0).
Now we can prove
THEOREM 3.
2014 ~)
Let H be afamily
ofprobability
distributions on(0, X),
then the measurable
mapping
Y:(X, ~ --~- ( Y, ~)
isBayesian
sufficientfor H if it is ess. sufficient for P.
b)
If either conditions I and II or conditions I and III hold then Y is ess.sufficient for P if it is
Bayesian
sufficient for H.Proof - a) (analogously
to Zacks[6]).
b) (i)
Assume I and II. Let H E H be the measure with(-density
h > 0.(i.
e. for all(x, 0)
ENH
xO).
If follows thatWith
which means
V(0 e0’)
with~(O’)
= 1 we have307 BAYESIAN SUFFICIENT STATISTICS AND INVARIANCE
for (
almost all 0 e0’.Hence,
because of I followsfor (
almost all 0 E 0.ii)
Assume I and III. III(ii)
states that thesets Oh = { 0
>0,
h EH }
form a countable
partition
of O.Hence
: =
W(y, 8)
oY(x).
ButIt follows as in the
proof
of part b(i)
of this theorem thatfor (
almost all 0 withK(x)
=f (x, eo).
aExample
2 showed that theorem 3b isn’t valid withoutassumptions.
In
particular
condition I can’t bedispensed
with without substitute. The next twoexamples
shows the same is true for conditionsII,
IIIii)
andIII
i).
EXAMPLE 4. - Let 0 be a countable set, say 0 ===
{ 0J
liE~l ~ .
Letwith
These
assumptions
fulfil the conditions of I but the assertions ofexample
2are
preserved (the
calculation isparallel
to that ofexample 2).
then I and III
ii)
but not IIIi)
are fulfilled andagain
every statistic on(X, ~)
is
Bayesian
sufficient for H. JIn
example
4 therequirement
that there exists a9o
E O so that for allVol. XV, nO 4-1979.
308 W. FLACKE AND N. THERSTAPPEN
h E > 0 was not correct while the condition
f (x, 80)
>O;u
a. e. canbe
thoroughly
fulfilled. The nextexample
shows that even ifis correct and conditions
I,
IIIi)
holdf (x, 80)
> 0 can’t bedispensed
withwithout substitute.
EXAMPLE 5. -
Let 0398, 03B6 be
as inexample
4,(X, 9t) = (1R1, B1)
theLebesgue-
Borel measurable space and À1 the
Lebesgue
measure on(1R1, B1).
Defineand
Let M ~ X with
p(M)
> 0 and letbe f (x, 81 )
- 0 on M. DefineY (x - M ) - x
x o xEM x~M x o E M. Theni(03B8j|X = x
X =x - )
-( i)YM(03B8j|Y =
Y =y)oY(x) Y) (x)
~c
x (
a. e. which means,YM
isBayesian
sufficient for H. HereI,
IIIit)
and
h(8o)
> 0V(h
E(choose eo - 81)
are fulfilled. This doesn’t includesufficiency.
P can still be chosenarbitrarily.
LetPei
=P1
be a measure with~,1-density f(x, i)
= x i . Define M =[2, 4],
xo -3, f(x’, 1 )
= 0 Vx’ E M .YM
generates the03C3-subalgebra
where M n is the trace of
B1
in M.Suppose YM
is a sufficient statistic for P. For each B EB1
denoteby P(B | YM)
a version of the conditionalexpectation
of xBgiven YM
which isindependent
ofP i,
Because ofthe
~(YM) measurability P(B )
must be constant on M. HenceAnnales de l’Institut Henri Poincaré - Section B
309
BAYESIAN SUFFICIENT STATISTICS AND INVARIANCE
Let B =
[2, 3]
andP(B )
= b on M. ThenThat’s a contradiction to
sufficiency
and likewise to ess.sufficiency,
for 0 iscountable, 03B6(03B8)
> 0~03B8~0398.3. BAYESIAN INVARIANTLY SUFFICIENT STATISTICS Now invariance
properties
are needed. Let G be a group of one to one measurable transformations of X onto X and let P= { P03B8|
0 e0398}
beinvariant under G. The invariance of P means that
V(g
EG)
andV(0
E0)
there exists a
unique
0’ e0(P
is identifiable so that theuniqueness
of 0’ willbe
satisfied)
such that the distributionof g(X)
isgiven by Pe.
whenever thedistribution of X is
given by Po.
The parameter©’, uniquely
determinedby
g and
0,
is denotedby
Then G EG }
is a group of measurable transformations of O onto itself.The group of transformations induces a
partition
of X toorbits,
wherean orbit
of xo,
Xo EX,
relative to G is the set x, g E~J }’
A statistic Y on X is invariant if it is constant on orbits i. e.
An invariant function Y on X is called maximal invariant
if f (x)
=f (x’) implies
the existence of a g E G for which x =gx’.
Maximal invariantsalways
exist and all invariants are functions of a maximalinvariant;
ifY is invariant under G then its distribution
depends only
on a maximalinvariant function a on 0 under G with range r =
!x(0).
Let U:
(X, ~) ~ ( U, ~)
be a maximal invariant statistic on X under G and letfu(u, e)
be thedensity
ofU(Po).
DefineU(Po) = Py,
y =oc(0)
and
.fu(u~ y).
Hence8) ~ ~(U)) _ y)
°U(x)Pe
a. e.where
E(.f (x, 9) ~ N(U))
de notes a version of the conditionalexpectation given ~(U). Following
from the factorization of conditionalexpectations [6] E(h(0) ) only depends
on y.DEFINITION 6.
with
VoL XV, n ° 4-1979.
310 W. FLACKE AND N. THERSTAPPEN
Applying
the definition of theposterior density
we would receive thedenominator Nevertheless the
following identity
holds.LEMMA 7.
U
=u) ~(a))
=I U
=u)
ox ~ a.
e.that means
Py x ~
a. e. fora(H)
almost all y E r.Proof -
Let
Y : (X, ~ -~ ( Y, ~)
be aBayesian
sufficient statistic for whichY(gx)
=Y(gx’)
wheneverY(x)
=Y(x’)
then G induces a groupGy
oftransformations gy on Y. Here gy is defined
by
gyy =Y(gx) V(y E Y)
andall x
satisfying Y(x)
= y. If Z is invariant on Y underGy
then Z o Y isinvariant on X under G. With the
following
definition thepossible
reductionof data is
given.
DEFINITION 8. - A function V :
(X, 9t) -~ (V, D)
isinvariantly Bayesian
sufficient for H under
G,
ifi)
V is invariant under G(V
= W oU).
ii) V(H
EH) V
=v)
oW(u)
=U
= x a. e. withv
=v)
=w
=v).
jThe
following
scheme represents the routes of reduction.311
BAYESIAN SUFFICIENT STATISTICS AND INVARIANCE
Definition 8 states that V is
Bayesian
sufficient forwhere a : O --~ r is a maximal invariant function under G.
Regard
r as aquotient
space formedby
theG orbits, equipped
with thequotient topology
which is the finest
topology keeping
a continous. The purpose is to transfer condition I from O to r but thequotient topology
is too coarse ingeneral.
LEMMA 9. If condition I is valid for
~,
it is valid too fora(().
Proof - (I)
Assume r with = 1then {P03B3/03B3
E0393’}
andU(P)
areequivalent.
Let be with0,
y E r’ then= 0 for all o E Thus 1 =
a(~)(r’).
Hence for all6 = 0 and
finally Py(NU)
= 0 for all y E r. JLEMMA 10. If condition II
holds,
it also holds forHa
andProof -
Let H E H fulfil condition II. ox(0)
=9t(x)) implies
o >0~
a. e. Thenis also a
density
ofHa
relative toa(~).
JEXAMPLE 11. It is
impossible
to transfer condition III in the same way.Be 0 = r
== { [n - 1, n [/n
and letbe ,
theÀ1-continuous
pro-bability
measure withdensity
c, c = withc ~
= 1 andc~ > 0 for all i E ~I. Let
Hi
be the measure with’-density
Then condition III
ii)
is fulfilled for H while the item x{o}synthesizes
IIIi) (given f (x, 0)
>0). Transfering
the conditions to theimage
space we get IIIii)
withhia
= but IIIi)
isn’t valid anylonger
because ofTHEOREM 12. - Let V :
(X, 9T) -~ (V, ~)
be aninvariantly
ess. sufficientstatistic for P under
G,
then V isinvariantly Bayesian
sufficient for Hunder G.If conditions I and II are
fulfilled,
theninvariantly Bayesian sufficiency implies invariantly
ess.sufficiency.
Vol. XV, no 4-1979.
312 W. FLACKE AND N. THERSTAPPEN
Proof -
V isinvariantly Bayesian
sufficient for H under G if andonly
if W is
Bayesian
sufficient forHa.
Let V beinvariantly
ess. sufficient for P under G. Then W is ess. sufficient forU(P).
Hence W isBayesian
sufficientfor Now let W be
Bayesian
sufficient for ThenHa
fulfils I and II.Hence W is ess. sufficient for
U(P).
The theorem of Stein holds under several conditions
(see [4]).
One ofthese conditions states.
CONDITION IV.
There is a set
Ao
E~(V)
of P measure 1 and an invariant conditional pro-bability
distributionQ
on % x X withQ(A, x)
= 0V(A
E~) Ao).
The invariance of the conditional
probability
distribution means :Now we get a
Bayesian
version for the theorem of Stein.THEOREM 13. - Let conditions
I, II,
IV orI, III,
IV be valid then(*),
condi-tion IV can be substituted
by
any of the other conditions of[4].
Proof.
Y is ess. sufficient for P. It follows that Z o Y isinvariantly
ess. sufficient for P under G
[4].
Hence Z o Y isinvariantly Bayesian
sufficient.EXAMPLE 14. - Let be
(X, ~) = «(Rn, ~");
X = ...,X,); Xl,
...,Xn independent
identical distributed random variables with distributionN(81, 02) ;
we define0 == ~
1 andHence
Z(y)
= Y2’ Let be a : O ~r,
r =f~+
be definedby a(~l, ®2) - 8~~
then a is continuous
(and
IV isfulfilled) ; (
must be aprobability
measure onHenri Poincaré - Section B
313 BAYESIAN SUFFICIENT STATISTICS AND INVARIANCE
(0,0
with apositive ~, ~ ~-density ~(81, (II fulfilled).
Let be h’ -hç
and H =
{ H/H probability
measure on(0, e
n~2)
with(-density fJ2)
> 0}.
With some calculations followsFurther
2 .... I
Here M
= : ’ - . : -
is a(n - 1 )
x(n - 1)
matrix. U has the distribution1 .... 2
N(0, 82M).
HenceHere is
~2n-1-distributed. Y2
has the21-density fY2(y2 ; 03B82)
So we get
. 1 -n... 1 B
It holds y2 = with
M’~=--( ~B ~ 1 ...’1-J
’ ~ . :
and finally I U - u~
= ( y2 -
Vol. XV, n° 4-1979.
314
REFERENCES
[1] D. BASU, On Sufficiency and Invariance Essays in Probability and Statistics. Chapel Hill,
N. C. University of North Carolina Press, 1970, p. 61-84.
[2] H. BAUER, Probability Theory and Elements of Measure Theory. New York, Holt, Rinehart and Winston, 1972.
[3] N. BOURBAKI, Éléments de Mathématique. Intégration Chap. 1-4, Paris, Hermann, 1965.
[4] W. J. HALL, R. A. WIJSMAN, J. K. GHOSH, The Relationship between Sufficiency and Invariance with Applications in Sequential Analysis. Ann. Math. Statist., t. 36, 1965, p. 575-614.
[5] H. RAIFFA, R. SCHLAIFER, Applied Statistical Decision Theory, Cambridge, Mass.
The M. I. T. Press, 1961.
[6] S. ZACKS, The Theory of Statistical Inference, New York, J. Wiley, 1971.
(Manuscrit reçu le 3 mai 1978)
Annales de l’Institut Henri Poincaré - Section B