x^;cM^
n^!:-HD28 .M414 r\o.3ozl-«<? 9li v.S-'^fC I 7989^ 'Hi»«te9_
WORKING
PAPER
ALFRED
P.SLOAN
SCHOOL
OF
MANAGEMENT
Bayesian
Parametric
Models
Peter J.
Kempthorne
Max
B.Mendel
SloanWorking Paper# 3021-89
MASSACHUSETTS
INSTITUTE
OF
TECHNOLOGY
50
MEMORIAL
DRIVE
Bayesian
Parametric
Models
Peter J.
Kempthorne
Max
B.Mendel
Bayesian Parametric
Models
Peter
J.Kempthorne
Max
B.
Mendel
Massachusetts
Institute ofTechnology
Cambridge,
MA
02139
May
30,1989
1
Introduction
From
a Bayesian perspective, uncertainty in arandom
quantity, sayX
is
modeled
in terms of P(-), a (joint) probability distribution of (all thecomponents
of) X.To
define such a distribution, parametric models areused as a convenient
way
to specify such a joint distribution; see, e.g.,Lindley (1971),
Box
and
Tiao (1973), or Berger (1985). For example, let 6 denote anunknown
parameter taking on a value in 0, the parameter space,and
for each^6
0, let the conditional distribution ofX
given 6 bespecified
by
P[- \ 6). These distributions can be given in terms of densities,which, SiS a function of 6 for fixed X, is
known
as the likelihoodfunction.The
unconditional distribution ofX
is then given by specifying 7r(-), aprobability distribution on 0.
While
apparently easier thanjust specifying the unconditionaldistribu-tion of
X
directly, there are two problems with this approach. First,how
should one choose the likelihood function?Common
practice leads tothoselikelihood functions which have been traditional choices in the past or which
satisfy exogenous needs for mathematical tractability or simplicity. This
approach is natural for investigators comfortable with frequentist prob-ability models
who
assume
that there issome
fixed,unknown
parameterwhich
fully characterizes the distribution ofthe observedrandom
variables. However, this raises a second problem - the specification of the priordis-tribution. Bayesian approaches are often
abandoned
in specific problems because specifying the prior is so daunting.Much
ofthe difficulty is due tothe abstract nature ofthe
imknown
parameter.In this paper,
we
address theproblem
ofdefining parametersamd
spec-ifying parametric probability models. Section 2 provides a formal
develop-ment
ofBayesian parametric models.To
any classP
of probability modelsfor specific data, X, a parameter is defined as any measurable function of the datawhich, as a conditioningvariable, leads to consensus beliefsfor the
randomness
in the data. That, is the conditional distribution ofX
giventhe parameter does not vary with models
P
&
P
.The
prior distributiononthe parameter then has a simple definition as the marginal distribution
on
collections ofoutcomes of
X
defined by values ofthe parameter - themea-surable fvmction of X. Similarly, the likelihood function for the parameter
has a natural definition in terms of conditional expectations of
(measur-able) functions of the data given the cr—field on the sample space of the
data generated
by
the parameter.With
definitions ofa parameter, itsprior distributionand
likelihoodfimction,we
are able toconstruct the "Bayesianparametric representation" ofany Bayesian probability model. This is akin
to ageneralization ofDeFiimeti's representation,
when
modeling dataotherthan as an infinite sequence of exchangeable
random
variables.Section 2 also proposes
two
useful concepts for Bayesian modeling.The
first, "minimal parameters," characterize parametrizations of a class ofprobability models which are
most
parsimonious in the sense thatno
un-observable quantities are needed to specify the models and, the extent of prior introspection necessary to identify specific models is minimized.The
second concept, a "maiximal class" comprises the totality of Bayesian prob-abilitymodelsconsistent with a given parametrization.
The
robustness ofa cIelss ofBayesian parametric probability modelsis determined inpart by therange of subjective beliefs
which
are consistent with the parametrization. Section 3 demonstrates the practical scope of the theory in the contextof several applications to finite sequences of
random
variables.The
Bayesian perspective does providesome
insight on the existence of parametric modelswhen
therandom
quamtity is a sequence of(and its various generalizations, e.g., Hewitt
and
Savage, 1955) says that ifthesequence ofexchangeable rajidomvariablesis potentially infinite in size,
then there exists a parametric
model
for such asequence with independentand
identically distributed components. Unfortunately, the theory does not address the issue ofhow
such modelsand
their underlying parametersshould be defined. Moreover, the size of the sequence
must
be potentiallyinfinite in order for the parametric probability
model
to bewell defined. In contrast, our development leads to definitions of parametersand
paramet-ric models for any
random
variable or sequence ofrandom
variables.The
notion of treating functions ofthe data, i.e., statistics, as parameters is notnew. Lauritzen (1974) discusses the
problem
of predicting the value of aminimal totally sufficient statistic for a sequence ("population") given the
observation ofa subsequence ("sample"). Then, the value ofthe "sufficient statistic in the larger part of the population will play the role of an
un-known
'parameter'. This 'parameter' is then estimated by themethod
ofmaximum
likelihood (pp. 129-130)."He
notes, however, that this likelihood approach to inferencesmay
"give unreasonable results"(p. 130) becausethe
maximum
likelihood preditions will not vary with changesin the familyof distributions
P
, so long as the minimial totally sufficient statisticand
the conditional distributions given these remain the same.
Lauritzen (1974) also introduces the notion of "extreme models" (p.
132). These models correspond to our Bayesian models in
which
the prior distributions are degenerate at particular parameter values. In these cases,he notes that "the 'parameter' is in a certain sense equal to the value of
the statistic
when
calculated over the entire population" (p. 132). Hislikelihood approach leads to theconclusion that such a class
P
"is as broadas one
would
want" since it consists of all modelswhich
might be true inany
sense.This interpretation ofthe adequacy of models highlights the distinction of the Bayesian approach, however.
Without
knowledge ofwhichmodel
istrue, the Bayesian solution reflects this uncertainty in terms ofa
probabil-ity distribution over possible models.
The
problems Lauritzen notes with maximum-likelihood preditions being unaffected by changes in the classP
,
are resolved coherently with Bayesian models, where changes in
P
lead to changes in the definition of the parameter and/or the specification of prior distributions on the parameter.Our
development of Bayesian parametric models is closely related toDawid's (1979,1982) "inter-subjective models."*
When
a sufficient statisticT
can be found for a classP
of probability models, thenDawid
callsT
an "/—parameter" for the family of conditional distributions, given T.Dawid
notes that "it will often be appropriate toembed
the class of events un-der consiun-deration in a larger, real or conceptual, class before constructing the/—model
(p.221),and
that "theremay
bemany
different but equallysatisfactory models" (p.219). Arguing subjectively for the applicability of
propensity-type models, he details specific
/—models
for exchangeablebi-nary sequences
and
orthogonal invariant measures,and
discusses therele-vant theory ofI
—
models
forsequences with repetitivestructure, extending the theory of Lauritzen (1974).The
major distinction between our approachand
Dawid's is thatwe
focus on the specification of models
and
the definition of parameters for finite sequences of observablerandom
quantities.We
develop models for a sequence of infinite length as the limiting case of a sequence of modelsfor finite sequences. Parameters defined as measurable functions ofthe
ob-servables are then seen be interpretatable for infinite sequences only if the
limit ofthe sequence of measurable functions exists in any meaningful way.
When
the infinite-lengthsequence is taken as primary, theremay
be consid-erable flexibility in defining parameters in terms of unobservable variables. Rather than view alternative parametrizations as equally satisfactory,we
advocate applying the concept of minimal parametrizations as a guiding
principle with
which
to definemodel
parameters, parsimoniously, in terms of observables.2
Bayesian
Parametric
Models
Let
X
be arandom,
observable quantity.^Denote
thesame
sample spaceof
Xby
X.^Wethank Richard Barlow for bringing Dawid's work to our attention.
^Often
X
is a sequence of random variables, i.e.,X
=
[ii,...,ijv). For example,X
couldrepresent the outcome ofasequence of
N
coin tosses or the states ofadiscrete-timedynamic system as it evolves from time 1 to time A'^. This interpretationof A" as a finite
From
a Bayesian perspective, uncertainty in therandom
quantityX
is
modeled
in terms of P{-), a (joint) probability distribution of (all thecomponents
of)X.
To
define such a distribution, let7
denote a <7-field onX
for whichX
\s measurable.The
classI
is the set ofall events pertainingto
X
for which the Bayesian can specify probabilities through introspection(and deduction applying the probability calculus).
Let
P
=
{-P's} be a class of probability models on the measureablespace
{X
,J) for a group ofBayesians.That
is,P
constitutes the Bayesians'spectrima of possible choices for a distribution of X.
The
size ofsuch a classwill
depend
on the degree of possible heterogeneity in the Bayesians' (prior)beliefs for
X.
WhileP
could beso general as toencompass
all distributions on (JT,J), a non-trivial special case is the class of all exchangeabledistri-butions.
We
shall define a parameter 6 for a given classP
ofBayesian models asa measurable function: 6 :
Z
—
> such that the conditional distributionof
X
given is thesame
for all modelsP
G
P.Such
a definitionmakes
the parameter an operationally defined
random
variable, that is, defined interms ofobservables alone.
A
parameter can be foundby
first introducing a subfield Ts ofthe basicfield
7
suchthat the conditionalexpectation, relative to 7g, ofthe indicatorof
any
event inJ
is thesame
for allP
€
P.Any
functiononX
that inducessuchasubfield canthen beused as aparameter. Formally,with 1a denoting
the indicator of an event
A
and
Ep
denoting the expectation with respect to the distribution P,we
proposeDefinition:
An
J-measurable function ^ is a parameter for the classP
, if, forany
A
€i T, the conditional expectationEp{1a
\
Je) can be defined to be equal for all
P
6
P.We
shall refer to the field 7e as the parameter field for the classP
.Any
subfield of
7
that differs from 7$ only by sets of measure zero can serve equally well as a parameter field for the class. In practice, any one of thesefields can be used. Moreover, these are typically
many
functions onX
thatinduce the
same
subfield. Therefore, there will bemany
functions 6 thatThe
definition of the peirameter provides an J^-measurable functionE{1a
I •^)»for any
A
E
7, which does notdepend
on the particularP
in P, that is, an /^-measurable functionwhich
is thesame
for every Bayesianin the group. This function can be used to providea generalized definition of the likelihood for a class ofmodels.
Definition:
The
parametric likelihood function for the classP
with respect to a parameter is a function !'(•,•) :J
x©
—
> [0,l]whose
values are given byL{A,e')
=
E{u
I n),
for almost all x : ^(i)
=
^.This definition is consistent with convential usage
when
A
consists of asingle
component
ofX.
Given a parameter 6 for the class P, any distribution
P
E
P
specifies a distribution on {X.,7e), whichwe
call the prior distribution of corre-sponding to P. Identifying the prior distribution with a distribution onobservables contrasts with its usual interpretation as a distribution on an
abstract parameter space with <7-field §. However, the latter can be
for-malized with the present definition of d as a measurable function of X, with
g
=
{e{A),A
e
Te}.Consider then
Definition:
The
prior distribution of apeirameter for amodel
P
e P
is themeasure
tt on {0,g) induced byP
on (Z,Jj), thatis,
7r{B)
=
P{0-'{B)),Beg.
These definitions allow us to write a parametric representation for any Bayesian model, generalizing DeFinetti's representation for exchangeable
sequences to arbitrary
random
variables.The
identity for conditionalex-pectations
P{A)
= E{E{U
for any >l
G
J,when
expressed in terms of the parametric likelihoodand
prior distribution provides
Definition: For any parameter 5 of a class
P
of models on {X,T), the parametric representation of the probability of anevent
A
G
I
isP{A)
=
I L{A,e)7r{de). JeThe
likelihoods can, therefore, be interpreted as the extremeprobabil-ity eissignments pertaining to the class
P
in the sense thatany
probability assignment in the cIeiss can be expressed as a convex mixture of thelikeli-hoods.
A
parameter always exist for any class P.The
identityfunction 0{X)=
X
is such that the conditional expectation of the indicator of anyA
E
7
with respect to Tg is the indicator itself, since
E[1a
I Te]=
E[1a
I T]=
1a.It follows that this conditional expectation is equal for all
P
no matterwhat P
is and, hence, that the the identity is a parameter for amy class P.A
classP
may
admit multiple parameters. Forany
two
suchparameters,say^1, ^25 theintersectionTe
=
^Si^^e^ definesapaxametrizationwhich
isnofiner than thosegiven by 0i
and
62.The
most
parsimonious parametrizationis obtained by taJcing the intersection over all parameter fields. This leads
to:
Definition:
A
parameter isminimal
foraclassP
\f7$=
fl ^«,where
the intersection is taken over all parameters of P.By
defining parameters as measurablefunctions oftherandom
variable,the correspondence between Bayesian models
P
G P
and
prior measures tton
is one-to-one.The
relationship need not be "onto," however.Some
distributions tt on
may
specify distributions which are not in P. Thiswill occur, for example,
when
the classP
is not convex under probabilisticDefinition:
A
classP
of Bayesian models ismaximal
withrespect to a parameter if11, the class of all distributions tt
on
{0,^), is isomorphic to P.
A
classP
ismaximal
for aparameter 6 only if^ is aminimal
parameter.Suchclasses reflect the entire range ofBayesian beliefs which are consistent with the given minimal parametrization.
REMARKS:
(2.1)
Our
definition ofa parameter is similar to the frequentist definitionof a sufficient statistic
when
the classP
is indexedby
a frequentistparameter </>£$, the frequentist parameter space. However, the pa-rameters in frequentist models are generally not measurable functions
of the signal*
(2.2) While there
may
be substantial divergence in opinion regarding theprobabilitiesof two Bayesians
who
choose their modelsfrom
P
, apa-rameter characterizes a basis for consensusbeliefs: conditional on the value of the parameter, they agree on the
randomness
in X.(2.3)
A
minimal
parametrization of a classP
minimizes the extent ofprior introspection necessary to identifya Bayesian probability
model
P
&
P; the measureP
need only be specified on the coarsest subfield7e
which
still provides a basis for consensus opinion or full knowledgeof the remaining
randomness
in X.(2.4) There is
no
redundeincy in themodel
specificationwhen
theparam-eter is minimal.
The
conditional distributions:X
\6'
and
X
|
6" are
distinct almost everywhere (for every
P
^
P) fortwo
distinct val-ues 6'^
6" of a minimal parameter. Otherwise, a parameter with a coarsera—
field could be defined which does not distinguish 6'and
6".(2.5)
A
frequentist classP
like that inRemark
(2.1)would
beinappropri-ate as a class ofmodels for a group ofBayesians unless each Bayesian
^Notable exceptionsariseinthecasewhen
X
isaninfinite sequenceofrandomvariables.was
completely certain about therandomness
in thesignal. Bayesianswould
typicallyprefer tochoose probabilisticmixtures of distributionsin such a class. Expressing beliefs as such a mixture indicates
how
the Bayesianframework
enrichesupon
the frequentist approach.The
Bayesian analogue to a frequentist class
P
would
be P*, its convex hull under probabilisticmixtures. Then, every prior distribution tt on$
would
correspond to a distributionP
G
P*.Depending
onhow
theparameter <^ is defined however, distinct prior distributions on
$
donot necessarily lead to distinct models
P
^
P*. However, if thefre-quentist parameter <^ is also a Bayesian parameter, i.e., a measurable
function of the data, then distributions tt on
$
will uniquely identify modelsP
e
P*.An
example
ofsuch a case iswhen
X
is asequence ofthreeexchange-able binary variables. Traditional Bayesian analyses
and
frequentistanalyses
would model
such asequence as independentand
identically distributed Bernoullirandom
variables.The
Bayesian analysis then requires the specification of the distribution of the success proba-bility over the range [0,1].Our
analysiswould
define parameters as functions of the observables.The
count of successes is a minimalpa-rameter for this problem. These take on only four values - 0,1,2, or 3
and
the prior distribution is specified by specifying the probability ofany
three suchoutcomes
for the sequence.(2.6) While there is a prior distribution tt on for every distribution
P
e
P
, it is not necessary that every distribution tt on correspondsto a distribution
P
G
-P. Necessary conditions for this to be case are:(i) the parameter is minimal;
and
(ii) the classP
equals P*, itsconvex hull under probabilistic mixtures.
(2.7)
K
a classP
ismaximal
with respect to the (minimal) parameter 0,the all possible heterogeneous beliefs consistent with the parametric
model
definedby can beexpressed interms ofamember
distributionof P.
(2.8)
When
P
is a"maximal
class" of probability models for a given parametrization the Bayesian parameter is equivalent to Dynkin'sstatistic exists for P, a convex set of probability measures, then
P
contains a subset P^ ofextreme pointsand
any measure inP
can becharacterized by a probabilistic mixture ("barycentre") of extreme
measures. Dynkin's set P^ is just the sub-class ofmodels
correspond-ing to prior distributions which are degenerate at specific values of the parameter.
3
Applications
with
Finite
Sequences
Suppose
thatX
consists of a finite sequence ofcomponent
variables,X"
=
[Xi,...,Xs\, havingoutcome
spaceZ
^
Xi X Xj X ... X Xj\iithe Cartesian product of the
outcome
spaces of the coordinates. Let, asbefore,
J
denote the cr-field for which the Bayesians can specifyprobabili-ties.
Example
1: Arbitrary SequencesFirst, let
P
represent the class ofall possible distributions on themea-surable space (X, J).
The
minimal paxametrization is obtained by letting the parameter fieldequal the basic field
J
.The
obviousparameter is thenthe identity function,that is,
6{x)
=
X,in which case is a copy of
X
and
the field^
is equivalent to T.The
likelihood function of anyA
E
T
reduces to the indicator ofA
interpreted as an element of ^, or,
y 0, otherwise.
The
prior for a distributionP
isP
itself on (0, ^), and the parametricrepresentation
becomes
the integral of the indicator ofA
C
Q
with respect to P.Sincethere is
no
consensus ofopinionwithin theclass ofalldistributions on{X
,7), the parameter field is not a strict subfield ofthe basic fieldand
specifying a prior
on (0,^)
is equivalent to specifying a distributionon
(r,j).
Example
2: Deterministic SequencesSuppose that
X
is a "deterministic" sequence, that is, suppose that the Bayesians in the group agree on a transformationT
: X\ —*X
such thatX
=
T(ii). For instance, thecomponents
ofX
could represent the statesof a
dynamic
system progressing from time 1 to iV withunknown
initialcondition X\.
The
minimal parameterization is obtainedby letting theparameter fieldequal Ji, the field induced by Xi.
The
projection ofX
ontoXi
then servesas a
minimal
parameter, or,e{x)
=
xi,with equal to a copy of X\
and
Q
equivalent to the field on X\.The
likelihood function is then the indicator of the initial conditions leading to sequences insome
event ylG
/,[0,
otherwise.The
prior distribution is the distribution of the initial conditionXi
and
the parametric representation gives the distribution of a sequence as the probabilitistic mixture of the possible sequences over the initial condition.
This
example
can be extended in various ways. For instance, instead offinding agreement on asingle T, the Bayesian
may
only be able to agree onsome
set oftransformations{T}
suchastheset ofall lineairtransformations.The
parametrization will then have to be extended toencompass
the fieldinduced by T.
Example
S: Exchangeable 0-1 SequencesSuppose that the
components
ofX
each take on either the value "0" or"1,"
and
suppose that7
is the set of all subsets ofX
. LetP
be the classof exchangeable distributions on (X,J), that is, distributions
P
that areinvariant under permutations ofthe
components
ofX.
For instance,X
can be thought of as a sequence of"random"
drawings without replacementfrom
an urn containingN
balls which are either mzirked "0" ormarked
"1."The
minimal parameter field is generated by the sets consisting ofse-quences having equal
number
of "l"'s.A
convenient parameter Ls therel-ative frequency of "l"'s in the signal, that is,
1
^
* »=i
having
outcome
space=
{0,^,. ..,1} with
§
the set of all subsets of 0.The
likelihood function is wellknown
in connection with urnprob-lems. In particular, the likelihood for any subsequence or cylinder set
A =
[ii, ...,!„]where
n
<
A'' isL{A,0)
=
N
-n
-, for Er=iXi
<0
otherwise.
In other words, the
number
ofways
ofarranging the remaining ballsmarked
"1" after the first
n
areremoved
divided by the totalnumber
ofarrange-ment
of such balls before the first n were removed.The
prior distributionrepresents the distribution over the composition of the
um
and
the para-metric representation says that the distribution of any finite sample is amixture of the distributions conditional on the urn's composition.
This parametric representation is also
known
as de Finetti's represen-tation (for finite sequences); see de Finetti (1937).Example
4-' Spherically-Symmetric SequencesSuppose
now
that thecomponents
ofX
take on values in the real num-bersand
suppose thatJ
is the Borel field. LetP
be theclass of spherically-symmetricdistributions on (X,7), that is, distributions which are invariant underrotations ofX.
For instance, thecomponents
ofX
could represent aset of orthogonal coordinates of
some
vector quantity such as the velocity ofa pointmass
in space. Ifthe Bayesiaiis agree that the particular choice ofcoordinate frame is irrelevant, then
P
consists ofthespherically-symmetricdistributions.
The
minimalpcirameterfieldis generatedbythesets ofsequences havingthe
same
Euclidean norm; these sets cire surfaces of A'^-dimensional spheres centered at the origin.The
second samplemoment
of the sequences can serve as a minimal parameter, that is,The
parameter'soutcome
space consists then of non-negativerealnum-bers.
The
likelihood function for cylinder sets of the formA
=
\dxi, ...,d2:„] is foimd to be L{A,e)=
. r(l.) r(-^)(«-N9")T 1_
Z^.=i'^. - ST^ JV-n 2-1dxi...dxr„
when
Er=i a;,-<
AT^otherwise.
(This will be proved in the context of distributions which are uniform on
general/''-spheres in the sequel. For the present/^ case, seealso Eaton,1981.)
The
prior distribution consists ofthe distribution ofthe second samplemoment
of the sequenceand
the parametric representation gives thedis-tributions in
P
as a mixture over the conditional distributions given thesecond sample
moment.
The
analysis remains valid for sequenceswhose
components
take values insome
subset of the real numbers, although thedomain
of definition ofthe likelihood
and
itsnorming
constantmay
have to be changed.When
the
components
are restricted to the set 0,1, theexample
reduces to theprevious example.
4
Infinite
Sequences
Most
applications of parametric models in statistics involve infinitese-quences of
random
variables, that is, sequenceswhose
lengthN
is notbounded
beforehand. Observations in the infinitely far future cannot becalled observable in any practical sense. However, they can be viewed as limits of real observations
and
therefore, parametric models for infinitese-quences are, at best, limits ofoperationally defined parametric models.
Parameters for infinite sequences are defined as limits ofparameters for
finite sequences
when
the length is increased without bound. For instance,for the exchangeable 0-1 sequences in
Example
3,we
can define theparam-eter to yield the limiting relative frequency, or propensity, of a sequence, that is,
, ,
_
J limjv_oo ;^I2,=i2;,»
whenever
this limit exists, [ —1, say, otherwise.However, it is well-known (see de Finetti, 1937) the set for which
lim-iting frequency does not converge has P-probability zero. This implies, for
instance, that a parameter which assigns the value to the non-convergent
sequences induces a parameterfield differingfromthe formeronly onsets of /'-probability zero. Therefore,
we
need only consider the parameter spaceconsisting of the interval [0,l].
For these values of one finds for the likelihood function for sets
A =
[xi,...,!„] that,
L{A,0)
=
eEr=i'. (i_^)"-Er=,'.,by
taking the limit of the expression in example 3 asN
—^ oc.The
priorbe-comes
the distribution ofthe limitingrelativefrequencyand
the parametricrepresentation
becomes
de Finetti's representation.Parametric models for infinite sequences can be used in practice as ap-proximations for models for large sequences. For the exchangeable case, an advantage of using this approximation is the fact that the likelihood
simplifies to a product measure.
A
disadvantage is that onenow
has to specify a prior on the entire [0,1] interval instead of just A^ points.References
Berger, J.O. (1985). StatisticalDecision Theory
and
Bayesian Analysis. Springer,New
York.Box, G.E.P.
and
Tiao G.C. (1973). Bayesian Inference in Staistical Analysis. Addison Wesley, Reading.Dawid,
A. P. (1979). Conditional independence in statitsical theory (withDis-cussion). J. Royal Statist. Soc.
B
41: 1-31.Dawid,
A. P. (1982). Intersubjective statistical models. In Exchangeability inProbability
and
Statistics. G.Koch
and
F. Spizzichino (editors). NorthHolland, 217-232.
de Finetti, B. (1937).
La
prevision: ses lois logiques, ses sources subjectives. Annals de I'Institut Henri Poincare; English translation in Kyburg, Jr., H.E.
and
Smokier, H.E. eds. Studies in Subjective Probability,New
York: Wiley, 1964.Dynkin, E.B. (1978). Sufficient statistics
and
extreme points. The Annals ofProbability. 6: 705-730.
Eaton, M.L. (1981).
On
the projections of isotropic distributions.Ann.
statist.9: 391-400.
Hewitt, E. and Savage, L.J. (1955).
Symmetric
measures on cartesian products.Trans.
Amer.
Math. Soc. 80: 470-501.Lauritzen, S. L. (1974). Sufficiency, prediction
and
extreme models. Scand. J.Statist. 1: 128-134.
Lindley , D. V. (1971). Bayesian Statistics,
A
Review. Reagional ConferenceSeries in Applied Mathematics,
S.LA.M.
Date
Due
'lMM.
MAR
13
1930 SEP. 13
^3^' 1 ttM Lilj-26-67MIT LIBRARIES