Identifiability of Finite Mixture Models
Jang SCHILTZ (University of Luxembourg) joint work with
C´edric NOEL(University of Lorraine & University of Luxembourg)
SMTDA 2020 June 4, 2020
Outline
1 Introduction
2 Characterization of Identifiability
3 Identifiability of finite mixture models
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 2 / 20
Outline
1 Introduction
2 Characterization of Identifiability
3 Identifiability of finite mixture models
Outline
1 Introduction
2 Characterization of Identifiability
3 Identifiability of finite mixture models
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 2 / 20
Outline
1 Introduction
2 Characterization of Identifiability
3 Identifiability of finite mixture models
Identifiability
Definition:
A finite mixture model is identifiable if a given dataset leads to a uniquely determined set of model parameter estimations up to a permutation of the clusters.
Identifiability of the parameters is a necessary condition for the existence of consistent estimators for any statistical model.
Without identifiability, there might be several solutions for the parameter estimation problem.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 4 / 20
Identifiability
Definition:
A finite mixture model is identifiable if a given dataset leads to a uniquely determined set of model parameter estimations up to a permutation of the clusters.
Identifiability of the parameters is a necessary condition for the existence of consistent estimators for any statistical model.
Without identifiability, there might be several solutions for the parameter estimation problem.
Literature Review
Teicher (1963): The class of all mixtures of one-dimensional normal distributions is identifiable.
Yakowitz and Spragins (1968): Extension to the class of all Gaussian mixtures.
Henning (2000): Identifiability for linear regression mixtures with Gaussian errors under certain conditions.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 5 / 20
Literature Review
Teicher (1963): The class of all mixtures of one-dimensional normal distributions is identifiable.
Yakowitz and Spragins (1968): Extension to the class of all Gaussian mixtures.
Henning (2000): Identifiability for linear regression mixtures with Gaussian errors under certain conditions.
Literature Review
Teicher (1963): The class of all mixtures of one-dimensional normal distributions is identifiable.
Yakowitz and Spragins (1968): Extension to the class of all Gaussian mixtures.
Henning (2000): Identifiability for linear regression mixtures with Gaussian errors under certain conditions.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 5 / 20
Finite Mixture Models
Data:
Variable of interestYi =yi1,yi2, ...,yiT Covariantsx1, ...,xM andzi1, ...,ziT ait age of subject i at time t
Model:
K groups of sizeπk with trajectories yit =
sk
X
j=0
βjk+
M
X
m=1
αkmxm+γjkzit
!
aitj +εkit, (1) where εkit ∼ N(0, σk).
Finite Mixture Models
Data:
Variable of interestYi =yi1,yi2, ...,yiT Covariantsx1, ...,xM andzi1, ...,ziT ait age of subject i at time t Model:
K groups of sizeπk with trajectories yit =
sk
X
j=0
βjk+
M
X
m=1
αkmxm+γjkzit
!
aitj +εkit, (1) where εkit ∼ N(0, σk).
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 6 / 20
Outline
1 Introduction
2 Characterization of Identifiability
3 Identifiability of finite mixture models
Notations
Distribution f of a finite mixture model:
f(yi; Ω) =
K
X
k=1
πkgk(yi;θk).
Cumulative distribution function F of a finite mixture model:
F(yi; Ω) =
K
X
k=1
πkGk(yi;θk).
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 8 / 20
Mixtures and mixing distributions
Let F =
F(y;ω), y ∈RT, ω∈Rs+2K be a family of T-dimensional cdf’s indexed by a parameter setω, such that F(y;ω) is measurable in RT ×Rs+2K .
The the s+ 2-dimensional cdfH(x) =R
Rs+2K F(y;ω)dG(ω) is the image of the above mapping, of the s+ 2-dimensional cdfG.
The distribution H is called the mixture ofF andG its mixing distribution.
Let G denote the class of all s+ 2-dimensional cdf’s G and Hthe induced class of mixturesH.
Then H is identifiable ifQ is a one-to-one map fromG onto H.
Characterization of identifiability
The set Hof all finite mixtures of class F of distributions is the convex hull of F.
H= (
H(y) :H(y) =X
i
ciF(y, ωi), ci >0,X
i
ci = 1, F(y, ωi)∈ F )
.
(2) Theorem
A necessary and sufficient condition for the class Hof all finite mixtures of the family F to be identifiable is thatF is a linearly independent family over the field of real numbers.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 10 / 20
Outline
1 Introduction
2 Characterization of Identifiability
3 Identifiability of finite mixture models
The Model
Yit =f(ait;βk, δk) +εkit =βkAit+δkWit+εkit. (3) We can write
L((Yi)i∈I) =O
i∈I
FAi,Wi,J. (4)
Identifiability of a model means that knowing the data distribution L(Yi),i ∈I, one can uniquely identify the mixing distributionJ.
That is, no two distinct sets of parameters lead to the same data distribution.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 12 / 20
Nagin’s base model
C1 = FA,J : FA,J =O
i∈I
FAi,J
!
J∈Ω1
Theorem Let hj = min
q : {Aij,i ∈I} ⊆ ∪qi=1Hi Hi ∈ Hn−1 .
If there exist j such that|S(J)|<hj, ∀J thenC1 is identifiable.
Adding covariates independent of cluster membership
C2 = FA,J : FA,J =O
i∈I
FAi,Wi,J
!
J∈Ω1
, (5)
C2A = FA,J : FA,J =O
i∈I
FAi,J
!
J∈Ω1
, (6)
C2W = FA,J : FA,J =O
i∈I
FWi,J
!
J∈Ω1
. (7)
Theorem
IfC2A andC2W are identifiable and Wij is not a multiple of Aij, for all i,j , then C2 is identifiable.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 14 / 20
Numerical Example
Two clusters with sizesπ1=π2= 12. Two time-points 1 and 2.
Same variability in both clusters σ= 0.1
We simulate 50 samples of 100 trajectories with parameters β1 = (3,−2) andβ2 = (0,2) (linear model)
β1 = (10,−12.5,3.5) andβ2 = (−2,5,−1) (polynomial model).
Parallel coordinate plots of the estimated parameter
0.44 0.56
-0.10 3.06
-2.03 2.06
0.0866 0.1097
π intercept slope σ
0.36 0.64
-1.31 10.75
-13.61 3.99
-1.44 4.71
0.0925 0.1075
π intercept slope degree 2 σ
Linear Model Parabolic Model
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 16 / 20
The generalized model
Theorem
The model is identifiable if
dk <T for all 1≤k ≤K and all ait are distinct, for all it. Wk has full rank for all1≤k ≤K .
rk(Ak,Wk) =rk(Ak) +rk(Wk), for all 1≤k ≤K .
Numerical Example
Two clusters with sizesπ1=π2= 12. Two time-points 1 and 2.
Same variability in both clusters σ= 0.1
Shape description parametersβ1= (3,−2), β2 = (0,2),δ1 = 2 and δ2 =−3.
We simulate 50 samples of 100 trajectories for 3 types of models:
The covariate is independent of time and only takes values 0 or 1 The covariate is time dependent but in a nonlinear way
The covariate is time dependent in a linear way
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 18 / 20
Parallel coordinate plots of the estimated parameter
0.35 0.65
-0.075 3.079
-2.03 2.05
0.0883 0.1084
-3.05 2.04
π intercept slope σ δ
0.411 0.589
-0.0577 3.1001
-2.06 2.04
0.0875 0.1114
-3.01 2.01
π intercept slope σ δ
6.34e-08 1.00e+00
-4.11 12.51
-19.0 36.1
7.60 7.76
-124.6 67.8
π intercept slope σ δ
Model 1 Model 2 Model 3
Bibliography
Nagin, D.S. 2005: Group-based Modeling of Development.
Cambridge, MA.: Harvard University Press.
Schiltz, J. 2015: A generalization of Nagin’s finite mixture model. In:
Dependent data in social sciences research: Forms, issues, and methods of analysis’ Mark Stemmler, Alexander von Eye & Wolfgang Wiedermann (Eds.). Springer 2015.
Teicher, H. 1963: Identifiability of Finite Mixtures. Annals of Mathematical Statistics, 34,4,1265–1269.
Yakowitz, S.J. and Spragins, J.D. 1968: On the identifiability of finite mixtures. Annals of Mathematical Statistics, 39, 209–214.
Hennig, C. 2000: Identifiability of Models for Clusterwise Linear Regression. Journal of Classification, 17, 273–296.
Jang SCHILTZ Identifiability of Finite Mixture Models with underlying Normal DistributionJune 4, 2020 20 / 20