Analysis of the salary trajectories in Luxembourg: a finite mixture approach

(1)

Analysis of the salary trajectories in Luxembourg

A finite mixture model approach

Jang SCHILTZ (University of Luxembourg)

joint work with Jean-Daniel GUIGOU (University of Luxembourg)

& Bruno LOVAT (University Nancy II)

(2)

includegraphics[height=0.5cm]lsf.eps

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(3)

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(4)

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(5)

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(6)

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(7)

General description of Nagin’s model

We have a collection of individual trajectories.

We try to divide the population into a number of homogenous

(8)

General description of Nagin’s model

We have a collection of individual trajectories.

We try to divide the population into a number of homogenous

subpopulations and to estimate a mean trajectory for each subpopulation.

(9)

General description of Nagin’s model

We have a collection of individual trajectories.

We try to divide the population into a number of homogenous

(10)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(11)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(12)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(13)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(14)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(15)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(16)

The Likelihood Function (1)

Consider a population of size N and a variable of interest Y .

Let Y

i

= y

i

1

, y

i

2

, ..., y

i

T

be T measures of the variable, taken at times

t

1 , ...t

T

for subject number i .

P(Y

i

) denotes the probability of Y

i

count data ⇒ Poisson distribution

binary data ⇒ Binary logit distribution

censored data ⇒ Censored normal distribution

(17)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(18)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(19)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(20)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(21)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(22)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(23)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(24)

The Likelihood Function (2)

π

j

: probability of a given subject to belong to group number j

⇒ π

_j

is the size of group j .

We try to estimate a set of parameters Ω =

n

β

₀

j

, β

₁

j

, β

₂

j

, β

₃

j

, β

₄

j

, π

j

o

which

allow to maximize the probability of the measured data.

P

j

_(Y

i

) : probability of Y

i

if subject i belongs to group j

⇒ P(Y

_i

) =

r

X

j =1

π

j

P

j

(Y

i

).

(1)

Finite mixture model Daniel S. Nagin (Carnegie Mellon University)

finite : sums across a finite number of groups

(25)

The Likelihood Function (3)

Hypothesis: In a given group, conditional independence is assumed for the

sequential realizations of the elements of Y

i

!!!

⇒ P

j

(Y

i

) =

T

Y

t=1

p

j

(y

i

t

),

(2)

where p

j

(y

i

t

) denotes the probability of y

i

t

given membership in group j .

Likelihood of the estimator:

(26)

The Likelihood Function (3)

Hypothesis: In a given group, conditional independence is assumed for the

sequential realizations of the elements of Y

i

!!!

⇒ P

j

(Y

i

) =

T

Y

t=1

p

j

(y

i

t

),

(2)

where p

j

(y

i

t

) denotes the probability of y

i

t

given membership in group j .

Likelihood of the estimator:

(27)

The Likelihood Function (3)

Hypothesis: In a given group, conditional independence is assumed for the

sequential realizations of the elements of Y

i

!!!

⇒ P

j

(Y

i

) =

T

Y

t=1

p

j

(y

i

t

),

(2)

where p

j

(y

i

t

) denotes the probability of y

i

t

given membership in group j .

Likelihood of the estimator:

(28)

The Likelihood Function (3)

Hypothesis: In a given group, conditional independence is assumed for the

sequential realizations of the elements of Y

i

!!!

⇒ P

j

(Y

i

) =

T

Y

t=1

p

j

(y

i

t

),

(2)

where p

j

(y

i

t

) denotes the probability of y

i

t

given membership in group j .

Likelihood of the estimator:

(29)

The Likelihood Function (3)

Hypothesis: In a given group, conditional independence is assumed for the

sequential realizations of the elements of Y

i

!!!

⇒ P

j

(Y

i

) =

T

Y

t=1

p

j

(y

i

t

),

(2)

where p

j

(y

i

t

) denotes the probability of y

i

t

given membership in group j .

Likelihood of the estimator:

(30)

The case of a censored normal distribution (1)

Y

∗

: latent variable measured by Y .

y

_i

∗

_t

= β

₀

j

+ β

₁

j

Age

i

t

+ β

j

2 Age

2 i

t

+ β

j

3 Age

3 i

t

+ β

j

4 Age

4 i

t

+ ε

i

t

,

(4)

where ε

i

t

∼ N (0, σ), σ being a constant standard deviation.

Hence,

y

i

t

= S

min

si

y

∗

i

t

< S

min

,

y

i

t

= y

∗

i

t

si

S

min

≤ y

∗

i

t

≤ S

max

,

y

i

t

= S

max

si

y

∗

i

t

> S

max

,

where S

min

and S

max

dennote the minimum and maximum of the censored

(31)

The case of a censored normal distribution (1)

Y

∗

: latent variable measured by Y .

y

_i

∗

_t

= β

₀

j

+ β

₁

j

Age

i

t

+ β

j

2 Age

2 i

t

+ β

j

3 Age

3 i

t

+ β

j

4 Age

4 i

t

+ ε

i

t

,

(4)

where ε

i

t

∼ N (0, σ), σ being a constant standard deviation.

Hence,

y

i

t

= S

min

si

y

∗

i

t

< S

min

,

y

i

t

= y

∗

i

t

si

S

min

≤ y

∗

i

t

≤ S

max

,

y

i

t

= S

max

si

y

∗

i

t

> S

max

,

where S

min

and S

max

dennote the minimum and maximum of the censored

(32)

The case of a censored normal distribution (1)

Y

∗

: latent variable measured by Y .

y

_i

∗

_t

= β

₀

j

+ β

₁

j

Age

i

t

+ β

j

2 Age

2 i

t

+ β

j

3 Age

3 i

t

+ β

j

4 Age

4 i

t

+ ε

i

t

,

(4)

where ε

i

t

∼ N (0, σ), σ being a constant standard deviation.

Hence,

y

i

t

= S

min

si

y

∗

i

t

< S

min

,

y

i

t

= y

∗

i

t

si

S

min

≤ y

∗

i

t

≤ S

max

,

y

i

t

= S

max

si

y

∗

i

t

> S

max

,

where S

min

and S

max

dennote the minimum and maximum of the censored

(33)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(34)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(35)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(36)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(37)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(38)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(39)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(40)

The case of a censored normal distribution (2)

Notations :

β

j

x

i

t

= β

j

0 + β

j

1 Age

i

t

+ β

j

2 Age

i

2

t

+ β

j

3 Age

i

3

t

+ β

j

4 Age

i

4

t

.

φ: density of standard centered normal law.

Φ: cumulative distribution function of standard centered normal law.

Hence,

p

j

(y

i

t

= S

min

) = Φ

S

min

− β

j

x

i

t

σ

,

(5)

p

j

(y

i

t

) =

1 σ

φ

y

i

t

− β

j

_x

it

σ

pour

S

min

≤ y

it

≤ S

max

,

(6)

p

j

(y

i

t

= S

max

) = 1 − Φ

S

max

− β

j

x

i

t

σ

(41)

The case of a censored normal distribution (3)

If all the measures are in the interval [S

min

, S

max

], we get

L =

1 σ

N

Y

i =1

r

X

j =1

π

j

T

Y

t=1

φ

y

i

t

− β

j

_x

i

t

σ

.

(8)

It is too complicated to get closed-forms equations

⇒ quasi-Newton procedure maximum research routine

Software:

SAS-based Proc Traj procedure

(42)

The case of a censored normal distribution (3)

If all the measures are in the interval [S

min

, S

max

], we get

L =

1 σ

N

Y

i =1

r

X

j =1

π

j

T

Y

t=1

φ

y

i

t

− β

j

_x

i

t

σ

.

(8)

It is too complicated to get closed-forms equations

⇒ quasi-Newton procedure maximum research routine

Software:

SAS-based Proc Traj procedure

(43)

The case of a censored normal distribution (3)

If all the measures are in the interval [S

min

, S

max

], we get

L =

1 σ

N

Y

i =1

r

X

j =1

π

j

T

Y

t=1

φ

y

i

t

− β

j

_x

i

t

σ

.

(8)

It is too complicated to get closed-forms equations

⇒ quasi-Newton procedure maximum research routine

Software:

SAS-based Proc Traj procedure

(44)

The case of a censored normal distribution (3)

If all the measures are in the interval [S

min

, S

max

], we get

L =

1 σ

N

Y

i =1

r

X

j =1

π

j

T

Y

t=1

φ

y

i

t

− β

j

_x

i

t

σ

.

(8)

It is too complicated to get closed-forms equations

⇒ quasi-Newton procedure maximum research routine

Software:

SAS-based Proc Traj procedure

(45)

The case of a censored normal distribution (3)

If all the measures are in the interval [S

min

, S

max

], we get

L =

1 σ

N

Y

i =1

r

X

j =1

π

j

T

Y

t=1

φ

y

i

t

− β

j

_x

i

t

σ

.

(8)

It is too complicated to get closed-forms equations

⇒ quasi-Newton procedure maximum research routine

Software:

SAS-based Proc Traj procedure

(46)

A computational trick

The estimations of π

j

must be in [0, 1].

It is difficult to force this constraint in model estimation.

Instead, we estimate the real parameters θ

j

such that

(47)

A computational trick

The estimations of π

j

must be in [0, 1].

It is difficult to force this constraint in model estimation.

Instead, we estimate the real parameters θ

j

such that

(48)

A computational trick

The estimations of π

j

must be in [0, 1].

It is difficult to force this constraint in model estimation.

Instead, we estimate the real parameters θ

j

such that

(49)

A computational trick

The estimations of π

j

must be in [0, 1].

It is difficult to force this constraint in model estimation.

Instead, we estimate the real parameters θ

j

such that

(50)

A computational trick

The estimations of π

j

must be in [0, 1].

It is difficult to force this constraint in model estimation.

Instead, we estimate the real parameters θ

j

such that

(51)

Muth´

en’s model (1)

Muth´

en and Shedden (1999): Generalized growth curve model

Elegant and technically demanding extension of the uncensored normal

model.

Adds random effects to the parameters β

j

that define a group’s mean

trajectory.

Trajectories of individual group members can vary from the group

trajectory.

Software:

(52)

Muth´

en’s model (1)

Muth´

en and Shedden (1999): Generalized growth curve model

Elegant and technically demanding extension of the uncensored normal

model.

Adds random effects to the parameters β

j

that define a group’s mean

trajectory.

Trajectories of individual group members can vary from the group

trajectory.

Software:

(53)

Muth´

en’s model (1)

Muth´

en and Shedden (1999): Generalized growth curve model

Elegant and technically demanding extension of the uncensored normal

model.

Adds random effects to the parameters β

j

that define a group’s mean

trajectory.

Trajectories of individual group members can vary from the group

trajectory.

Software:

(54)

Muth´

en’s model (1)

Muth´

en and Shedden (1999): Generalized growth curve model

Elegant and technically demanding extension of the uncensored normal

model.

Adds random effects to the parameters β

j

that define a group’s mean

trajectory.

Trajectories of individual group members can vary from the group

trajectory.

Software:

(55)

Muth´

en’s model (1)

Muth´

en and Shedden (1999): Generalized growth curve model

Elegant and technically demanding extension of the uncensored normal

model.

Adds random effects to the parameters β

j

that define a group’s mean

trajectory.

Trajectories of individual group members can vary from the group

trajectory.

Software:

(56)

Muth´

en’s model (2)

Advantage of GGCM

Fewer groups are required to specify a satisfactory model.

Disadvantages of GGCM

1

Difficult to extend to other types of data.

2

Group cross-over effects.

(57)

Muth´

en’s model (2)

Advantage of GGCM

Fewer groups are required to specify a satisfactory model.

Disadvantages of GGCM

1

Difficult to extend to other types of data.

2

Group cross-over effects.

(58)

Muth´

en’s model (2)

Advantage of GGCM

Fewer groups are required to specify a satisfactory model.

Disadvantages of GGCM

1

Difficult to extend to other types of data.

2

Group cross-over effects.

(59)

Muth´

en’s model (2)

Advantage of GGCM

Fewer groups are required to specify a satisfactory model.

Disadvantages of GGCM

1

Difficult to extend to other types of data.

2

Group cross-over effects.

(60)

Model Selection

Bayesian Information Criterion:

BIC = log(L) − 0, 5k log(N),

(11)

where k denotes the number of parameters in the model.

Rule:

(61)

Model Selection

Bayesian Information Criterion:

BIC = log(L) − 0, 5k log(N),

(11)

where k denotes the number of parameters in the model.

Rule:

(62)

Model Selection

Bayesian Information Criterion:

BIC = log(L) − 0, 5k log(N),

(11)

where k denotes the number of parameters in the model.

Rule:

(63)

Posterior Group-Membership Probabilities

Posterior probability of individual i ’s membership in group j : P(j /Y

i

).

Bayes’s theorem

⇒ P(j/Y

i

) =

P(Y

i

/j )ˆ

π

j

r

X

j =1

P(Y

i

/j )ˆ

π

j

.

(12)

Bigger groups have on average larger probability estimates.

(64)

Posterior Group-Membership Probabilities

Posterior probability of individual i ’s membership in group j : P(j /Y

i

).

Bayes’s theorem

⇒ P(j/Y

i

) =

P(Y

i

/j )ˆ

π

j

r

X

j =1

P(Y

i

/j )ˆ

π

j

.

(12)

Bigger groups have on average larger probability estimates.

(65)

Posterior Group-Membership Probabilities

Posterior probability of individual i ’s membership in group j : P(j /Y

i

).

Bayes’s theorem

⇒ P(j/Y

i

) =

P(Y

i

/j )ˆ

π

j

r

X

j =1

P(Y

i

/j )ˆ

π

j

.

(12)

Bigger groups have on average larger probability estimates.

(66)

Posterior Group-Membership Probabilities

Posterior probability of individual i ’s membership in group j : P(j /Y

i

).

Bayes’s theorem

⇒ P(j/Y

i

) =

P(Y

i

/j )ˆ

π

j

r

X

j =1

P(Y

i

/j )ˆ

π

j

.

(12)

Bigger groups have on average larger probability estimates.

(67)

Posterior Group-Membership Probabilities

Posterior probability of individual i ’s membership in group j : P(j /Y

i

).

Bayes’s theorem

⇒ P(j/Y

i

) =

P(Y

i

/j )ˆ

π

j

r

X

j =1

P(Y

i

/j )ˆ

π

j

.

(12)

Bigger groups have on average larger probability estimates.

(68)

Use for Model diagnostics (2)

Diagnostic 1: Average Posterior Probability of Assignment

AvePP should be at least 0, 7 for all groups.

Diagonostic 2: Odds of Correct Classification

OCC

j

=

AvePP

j

/1 − AvePP

j

ˆ

π

j

/1 − ˆ

π

j

.

(13)

(69)

Use for Model diagnostics (2)

Diagnostic 1: Average Posterior Probability of Assignment

AvePP should be at least 0, 7 for all groups.

Diagonostic 2: Odds of Correct Classification

OCC

j

=

AvePP

j

/1 − AvePP

j

ˆ

π

j

/1 − ˆ

π

j

.

(13)

(70)

Use for Model diagnostics (2)

Diagnostic 1: Average Posterior Probability of Assignment

AvePP should be at least 0, 7 for all groups.

Diagonostic 2: Odds of Correct Classification

OCC

j

=

AvePP

j

/1 − AvePP

j

ˆ

π

j

/1 − ˆ

π

j

.

(13)

(71)

Use for Model diagnostics (2)

Diagonostic 3: Comparing ˆ

πj

to the Proportion of the Sample

Assigned to Group j

The ratio of the two should be close to 1.

Diagonostic 4: Confidence Intervals for Group Membership

Probabilities

(72)

Use for Model diagnostics (2)

Diagonostic 3: Comparing ˆ

πj

to the Proportion of the Sample

Assigned to Group j

The ratio of the two should be close to 1.

Diagonostic 4: Confidence Intervals for Group Membership

Probabilities

(73)

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(74)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(75)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(76)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(77)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(78)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(79)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(80)

The data

Salaries of workers in the private sector in Luxembourg from 1940 to 2006.

About 7 million salary lines corresponding to 718.054 workers.

Some sociological variables:

gender (male, female)

nationality and residentship (luxemburgish residents, foreign residents,

foreign non residents)

working status (white collar worker, blue collar worker)

year of birth

(81)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(82)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(83)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(84)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(85)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(86)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(87)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(88)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

elimination of all the workers who had monthly salaries above 15.000

transforming all the salaries above 7.577 to 7.577

(89)

Dataset transformations

Mathematica programming

1 row per year → 1 row per worker

selection of the period we are interested in

taking out the years without years up to a maximum of five years

selection of the people who worked the number of years we are

interested in

Transformations in SPSS

(90)

Proc Traj procedure

Selection of the time period for macroeconomic reasons

(Crisis in the steel

industry and emergence of the financial market place of Luxembourg)

20 years of work for workers beginning their carrier between 1982 and 1987

Proc Traj Macro:

DATA TEST;

INPUT ID O1-O20 T1-T20;

CARDS;

data

RUN;

PROC TRAJ DATA=TEST OUTPLOT=OP OUTSTAT=OS OUT=OF

OUTEST=OE ITDETAIL;

ID ID; VAR O1-O20; INDEP T1-T20;

(91)

Proc Traj procedure

Selection of the time period for macroeconomic reasons (Crisis in the steel

industry and emergence of the financial market place of Luxembourg)

20 years of work for workers beginning their carrier between 1982 and 1987

Proc Traj Macro:

DATA TEST;

INPUT ID O1-O20 T1-T20;

CARDS;

data

RUN;

PROC TRAJ DATA=TEST OUTPLOT=OP OUTSTAT=OS OUT=OF

OUTEST=OE ITDETAIL;

ID ID; VAR O1-O20; INDEP T1-T20;

(92)

Proc Traj procedure

Selection of the time period for macroeconomic reasons (Crisis in the steel

industry and emergence of the financial market place of Luxembourg)

20 years of work for workers beginning their carrier between 1982 and 1987

Proc Traj Macro:

DATA TEST;

INPUT ID O1-O20 T1-T20;

CARDS;

data

RUN;

PROC TRAJ DATA=TEST OUTPLOT=OP OUTSTAT=OS OUT=OF

OUTEST=OE ITDETAIL;

ID ID; VAR O1-O20; INDEP T1-T20;

(93)

Proc Traj procedure

Selection of the time period for macroeconomic reasons (Crisis in the steel

industry and emergence of the financial market place of Luxembourg)

20 years of work for workers beginning their carrier between 1982 and 1987

Proc Traj Macro:

DATA TEST;

INPUT ID O1-O20 T1-T20;

CARDS;

data

RUN;

PROC TRAJ DATA=TEST OUTPLOT=OP OUTSTAT=OS OUT=OF

OUTEST=OE ITDETAIL;

ID ID; VAR O1-O20; INDEP T1-T20;

(94)

Proc Traj procedure

Selection of the time period for macroeconomic reasons (Crisis in the steel

industry and emergence of the financial market place of Luxembourg)

20 years of work for workers beginning their carrier between 1982 and 1987

Proc Traj Macro:

DATA TEST;

INPUT ID O1-O20 T1-T20;

CARDS;

data

RUN;

PROC TRAJ DATA=TEST OUTPLOT=OP OUTSTAT=OS OUT=OF

OUTEST=OE ITDETAIL;

ID ID; VAR O1-O20; INDEP T1-T20;

(95)

Proc Traj procedure

Selection of the time period for macroeconomic reasons (Crisis in the steel

industry and emergence of the financial market place of Luxembourg)

20 years of work for workers beginning their carrier between 1982 and 1987

Proc Traj Macro:

DATA TEST;

INPUT ID O1-O20 T1-T20;

CARDS;

data

RUN;

PROC TRAJ DATA=TEST OUTPLOT=OP OUTSTAT=OS OUT=OF

OUTEST=OE ITDETAIL;

(96)

(97)

(98)

(99)

Outline

1 Nagin’s Finite Mixture Model

2 The Luxemburgish salary trajectories

3 Description of the groups

(100)

1

st

group

13.4 % of the population

(101)

1

st

group

13.4 % of the population

(102)

1

st

group

13.4 % of the population

(103)

(104)

(105)

(106)

(107)

1

st

group

Men:

(108)

1

st

group

Men:

(109)

(110)

(111)

(112)

(113)

(114)

2

nd

group

16.9 % of the population

(115)

2

nd

group

16.9 % of the population

(116)

2

nd

group

16.9 % of the population

(117)

(118)

(119)

(120)

(121)

2

nd

group

Men:

(122)

2

nd

group

Men:

(123)

(124)

(125)

(126)

(127)

(128)

3

rd

group

20.8 % of the population

(129)

3

rd

group

20.8 % of the population

(130)

3

rd

group

20.8 % of the population

(131)

(132)

(133)

(134)

(135)

3

rd

group

Men:

(136)

3

rd

group

Men:

(137)

(138)

(139)

(140)

(141)

(142)

4

th

group

7.9 % of the population

(143)

4

th

group

7.9 % of the population

(144)

4

th

group

7.9 % of the population

(145)

(146)

(147)

(148)

(149)

(150)

(151)

(152)

(153)

(154)

5

th

group

14.9 % of the population

(155)

5

th

group

14.9 % of the population

(156)

5

th

group

14.9 % of the population

(157)

(158)

(159)

(160)

(161)

5

th

group

Men: