Analysis of ordinal longitudinal data

(1)

Analysis of ordinal longitudinal

data

Anne-Françoise Donneau

af.donneau@ulg.ac.be

(2)

Plan

(3)

Plan

1. Ordinal longitudinal data

2. Analysis of longitudinal data - Marginal

approach

(4)

Plan

1. Ordinal longitudinal data

2. Analysis of longitudinal data - Marginal

approach

(5)

Plan

1. Ordinal longitudinal data

2. Analysis of longitudinal data - Marginal

approach

3. Global odds ratio as measure of association

(6)

Plan

1. Ordinal longitudinal data

2. Analysis of longitudinal data - Marginal

approach

3. Global odds ratio as measure of association

4. Analysis of ordinal longitudinal data

(7)

Plan

1. Ordinal longitudinal data

2. Analysis of longitudinal data - Marginal

approach

3. Global odds ratio as measure of association

4. Analysis of ordinal longitudinal data

5. Application

(8)

(9)

Ordinal longitudinal data

(10)

Ordinal longitudinal data

What is longitudinal data?

-

Longitudinal data correspond to repeated observations of an outcome variable and a set of covariates for each of many subjects

(11)

Ordinal longitudinal data

What is longitudinal data?

-

(12)

Ordinal longitudinal data

What is longitudinal data?

-

What is ordinal variable?

-

Ordinal variable is a categorical variable presenting a logical ordering of the categories

(13)

Ordinal longitudinal data

What is longitudinal data?

-

What is ordinal variable?

-

Ordinal variable is a categorical variable presenting a logical ordering of the categories

(14)

Ordinal longitudinal data - Example

524 patients, randomized into two treatment groups (RT or RT+PCV) and observed at 9 occassions time.

(15)

Ordinal longitudinal data - Example

(16)

Ordinal longitudinal data - Example

Ordinal variable: Global health status (QL)

(17)

Ordinal longitudinal data - Example

N=524 patients

Y =QL (Yi = (yi1, ..., yini)

′ _, _i _{= 1, ..., N} _with _n

(18)

Ordinal longitudinal data - Example

N=524 patients

Y =QL (Yi = (yi1, ..., yini)

′ _, _i _{= 1, ..., N} _with _n

i ≤ 9)

(19)

Ordinal longitudinal data - Example

N=524 patients

Y =QL (Yi = (yi1, ..., yini)

′ _, _i _{= 1, ..., N} _with _n

i ≤ 9)

X=(Treat,Time,TreatxTime)’

(20)

(21)

Analysis of longitudinal data - Marginal approach

(22)

Analysis of longitudinal data - Marginal approach

Objective

-

Describe the marginal expectation of the outcome

variable as a function of the covariates while

accounting for the correlation among the repeated observations for a given subject

(23)

Analysis of longitudinal data - Marginal approach

Objective

-

(24)

Analysis of longitudinal data - Marginal approach

Objective

-

Tools

(25)

Analysis of longitudinal data - Marginal approach

Objective

-

Tools

-

Gaussian outcome _→ large class of linear models

- Non Gaussian outcome _→ limited number of models,

(26)

Analysis of longitudinal data - Marginal approach

Objective

-

Tools

-

Gaussian outcome _→ large class of linear models

(27)

(28)

Analysis of longitudinal data - Marginal approach

(29)

Analysis of longitudinal data - Marginal approach

Solution

(30)

Analysis of longitudinal data - Marginal approach

Solution

-

Alternative to the likelihood-based analysis

↓

(Liang and Zeger, 1986)

Generalized estimating equations GEE

(31)

Analysis of longitudinal data - GEE1

In GLM, score equation writes

S(β) = X i ∂µ_i ∂β v −1 i (yi − µi) = 0 , with vi = V ar(Yi)

(32)

Analysis of longitudinal data - GEE1

S(β) = X i ∂µ_i ∂β v −1 i (yi − µi) = 0 , with vi = V ar(Yi) Multivariate extension, Y_i = (y_i1, ..., y_in_i)′

(33)

Analysis of longitudinal data - GEE1

S(β) = X i ∂µ_i ∂β v −1 i (yi − µi) = 0 , with vi = V ar(Yi) Multivariate extension, Y_i = (y_i1, ..., y_in_i)′ S(β) = N X i=1 ni X j=1 ∂µ_ij ∂β v −1 ij (yij − µij) = N X ∂µ_i′ V −1(Y _{− µ} ) = 0

(34)

Analysis of longitudinal data - GEE1

S(β) = X i ∂µ_i ∂β v −1 i (yi − µi) = 0 , with vi = V ar(Yi) Multivariate extension, Y_i = (y_i1, ..., y_in_i)′ S(β) = N X i=1 D_i′V_i−1(Yi − µi) = 0

(35)

Analysis of longitudinal data - GEE1

S(β) = X i ∂µ_i ∂β v −1 i (yi − µi) = 0 , with vi = V ar(Yi) Multivariate extension, Y_i = (y_i1, ..., y_in_i)′ S(β) = N X i=1 D_i′V_i−1(Yi − µi) = 0

With µ_i = E(Yi) and Vi = diag(vi1, ..., vini)

(36)

Analysis of longitudinal data - GEE1

-

Problem : Correlations among the repeated

(37)

Analysis of longitudinal data - GEE1

-

observations for a given subject

- Solution: Allow non-diagonal V_i, _→ ’working’ correlation matrix

(38)

Analysis of longitudinal data - GEE1

-

Let R_i(α) be a n_i _{× n}_i symmetric matrix which fullfills the condition to be a correlation matrix , and let α be an

(39)

Analysis of longitudinal data - GEE1

-

Define,

V_i = A1/2_i (β)R_i(α)A1/2_i (β)

(40)

Analysis of longitudinal data - GEE1

-

Define,

V_i = A1/2_i (β)R_i(α)A1/2_i (β)

in which A_i_{(β) = diag {v}_ij(µ_ij_(β))}

(41)

Analysis of longitudinal data - GEE1

-

Using the new definition of V_i, consistent estimates of the model parameters, β, can be obtained from the solution of the estimating equations

N

X

(42)

Analysis of longitudinal data - GEE1

-

Using the new definition of V_i, consistent estimates of the model parameters, β, can be obtained from the solution of the estimating equations

(43)

Large sample properties

As _{N → ∞} , √N ( ˆβ _{− β) ≈} N(0, I₀−1I₁I₀−1) With I₀ = N X i=1 D_i′V_i−1D_i I₁ = N X i=1 D_i′V_i−1V ar(Yi)V_i−1Di

(44)

Large sample properties

(45)

Large sample properties

In practice, V ar(Y_i) is replaced by

(46)

Large sample properties

When R_i(α) is the true correlation matrix of the Y_i’s,

(47)

Large sample properties

(48)

Estimation of working correlation

Pearson residual to estimate α

Sensitive Corr(Y_ij, Y_ik) Estimate

Independence 0 -Exchangeable α αˆ = _N1 PN i=1 ni(n1i−1) P j6=k eijeik AR(1) αt αˆ = _N1 PN i=1 ni1−1 P j≤ni−1 eijei,j+1 Unstructured α_jk αˆ_jk = _N1 PN i=1 eijeik e = yij − µij

(49)

Fitting GEE1

(50)

Fitting GEE1

1.

Compute initial estimate for β

(51)

Fitting GEE1

1.

2. Compute Pearson residual e_ij

(52)

Fitting GEE1

1.

3. Compute estimate for α

(53)

Fitting GEE1

1.

4. Compute R_i(α)

(54)

Fitting GEE1

1.

4. Compute R_i(α)

5. Compute V_i = A1/2_i (β)R_i(α)A1/2_i (β)

6. Update estimate for β:

ˆ β(m+1) = ˆβ(m) ₋ " N X ∂µ′_i ∂β V −1 i ∂µi ∂β #−1 " N X ∂µ′_i ∂β V −1 i (Yi − µi) #−1

(55)

Fitting GEE1

1.

4. Compute R_i(α)

ˆ β(m+1) = ˆβ(m) ₋ " N X i₌₁ ∂µ′ i ∂β V −1 i ∂µi ∂β #−1 " N X i₌₁ ∂µ′ i ∂β V −1 i (Yi − µi) #−1

(56)

Fitting GEE1

1.

4. Compute R_i(α)

ˆ β(m+1) = ˆβ(m) ₋ " N X i₌₁ ∂µ′ i ∂β V −1 i ∂µi ∂β #−1 " N X i₌₁ ∂µ′ i ∂β V −1 i (Yi − µi) #−1

(57)

GEE1 - Conclusions

(58)

GEE1 - Conclusions

-

Mean structure needs to be correctly specified

- Consistent estimates (βˆ) even when the working correlation matrix is misspecified

(59)

GEE1 - Conclusions

-

- Misspecification of working correlation matrix can affect the efficiency of βˆ

(60)

GEE1 - Conclusions

-

(61)

GEE1 - Conclusions

-

- Price to pay: correlation structure should not be interpreted

(62)

Analysis of longitudinal data - GEE2

Scientific interest of GEE2 is on modeling

- the expectation of Y and

(63)

Analysis of longitudinal data - GEE2

Scientific interest of GEE2 is on modeling

- the expectation of Y and

- the pairwise association of y_ij and y_ik

Let γ_jk denotes the pairwise association and assume that

(64)

Analysis of longitudinal data - GEE2

(65)

Analysis of longitudinal data - GEE2

δ = (β, α) are the parameters of interest.

(66)

Analysis of longitudinal data - GEE2

Define ω_i the vector of all pairwise products of outcomes.

(67)

Analysis of longitudinal data - GEE2

Define ω_i the vector of all pairwise products of outcomes.

Let µ_i = E[Yi] and ηi = E[ωi].

δ can be estimated by solving the set of equations

N

X

i=1

(68)

Analysis of longitudinal data - GEE2

N X i=1 D_iV_i−1f_i = 0 _→ GEE2 With D_i =   ∂µi ∂β 0 ∂ηi ∂β ∂ηi ∂α   , Vi =   V ar(Yi) cov(Yi, ωi) cov(ω_i, Y_i) V ar(ω_i)   and f_i =   Y_i _{− µ}_i ω_i _{− η}_i  

(69)

Analysis of longitudinal data - GEE2

Consistency of δˆ depends on the correct specification of

(70)

Analysis of longitudinal data - GEE2

(71)

Analysis of longitudinal data - GEE2

(72)

Analysis of longitudinal data - GEE2

N X i=1 D_iV_i−1f_i = 0 ↓ N X i=1 ∂µ′ i ∂β V ar[Yi] −1_(Y i − µi) = 0 N X ∂η_i′ ∂α V ar[ωi] −1_(ω i − ηi) = 0

(73)

Global odds ratio as measure of association

Measure of association between pairs of outcomes:

(74)

Global odds ratio as measure of association

- Continuous variable : correlation coefficient

(75)

Global odds ratio as measure of association

- Binary variable : odds ratio (OR)

(76)

Global odds ratio as measure of association

(77)

Global odds ratio as measure of association

- Ordinal variable: Global odds ratio (GOR)

Global odds ratio is an extension of the simple odds ratio for a _{2 × 2} contingency table

(78)

Global odds ratio as measure of association

It can be defined as the odds ratio of a _{2 × 2} table when

adjacent rows and columns of a K _{× K} contingency table

(79)

Global odds ratio as measure of association

(80)

Global odds ratio as measure of association - notation

For each subject i (i = 1, ..., N ) and at each time

t = 1, ..., n_i, we have an ordinal outcome with K levels

(81)

Global odds ratio as measure of association - notation

Z_it = k , k = 1, ..., K.

Consider the ordinal outcomes of the subject i at the two time occasions s and t, Z_is and Z_it.

(82)

Global odds ratio as measure of association - notation

Z_it = k , k = 1, ..., K.

Consider the ordinal outcomes of the subject i at the two time occasions s and t, Z_is and Z_it.

γ_itk = P [Zit ≤ k|Xit = xit] and

(83)

Global odds ratio as measure of association - definition

The joint probabilities of the outcomes of the subject i at

occasions s and t can be arranged in a K _{× K} contingency

(84)

Global odds ratio as measure of association - definition

table.           P [Zis = 1, Zit = 1] . . . P [Zis = 1, Zit = K] . . . . . . . . . . . . . . .          

(85)

Global odds ratio as measure of association - definition

table.

Dichotomizing this contingency table at (j, k) leads to the following collapsed _{2 × 2} contingency tables

(86)

Global odds ratio as measure of association - definition

          P [Z_is = 1, Z_it = 1] . . . P [Z_is = 1, Z_it = K] . . . . . . . . . . . . . . . P [Z_is = K, Z_it = 1] . . . P [Z_is = K, Z_it = K]           ↓  P [Z_is _{≤ j, Z}_it _{≤ k] P [Z}_is _{≤ j, Z}_it _{≻ k]} 

(87)

Global odds ratio as measure of association - definition

  P [Z_is _{≤ j, Z}_it _{≤ k] P [Z}_is _{≤ j, Z}_it _{≻ k]} P [Z_is _{≻ j, Z}_it _{≤ k] P [Z}_is _{≻ j, Z}_it _{≻ k]}  

(88)

Global odds ratio as measure of association - definition

The global odds ratio for subject i at occasion times s and

t, at the cutpoint (j, k) is defined as

Ψ_ijk(s, t) = P[Zis ≤ j, Zit ≤ k]P [Zis ≻ j, Zit ≻ k] P[Z_is _{≤ j, Z}_it _{≻ k]P [Z}_is _{≻ j, Z}_it _{≤ k]}

(89)

Global odds ratio as measure of association - definition

Using definition of F_ijk(s, t), γ_isj and γ_itk, the global odds ratio can be reexpressed as follow,

Ψ_ijk(s, t) = Fijk(s, t){1 − γisj − γitk + Fijk(s, t)} {γisj − Fijk(s, t)}{γitk − Fijk(s, t)}

(90)

Global odds ratio as measure of association - definition

Using definition of F_ijk(s, t), γ_isj and γ_itk, the global odds ratio can be reexpressed as follow,

(91)

(92)

Analysis of ordinal longitudinal data

(93)

Analysis of ordinal longitudinal data

Estimation of β

Let Y_i = (Y ′

i1, ..., Yin′ i)

′ _with _Y

it = (Yit1, ..., Yit,(K−1))′ where

Y_itk =    1 if Zit = k 0 otherwise

(94)

Analysis of ordinal longitudinal data

Estimation of β

Let Y_i = (Y ′

i1, ..., Yin′ i)

′ _with _Y

it = (Yit1, ..., Yit,(K−1))′ where

Y_itk =    1 if Zit = k 0 otherwise Let π_i = (π′

i1, ..., πin′ _i) with πit′ = (πit1, .., πit,(K−1))′ where

(95)

Analysis of ordinal longitudinal data

First set of estimating equations,

N

X

i=1

(96)

Analysis of ordinal longitudinal data

N X i=1 D_i′V_i−1(Yi − πi(β)) = 0 with D′ i = dπi(β)/dβ

(97)

Analysis of ordinal longitudinal data

N

X

i=1

D_i′V_i−1(Yi − πi(β)) = 0

(98)

Analysis of ordinal longitudinal data

N X i=1 D_i′V_i−1(Yi − πi(β)) = 0        V_11i V_12i . . . V_1n_i_i . . . . . . . . V_n_i_1i V_n_i_2i . . . V_n_i_n_i_i       

(99)

Analysis of ordinal longitudinal data

(100)

Analysis of ordinal longitudinal data

(101)

Analysis of ordinal longitudinal data

(102)

Analysis of ordinal longitudinal data

Estimation of α Let ω_i = [ω_i(1, 2)′_{, ..., ω} i(s, t)′, ..., ωi(ni − 1, ni)′]′ with

ωi(s, t)′ = (ωi11(s, t), ωi12(s, t), ..., ωi1(K−1)(s, t), ωi21(s, t), ..., ωi(K−1)(K−1)(s, t))′

where

(103)

Analysis of ordinal longitudinal data

Estimation of α

Let

η_i = [ηi(1, 2)′, ..., ηi(s, t)′, ..., ηi(n_i−1, ni)]′

with

ηi(s, t)′ = (ηi11(s, t), ηi12(s, t), ..., ηi1,(K−1)(s, t), ηi21(s, t), ..., ηi,(K−1),(K−1)(s, t))′

where

(104)

Analysis of ordinal longitudinal data

Estimation of α

Let

η_i = [ηi(1, 2)′, ..., ηi(s, t)′, ..., ηi(n_i−1, ni)]′

with

ηi(s, t)′ = (ηi11(s, t), ηi12(s, t), ..., ηi1,(K−1)(s, t), ηi21(s, t), ..., ηi,(K−1),(K−1)(s, t))′

where

(105)

Analysis of ordinal longitudinal data

Second set of estimating equations,

N

X

i=1

(106)

Analysis of ordinal longitudinal data

N X i=1 C_i′W_i−1(ωi − ηi(α, β)) = 0 with C′ i = ∂ηi(α) ∂α

(107)

Analysis of ordinal longitudinal data

N

X

i=1

C_i′W_i−1(ωi − ηi(α, β)) = 0

(108)

Analysis of ordinal longitudinal data

N X i=1 C_i′W_i−1(ωi − ηi(α, β)) = 0 Wi =          Wi12 0 . . . 0 0 W_i23 . . . 0 . . . . . . . . 0 0 . . . W_i,(n_i_−1),n_i          for i = 1, ..., N

(109)

Analysis of ordinal longitudinal data

N X i=1 C_i′W_i−1(ωi − ηi(α, β)) = 0 Wi =          Wi12 0 . . . 0 0 W_i23 . . . 0 . . . . . . . . 0 0 . . . W_i,(n_i_−1),n_i          for i = 1, ..., N

V ar(ωijk(s, t)) = P [Yisj = 1, Yitk = 1][1 − P [Yisj = 1, Yitk = 1]]

(110)

Analysis of ordinal longitudinal data

In conclusion, the estimation of (β, α) is obtained from

N X i=1 D_i′V_i−1(Y_i _{− π}_i(β)) = 0 N X i=1 C_i′W_i−1(ωi − ηi(α, β)) = 0

(111)

Analysis of ordinal longitudinal data

In conclusion, the estimation of (β, α) is obtained from

(112)

Analysis of ordinal longitudinal data

(113)

Analysis of ordinal longitudinal data

Using this definition, F_ijk(s, t) = P [Z_is _{≤ j, Z}_it _{≤ k]} can be expressed in terms of Ψ_ijk(s, t), γ_isj and γ_itk

(114)

Analysis of ordinal longitudinal data

Using this definition, F_ijk(s, t) = P [Z_is _{≤ j, Z}_it _{≤ k]} can be expressed in terms of Ψ_ijk(s, t), γ_isj and γ_itk

P [Yisj = 1, Yitk = 1] = Fijk(s, t) − Fij,k−1(s, t) − Fi,j−1,k(s, t) + Fi,j−1,k−1(s, t)

(115)

Analysis of ordinal longitudinal data

(116)

Analysis of ordinal longitudinal data

(117)

Analysis of ordinal longitudinal data

How to solve this set of equations?

(118)

Analysis of ordinal longitudinal data

A Fisher-scoring type algorithm may be used to obtain ( ˆβ, ˆα)

ˆ β(m+1) = ˆβ(m)₋ " _N X i=1 ˆ D′ iVî −1 _ˆ D′ i #−1 " _N X i=1 ˆ D′ iVî −1 (Yi − πi(β)) #−1 and ˆ α(m+1) = ˆα(m)− " _N X i=1 ˆ C′ iWî −1 _ˆ C′ i #−1 " _N X i=1 ˆ C′ iWî −1 ³ ωi − ηi( ˆβ (m+1) ,αˆ(m))´ #−1

(119)

Large sample properties

As _{N → ∞} , N1/2[( ˆβ _{− β)}′, ( ˆα _{− α)}′_{] ≈ N(0, V}_β,α), with V_β,α =   B₁₁−1 0 B₂₁ B₂₂−1     Σ₁₁ Σ₁₂ Σ′₁₂ Σ₂₂     B₁₁−1 B₂₁′ 0 B₂₂−1  

(120)

Large sample properties

As _{N → ∞} , N1/2[( ˆβ _{− β)}′, ( ˆα _{− α)}′_{] ≈ N(0, V}_β,α), with V_β,α =   B₁₁−1 0 B₂₁ B₂₂−1     Σ₁₁ Σ₁₂ Σ′₁₂ Σ₂₂     B₁₁−1 B₂₁′ 0 B₂₂−1   Where B11 = N−1 N X i=1 D′_iV_i−1D_i′ B22 = N−1 N X i=1 C_i′W_i−1C_i′

(121)

Large sample properties

As _{N → ∞} , N1/2[( ˆβ _{− β)}′, ( ˆα _{− α)}′_{] ≈ N(0, V}_β,α), with V_β,α =   B₁₁−1 0 B₂₁ B₂₂−1     Σ₁₁ Σ₁₂ Σ′₁₂ Σ₂₂     B₁₁−1 B₂₁′ 0 B₂₂−1   Where Σ11 = N−1 N X i=1 D_i′V_i−1 {V ar(Yi)} V_i−1Di Σ22 = N−1 N X i=1 C_i′W_i−1 {V ar(ωi)} W_i−1Ci Σ₁₂ = N−1 N X D_i′V −1 _{Cov(Y i, ωi)} W−1Ci

(122)

Large sample properties

The ’robust’ asymptotic covariance matrix of N1/2( ˆβ _{− β)} is

(123)

Large sample properties

The ’robust’ asymptotic covariance matrix of N1/2( ˆβ _{− β)} is

V_β = ¡B₁₁−1Σ₁₁B₁₁−1¢

The ’robust’ asymptotic covariance matrix of N1/2( ˆα _{− α)} is

(124)

Large sample properties

The ’naive’ asymptotic covariance matrix of N1/2( ˆβ _{− β)} is

(125)

Large sample properties

The ’naive’ asymptotic covariance matrix of N1/2( ˆβ _{− β)} is

V_β = B₁₁−1

The ’naive’ asymptotic covariance matrix of N1/2( ˆα _{− α)} is

(126)

Application - GEE2 - Visual impairment

Data from the Wisconsin Epidemiologic Study of Diabetic

N = 720 patients

Y = Severity of diabetic retinopathy (none, mild, moderate, proliferative)

(127)

N = 720 patients

X= (Gender (0=male, 1=female), Dose of insulin (/day))

Association between eyes

Covariates Estimates (SD) P-value

int 0.60

Gender -0.861 (0.35) 0.01

(0=male, 1=female)

(128)

N = 720 patients

int 0.60

Gender -0.861 (0.35) 0.01

(0=male, 1=female)

(129)

N = 720 patients

int 0.60

Gender -0.861 (0.35) 0.01

(0=male, 1=female)

Dose of insulin -0.807 (0.335) 0.02

(130)

Further perspectives

(131)

Further perspectives

Application to longitudinal analysis of quality of life data:

(132)

Further perspectives

- Continuous VS. Categorical VS. Ordinal

What happens when ordinal variable is treated as continuous or categorical?

(133)

Further perspectives

(134)

Further perspectives

- Treatment of missing values

Longitudinal analysis or competing risk?

- Administrative reasons, patient’s refusal, ...missing not planned

(135)

Further perspectives

- Death, data no more collected after disease progression,...

In some situations, patients are not able to go until the last planned timepoint. Are the assumptions of models treating missing data still valid?

(136)

Further perspectives

- Death, data no more collected after disease progression,...

(137)

References

1. Liang,K. and Zeger L. (1986) Longitudinal Data Analysis for Discrete and Continuous Outcomes, Biometrics, 42,

121-130

2. Liang,K. and Zeger L. (1986) Longitudinal Data Analysis using Generalized Linear Models, Biometrika, 73, 13-22 3. Lipsitz S. and Laird N. (1991) Generalized estimating

equations for correlated binary data Using the odds ratio as measure of association, Biometrika, 78, 153-160

4. Williamson J et al. (1995) Analyzing Bivariate Ordinal Data Using a Global Odds Ratio, JASA, 90, 1432-1437

(138)