HAL Id: hal-02750398
https://hal.inrae.fr/hal-02750398
Submitted on 3 Jun 2020
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
How to analyse multiple ordinal scores in a clinical trial? Multivariate vs. univariate analysis
Céline M. Laffont, Didier Concordet
To cite this version:
Céline M. Laffont, Didier Concordet. How to analyse multiple ordinal scores in a clinical trial? Multivariate vs. univariate analysis. 2. Biostatistics Conference of the Central European Network, Sep 2011, Zurich, Switzerland. �hal-02750398�
C. Laffont & D. Concordet
INRA, UMR 1331, Toxalim, F-31027 Toulouse, France.
Université de Toulouse, INPT, ENVT, UPS, EIP, F-31076 Toulouse, France.
Project funded by Novartis Pharma AG, Switzerland
How to analyse multiple ordinal
scores in a clinical trial?
Multivariate vs univariate analysis
How to analyse multiple ordinal
scores in a clinical trial?
Case study
In dogs, the effect of robenacoxib (NSAID) on osteoarthritis is assessed using several scores
2 Lameness at walk None Mild Obvious Marked Posture (at stand) Normal Slightly abnormal Markedly abnormal Severely abnormal Pain at palpation No pain Mild Moderate Severe Lameness at trot None Mild Obvious Marked Willingness to raise contralateral limb No resistance Mild resistance Moderate resistance Strong resistance Refuses
Posture
Lameness at walk Lameness at trot
Willingness to raise contralateral limb Pain at palpation
Sum of investigator scores: 0-16
Compute the sum of scores and analyse it as a sum of scores continuous
continuous variable
It ignores the actual metric of each score and assumes that all categories are equidistant
Posture 3 Severely abnormal 2 Markedly abnormal 1 Slightly abnormal 0 Normal
Why is this approach not appropriate?
4
The distance between 0 and 1 is not the same as the distance between 2 and 3
It ignores the actual metric of each score and assumes that all categories are equidistant
Posture 3 Severely abnormal 2 Markedly abnormal 1 Slightly abnormal 0 Normal Lameness at trot 3 Marked 2 Obvious 1 Mild 0 None
Why is this approach not appropriate?
The distance between 1 an 2 is not the same as the distance between 1 and 2
What should be done
Analyse the data as ordered categorical data using
appropriate models (logit, probit…)
Many publications on ordinal data analysis
Applications to assess drug effect
pain relief, nicotine craving scores, sedation, diarrhea, neutropenia…
Estimation/modelling issues
AAPS 2004, JPKPD 2001, 2 articles in JPKPD 2004, JPKPD 2008… But published models restricted to the analysis
of only one score !one
0 1 1 14 29 56 S id e e ff e c ts S id e e ff e c ts Efficacy Efficacy 0 1 14 1 16 69 Efficacy Efficacy 0 1 0 1 Drug A Drug B
They only estimate marginal distributions
Limits of univariate analyses
Sum = 15 Sum = 15 Sum = 70 Sum = 70
Drugs A and B have the same marginal distributions but different benefit-risk ratios !
Univariate analyses assume scores are independent while in many cases, they should be correlated
Limits of univariate analyses
Lameness at walk None Mild Obvious Marked Posture (at stand) Normal Slightly abnormal Markedly abnormal Severely abnormal Pain at palpation No pain Mild Moderate Severe Lameness at trot None Mild Obvious Marked Willingness to raise contralateral limb No resistance Mild resistance Moderate resistance Strong resistance Refuses
Multivariate analysis: background
Very few approaches exist to analyse jointly several ordinal scores
In 2007, Todem et al. proposed a probit mixed effects model for longitudinal bivariate data
Statist. Med. 26:1034
age, mental illness, initial severity, time
Safety
Application:
Fluvoxamine data
Efficacy
Objectives
Extend this previous model (Todem et al.)
To analyse more than two scores (model estimation issue)
To apply to population PK/PD data
10
Identify similarities between scores
Model based on latent variable approach
Lameness Lameness 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Marked Obvious Mild NoneContinuous latent lameness intensity Y*
a1 a2 a3
Cut-off points
Probit
Probit link link function
The
K
scoresY
1,Y
2…Y
K are obtained bycategorisation of
K
latent variablesY
1*,Y
2*…Y
K*fk : (non)linear function for response k = 1…K
xkij : covariates for subject i, response k and time tkij
β
k : fixed effects for response kη
ki : random effects for inter-individual variabilityε
kij : random effects for intra-individual variabilitykij ki kij k k kij
f
x
Y
*=
(
β
,
)
+
η
+
ε
Model based on latent variable approach
Ω , 0 ... 0 ... 1 N ~ Ki i η η kij ki
ε
η
⊥
Γ , 0 ... 0 ... 1 N ~ Kij ij ε εThe correlations between the scores across time are
modeled as correlations between latent variables
Y
*'
kij
kij
ε
ε
⊥
Modelling correlations between scores
-
η
: overall correlation between scores within subjects -ε
: correlation within subjects at a given occasionLikelihood function
(
)
[
]
) ,..., ( where ) ( ) , ( ) ( ) ( ) ( ; ... ; ... ) / ( 1 1 1 1 1 1 1 ) ( ... ) ( 1 1 1 1 1 K i i N i i i i i i i ni j c m cK m m y I m y I i K Kij ij i i y L y L d P y L y L m Y m Y P y L K K Kij ij η η η θ η η η η η = = = = = =∏
∫
∏∏ ∏
= = = = = × × =Parameter estimation
No closed form (nonlinear mixed effects model)
Need to compute the
Need to compute the
multivariate normal cdf
multivariate normal cdf ΦΦΦΦΦΦΦΦKK
No current software can be used Own program written in C++
Approximation of the multivariate normal cdf ΦΦΦΦK Gauss-Legendre quadratures (8 nodes)
Stochastic EM algorithm (SAEM-like) Efficient for ordinal data analysis
(Kuhn & Lavielle, CSDA 2005 ; Savic et al. AAPS 2011)
Metropolis-Hastings algorithm
Gauss-Newton/gradient method for optimisation
Method evaluation with simulation studies
Bivariate analysis ij i ij ij ij ij i ij ij C EC C E Y C Slope Y 2 2 50 max * 2 1 1 * 1 ε η ε η + + + × = + + × =100 subjects, 5 obs. /subject at 1, 3, 6, 12 and 24 h
2 scores Y1 and Y2 (3 and 4 categories resp.)
8 . 0 ) , ( corr 8 . 0 ) , ( corr 2 1 2 1 = = ij ij i i ε ε η η Time (h)
Method evaluation with simulation studies
200 subjects, 4 obs./subject, i.e. one per dose: 2.5, 5, 10, 20 mg
3 scores Y1, Y2 and Y3 (3 categories each)
4 ... 1 , 3 ... 1 * = = + + × = Slope Dose k j Ykij k j
η
kiε
kij − = − = 1 10 . 0 0 1 85 . 0 1 ) ( 1 10 . 0 0 1 90 . 0 1 ) (η Corr ε Corr Trivariate analysisMethod evaluation with simulation studies
Model estimation
Multivariate analysis Univariate analyses
Stochastic EM Stochastic EM
NONMEM 6 (Laplace)
Results:
0 25 50 75 100 125 150 175 200 a11 a12 a21 a22 a23 Slo pe (Y 1) Em ax (Y 2) EC 50 (Y 2) om ega 1 om ega 2 corr eta s corr eps ilons
Multivariate (Stoch. EM) Univariate (Stoch. EM) Univariate (NONMEM 6)
20
Bivariate analysis : parameter estimates
Departure from true values (true value = 100%) + SE
Same marginal distribution
0 25 50 75 100 125 150 175 200 a11 a12 a21 a22 a23 Slo pe (Y 1) Em ax (Y 2) EC 50 (Y 2) om ega 1 om ega 2 corr eta s corr eps ilons
Multivariate (Stoch. EM) Univariate (Stoch. EM) Univariate (NONMEM 6)
Bivariate analysis : parameter estimates
Same marginal distribution
Same marginal distribution
correlations
correlations
Departure from true values (true value = 100%) + SE
22 Univariate analyses
assuming independence
95% CI for model predictions median observations
Bivariate analysis : VPC
22 Bivariate analysis Joint probability P(Y1 ≤ 2 ; Y2≤ 1, 2 or 3)Results:
0 20 40 60 80 100 120 140 160 a11 a12 a21 a22 a31 a32 Slo pe (Y 1) Slo pe (Y 2) Slo pe (Y 3) om ega 1 om ega 2 om ega 3
Multivariate (Stoch. EM) Univariate (Stoch. EM) Univariate (NONMEM 6)
24
Estimation of marginal distribution
Departure from true values (true value = 100%) + SE
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 True values
Multivariate (Stoch. EM)
Estimation of correlations
ηηηη
ηηηη (IIV)(IIV) εεεεεεεε (IOV)(IOV)
The multivariate analysis allows to catch:
the high correlations between scores 1 and 2
Trivariate analysis:
VPC
Joint distribution
P(Y1 =3 ; Y2 =3)
95% CI for model predictions median
observations
Trivariate analysis:
VPC
Joint distribution
P(Y1 =3 ; Y2 =1)
95% CI for model predictions median
observations
Objectives
Generalise this previous model (Todem et al.)
To apply to population PK/PD data
To analyse more than two scores in practice
(model estimation issue)
28
Identify similarities between scores
To identify scores that document a same physio-pathological process and possible redundancies
Ex: trivariate analysis
Principal Component Analysis (PCA)
Y1*
Y2* Y3
Conclusion
- Assess marginal distributions only
- Can lead to some bias and wrong conclusions
Cons
- Rapid
- Easy to understand and interpret
Pros
Univariate analyses Multivariate analysis
- Computation time (bivar. = 3h; trivar. = 18h) - Homemade program
Cons
- Avoid bias and wrong
conclusions in clinical trials - Identification of redundancies
between scores (PCA)
Pros