A Random Weighted Bootstrap for GLMM - On the Inference of Random Effects in Generalized Linear

the bootstrap samples and replications of the variance components (see the simulations of Field and Welsh 2007 and Field et al. 2010). On the other hand, the same authors show that TB and CB reproduce properly the structure of the higher moments of the sums of squares under a fixed cluster size regime, i.e. when n → ∞ for fixed n_i [Field and Welsh, 2007, Corollary 3] regardless of the data generating mechanism. In later works, they show that this result carries over to the asymptotics of an estimator based on estimating equations provided that some regularity conditions are fulfilled [Field et al., 2008, Theorem 2].

With regard to the properties of GCB, Field et al. [2010][Theorem 2] show that the assumptions on the weights needed for the validity of the bootstrap can reduced to condi-tions on their expectation and variance. Namely, whenE[w_i^∗] =Var[w_i^∗] = 1 the bootstrap distributions provide asymptotically proper approximations to the sampling distribution under a fixed cluster regime, which are mostly inherited from the conditions of the esti-mating equations [Field et al.,2008]. As for the choice of the weights, simulation studies inSamanta and Welsh[2012] and Ding and Welsh[2017] show good approximations with exponentially distributed weights E(1).

1.5 A Random Weighted Bootstrap for GLMM

Over time, the idea of bootstrapping via random weighting has been explored in various contexts. We refer the interested reader to the works of Freedman and Peters [1984], Rao and Zhao[1992] for illustration in linear regression, Rubin et al. [1981] in a Bayesian context and Newton and Raftery [1994] in Likelihood-based inference. To the best of our knoweledge, however, the extension of this bootstrapping approach to GLMM, or Latent Variable Models for that matter, has not yet been undertaken. We are inclined to believe that this is due to the need of approximation of the integrals defining the likelihood and the use of approximate techniques that make it hard to study the theoretical properties of this framework. Moreover, we could conjecture that the lack of explicit estimating equations make the direct extensions of the approach of Chatterjee and Lahiri[2011] and Field et al. [2010] all the more challenging.

To overcome these drawbacks, our strategy consists in inserting the random weights at the level of the exponent of the joint distribution (1.10), thereby proposing the following procedure:

RWLB Algorithm

1. Generate random weightsw^∗_i such that E[w_i^∗] =Var[w^∗_i] = 1

2. Insert the weights in the contributions to the joint likelihood, as in:

`^∗_i (θ,u_i) =w^∗_i`_i(θ,u_i), (1.23) 3. Use the Laplace approximation on the resulting weighted likelihood, yielding a Laplace-approximated Weighted log-Likelihood (LAWLL) which can be written as follows:

4. Optimize the LAWLL to obtain RWLB replicates.

In order to better understand the consequences of this adaptation, we start by pro-viding the expressions of the derivatives of `^∗_i, since all the features of the LAWLL will depend on these quantities: A first conclusion can be drawn by looking at Equation (1.25) namely, the fact that the Modes ˜u_iare not affected by the random weights in the first step of the procedure. One can see that when (1.25) is equated to zero to retrieve the Mode, the weight becomes irrelevant thus yielding the same value as a non-weighted contribution. That said, it can be assessed that there is a perturbation once the optimization in the second step of the algorithm reaches convergence because of the dependence of the mode on the model parameters, yielding a sort of bootstrap replicate for the mode: ˜u^∗_i = ˜u_i(y_i,θ^b^∗) as a by-product.

After noticing that the modes are not affected, we can deduce a second consequence from equation (1.26). As we have seen previously, ˜`_i determines the expression of the matrix V_i(θ), which enters the LALL via the logarithm of its determinant as in equation (1.16). It transpires that ponderating `_i doesn’t necessarily entail Laplace-approximate log-Likelihood (1.16) but has an arguably different, perhaps more relevant effect in some contexts. A final consequence is the effect on the approximation error which, as we have seen in equation (1.13) is dictated by the higher order derivatives of `_i through T_ik. Equation (1.27) shows that these terms can be replaced by weighted versionsT_ik^∗ =w^∗_iT_ik and hence: ε^∗_i (θ) := logE[exp(w_i^∗Ri)], which can then be successively expanded into infinite series for the exponential and the logarithm to gain some more insight. Given that the evaluation of such expanded terms can become cumbersome, we will address this discussion with an example in a random intercept model in section 1.5.2.

1.5.1 Relation with the GCB for a Random Effect LMM

In order to study the relationships of our scheme with the GCB for LMM of Pang and Welsh[2014] we shall concentrate on aRandom Effect Model of equation (1.6) and contrast the Scores of our method with the expressions found in Samanta and Welsh [2012]. For such a model, contributions `^∗_i can be written concisely using matrix notation, as in:

`^∗_i (θ) =−w_i^∗

It is known that the LALL of a LMM is exact, a fact that transpires from the non-weighted versions of equations (1.25) through (1.27), with b(η_ij) = η²_ij/2, making b^(k)(η_ij) = 0 and hence ε_i(θ) = 0. Owing to this, the only derivatives needed to provide a characterization of the LALL are those of first and second order and the same can be

1.5. A Random Weighted Bootstrap for GLMM 15 said about the LAWLL. Hence, the derivatives (1.25), (1.26), for a LMM can be written as follows: yields the following individual contributions to the Scores:

˜Ψ^βi (θ) = w_i^∗X^T_i Σ⁻¹_i (y_i−X_iβ), (1.35)

˜Ψ^σi^r²(θ) = w_i^∗

2 (y_i−X_iβ)^T Σ⁻¹_i Z_irZ^T_irΣ⁻¹_i (y_i−X_iβ)− 1

2trΣ⁻¹_i Z_irZ^T_ir. (1.36) As we can see, equation (1.35) is equivalent to the expression of the score for β in equation (5) of Samanta and Welsh[2012]. On the other hand, Equation (1.36) is similar up to the weighting of the term with the trace, yielding, as predicted, a nuanced effect that is different than the weighing of the LALL. As we show in the technical appendix, this property will has a mild effect on the approximations obtained with the bootstrap pro-cedure, since they will only impact the estimating equations of the variance components, yielding an increase in the estimate of the variance-covariance of this set of parameters.

The extent of this effect, however, will also depend on the relationship between Σ_i and the design matrices for the random effects Z_i.

1.5.2 Properties in GLMM with an Individual Random Effect

In what folllows, we develop the expressions related to Laplace-approximated Likelihood for a GLMM with only one Random Effect customary associated to the effect of the observational unit. This simplification of the analysis is done to avoid the complicated expressions that compose the approximation error ε_i(θ) and inherits from the remarks formulated by Shun and McCullagh [1995], Raudenbush et al. [2000] and Ogden [2016].

In this context, q = 1, making D_σ = σ and u_i ∼ N(0,1), hence the individual contributions to the LAWLL of Equation (1.15) become:

logL^∗_i (θ) = 1

2log (2π) + 1

2log|V_i^∗(θ)| −n_i`˜^∗_i (θ) +ε^∗_i (θ) (1.37)

where V_i^∗(θ) = [n_i`˜^∗(2)_i (θ)]⁻¹ = _w¹^∗ and leaving aside for the moment any considerations on convergence of the infinite series representations of the exponential and logarithm functions, we can provide the following representation for the approximation error of the LAWLL:

ε^∗_i (θ) = logE{exp [w_i^∗R_i]}= log From the definition of B, the terms in the expectation are centered moments of a normal distribution N[˜u_i, V_i(θ)], and therefore 0 when the exponent is odd. This simplifies the expression ofE[w^∗_iRi]^lby reducing the set of indexes to those having an even sum, implying that the sums are carried over the partitions of indexes such that their sum is an even number.

These computations are difficult to carry, so we shall use an even more simplified example found in Ogden [2016] to formulate a conjecture on these effects. Consider a Logit GLMM with only one random effect, therefore with a Likelihood contribution Li(σ²) = ^R_Rexp[−ni`i(ui, σ²)], yielding a LALL contribution with approximation error partitionsQof 2l indices intolblocks of size 2, such thatQis complementary toP. From the definition of n_i`_i, it is clear that n_i`^(k)_i (σ²) =O_p(n_i) ∀k and hence:

h_P(σ²) := [n_i`^(|p_i ¹^|)(σ²)]. . .[n_i`^(|p_i ^v^|)(σ²)][n_i`⁽²⁾_i (σ²)]^−l=O_p(n^v−l_i ), (1.42) with the following derivative with respect to σ²

∂h_P

∂σ²(σ²) = O_p(n^v−l−1_i ), (1.43)

where the dominant terms arising when (l, v) = (2,1) or (l, v) = (3,1), implying that the contribution to the error in the score is given by: _∂σ^∂εⁱ²(σ²) = O_p(n⁻²_i ), for a total approximation error in the score of:

∂ε_n

∂σ²(σ²) =O_p(n⁻²_i n), (1.44)

1.5. A Random Weighted Bootstrap for GLMM 17 which is shown to be the same order as the uniform error in the score, see [Ogden, 2016, Sec. 3.1.]. With these remarks, we conjecture that, conditioned on the data, the random weighted contributions n_iw_i^∗`_i would see their uniform error in approximation (and consequently their rate of convergence to the asymptotic normality of the MLE) reduced to O_p[(w_in_i)⁻²n], however, more work needs to be undertaken to formalize this guess.

1.5.3 Implementation

For our implementations we use the R environment [R Core Team, 2015], where many tools are available to draw inference on GLMM. In particular, we could use packagesnlme [Pinheiro and Bates, 2009] and lme4 [Bates et al., 2015], however these implementations are not suited for extension to our proposed bootstrap scheme via the argument called weights, since they insert the random weights only on the first term of the contributions n_i`_i, see Equation (15) inBates et al.[2015], which in our notation for the the LMM case would yield:

`_i(θ,u_i)∝ kW^1/2_i [y_i−η_i]k²₂+φku_ik²₂ (1.45) for a weighting matrix W_i. In LMM, this procedure equates a resampling or pertur-bation via random weighting for the conditional distribution of responses given random effects only. Moreover, simulation experiments that we won’t report in this manuscript have shown us that this method is far from providing the adequate perturbation as well.

We have therefore implemented our bootstrap scheme with the help of a novel tool, the Template Model Builder TMBbyKristensen et al. [2016], a very flexible Rpackage specifi-cally designed for latent variable problems in a manner similar to the AD Model Builder package (ADMB) of Fournier et al. [2012].

TMB works on the grounds of two components, user-specified templates and the Au-tomatic Differentiation (AD) paradigm [Griewank and Walther, 2008]. Without getting into details, we can attempt to roughly summarize the workings of TMB as follows : By means of C++ templates, the user defines the joint distribution of responses and ran-dom effects which, after compilation and loading into theR environment, take part in the construction of a list of elements that can work as input for optimization functions such as optim() and nlminb() containing the parameters, objective functions and the gradi-ents obtained with the AD. Moreover, by declaring the random effects via the argument random, the resulting list contains the Laplace-Approximated Likelihood as the objective function with the gradient obtained using the AD approach .

We have thus taken advantage of the flexibility of TMB to implement our method in the simulation studies of section 1.6. For illustration, the templates for the likelihood of the LMM and Mixed Logit models that we have considered, are to be found in the Appendix (see Listings 1.1 and 1.2).

Dans le document On the Inference of Random Effects in Generalized Linear Mixed Models (Page 26-31)