The Gaussian process emulator can also be interpreted as an approximation of the com- puter model by kernel interpolation with radial basis function as in Schaback (1995, 2007). In this framework, a point-wise control on the error of approximation is pro- vided. Hence, we are able to guarantee a control on the distance between the maximum likelihood estimates in the approximate **mixed** meta-**models** and the maximum likeli- hood estimates obtained with the exact computer model. This control is decreasing to zero as a function of the space-fillingness of the design of numerical experiments. The paper is organized as follows. Section 2 introduces the standard non-linear **mixed** model and Section 3 recalls the principles and the main results of the Gaussian process emulation. Section 4 introduces three **mixed** **models** approximated by Gaussian process emulator. In Section 5, three versions of the SAEM algorithm coupled to a Gaussian process emulator are proposed. Theoretical results are given in Section 6. A simulation study illustrates these results (Section 7). Section 8 concludes the paper with some extensions. Proofs are gathered in Appendix.

En savoir plus
5. Conclusion
We presented two approaches to fit linear **mixed** **models** accounting for left-censoring of the response and we showed with an example that they gave the same results. Thus, in a practical point of view, to fit **mixed** **models** for left-censored repeated measures, one can choose between NLMIXED and CENSAD. The main elements to choose between approaches are the structure of the data and the model used. In fact, CENSAD will be limited when numerous measures are censored while too many random effects will limit NLMIXED because of the numerical integration. Another point is the potential extension of the estimation to more general model. CENSAD allows including a Gaussian process in the error term like a first order auto-regressive process or a Brownian motion. The extension to a bivariate model is direct with NLMIXED (see section 3.2) and possible using the other method [22]). Using NLMIXED, the main limitation is then the number of random effects. In our experience, the procedure was reliable until four random effects leading, for example, to a bivariate model with two intercepts and two random slopes.

En savoir plus
Sarah Ola Moreira 1 , Karin Tesch Kuhlcamp 2 , Fabíola Lacerda de Souza Barros 2 ,
Moises Zucoloto 3 , Alyce Carla Rodrigues Moitinho 4
A bstract – Few cultivars of papaya from the Formosa group are available to producers, and the
development of new genotypes is indispensable. Thus, the use of effective selection strategies to obtain more productive cultivars and better quality fruits is also necessary. The aim of this study was to select of half-sib families (HSF) of papaya using the methodology of **mixed** **models**. Nineteen HSFs from the Incaper’s papaya breeding program were evaluated in a randomized block design with five replicates and nine plants per plot. The selection was made based on fruit mass (FM), pulp thickness (PT), soluble solids content in pulp (SS) and number of fruits (NF). The genetic parameters and genotypic values were estimated by the REML/BLUP procedure. The selected HSFs increased FM by 26.1%; the PT in 10.5%; the SS in 7.5% and; the NF in 13.0%. The additive heritability within the progenies and the individual variation coefficient obtained indicate that the selection between and within the HSFs can provide greater genetic gains. The selection based on the REML/BLUP methodology was efficient to obtain simultaneous genetic gains for all variables under study despite the negative correlation between them.

En savoir plus
An interesting direction for further research would be to develop the statisti- cal methodology for semi-Markov switching generalized linear **mixed** **models**. Since the hidden semi-Markov chain likelihood cannot be written as a simple product of matrices, the MCEM algorithm proposed by Altman (2007) for the MS-GLMM cannot be directly extended to the semi-Markovian case. In our MCEM-like algorithm proposed for MS-LMM and SMS-LMM, the difficulty lies mainly in the prediction of the random effects.

1. Introduction
In experimental sciences (agronomy, biology, experimental psychology, ...), analysis of vari- ance (ANOVA) is often used to explain one con- tinuous response with respect to different ex- perimental conditions, assuming homoscedas- tic errors. In studies where individuals con- tribute more than one observation, such as lon- gitudinal or repeated-measures studies, classi- cal ANOVA is no longer convenient since the assumption of data independence is not valid. The linear **mixed** model ( Laird and Ware , 1982 ) then provides then a better framework to take correlation between these observations into ac- count. By introducing random effects, **mixed** **models** allow to take into account the variabil- ity of the response among the different individ- uals and the possible within-individual corre- lation. Published case studies using a **mixed** model approach ( Baayen et al. , 2008 ; Onyango ,

En savoir plus
Applications of linear **mixed** **models** (LMMs) to problems in genomics include phenotype prediction, correction for confounding in genome-wide association studies, estimation of narrow sense heritability, and testing sets of variants (e.g., rare variants) for association. In each of these applications, the LMM uses a genetic similarity matrix, which encodes the pairwise similarity between every two individuals in a cohort. Although ideally these similarities would be estimated using strictly variants relevant to the given phenotype, the identity of such variants is typically unknown. Consequently, relevant variants are excluded and irrelevant variants are included, both having deleterious effects. For each application of the LMM, we review known effects and describe new effects showing how variable selection can be used to mitigate them.

En savoir plus
and the regression function is solution of an ODE. The estimation problem in the case where the ODE has no analytical solution has already been solved in [17].
However, most of the time, the studied biological process is not fully understood or too complex to be modeled deterministically. So, to account for time-dependent or serial correlated residual errors and to handle real life variations in model parameters occurring over time, **mixed** **models** described by stochastic differential equations (SDEs) have been introduced in the literature (see [32] or [43] for instance). These **models** are a natural extension of the **models** defined by ODEs, allowing to take into account errors associated with misspecifications and approximations in the dynamic system.

En savoir plus
Abstract
Bivariate linear **mixed** **models** are useful when analyzing longitudinal data of two associated markers. In this paper, we present a bivariate linear **mixed** model including random effects or first-order auto-regressive process and independent measurement error for both markers. Codes and tricks to fit these **models** using SAS Proc **MIXED** are provided. Limitations of this program are discussed and an example in the field of HIV infection is shown. Despite some limitations, SAS Proc **MIXED** is a useful tool that may be easily extendable to multivariate response in longitudinal studies.

En savoir plus
2.1 Introduction
Nonlinear **mixed** effects **models** (NLMM) are more and more frequently used for anal- ysis of longitudinal data and repeated measurements in pharmacokinetics, growth and other studies. Comparing to linear **mixed** **models**, parameters of such **models** provide a better biological interpretation of the mechanisms involved and the corresponding **models** are also more parcimonious. The main interest of this paper is to obtain good parameter estimates using maximum likelihood estimation in nonlinear **mixed** effects **models**. Several procedures have already been proposed to estimate parameters of NLMM. The first ones were based on linearization of the log-likelihood such as first-order (F0) and first-order conditional expectation (FOCE) approximations (Sheiner and Beal, 1980; Lindstrom and Bates, 1990). Since errors can be large in the approximation of the observed log-likelihood (Davidian and Giltinian, 1995), some methods based on exact maximum likelihood (ML) were proposed such as Gaussian quadrature and methods based on Monte Carlo meth- ods. However, integration via Gaussian quadrature can be difficult and inaccurate in cases with high dimensionality, in this way stochastic tools may be a powerful alternative. Wei and Tanner (1990) proposed the MCEM algorithm, in which the E-step of the EM algorithm is approximated using a large sample of simulated data and so it is highly time consuming. For instance, Booth and Hobert (1999) reported some results from a study on a real data set: they simulated around 60,000 samples for the final iteration. Delyon et al (1999) proposed a method which promises convergence with fewer simulations: the SAEM algorithm. In this method, the E-step of EM algorithm is replaced by a Simulation step and a Stochastic Approximation step. When the conditional distribution of the missing effects given the observations is unknown, Kuhn and Lavielle (2004, 2005) combined the SAEM algorithm with a MCMC procedure, such as the Metropolis-Hastings algorithm, and called SAEM-MCMC algorithm. In practice, the main problem of this method is to adequately calibrate its parameters to obtain good parameter estimates.

En savoir plus
187 En savoir plus

The estimation algorithms proposed in this paper can directly be transposed to other families of hidden Markov **models** such as for instance hidden Markov tree **models**; see Durand et al. (2005) and references therein. Another interesting direction for further research would be to develop the statistical methodology for semi-Markov switching generalized linear **mixed** **models** to take into account non-normally distributed response variables (for instance, number of growth units, apex death/life, non flowering/flowering character in the plant architec- ture context). Since the conditional expectation of random effects given state sequences cannot be analytically derived, the proposed MCEM-like algorithm for semi-Markov switching linear **mixed** model cannot be transposed to the case of non-normally distributed observed data and other conditional restora- tion steps, for instance based on a Metropolis-Hastings algorithm, have to be derived for the random effects.

En savoir plus
Besides, regularisation methods have already been developped for GLMM, in which the random effects allow to model complex dependence structure. Eliot et al. [ 3 ] proposed to extend the classical ridge regression to Linear **Mixed** **Models** (LMM). The Expectation- Maximisation algorithm they suggest includes a new step to find the best shrinkage pa- rameter - in the Generalised Cross-Validation (GCV) sense - at each iteration. More re- cently, Groll and Tutz [ 4 ] proposed an L 1 -penalised algorithm for fitting a high-dimensional

corresponding curves on Figure 1. Such a data set has been previously analyzed by Mignon-
Grasteau et al. (1999), Jaffr´ ezic et al. (2006) and Meza et al. (2007), who conluded that,
among the standard growth **models**, the monotonic **mixed** Gompertz model is the most
appropriate one. This model is adapted to the most subjects, however it fails to model the

sequen
ing pro
edures for mixed model assembly lines in just-in-time produ
tion. system[r]

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignemen[r]

Comparison of the estimated Gaussian hidden semi-Markov chain (GHSMC) parameters (i.e. where the influence of covariates and the inter-individual heterogeneity are not taken into account) with the estimated semi-Markov switching linear **mixed** model (SMS-LMM) parameters (state occupancy distributions and marginal observation distributions). The regression parameters, the cumulative rainfall effect and the variability decomposition are given

[Figure 3 about here.] [Figure 4 about here.]
6 Discussion
The main original element of this study is the development of the SAEM algorithm for two- levels non-linear **mixed** effects **models**. We extend the SAEM algorithm developed by Kuhn and Lavielle (16), which was not yet adapted to the case of MNLMEMs with two levels of random effects. This algorithm will be implemented in the 3.1 version of the monolix software, freely available on the following website: http://monolix.org. The two levels of random effects are the between-subject variance and the within-subject (or between-unit) variance, with N subjects and K units, with no restriction on N or K. We show that the SAEM algorithm is split into two parts: an explicit EM algorithm and a stochastic EM part. The integration of the term p(b|φ; θ) in the likelihood results in the derivation of two additional sufficient statistics compared to the original algorithm. Furthermore it uses two intermediate quantities, the conditional expectations and variance of the between-subject random effects parameters b. The addition of higher levels of variability would therefore require other extensions of the algorithm.

En savoir plus
Unit´e de recherche INRIA Lorraine, Technopˆole de Nancy-Brabois, Campus scientifique, ` NANCY 615 rue du Jardin Botanique, BP 101, 54600 VILLERS LES Unit´e de recherche INRIA Rennes, Ir[r]

Unité de recherche INRIA Rhône-Alpes 655, avenue de l’Europe - 38330 Montbonnot-St-Martin France Unité de recherche INRIA Lorraine : LORIA, Technopôle de Nancy-Brabois - Campus scientifi[r]

Bootstrapping the raw random effects and residuals does not take into account variance underestimation, leading to shrinkage in the individual parameter estimates. To account for this issue, we employed the correction using the ratio between estimated and empirical variance-covariance matrix for the random effects and the residuals. It was shown to be an ap- propriate method for linear **mixed**-effects **models** because of the improvement of estimation for variance components. These ratios account for the degree of two shrinkages: η-shrinkage and ϵ-shrinkage, which quantify the amount of information in the individual data about the parameters [27, 28, 29]. When the data is not informative, the random effects and residuals are shrunk toward 0 and high degree of η-shrinkage and ϵ-shrinkage will be obtained. Sam- pling in the raw distribution will therefore underestimate the actual level of variability in the data, while correcting both empirical random effects and residuals for shrinkage restores this level. This idea of accounting for the difference between the estimated and empirical variance of residuals through an estimate of the shrinkage was proposed in bootstrapping ordinary lin- ear **models** [21], and was extended for the two levels of variability found in **mixed** **models** by Wang et al [23].

En savoir plus