HAL Id: hal-00894084
https://hal.archives-ouvertes.fr/hal-00894084
Submitted on 1 Jan 1995
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Genetic variation of traits measured in several
environments. II. Inference on between-environment
homogeneity of intra-class correlations
C Robert, Jl Foulley, V Ducrocq
To cite this version:
Original
article
Genetic
variation
of traits measured
in
several
environments.
II.
Inference
on
between-environment
homogeneity
of
intra-class
correlations
C Robert
JL
Foulley
V
Ducrocq
Institut national de la recherche agronomique, station de
g6n6tique quantitative
et
appliquee,
centre de recherche deJouy-en-Josas,
78352Jouy-en-Josas cedex,
Rance(Received
28April 1994; accepted
26September
1994)
Summary - This paper describes a further contribution to the
problem
oftesting
homo-geneity
of intra-class correlations among environments in the case of univariate linearmodels,
withoutmaking
anyassumption
about thegenetic
correlation betweenenviron-ments. An iterative
generalized expectation-maximization
(EM)
algorithm,
as described inFoulley
andQuaas
(1994),
ispresented
forcomputing
restricted maximum likelihood(REML)
estimates of the residual andbetween-family
components of variance andco-variance. Three different
parameterizations
(cartesian,
polar
andspherical
coordinates)
are
proposed
to compute EM-REML estimators under the reduced(constant
intra-class correlation betweenenvironments)
model. Thisprocedure
is illustrated with theanalysis
of simulated data.heteroskedasticity
/
parameterization/
intra-class correlation/
expectation-maximization
/
restricted maximum likelihoodRésumé - Variation génétique de caractères mesurés dans plusieurs milieux. II. Infé-rence relative à des corrélations intra-classe constantes entre milieux. Cet article décrit
une
approche
permettant d’estimer les composantes de variance-covariance entre milieux dans le cas de corrélation intra-classehomogènes
entremilieux, sans faire d’hypothèse
surles corrélations
génétiques
entre milieux pris 2 à 2. Unalgorithme itératif
d’espérance-maximisation(EM),
comparable
à celui décrit parFoulley
etQuaas
(1994),
estproposé
pour calculer les estimations du maximum de vraisemblance restreinte(REML)
des com-posantes résiduelleset familiales
de variance covariance. Troisparamétrisations différentes
(coordonnées
cartésiennes, polaires
etsphériques)
sontproposées
pour calculer les esti-mateurs EM-REML sous le modèle réduit(les
corrélations intra-classe sontsupposées
touteségales
à une mêmeconstante).
Cetteprocédure
est illustrée parl’analyse
de données simulées.hétéroscédasticité
/
paramétrisation/
corrélation intra-classe/
INTRODUCTION
Statistical
procedures
based on thetheory
of thegeneralized
likelihoodratio,
previously proposed by Foulley
et al(1994),
Shaw(1991)
and Visscher(1992),
have beenapplied
to test thehomogeneity
ofgenetic
andphenotypic
parameters
against
Falconer’s(1952)
saturated model. Inparticular,
Robert et al(1995)
have described aprocedure
forestimating
components
of variance and covariancebetween environments and for
testing
thehomogeneity
of thefollowing
parameters:
(a)
a constantgenetic
correlation betweenenvironments;
and(b)
constantgenetic
and intra-class correlations between environments.
The
objective
of this article is topresent
aprocedure
fordealing
withhomo-geneous intra-class correlations among environments without
making
anyas-sumption
about thegenetic
correlations between environments. The method is based on restricted maximum likelihood estimators(REML)
and on ageneral-ized
expectation-maximization
(EM)
algorithms
asproposed
initially by Foulley
and
Quaas
(1994)
for heteroskedastic univariate linear models. Threeparameteri-zations of variance-covariance
components
aresuggested
forsolving
thisproblem.
A simulated
example
ispresented
to illustrate thisprocedure.
THEORY
A model often used to deal with
genotypic
variation in different environments is the2-way
crossedgenotype
(random)
x environment(fixed)
linear model with interaction. Inparticular,
this model has beenproposed
as an alternative to amultiple-trait
approach
when variance and covariancecomponents
arehomogeneous
and
genetic
correlations between environments arepositive
(Foulley
andHenderson,
1989).
It has also beenemployed by
Visscher(1992)
tostudy
the power of likelihood ratio tests forheterogeneity
of intra-class correlations between environments whengenetic
correlations among them are assumedequal
tounity.
The aim of this paperis to go one
step
further inaddressing
the sameproblem
with the same model but with aheterogeneous
structure of variance-covariancecomponents.
The full model
Let us assume that records are
generated
from a cross-classifiedlayout.
The model is defined as follows:where It
is themean, h
i
is the fixed effect of the ith environment:a Si sj is
the randomfamily j
contribution such thats! !
NID(0,1)
andQ
s
v
is thefamily
variance for records in the ith
environment;
0’!;!!,
is the randomfamily
xModel
[1]
can be written moregenerally
using
matrix notation as:where Yiis a
(n
2
x1)
vector of observations in environmenti;
13
is a(p
x1)
vector of fixed effects with incidence matrixX
i
;
ui
=(s) )
and
u2
={h,s !
} are
2independent
random normal
components
of the model with incidence matrices for standardized effectsZit
andZ
2i
respectively;
cr! !
andQ
u
2
,.
are thecorresponding
components
of
variance, pertaining
to stratum i and ei isthe
vector of residuals for stratum i assumedN( 0 , a
f
l, In, ) .
The reduced model
The null
hypothesis
(H
o
)
consists ofassuming
homogeneous
intra-class correlations between environments(ie, d i, ti
=(a;i +a!8i) / (!9!+!hsi+!e!)
= t). The
variance-covariance structure of the residual is assumed to be
diagonal
and heteroskedastic. Under model[I],
thishypothesis
is tantamount toassuming
a constant ratio of variances between environments: Vi,
afl
/
(as.
+a!8i)
=8
2
,
where 8 is a constant. Under thishypothesis,
3 differentparameterizations
will be considered to solve thisproblem.
Cartesian coordinates
where 6 is a
positive
real number. Polar coordinateswhere pi and 6 are
positive
real numbers.Spherical
coordinateswhere
!2
is apositive
real number. Under thisparameterization
6’
=tan’
a.An EM-REML
algorithm
A
generalized expectation-maximization
(EM)
algorithm
tocompute
REMLwhere y is the set of estimable
parameters
for each of the 3 models(under
eachparameterization
considered).
Ei
l
[.]
represents
the conditionalexpectation
taken withrespect
to the distribution of fixed and random effectsgiven
the data vector andy =
y[
t
].
Ei
l
(.!
can beexpressed
as a function of bilinear forms and a trace ofparts
of the inverse coefficient matrix of the mixed-modelequations
(as
described inFoulley
and
Quaas,
1994).
So,
for eachparameterization,
we derive function[3]
withrespect
to each
parameter
of y
and we solve theresulting
system
8Q(Yly[t])
/
9
y
= 0. Aftersome
algebra
andusing
the method of’cyclic
ascent’(Zangwill, 1969),
we obtainthe 3
following algorithms.
For model
[2]
andusing
cartesiancoordinates,
thealgorithm
at iteration[t,
I+1]
can be summarized as follows. Let
8
2ft
,
l]
,
0
,[t,l]
andQ!t2!!.
be the values at iteration[t, 1].
The next iterates are obtained as:0
![tlc+i1
is theonly
positive
root of thefollowing
cubicequation:
with
with
For model
[2]
andpolar coordinates,
thealgorithm
at iteration!t,
I +
1]
can besummarized as follows. Let
8
2
[
t
,
1
],
ll
p
,
ft
and0&dquo; !
be the values at iteration[t, I].
The next iterates are obtained as:v
.
p!t,l+11
is theonly positive
root of thefollowing quadratic
equation:
with:
.
0i’!!U
is the solution of theequation
7-!!
=tan(!’!!/2)
whereZ
f
t
,
t+11
is
theonly
positive
root of thequartic equation:
For model
[2]
andspherical
coordinates,
thealgorithm
at iteration[t,
l +1]
canbe summarized as follows. Let
1/1l
t
,
l]
,
o
pi
’
andal!,4
the values at iteration[t, l!.
Thenext iterates are obtained as:
9
1/
1l
t
,
l+1]
is theonly
positive
root of thefollowing
quadratic
equation:
with:
with:
.
a!t,!+1!
is the solution of theequation
,!!t’t+1!
=tan!(a!-’+!/2)
wherexi’!!U
is theonly positive
root of the cubicequation:
The convergence of the EM-REML
procedure
is measured as the norm of thevector of
changes
in variance-covariancecomponents
between iterations. In oursimulation and for the 3
parameterizations,
convergence is assumed when the normis less than
10-
6
.
Inpractice,
the number of inner iterations is reduced toonly
one in the method of
’cyclic
ascent’. Thealgebraic
solution ofquadratic,
cubic orquartic equations, using
the discriminantmethod,
demonstrates that each timeonly
one root is
possible
in theparameter
space. In the simulatedexample,
thepolar
parameterization
converged
the fastest.Testing
procedure
Let
L(y;
y)
be thelog-restricted
likelihood,
F be thecomplete
parameter
space andr
o
a subset of itpertaining
to the nullhypothesis H
o
.
H
o
isrejected
at the level a if the statistic((y) =
2Max
r
L(y;
y) - 2Maxr
o
L(y;
y)
exceeds(o
where(
0
corresponds
toPr[X2 r , > (
o]
= a(
X2
is
thechi-square
distribution with rdegrees
of freedom
given
by
difference between the number ofparameters
estimated under the full and the reducedmodels).
Formulae to evaluate-2MaxL(y; y)
caneasily
be made
explicit:
where B is the coefficient matrix of the mixed-model
equations.
NUMERICAL EXAMPLE
This
procedure
is illustrated from ahypothetical
data setcorresponding
to abalanced,
crosseddesign
with 3environments,
20 families per environment and50
replicates
perfamily
(p
=3,
s = 20 and n =50).
The 20 families wererandomized within each environment. Basic ANOVA statistics for the
between-family
andwithin-family
sums of squares andcross-products
aregiven
in table I.Table II
presents
the estimation ofgenetic
and residualparameters
under the full and reduced(hypothesis
of a constant intra-class correlation betweenenvironments)
modelsrespectively,
and the likelihood ratio test of the reduced modelagainst
the full model. The P values in table II indicate that there are nosignificant
differences*
1,2,3
3 = the 3 environments.8
Sums of
cross-products
betweenfamilies: n !(y2 j. -
!/t..)(yt’?. !
Yi
’
..)
8 n j=1 8 n
Sums of squares within
families: L L(Yijk -
Yijf
2j=1 k=1
DISCUSSION AND CONCLUSION
In this paper, estimation and
testing
ofhomogeneity
of intra-class correlations among environments have been studied with heteroskedastic univariate linear models. Anotherpossible approach
to account for’genotype
x environment’ effectswould be to consider the
multiple-trait
linearapproach,
definedby
Falconer(1952).
As describedhereafter,
these 2approaches
may or may not beequivalent.
In thisdiscussion,
the conditionsrequired
to haveequivalence
between themultiple-trait
and the univariate linear models will be established.In Falconer’s
approach,
expressions
of the trait in different environments(i, i’)
are those of 2
genetically
correlatedtraits,
with a coefficient of correlationd(i,
i’),
Pii
’ =
!s!!,
/
aBaB.,.
The model is defined as follows:where lJ2!k
is
theperformance
of the kth individual(k
=1, 2, ... ,
n)
of thejth family
(j
=1,2,...,
s)
evaluated in the ith environment(i = 1, 2, ... , p);
b
ij
is the random effect of thejth family
in the ithenvironment,
assumednormally
distributed such thatVar(b
ij
)
=a
1
i,
Cov(b
ij
,
bi!!)
= aBiil for i
7! i’
andCov(bi!, bi.!!)
= 0for j #
j’
and any i andi’;
ljk is a residual effectpertaining
to the kth individual in the subclassij,
assumednormally
andindependently
distributed with mean zero andvariance
o,2 wi
Under the
hypothesis
ofhomogeneity
of intra-class correlations betweena
Likelihood ratio test;
b
degrees
of freedom =2;
* same EM-REML estimates under themultiple
traitapproach.
number of
parameters.
Model[1]
has[2p
+1]
genetic
and residualparameters
and model[4]
has[(p(p
+1)/2)
+1]
parameters.
For p =
3,
whatever thehypotheses considered,
eventhough
these 2 models havethe same number of estimable
parameters,
theparameter
spaces are notexactly
the same. Two conditions must be added to
satisfy
theequivalence
between themultiple-trait
and the univariate linear models. The univariate linear model doesnot allow the estimation of a
negative genetic
correlation betweenenvironments,
since it is a ratio of variances.Thus,
we have thefollowing
condition:Then we have:
and
By definition,
or
2
Si
anda!8i
arepositive
parameters,
so thefollowing
relation mustbe satisfied:
&dquo; &dquo;
It is worth
noticing
that the condition in[6]
means that thepartial
genetic
correlation between any
pair
( j, k)
of environments for environments i fixed is alsopositive.
The
problem
oftesting
homogeneity
of intra-class correlations betweenenviron-ments was
finally
solved under 3 differentassumptions
about thegenetic
correla-tions between environments:
equal
to one(Visscher, 1992);
constant andpositive
(Robert
etal,
1995);
andjust positive
(this work).
For more than 3
traits,
model[1]
is nolonger equivalent
to themultiple
traitapproach
of Falconer. As a matter offact,
itgenerates
fewerparameters
than
!4!,
2p
vsp(p
+1)!2
for[1]
and[4]
respectively.
This
parsimony
might
be aninteresting feature,
because the difference in numbers ofparameters
increases with the number of traits considered(eg,
10 vs15
parameters
for 5traits).
Comparison
ofapproaches
on realgenetic
evaluationproblems
such as sire evaluation ofdairy
cattle in several countries would be ofgreat
interest.REFERENCES
Falconer DS
(1952)
Theproblem
of environment and selection. Am Nat86,
293-298Foulley JL,
Henderson CR(1989)
Asimple
model to deal with sireby
treatmentinteractions when sires are related. J
Dairy
Sci 72, 167-172Foulley JL, Quaas
RL(1994)
Statisticalanalysis
ofheterogeneous
variances in Gaussian linear mixed models. Proc 5th WorldCongress
GenetAppl
LivestProd,
UnivGuelph,
Guelph, ON, Canada,
18, 341-348Foulley JL,
HébertD, Quaas
RL(1994)
Inference onhomogeneity
ofbetween-family
components of variance and covariance among environments in balanced cross-classified
designs.
Genet SelEvol 26,
117-136Lawley DN,
Maxwell AE(1963)
FactorAnalysis
as a Statistical Method. Butterworths MathematicalTexts, London,
UKRobert
C, Foulley JL, Ducrocq
V(1995)
Genetic variation of traits measured in several environments. I. Estimation andtesting
ofhomogeneous
and intra-class correlations between environments. Genet SelEvol 27,
111-123Shaw RG
(1991)
Thecomparison
ofquantitative genetic
parameters betweenpopulations.
Evolution45,
143-151Visscher PM
(1992)
On the power of likelihood ratio tests fordetecting heterogeneity
of intra-class correlations and variances in balanced half-sib