Mapping linked quantitative trait loci via residual maximum likelihood

(1)

HAL Id: hal-00894186

https://hal.archives-ouvertes.fr/hal-00894186

Submitted on 1 Jan 1997

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Mapping linked quantitative trait loci via residual

maximum likelihood

Fe Grignola, Q Zhang, I Hoeschele

To cite this version:

(2)

Original

article

Mapping

linked

_quantitative

trait loci via

residual

maximum

likelihood

FE

_{Grignola Q Zhang}

I

Hoeschele

Department of Dairy Science, Virginia Polytechnic

Institute and State

_University,

Blacksburg,

VA

_2l!061-OS15,

USA

(Received

2

_{September 1996; accepted}

19

_August

₁₉₉₇₎

Summary - A residual maximum likelihood method is

_presented

for estimation of the

positions

and variance contributions of two linked

_QTLs.

The method also

provides

tests for zero versus one

QTL

linked to a _groupof markers and for one versus two

_(aTLs

linked. A

_{deterministic,}

derivative-free

algorithm

is

_employed.

The variance-covariance matrix of the allelic effects at each

_QTL

and its inverse is

_computed

conditional on

_incomplete

information from

_multiple

linked markers. Covariances between effects at different

(aTLs

and between

_CaTLs

and

_polygenic

effects are assumed to be zero. A simulation

_study

was

_performed

to

_investigate

_parameterestimation and likelihood ratio tests. The

_design

was a

_{granddaughter design}

with 2 000 sons, 20 sires of sons and nine ancestors of sires. Data were simulated under a normal-effects and a biallelic model for variation at each

QTL. Genotypes

at five or nine

_{equally spaced}

markers were

_generated

for all sons and

their ancestors. Two linked

(aTLs

accounted

_jointly

for 50 or 25% of the additive

_genetic

variance,

and distance between

QTLs

varied from 20 to 40 cM. Power of

detecting

a

second

QTL

exceeded 0.5 all the time for the 50%

QTLs

and when the distance was

(at

least 30 cM for the 25%

QTLs.

An intersection-union test is

_preferred

over a likelihood ratio _test,which was found to be rather conservative. Parameters were estimated

quite

accurately

except for a

_slight

overestimation of the distance between two close

QTLs.

quantitative trait loci

_/

_multipoint_mapping

_/

residual maximum likelihood

_/

outcross

population

Résumé - Détection de _gènesliés à effets quantitatifs

(QTL)

grâce au maximum

de vraisemblance résiduelle. On

_présente

une méthode de maximum de vraisemblance résiduelle _{pour estimer}les

positions

et les contributions à la variabilité

_génétique

de deux

QTLs

liés. La méthode

_{fournit également}

des tests de l’existence d’un seul

_QTL

lié à un _groupede _marqueurs

(par

_rapportà

zéro)

ou de deux

_QTLs

(par

_rapportà un

seul).

Un

_algorithme

déterministe sans calcul de dérivées est utilisé. La matrice de variance-covariance des

_{effets alléliques}

à

_{chaque QTL}

et son inverse est calculée conditionnellement

à

l’information incomplète

sur les _marqueurs

_multiples

liés. Les covariances entre les

effets

aux

_{différents QTLs}

et entre les

effets

aux

_QTLs

et les

effets polygéniques

sont

*

(3)

supposées

nulles. Une étude de simulation a été

effectuée

pour

analyser

les

paramètres

estimés et les tests de _rapportsde vraisemblance. Le schéma

_{expérimental}

a été un schéma

«

_{petites-filles}

» avec 2 000

_fils,

20

_pères

des

_fils

et 9 ancêtres de ces

_pères.

Des données

ont été simulées avec un modèle de variation au

_QTL

de type Gaussien ou

_{biallélique.}

Les

_génotypes

pour cinq ou

_neuf

_marqueurs

_{également espacés}

ont été

_générés

_{pour tous} les

_,fils

et leurs anchêtres. Deux

_QTLs

liés

_expliquaient

conjointement 50 % ou 25 % de la variance

_génétique

additive et la distance entre les

QTLs

variait de 20 CM à 40 CM. La

puissance de détection d’un second

_QTL

a

_{dépassé 0,5,}

dans tous les cas _pourla situation 50

_%,

et

_quand

la distance entre

_QTLs

était

_supérieure

ou

égale

à 30 CM _pourla situation 25 %. Le test d’un

QTL

par rapport à deux

QTLs correspond

à la réunion de deux tests.

On l’a trouvé

_{plutôt conservatif.}

Les

paramètres

ont été estimés avec une

_{grande précision}

excepté

la distance entre deux

QTLs proches

qui a été

_légèrement

surestimée.

locus de caractère quantitatif

/

cartographie multipoint

/

maximum de vraisemblance résiduelle

_/

_population

_consanguine

INTRODUCTION

A

_variety

of methods for the statistical

_mapping

of

_quantitative

trait loci

_(QTL)

exist. While some methods

_{analyze squared phenotypic}

differences of relative

_pairs

(eg,

Haseman and

_{Elston, 1972;}

Gotz and

_Ollivier,

_1994),

most methods

_analyze

the individual

_phenotypes

of

_pedigree

members. Main methods

_applied

to livestock

populations

are maximum likelihood

_{(ML) (eg,}

_Weller,

_1986;

Lander and

_Botstein,

1989;

Knott and

_Haley,

_1992),

_{least-squares}

_(LS)

as an

_{approximation}

to ML

_(eg,

Weller et

_al,

_1990;

_Haley

et

al, 1994;

Zeng,

1994),

and a combination of ML

and LS referred to as

_composite

interval

_mapping

_(Zeng,

₁₉₉₄₎

or

_{multiple QTL}

mapping

(Jansen, 1993).

These methods were

developed mainly

for line

_crossing

and, hence,

cannot

_fully

account for the more

complex

data structures of outcross

populations,

such as data on several families with

_{relationships}

across

_families,

incomplete

marker

_information,

unknown number of

_QTL

alleles in the

_population

and

_varying

amounts of data on different

_(aTLs

or in different families.

Recently,

Thaller and Hoeschele

_{(1996a, b)}

and Uimari et al

₍₁₉₉₆₎

_implemented

a

_Bayesian

method for

QTL

_{mapping using}

single

markers or all markers on a

chromosome,

respectively,

via Markov chain Monte Carlo

_algorithms,

and

_applied

the

_analyses

to simulated

_{granddaughter designs}

identical to those in the

_present

study.

Hoeschele et al

₍₁₉₉₇₎

showed that the

_Bayesian

_analysis

can accommodate

either a biallelic or a normal-effects

_QTL

model. While the

_{Bayesian analysis}

was

able to account for

_{pedigree relationships}

both at the

_QTL

and for the

_polygenic

component,

and _gave

_good

_{parameter estimates,}

it was _very

_demanding

in terms

of

_{computing time,}

in

_particular

when

_fitting

two

_(aTLs

_(Uimari

and

_Hoeschele,

1997).

Therefore,

Grignola

et al

_(1996a)

_developed

a residual maximum likelihood

method,

using

a

_{deterministic,}

derivative-free

_algorithm,

to _mapa

_{single QTL.}

Hoeschele et al

₍₁₉₉₇₎

showed that this method can be considered as an

approxi-mation to the

_{Bayesian analysis fitting}

a normal-effects

_QTL

model. In the

normal-effects

QTL

model

postulated

by

the REML

analysis,

the vector of

QTL

allelic effects is random with a

_prior

normal distribution. The REML

_analysis

builds on

(4)

it to the estimation of

QTL, polygenic,

and residual variance

_components

and of

QTL location, using incomplete

information from

multiple

linked markers.

Xu and

_{Atchley (1995)}

_performed

interval

_{mapping using}

maximum likelihood based on a mixed model with random

QTL

effects,

but these authors fitted additive

genotypic

effects rather than allelic effects at the

_QTL,

with variance-covariance

matrix

_proportional

to a matrix of

_proportions

of alleles

identical-by-descent,

and assumed that this matrix was known. Their

analysis

was

_applied

to unrelated full-sib

_pairs.

In order to account for several

QTLs

on the same

_chromosome,

Xu and

Atchley

(1995)

used the idea behind

composite

interval

_mapping

and fitted variances

at the two markers

_flanking

the marker bracket for a

_QTL.

This

_{approach, however,}

is not

_appropriate

for

_{multi-generational}

_pedigrees,

as effects associated with marker

alleles erode across

_{generations owing}

to recombination. It is also

problematic

for

outbred

_populations,

where

incomplete

marker information causes the

_flanking

and

next-to-flanking

markers to differ _amongfamilies.

In this _paper,we extend the REML method of

_Grignola

et al

_(1996a)

to the

fitting

of

multiple

linked

_QTLs.

While the extension is

_general

for _anynumber of linked

_QTLs,

we

_apply

the method to simulated

_{granddaughter designs by}

fitting

either one or two

_QTLs.

METHODOLOGY Mixed linear model

The model is identical to that of

_Grignola

et al

_(1996a),

_except

that it includes effects at several

_(t)

QTLs,

and it can be written as:

where y is a vector of

_phenotypes,

X is a

_{design-covariate matrix, j3}

is a vector of fixed

effects,

Z is an incidence matrix

relating

records to

individuals,

u is a vector

of residual additive

_(polygenic)

_effects,

T is an incidence matrix

relating

individuals

to

_alleles,

vi is a vector of

QTL

allelic effects at

_QTL

_i,

e is a vector of

residuals,

A is the additive

_{genetic relationship matrix,}

_c

₇

_’

_is

the

_polygenic

variance,

_Q,

₀

_{,2 V(}

_i)

is the variance-covariance matrix of the allelic effects at

_QTL

i conditional on

marker

_information,

_Qv!2!

is the allelic variance at

_QTL

_{i (or}

half of the additive variance at

_QTL

_i),

R is a known

_diagonal

_matrix,

and

Q

e

is

residual variance. Each

matrix

_Q

_i

_depends

on one unknown

_parameter,

the _map

position

of

QTL

i (d

i

).

Parameters related to the marker _map

_(marker

_positions

and allele

_frequencies)

are assumed to be known. The model is

parameterized

in terms of the unknown

parameter’s heritability

(h

2

=

_{0&dquo;!/0&dquo;2),}

with

_{aj_}

_being

_phenotypic

and

_U2

additive

(5)

effects at

_QTL

i

_(v?

=

o,’ v(i) /’a 2;

i =

_{l, ... ,}

_t),

the residual variance

_{0,2, e}

and

_QTL

map locations

d

l

, ... ,

di, ... , dt.

A model

_equivalent

to the animal model in

_[1]

is

_(Grignola

et

_al,

_1996a):

where W has at most two non-zero elements

equal

to 0.5 in each row in columns

pertaining

to the known

_parents

of an

individual,

F

i

is a matrix with _upto four

non-zero elements per row

_pertaining

to the

_QTL

effects of an individual’s

parents

(Wang

et

_al,

_1995;

_Grignola

et

_al,

_{1996a), Ap}

and

_Qp(

_j

₎

are sub-matrices of A

and

_{Q, respectively, pertaining}

to all animals that are

_parents,

and m and ei

are Mendelian

_sampling

terms for

_polygenic

and

_QTL

_{effects, respectively,}

with covariance matrices as

_specified

in

_equation

_!2!.

While

_Var(m)

is

_diagonal,

_Var(e

_i

₎

can have some

off-diagonal

elements in inbred

_populations

_(Hoeschele,

_{1993; Wang}

et

al,

1995).

Note that models

_[1]

and

_[2]

are conditional on a set of

_QTL

_map

_positions

(and

on marker

_positions

which are assumed to be

_known).

_Dependent

on the

map

positions

are the matrices

Q

i

in model

_[1]

and the matrices

_F

_i

and

_Qp(

_j

₎

in

model

_!2!.

Note furthermore that models

_[1]

and

_[2]

assume zero covariances between effects

at different

_QTLs,

and between

_polygenic

and

_QTL

effects.

_However,

selection tends

to introduce

_negative

covariances between

_(aTLs

_(Bulmer,

_1985).

A reduced animal model

_(RAM)

can be obtained from model

_[2]

_{by combining}

m,

the e

i

(i

=

1, ... ,

t) and

e into the residual. Mixed model

_equations

_(MME)

can be formed

directly

for the

RAM,

or

_by

_setting

_upthe MME for model

_[2]

and

absorbing

the

_equations

in m and the ei

(i

=

(6)

with the A matrices defined in

_equation

_!2!.

Matrix

_D,

which results from successive

absorption

of the Mendelian

_sampling

terms for the

_polygenic

_component

and the

QTLs,

can be shown to be

_{always diagonal}

and _very

_simple

to

_compute,

even when

several

_{(aTLs (t}

>

₂₎

are fitted. Let

_6v!i!!!

_represent

the Mendelian

sampling

term

pertaining

to v effect k

(k

=

1, 2)

of

individual j

at

QTL

i,

and

6,,(

j

)

the Mendelian

sampling

term for the

_polygenic

effect of

_{j. Then,}

the element of D

_pertaining

to

individual j

(d

jj

)

is

_computed

as follows:

where

r

jj

is the

_jth

_diagonal

element of

R-

1 .

REML

_analysis

The REML

_analysis

was

performed using

interval

mapping

and a derivative-free

algorithm

to maximize the likelihood for _any

_given

set of

QTL positions,

as

described

_{by Grignola}

et al

_(1996a)

for a

_single

_QTL

model. The

_log

residual

likelihood for the animal model was obtained

by adding

correction terms to the residual likelihood formed

_directly

from the RAM MME

_(Grignola

et

_al,

_1996a).

The RAM residual likelihood is:

where N is the number of

_phenotypic

_{observations,}

NF the number of estimable fixed effects

_(rank

of

_X),

NR

RA

,yI

the number of random

genetic

effects of the

parents

((1

+

_2t)

times the number of

_parents),

CRAM

is the coefficient matrix in

the left-hand-side of

_[3],

P =

V-’ -

1 ,

1 X’V-

1 X)-

X(X’V-

1 V-

V =

Var(Y)/(7!,

and

_GRA,!,I

is a

block-diagonal

with blocks

Ap(7! and

Qp(i)(7!(i)

for i =

1, ... ,

t (see

(7)

The RAM residual likelihood is modified to obtain the residual likelihood for the animal model as follows

(Grignola

et

_{al 1996a):}

where A is the

_{block-diagonal}

with blocks

A

u

and

_!v(i)

_(i

=

_{1, ... ,}

t)

from

!2!,

_Czz

is the

_part

of the MME for model

_[2]

_pertaining

to m

_{and e}

_i

(i

= 1, ... ,

t

),

_and

NR

is total number of random

_genetic

effects

_!(1

!-

_2t)

times the number of

_animals]

in the animal model.

The

_analysis

is conducted in the form of interval

_mapping

as in

_Grignola

et al

(1996a),

except

that now a t-dimensional search on a

_grid

of combinations of

positions

of the

_{t CaTLs}

must be

_performed.

More

_precisely,

we

_{performed cyclic}

maximization

_by

_optimizing

the

_position

of the first

_QTL

while

_holding

the

_position

of the second

_QTL

constant and

_{subsequently fixing}

the

_position

of the first

QTL

while

_optimizing

the

_position

of the second

_QTL,

etc. A minimum distance

was allowed between the

_QTLs,

which was determined such that the

_(aTLs

were

always separated by

two markers. Whittaker et al

₍₁₉₉₆₎

showed that for

_regression

analysis

and F2 or backcross

_designs,

the two locations and effects of two

_(aTLs

in

_adjacent

marker intervals are not

_jointly

estimable. With other methods and

designs,

locations and variances of two

_(aTLs

in

_adjacent

intervals should be either

not estimable or

_poorly

estimated. At each combination of

_d

₁

and

d

2 values,

the residual likelihood is maximized with

_respect

to the

_parameters

_h

_z

_,

_v2

_(i

=

_1,...,

t)

and

_er!.

Matrices

_Qp(

_j

_),

_F

_i

and

_Ov!2!

were calculated for each

_QTL

as described in

Grignola

et al

_(1996a).

Hypothesis testing

The _presenceof at least one

QTL

on the chromosome

_harboring

the marker

_linkage

group can be tested

_by

_maximizing

the likelihood under the

_one-QTL

model and under a

_polygenic

model with no

QTL

fitted

(Grignola

et

_al,

_1996a).

The distribution of the likelihood ratio statistic for these two models can be obtained

via simulation or data

_permutation

(Churchill

and

_Doerge,

_1994;

_Grignola

et

_al,

1996a, b;

Uimari et

_al,

_1996).

_Here,

we consider

_testing

the

_one-QTL

model

_against

the

_two-QTL

model. This test is

_{performed by comparing}

the maximized residual likelihood under the

_two-QTL

model with

_(i)

the maximized residual likelihood under the

_{one-QTL model,}

_(ii)

the residual likelihood maximized under the

one-QTL

model with

_{QTL position}

fixed at the REML estimate of

_d

₁

obtained under the

_two-QTL

_model,

and

_(iii)

the residual likelihood maximized under the

_one-QTL

model with

_QTL

_position

fixed at the REML estimate of

_d

₂

obtained under the

two-QTL

model. The distribution of these likelihood ratio statistics is not

_known,

and

_obtaining

it via data

_permutation

would be difficult

_{computationally,}

as _many

permutations

would need to be

_analyzed,

and as the two-dimensional search took

1-2 h of run-time for the

_design

described below. The likelihood ratios

_{corresponding}

to

_{(i) (LR}

_d

_),

_(ii)

_(LR

_dl

_),

and

_(iii)

_(LR

_d2

₎

should have an

_asymptotic

_chi-square

distribution within 1 and 3

_degrees

of freedom. When

_using

_LR

_dl

and

_LR

_dz

_,

both ratios have to be

_significant

in order to

_reject

the null

_hypothesis

of one

_QTL.

This

(8)

for the first likelihood ratio the

_hypotheses

are:

_H

_o

_:

&OElig;!(l) -I-

0 and

&OElig;!(2)

= 0 _versus

H

1 :

_{&OElig;!(l) -I-}

0 and

_{&OElig;!(2) -I-}

0,

and for the second likelihood ratio the

hypotheses

are:

Ho:

_{&OElig;!(1) =}

0 and

_{&OElig;!(2) -I-}

0 versus

_H

₁

_:

_{&OElig;!(1) -I-}

0 and

_{&OElig;!(2) -I-}

0. The intersection-union test constructed in this way can be

_{quite conservative,}

as its size _maybe

much less than its

_specified

value ce. For

_{genome-wide testing,}

the

_significance

level

should also be

_adjusted

for the number of

_independent

tests

_performed

_(the

number of chromosomes

_analyzed

times the number of

_independent

_traits).

SIMULATION

Design

The

_design

simulated was a

_{granddaughter}

_design

(GDD)

as in the

_{single QTL}

study

of

_Grignola

et al

_(1996b),

where marker

genotypes

are available on sons and

phenotypes

on

daughters

of the sons. The structure resembled the real GDD of the US

_public

_gene

_{mapping project}

for

_dairy

cattle based on the

_dairy

bull DNA

repository

(Da

et

_al,

_1994).

The simulated GDD consisted of 2 000 sons, 20

sires,

and nine ancestors of the sires

_{(fig 1)..}

The

_phenotype

simulated was

_daughter

_yield

deviation

(DYD)

of sons

(Van-Raden and

_Wiggans,

_1991).

DYD is an _averageof the

_phenotypes

of the

_daughters

adjusted

for

_systematic

environmental effects and

_genetic

values of the

_daughters’

dams. For details about the

_analysis

of

_DYDs,

see

_Grignola

et al

_(1996b).

Marker and

_QTL

_genotypes

were simulated

according

to

_{Hardy-Weinberg}

fre-quencies

and the _map

_positions

of all loci. All loci were in the same

linkage

_group.

Each marker locus had five alleles at

_{equal frequencies.}

Several

_designs

were

con-sidered which differed in the map

positions

of the two

QTLs,

in the number of

markers,

and in the

_proportion

of the additive

_genetic

variance

_explained

_by

the

two

_QTLs.

These

_designs

are defined in table II. Also simulated was a

_{single QTL}

at 45 cM to test the

_{two-QTL analysis}

with data

_generated

under the

_{single QTL}

model.

Polygenic

and

QTL

effects were simulated

according

to the

_pedigree

in

_figure

1.

Data were

_{analyzed by}

_using

the

_pedigree

information on the sires. Note that

in the

_simulation,

no

_linkage

_{disequilibrium}

(across

families)

was

_{generated, ie,}

(9)

polygenic

effects were zero.

_Therefore,

an additional

_design

was simulated where

linkage disequilibrium

was

_generated

_by

_simulating

DYDs also for

_{sires, creating}

a

_larger

number of sires and

_culling

those sires with DYD lower than the 90th

percentile

of the DYD distribution.

_{QTL positions}

for this

_design

were 30 cM

(interval 2)

and 70 cM

_{(interval 3)}

with five

_markers,

and the

QTL

model was

the normal-effects model

_{(see below).}

Estimates of the simulated correlations

_(SE

in

_{parentheses),}

across 30

replicates,

were -0.20

(0.05),

-0.33

_(0.04),

and -0.32

(0.04),

between

_pairs

of v effects at

_QTL

1 and

_{QTL 2,}

between

_pairs

of v effects

at

_QTL

1 and

_polygenic

_effects,

and between

_pairs

of v effects at

_QTL

2 and

polygenic effects,

respectively.

The effects of one or several

_generations

_{of phenotypic}

truncation selection on additive

_genetic

variance in a finite locus model has been

(10)

QTL

models

Two different

QTL

models were used to simulate data. Under both

models,

phenotypes

were simulated as

where n

j

was the number of

daughters

of son

_j,

_gi!kwas the sum of the v effects

in

_daughter

k of

_{son j at}

_QTL

_i,

_u_j was a

_normally

distributed

_{polygenic effect,}

e

j was a

normally

distributed

residual, polygenic

variance

(

0 &dquo;)

was

equal

to the difference between additive

_genetic

variance

_{(afl )}

and the variance

_explained

_by

the

QTLs,

and

_{afl was}

environmental variance. Number of

_daughters

_person was set to

50,

corresponding

to a

_reliability

_(Van

Raden and

Wiggans,

1991)

near 0.8. Narrow

sense

heritability

of individual

_phenotypes

was

h

2

=

_0.3,

and

_phenotypic

SD was

Q P = 100.

Note that the

_QTL

contribution to the DYDs of sons was

_{generated by sampling}

individual

QTL

allelic effects of

_daughters

under each of the two

_genetic

models described below. This

_sampling

of

_QTL

effects ensures that DYD of a

_heterozygous

son, or of a son with substantial difference in the additive effects of the alleles at

a

QTL,

has

larger

variance _among

daughters

due to the

_QTL

than a

_homozygous

son or a son with similar

_QTL

allelic effects.

Two different models were used to describe variation at the

_QTL,

which are

identical to two of the models considered

_{by Grignola}

et al

_(1996b).

Normal-effects model

For each individual with both or one

parent(s)

_unknown,

both or one

effect(s)

at

QTL

k(k

=

1, 2)

were drawn from

_N(O,

_{a v 2(k)).}

For the

pedigree

in

_figure

_1,

there

were 32 base

alleles,

and each

_QTL

was treated as a locus with 32 distinct alleles

in

_passing

on alleles to descendants. The

_parameter

_a

_v

_2(k)

was set to

_{0.125or or}

0.625(J&dquo;!,

ie,

QTL

k accounted for

25%

₍₂

_V2

=

0.25)

or

12.5%

(2v!

=

0.125)

of the

total

additive

genetic variance,

respectively.

k k

Consequently,

the two

_(aTLs

accounted

_jointly

for between 25 and

50%

of the additive

_genetic

variance.

Biallelic model

Each

_QTL

was biallelic with allele

_frequency

_p_i =

p 2 =

P= 0.5. The variance _at

QTL

k was

where for p = 0.5 and

2v!

= 0.25 _or

2v!

=

_0.125,

half the

homozygote

difference _at

(11)

RESULTS

The

_designs

studied are described in table II and differ in the

QTL positions,

in the number of

_markers,

and in the

_proportion

of the additive

_genetic

variance

explained jointly by

two linked

_QTLs.

Overall,

the

QTL

parameters

were estimated

quite accurately

as in the

single-CaTL

_analysis

of

Grignola

et al

_(1996b),

_except

that there was a

_tendency

to overestimate the distance between the

CaTLs

with

decreasing

true distance.

Parameter estimates for all

_designs

in table II and for the normal-effects

_QTL

model used in the data simulations are

_presented

in table III. There

appeared

to

be a

_{slight tendency}

to overestimate the

_QTL

variance contributions

_(v

₂

_),

_but,

in most cases not

_{significantly.}

The

_QTL

map

positions

and the distance between

the

QTLs

were estimated

accurately

when the true _mapdistance between the

(aTLs

was 30 or 40 cM. When the true _mapdistance was

_only

20

_cM,

there was a

tendency

to overestimate the

_QTL

distance. This overestimation was

_{significantly}

more

_pronounced

when the number of markers was reduced from nine

(every

10

_cM,

designs IIIA,

B)

to five

_(every

20

_{cM, designs IVA,}

_B).

To

_investigate

whether the overestimation of the

QTL

distance was related to the search

_{strategy requiring}

a minimum distance between the

QTLs

such that these were

always

separated

by

two markers

_(with

the

_exception

of

_{designs IVA, B),}

the minimum distance

was reduced to 10 and 2 cM.

_However,

_parameter

estimates and likelihood ratios remained

_unchanged.

When the

_(aTLs

accounted

_jointly

for

_only

25%

of the additive

_genetic

variance

as

_compared

to

_50%,

there was little

_change

in the

_precision

of the estimates of the

QTL

variance contributions. Standard errors of the

_{QTL positions}

were

_higher,

and

overestimation of the distance between

_{(aTLs only}

20 cM

_apart

was

_slightly

more

pronounced.

Parameter estimates for

_designs

simulated under the biallelic

_QTL

model are

shown in table V.

Except

for the

_{QTL model,}

these

designs

are identical to

designs IA,

B and

IIIA,

B in table II. Parameters were estimated with an _accuracy

not

_noticeably

lower than for the normal-effects

_{QTL model,}

an observation in

agreement

with the

_single-(aTL

_study

of

_Grignola

et al

_(1996b).

When

_analyzing

the

_designs

in table II with the

_single-(aTL

_model,

the most

likely QTL

position

(d

in tables III and

_V)

was

always

somewhere in between the

(12)

the estimated

_QTL

_position

was _verynear the mean of the true

_positions.

This result was

_expected,

as both

_(aTLs

had

_equal

variance contributions and on _average

equally

informative

_flanking

markers.

Likelihood ratio statistics for all

_designs

in table II and for the normal-effects

QTL

model are

_presented

in table IV. The _averagevalues of the likelihood ratio statistics for

_testing

between the

_single-

and

_two-QTL

models declined as

_expected

with

_decreasing

distance between the two

_QTLs

_(designs

_IIA,

B versus

_{designs IIIA,}

B),

with

_decreasing

number of markers

_(designs

IA,

B versus

IIA,

B,

and

designs

IIIA,

B versus

_{designs IVA,}

B),

and with

_{decreasing joint}

variance contribution of the

_QTLs

_(A

versus

_B).

The _averagevalue of the likelihood ratio statistic

_LR

_d

was

always considerably

lower than those of

LR

dl

and

LR

d2

.

The _power

_figures

in table IV were calculated

_assuming

that

_LR

_dl

_,

_LR

_d2

and

LR

d

follow either a

_chi-square

distribution with 1 df or 3 df and

_using

an cx value of

_0.05/29

= 0.0017. To allow for

any

interpretation

of these power

figures,

we

estimated the

_type-I

error

_{by simulating}

data with a

_single

_QTL

_explaining

either

25,

12.5 or

6.25%

of the additive

_genetic

variance. For the

_type-I

error

_estimation,

a was set to

_0.05,

and the number of

replicates

was 200. Estimates of

type-I

errors are in table

_VI,

for the two tests

_(LR

_d

and

_LR

_d

_,

and

_LR

_d2

₎

and for thresholds from

chi-square

distributions with

1,

2 and 3 df.

Type-I

errors tended to increase

_slightly

with size of the

_QTL

_variance,

and were

consistently

lower for the test

_using

_LR

_d

_.

Based on these

results,

the

_{empirical type-I}

error was close to the

_{pre-specified}

value of 0.05 for the

LR

dl

and

LR

d2

tests when

_using

the

_{xi-threshold,}

while it was

(13)

With this

_background,

the _powerof

_rejecting

the

_single-(aTL

model based on

requiring

both

LR

dl

and

_LR

_d2

to exceed the

_significance

threshold was as

expected

always

higher

than or

_equal

to the power of the

LR

d

statistic. For the test based on

LR

dI

and

_LR

_d2

_,

_powerdeclined as

_expected

with

_decreasing

distance between

_QTLs

and with

_decreasing

true

QTL

variance contribution. For the

_{joint QTL}

variance contribution of

50%

and the test based on

_LR

_d

_,

and

_LR

_d2

_,

_powerwas

_equal

to or

higher

than 0.5

_always

for the

_xi

threshold

and

always

except

for

_design

IVA for the

_X2

_threshold.

For the

_joint

_QTL

25%,

a _powerof at

least 0.5 was achieved

_only

for the 30 cM distance between

_QTLs

and the

X2

and

X3

thresholds.

This

_finding

must be

interpreted by keeping

in mind the choice of

a = 0.0017 and the fact that the

_test,

_ascontructed

_here,

is rather conservative.

For the data simulated with a

_{single QTL explaining}

either 25 or

12.5%

of the

additive

_{genetic variance,}

the map

positions

estimated under the

two-QTL

model

were near the true

_position

and a

_ghost

_position

to either side of the true

_position

in most

_replicates.

This behavior of the REML

analysis

seems to

_support

the use

of

_LR

_dl

and

_LR

_d2

instead of

_LR

_d

_.

Likelihood ratio statistics for the biallelic

QTL

model and some of the

_designs

in

table II are

_presented

in table V.

_Overall,

likelihood ratios and _power

_figures

were

similar to those for the normal-effects

_QTL

_model,

with somewhat lower power for

design

IA but

_{slightly higher}

_powerfor other

_designs.

These differences are most

likely

due to the limited number of

_replications

_(30).

The

_cyclic

maximization

_strategy

for the

two-QTL

model took about 20 min

of serial wall-clock time on a

_21-processor

IBM SP2

_system

for a chromosome of

80 cM

_{length, compared}

with 1.5 h for a two-dimensional search. Run-time for the

single-QTL

analysis

was at most 8 min.

For the

_designs

in table

_II,

the two

_QTLs

had

_equal

variance contributions.

(14)

and 25 and 45 cM

_(nine

_{markers), respectively,}

were simulated

_using

the

normal-effect

_QTL

_model,

with

_QTL

1

_explaining

25%

and

_QTL

2

12.5%

of the additive

genetic

variance. The average estimates

(with

SE in

parentheses)

of

QTL

position

from the

_single

_QTL

_analysis

were 0.396 M

(0.023)

and 0.298 M

(0.010)

for the

30 and 70 cM and 25 and 45 cM

_{designs, respectively, being}

closer to the first locus with the

_larger

variance contribution. Estimated

QTL positions

from the

two-QTL

analysis

were 0.285 M

_(0.008)

and 0.720 M

_(0.010)

for the 30 and 70 cM

_design,

(15)

and 0.077

_(0.012)

for the second

_{design, respectively.}

For the 30 and 70 cM

_design,

power was 0.47

(0.40)

for the

Xi

2 (X’)

threshold and the test based on

LR

dl

and

LR

d2

.

For the 25 and 45 cM

_design,

_powerwas

_only

0.33

_(0.10)

for the same tests.

When

_linkage

disequilibrium

was

generated

by phenotypic

truncation selection

of sires for the

_design

with

_QTL

_positions

of 30 and 70 cM and

_{joint QTL genetic}

_50%,

_QTL

_parameters

and their estimates were

_clearly

affected.

Heritability

was estimated low

(0.213 ! 0.020),

and the

vfl (k

=

1, 2)

_were

estimated

_{high (0.184 ±0.014,}

_{0.211 ±0.014),}

but

_{QTL positions}

were estimated

accurately

(0.292 t

0.010,

0.716 ±

_0.007).

Power

_appeared

to be somewhat reduced

compared

to the same

_design

without

_selection,

and was estimated at 0.86 and 0.73 for the

_X

_i

_{1 and X thresholds,}

_{respectively.}

Reduction _{in power}was

_probably

due to

the

_high

estimate of error variance

(1737.9 t 55.3).

CONCLUSIONS

The REML

_{analysis of Grignola}

et al

_{(1996a, b),}

based on a mixed linear model with

random and

_normally

distributed

_QTL

allelic effects and conditional on

_incomplete

information from

_multiple

linked

markers,

has been extended here to fit

_multiple

linked

QTLs.

This extension is necessary to eliminate biases in the estimates of the

_QTL

_parameters

_position

and

_variance,

which occur when

_fitting

a

_single

_QTL

and other linked

QTLs

are

_present.

For the

_present

study,

the

analysis

had been

implemented

for two

_QTLs

on the same chromosome

_using

a two-dimensional

search. When

_fitting

_QTLs

or additional unlinked

_QTLs,

a more efficient search

_strategy

may be

required,

or alternative

algorithms

(eg,

Meyer

and

_Smith,

_1996).

In the

_meantime,

a

_{cyclic optimization approach}

was

implemented,

with one

QTL position

held constant while

_optimizing

the

other,

and vice versa, which reduced the number of likelihood evaluations and hence CPU time

_considerably

relative to a two-dimensional search. As likelihood maximizations

at different

_position

combinations are

_independent

of each

_other,

use of

_multiple

processors, if

available,

would reduce run-times

substantially.

For the

_one-QTL

model,

relationships

between the REML

_analysis,

the

equiv-alent method of Xu and

_{Atchley (1995),}

the method of Schork

_(1993),

and the

Bayesian analysis

of Uimari et al

₍₁₉₉₆₎

were discussed in

Grignola

et al

_(1996a).

The method of Xu and

_Atchley

₍₁₉₉₅₎

_fitting

variances associated with the

next-to-flanking

markers to account for additional linked

_QTLs

would not have worked well for the

_designs

studied here. A first reason is the inclusion of ancestors of the

sires in the

_analysis,

as their method fits random effects associated with the marker

alleles in founders which erode across

_generations

due to recombination. Another

reason is the small number of families

differing

in the

flanking

and

next-to-flanking

markers,

resulting

in too little information for estimation of variances associated with the

_{next-to-flanking}

markers in the method of Xu and

_Atchley

_(1995).

For similar _reasons,

_composite

interval

_mapping

_{(Zeng, 1994)}

and

_{multiple QTL}

map-ping

(Jansen, 1993),

which are based on the inclusion of markers as

cofactors,

are

not suitable for the

_analysis

of

_{multi-generational pedigrees.}

REML

_analysis

under the

_two-QTL

model

yielded

fairly

accurate

_parameter

estimates.

_Map

distance between the

QTLs

was

_{overestimated,}

with

_decreasing

(16)

study,

the REML

_analysis

was robust to the number of alleles at the

_QTLs,

as there was

_relatively

little difference in

_parameter

estimates and likelihood ratio statistics

between

_{designs generated}

with the normal-effects and biallelic

_QTL

_{models, given}

the number of

_{replicates performed.}

_(eg,

Knott and

_Haley,

1992;

Haley

et

_al,

₁₉₉₄₎

lead to the conclusion that a minimum distance of 20 cM

was

_required

between linked

_QTLs

for their

_separate

detection. This result was

confirmed in the

present

study. However,

discrimination among different numbers

of

_QTL

_(eg,

one versus

_two)

_requires

additional

research,

including

an

_{investigation}

of alternative

_approaches

such as an

_adaptation

of

_composite

interval

_mapping

to

pedigree analysis.

Gains in power from

_fitting

additional unlinked

QTL

and selection of

_QTL

to be included in the

_analysis

are related

_{topics warranting}

further research.

If there is

_{linkage disequilibirum}

due to

_{selection, QTL positions}

will still be estimated

_accurately,

while variance estimates and power may be affected.

Accounting

for

_{disequilibrium}

in the

_analysis

should be

_{investigated.}

ACKNOWLEDGMENTS

This research was

supported by

Award No 92-01732 of the National Research Initiative

Competitive

Grants

Program

of the US

Department

of

Agriculture, by

the Holstein Association

_USA,

and

_by

ABS Global Inc.

REFERENCES

Berger

RL

₍₁₉₉₆₎

Likelihood ratio tests and intersection-union tests. In: Advances in Statistical Decision

_Theory

_(N

Balakreshnan and S

_{Panchapahesan,}

_eds),

_Birkhauser,

Boston

_{(in press)}

Bulmer MG

₍₁₉₈₅₎

The Mathematical

_{Theory of Quantitative}

Genetics. Clarendon

_Press,

Oxford

Cantet

RJC,

Smith C

₍₁₉₉₁₎

Reduced animal model for marker assisted selection

_using

best linear unbiased

prediction.

Genet Sel

Evol 23,

221-233

Casella

_{G, Berger}

RL

₍₁₉₉₀₎

Statistical

_Inference.

Wadsworth &

_Brooks/Cole,

Advanced Books &

_Software,

Pacific

Grove,

CA

Churchill

G, Doerge

R

₍₁₉₉₄₎

_Empirical

threshold values for _quantitativetrait _mapping.

Genetics 138,

963-971

Da

_Y,

Ron

_M,

Yanai

_A,

Band

M,

Everts

RE,

Heyen DW,

Weller

JI,

Wiggans GR,

Lewin HA

₍₁₉₉₄₎

The

_Dairy

Bull DNA

_Repository:

A resource for

mapping

quanti-tative trait loci. Proc 5th World

_Congr

Genetics

_Appl

Livest Prod 21, 229-232 Fernando

_RL,

Grossman M

₍₁₉₈₉₎

Marker-assisted selection

_using

best linear unbiased

prediction.

Genet Sel Evol 21, 467-477

Goddard M

₍₁₉₉₂₎

A mixed model for

_analyses

of data on

_{multiple genetic}

markers. Theor

t

l

pPl

Genet

_83,

878-886

Gotz

_KU,

Ollivier L

₍₁₉₉₄₎

Theoretical aspects of

_applying

_{sib-pair linkage}

tests to

livestock species. Genet Sed Evol 24, 29-42

Grignola

FE,

Hoeschele

_I,

Tier B

_(1996a)

_{Mapping quantitative}

trait loci via residual maximum likelihood: 1.

_Methodology.

Genet Sel

_{Evol 28,}

479-490

Grignola FE,

Hoeschele

I, Zhang Q,

Thaller G

_(1996b)

Mapping quantitative

trait loci via residual maximum likelihood: 1. A simulation

_study.

Genet Sel Evol 28, 491-504

Haley CS,

Knott

_SA,

Elsen J-M

₍₁₉₉₄₎

_Mapping

quantitative trait loci in crosses between

(17)

Haseman

_JK,

Elston RC

₍₁₉₇₂₎

The

_{investigation}

of

linkage

between a _quantitativetrait and a marker locus. Behav Genet _{2, 3-19}

Hoeschele I

₍₁₉₉₃₎

Elimination of quantitative trait loci

_equations

in an animal model

incorporating genetic

marker data. J

_Dairy

Sci

_76,

1693-1713

Hoeschele

_I,

Uimari

_{P, Grignola FE, Zhang Q, Gage}

KM

₍₁₉₉₆₎

Statistical

_mapping

of

polygene

loci in livestock. Proc Int Biometrics Soc

_(in

_press)

Hospital F,

Chevalet C

₍₁₉₉₆₎

Interactions of

_{selection, linkage}

and drift in the

_dynamics

of

_polygenic

characters. Genet Res Camb 67, 77-87

Jansen RC

₍₁₉₉₃₎

Interval

_mapping

of

_{multiple quantitative}

trait loci. Genetics

_135,

252-324

Knott

_{SA, Haley}

CS

₍₁₉₉₂₎

Maximum likelihood

_mapping

of quantitative trait loci

_using

full-sib families. Genetics 132, 1211-1222

Lander

_ES,

Botstein D

₍₁₉₈₉₎

_Mapping

Mendelian factors

_underlying

quantitative traits

using

RFLP

_linkage

_maps.Genetics 121, 185-199

Meyer

K

₍₁₉₈₉₎

Restricted Maximum Likelihood to estimate variance _componentsfor animal models with several random effects

_using

a derivative-free

_algorithm.

Genet Sel Evol _21,317-340

Meyer K,

Smith SP

₍₁₉₉₆₎

Restricted Maximum Likelihood estimation for animal models

using

derivatives. Genet Sel Evol 28, 23-50

Schork NJ

₍₁₉₉₃₎

Extended

_{multi-point identity-by-descent analysis}

of human

quanti-tative traits:

_Efficiency,

_power,and

_modeling

considerations. Am J Hum Genet 53,

1306-1319

Thaller

G,

Hoeschele I

_(1996a)

A Monte Carlo method for

_{Bayesian analysis}

of

linkage

between

_single

markers and

_quantitative

trait loci: I.

_Methodology.

Theor

Appl

Genet

93, 1161-1166

Thaller

_G,

Hoeschele I

_(1996b)

A Monte Carlo method for

_{Bayesian analysis}

of

_linkage

between

_single

markers and _quantitativetrait loci: I. A simulation

_study.

Theor

_Appl

Genet

_{93, 1167-1174}

Uimari

_P,

Thaller

_G,

Hoeschele I

₍₁₉₉₆₎

The use of

_multiple

linked markers in a

_Bayesian

method for

_{mapping quantitative}

trait loci. Genetics

_143,

1831-1842

Uimari

_P,

Hoeschele I

₍₁₉₉₆₎

_Mapping

linked quantitative trait loci with

_{Bayesian analysis}

and Markov chain Monte Carlo

algorithms.

Genetics 146, 735-743

VanRaden

PM, Wiggans

GR

₍₁₉₉₁₎

_{Derivation, calculation,}

and use of national animal model information. J

_Dairy

Sci

74,

2737-2746

Wang

T, Fernando

_RL,

van der Beek

_S,

Grossman M

(1995)

Covariance between relatives for a marked

_quantitative

trait locus. Genet Sel

_{Evol 27,}

251-274

Weller JI

₍₁₉₈₆₎

Maximum likelihood

_techniques

for the

_mapping

and

_analysis

of

quanti-tative trait loci with the aid of

genetic

markers. Biometrics 42, 627-640

Weller

_JI,

Kashi

_Y,

Soller M

₍₁₉₉₀₎

Power of

_daughter

and

_{granddaughter designs}

for

determining linkage

between marker loci and

_quantitative

trait loci in

_dairy

cattle. J

Dairy

Sci

73,

2525-2537

Whittaker

JC, Thompson R,

Visscher PM

₍₁₉₉₆₎

On the

mapping

of

_{QTL by regression}

of

phenotype

on

_{marker-type. Heredity}

_77,23-32

Xu

_{S, Atchley}

WR

₍₁₉₉₅₎

A random model

approach

to interval

_mapping

of

_Quantitative

Trait Loci. Genetics

141,

1189-1197