Genotypic covariance matrices and their inverses for models allowing dominance and inbreeding

(1)

HAL Id: hal-00893829

https://hal.archives-ouvertes.fr/hal-00893829

Submitted on 1 Jan 1990

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Genotypic covariance matrices and their inverses for

models allowing dominance and inbreeding

Sp Smith, A Mäki-Tanila

To cite this version:

(2)

Original

article

Genotypic

covariance matrices

and their

inverses for

models

allowing

dominance

and

_inbreeding

SP Smith

A Mäki-Tanila*

Animal Genetics and _{Breeding Unit, University}

_of

New

_England,

_Armidade, NSW _2351,Australia

(Received

11 June _{1988; accepted}3 _July

₁₉₈₉₎

Summary - Dominance models are _{parameterized}under conditions of

_inbreeding.

The

properties of an infinitesimal dominance model are reconsidered. It is shown that mixed-model

_methodology

is _justifiableas _{normality assumptions}can be met. Tabular methods for _{calculating genotypic}_{covariances among inbred relatives}are described. These methods

employ 5 parameters required to accommodate _additivity,dominance and

_inbreeding.

Rules for _calculatinginverse _genotypiccovariance matrices are _presented.These inverse matrices can be used _directlyto set up the mixed-model equations. The mixed-model

methodology allowing

for dominance and

_inbreeding

_providesa powerful framework to

better _explainand utilize the observed variation in _quantitativetraits.

dominance _/inbreeding / infinitesimal models _/inverse _/mixed _{model / recursion}

Résumé - Matrices de covariances génotypiques et leurs inverses dans les modèles incluant dominance et consanguinité. Les modèles de _{génétique quantitative}incluant la dominance sont considérés dans des conditions de _{consanguinité.}

_Après

une discussion des _propriétésdu modèle _{infinitésimal,}on _{montre que}la _{méthodologie}des modèles mixtes

peut être _appliquéeà cette _situation,dans la mesure où les _hypothèsesde normalité

peuvent être _satisfaites.On décrit des méthodes tabulaires pour calculer les covariances

génotypiques parmi des _{apparentés consanguins,}dont _l’emploinécessite l’introduction de 5 composantes de variances. On

_présente

les

_règles

du calcul direct de l’inverse de ces matrices de covariances

_{génotypiques,}

connaissant la _{généalogie,}et ces 5 composantes de la variance. Ces matrices inverse peuvent être utilisées directement pour établir les équations

du modèle mixte. La _{méthodologie}du modèle _mixte,_prenanten _compteles interactions

de dominance et la

_{consanguinité, fournit}

un cadre _pourune meilleure _explicationet une meilleure utilisation de la variabilité des caractères

_{quantitatifs.}

dominance _/_{consanguinité}_/modèle infinitésimal _/inverse _/modèle mixte /

algorithme récursif

*

Permanent address: Department of Animal Breeding, Agricultural Research Centre _(MTTK), 31600 Jokioinen, Finland.

**

(3)

INTRODUCTION

The mixed linear model has

_{enjoyed widespread}

_acceptancein animal

_breeding.

Most

_applications

have been restricted to models which

_depict

additive _{gene action.}

However,

there is also concern with non-additive effects within and between breeds and crosses

_(eg,

_Hill,

_1969;

_{Kinghorn, 1987;}

Miki-Tanila and

_Kennedy,

_1986).

Henderson

₍₁₉₈₅₎

_provided

a

statistical framework

for

_modelling

additive and non-additive

_genetic

effects when there is no

_inbreeding.

With

_inbreeding,

the mixed model allows statistical

_{analysis, however,}

considerable

_{developmental}

work

remains.

_Inbreeding

_complicates

covariance structures

_{(Harris, 1964).}

_Moreover,

inbreeding

depression

is a manifestation of interactions like dominance and

_epistasis.

Models which include

_only

additive effects and covariates for

_inbreeding

_(eg,

Hudson and Van

_Vleck,

₁₉₈₄₎

are

rough

_{approximations.}

The proper treatment of

_inbreeding

and dominance involves 6

_genetic

_parameters

(Gillois,

1964; Harris,

1964).

These _parametersdefine the first and second moments

of

_genotypic

values in the absence of

epistasis.

A

_{genetic analysis}

is

_{possible by}

repetitive

sampling

of lines derived from one

_population

_through

a fixed

_pedigree

(eg,

Chevalet and

_Gillois,

_1977).

_However,

we should like to

_perform

an

_analysis

where the

_pedigrees

are realized with selection

_and/or

random

_mating.

This could be done if an infinitesimal model was feasible and we could

_apply

normal

_theory

and the mixed model.

_Furthermore,

it would be useful to build covariance matrices

and inverse structures

_easily,

to enable use of Henderson’s

₍₁₉₇₃₎

mixed-model

equations.

This _papershows how to

_justify

and

_implement

these activities. It is

an extension of Smith’s

₍₁₉₈₄₎

_attemptto

_generalize

models with dominance and

inbreeding.

DOMINANCE MODELS Finite loci

In this section we introduce the 6

_genetic

_parametersneeded to model

_additivity,

dominance and

_{inbreeding depression.}

These _parametersare functions of _gene

frequency

(p

i

for the

i

th

_allele)

in much the same way that

heritability depends

on _gene

_frequency

for

_purely

additive traits.

First,

consider the

_{genotypic effect,}

_g_ij for 1 locus

_{represented by}

where p

is the _{mean, a}i and aj are the additive effects for the

i

th

and

_j

_th

_allele,

and

d

ij

is the

_{corresponding}

dominance deviation.

_Equation

₍₁₎

_representsa _systemof

r(r

+

_1)/2

equations

in r + 1 +

_r(r

+

_1)/2

unknows

_(ie,

_!,_a_i_,_a_j_,

_d

_ij

₎

where r is

the number of alleles. To

_uniquely

determine _p,aiand

_d

_ij

requires

additional r + 1 constraints

_given

as:

These constraints are derived from effectual definitions

_applied

to

_populations

in

(4)

It follows that in

_{populations undergoing}

random

_mating,

the additive variance is:

and the dominance variance is:

To accommodate

_inbreeding

_requires

3 additional _parameters:

_(i)

the

_complete

inbreeding

depression:

(ii)

the dominance variance among

homozygotes:

and

_(iii)

the covariance between additive and dominance effects among

homozy-gotes:

It is convenient to work with the

_{parameter 6}

₂

=

06 + 6

which

is a second moment.

The

_symbol

&dquo;&dquo;’&dquo;

is a reminder that the associated

_{parameter refers}

to 1 locus. When there are n

_loci,

_{then the}

_parametersof

_interest,

_{say v,}

are

the

_single

locus

terms v

_{= (8fl, Qa, u6, 7.2}

82 _{-2 -}

&dquo;summed&dquo; over loci. All _parameters

terms

fol2,

fa2 a,

2, a2 - d, Ub, 2, U67 1 62

62

or

a2 bi

aa6l

&dquo;summed&dquo; over loci.

All parameters

in v =

{!a,

a-d,

U6

,

U

6

2

or

a-6,

_aa

₆₎

are formal sums. The column vector of

inbreeding

depression

(u

6 )

which is defined as a list of

E

6

for loci

1, 2, ... , n,

is

also

very

_useful.

Among

the

parameters,

we have the

dependancies

ua

=

u u

6

and

or 6 2 = 62 _ U2 6*

The _{parameters v}describe a

_{hypothetical population}

of infinite size

_undergoing

random

_mating

and

_inlinkage

_equilibrium.

This

_population

is sometimes referred

to as the base

population,

but we find this usage

misleading.

In the

spirit

of

Bulmer

_(1971),

let us introduce

_segregation

effects

defined as

deviations

from

mid-parent

values. In

_fact,

both additive and dominance effects have

_mid-parents

values,

as will be seen later. Now we can define v as _parametersthat determine the

stochastic

_properties

of

_segregation

effects for an observed

_sample

of animals from a

known

_pedigree.

Whether or not these

_segregation

effects are

_{representative}

of some

ancestral

_population

_(perhaps

several

_generations

_old)

_is,

of course,

questionable.

Indeed,

ancestral effects associated with a

_sample

of animals can be treated as fixed

(Graser

et

_al,

₁₉₈₇₎

_{and, hence, segregation}

effects and estimates of v can be far

removed from the ancestral base. This

_{interpretation}

is robust under

_selection,

with the added

_assumption

that

_linkage

_{disequilibrium}

in one

_generation

influences the

_through

the

_mid-parent

values. Our

_assumption

need

_only

be

approximately

correct over a few

_generations

_(perhaps

far removed from the

base).

It is

_important

to

_point

out that these views are definitional and no method of

(5)

The

_disruptive

forces of

_genetic

drift on our _usageof v are

_probably

of

_negligible

importance;

a small

_population

is

_just

another

_repetition

of a fixed

pedigree sampled

from the base

_population.

Infinite loci

It is feasible to define an infinitesimal model with dominance

(Fisher,

1918).

When

there is directional

_dominance,

we

_might

observe

1 U6

I

_going

to

_infinity

or

U2

_and

a!d

_d2

going

to zero

_(Robertson

and

Hill,

1983).

However,

it is our belief that this

_problem

is characteristic of

particular

infinitesimal

models,

not all infinitesimal models. To

show

_this,

we have constructed a counter

_example.

Because

_{o, 2,a}

₂

_,

_a2,

_a_a6and _U6 are formal _sums,_{it is necessary}

(but

not

_sufficient)

for the contributions from

_single

loci to be of the order

n-

1

where n is the number

of

_{loci; ie,}

if the limit of v is finite.

_Whereas,

it

_might

seem reasonable to

_require

location effects like

_U

₆

to

_approach

0 at a rate of

_n-

₁

_/

₁

_,

this is not necessary and it _mayresult in infinite

_inbreeding

_depression.

Now let us

_imagine

an infinite number of

_loci,

each with 4

_possible

_alleles,

that

could be

sampled

with

equal

likelihood.

_Assuming

that the dominance deviations

for each locus are as

_given

in Table

I,

these deviations are consistent with constraints

(2).

In this

_example

it is

_possible

to use _anyadditive effects also consistent with

_(2),

where

_a2

_is

_proportional

to n-1. For a

_{particular locus,}

the

inbreeding

depression

and dominance variances are:

_U

₆

=

-1/(2n);

a’

=

1/(4n

2 )

₊

3/(8n);

a2

=

1/(4n

2 )

+

_1/(2n).

Summed over n loci these

become:

u6 =

-1/2;

Qd

=

1/(4n)

+

_3/8;

_a

₆

=

_1/(4n)

_{+ 1/2.}

_Letting

n drift to

_infinity

_gives

the

_following

non-trivial _parameters:u6 =

-1/2;

a

§

=

3/8;

or2

=

1/2.

This

provides

our counter

example.

There does not not seem to be an

_analogous

_example

_involving

_only

two alleles.

_However,

the biallelic situation is

_{uninteresting}

because it

_implies

a

singularity:

-2

= a2a2

smgu arlty:

Uab - uau8’ 6* >

The above demonstration may seem artificial because it is

spoiled by

global

changes

in _gene

_frequency

_(WG

_{Hill, 1988,}

_personal

_{communication).}

_However,

we can construct other more elaborate counter

examples.

For

_instance,

let loci _{vary in} their contribution to the _parameters.Let there be infinite loci indexed

1, 2, ... ,

n,

where

_Qd

_,

_{!6 !}

_0,

and there is no directional

_dominance;

_ie,

u6 = 0.

Among

(6)

where k

2 < n

<

_(k

+

₁₎

₂

_.

_By

_redefining

the contributions from

_single

loci to

E

6

=

-n-

l

/

2

₊

u

s

and

a2 a

2 <

n-

1 /

2 _(perhaps

_{by altering}

gene

frequency),

we notice that ub =

-1,

and

Qd

and

!6 are

non-zero at the limit when n _goesto

infinity.

We can create _yetanother

_subsequence

with indices

2, 5, ... ,

k2

_{+ 1,}

where

k

2 +1 <_

n <

_{(k+ 1)}

₂

_{+ 1.}

_Taking

the

_previous

alterations and

_assuming

for the latter

sequence

E

6

=

1/2n-

1 /

2 + E

6 ,

the limit value

of u

6

is

-1/2.

It is

possible

_toselect _a

finite number of

_subsequences

and make similar alterations. Each of these sequences

becomes

_{infinitely long}

as n

_{approaches infinity. Hence,}

infinitesimal

_models,

where

0 <

_J

_U61

< ₀₀_,

U2

q 0

0 and

_{Qa 5!}

_0,

are feasible.

With an infinitesimal

_model,

v is a function of summary statistics that involves gene

frequency.

Individual gene

frequencies

have little or no effect on v.

Moreover,

genotypic

effects summed over loci follow a normal distribution. This

_implies

that selection and

_genetic

drift can be accommodated

_by

the mixed

_model,

as

suggested by

Bayesian

arguments

(eg,

Gianola and

_Fernando,

_1986).

In

_particular,

the

_assumption

about the influence of

_linkage

_{disequilibrium,}

discussed

_earlier,

is valid under the infinitesimal model.

The real issue is not whether

_[u

₆

_is

infinite or dominance variances are zero, but

whether

_normality

and

_linearity

are

_appropriate

_assumptions

_given

a finite number

of loci. If

_[u

₆

is estimated from real

data,

it will be found to be

_infinite,

_although

it

may be very

large.

Furthermore,

if dominance variances are

_found

to be non-zero,

and if _manyloci are

_involved,

then it would seem that a contrived infinitesimal

model

_(like

the ones

_above)

is

_appropriate.

Normal

_{approximations}

are

_adequate

under most realistic models for

_genetic

_variation;

there

_being

a small number of

major

loci and a

_large

number of minor loci

_(Robertson,

_1967).

_However,

with a very small number of

loci,

these

approximations

become less

adequate

with each

additional

_generation

of selection.

GENOTYPIC COVARIANCE STRUCTURES

Harris

₍₁₉₆₄₎

_developed

recursion formulae for

_evaluating

the

_identity

coefficients needed to determine covariances _amonginbred relatives. In a later paper,

Cocker-ham

₍₁₉₇₁₎

elaborated on these methods.

_{Using zygotic}

_networks,

Gillois

(1964)

also devised a scheme to evaluate

_{identity coefficients,}

and Nadot and

_Vaysseix

(1973)

published

an

_algorithm

for

_implementing

Gillois’s

_procedure.

In this _paper,tabular methods for

_evaluating

second moments are

presented.

These

_techniques

allow the exact evaluation of

_genotypic

covariances without

cal-culating

individual

_identity

coefficients. The first class of methods are

_conceptually

easy and are modelled after the

_genomic

table described

by

Smith and Allaire

(1985).

The second class

_(those

based on

compression)

are

conceptually

more

(7)

Methods based on

_gametes

Each animal in a

_pedigree

receives 1

_genomic

half or _gametefrom each of its _parents.

Thus,

every animal has 2

genomic

halves and the total number of such halves is

r =

_2s,

where _sis the number of animals.

Let ai be a column vector of additive

_effects,

such that the

It

h

element of ai

equals

the additive contribution of the

l

th

locus in the

i

th

_gamete.If there are n

loci,

then _a_ihas

_length

n. Under an infinitesimal

_model,

aiis

infinitely long.

Define

d

ij

as a vector of dominance

_{deviations, typical}

of the union of

_gametes

i and

_j.

The

it

h

element of

_d

_ij

_equals

the dominance contribution of the

E!h

locus. If i and

j

are

_genomic

halves from different

_animals,

the vector

_d

_ij

_depicts

the dominance deviations for a

_phantom

animal.

Like

_animals,

_gameteshave a

_{pedigree; genomic}

halves in one animal form a

parental pair

for

_producing

_gametes.Let us assume the _gametesare ordered such

that i >

_j,

if _gametei is a descendant of _gamete

_j.

_Furthermore,

let us assume

i

_{> j}

_implies

that

_{gamete j}

is a base

_population

_gamete

if i is.

_Next,

_imagine

the ordered _sequence:

where I is an

_identity

matrix of order n. This is a _very

_long

list

_comprising

of

(r+1)(r+2)/2

arrays.

Fortunately,

we need

only

select a much smaller

subsequence,

G =

{I,

_g_l_,_g₂_,...,

gp}

from this

list;

_ie,

the arrays that _are

_actually

needed for

recursive calculations. An

_algorithm

for

_extracting

G is

_presented

in

_Appendix

A. The elements of G are used as row and column

_headings

in a table

depicting

the

second moments

_E{G’G}

which is

_represented

_by:

This table is referred to as the extended

_genomic

table

_(cf

Smith and

_Allaire,

1985),

and is denoted

_by

E.

Elements of E are

_{computed by}

recursion.

_Starting

with the first _row,elements

are evaluated from left to

_right.

When the first row is

_completed,

the first column is

filled in

_using

_symmetry.The

_remaining

elements in the second row are evaluated

from left to

_right

and the second column is then filled in

_using

_symmetry.

This

process is continued for each additional row and column. The recursions used to

compute E are listed

_below,

where B is defined as the index set of all base

_gametes,

i >

_{j, k,}

_m,k > _m,and _{parent gametes}of i are x and y. The

proofs

of these formulae

are due to

_properties

_involving

sums of

_expectations

and conditional

_{expectations.}

(8)

where i - x _or_{y represents}the event that the

t

th

locus of

_gamete

i is identical

_by

descent to that x or _y,

_{respectively.}

The

_product

_gig

_w

is intended to involve the

gametes

i, j, k

and m

(v

and w are used to

_identify

the associated columns of

_G).

(i)

First n rows:

(ii)

Subsequent

rows:

(a)

Additive and additive

(9)

(c)

Dominance and dominance

the recursive formulae in

_(c)

_appearsin Smith

_(1984).

When

_ie0,

the above recursions are initialized

_assuming

that

_gametes

are

_sampled

at random from a

_{single population.}

For this case, we have additional

_{simplification}

for all values of i:

Now that the recursive structure of E has been

_shown,

it is

_possible

to describe the

al

f

orithm

of

_Appendix

A. Define

_{f (v)}

as the _youngest

_gamete

associated with

the vtcolumn of

_G,

_sayg&dquo;. The matrix or list G is said to be closed under

_gametic

recursion if the terms used to

_expand

any gv

by

parent gametes of

_{f (v)}

are also

of G. More

_formally,

_(g

_v

_{I f (v) =}

_x)

and

_(g

_v

_{I f (v)}

=

y)

_arecolumns of G when

f(v);(),

and has _parent

_gametes

x and _y.The

_algorithm

in

_Appendix

A is called

a

_depth-first

search and it

_produces

sets of vectors closed under recursion.

_Any

element needed to evaluate any recursion can

_always

be

_found

in E. The

_algorithm

of

_Appendix

A can also be used to define the

_subsequence

G

introduced below.

It is

_possible

to combine additive and dominance effects into

_{genotypic effects,}

say

i

ij

= _ai₊ a

j+

_d

_ij

_,

and use these as row and

column

headings

of a new table.

The

_headings

are ordered as some

subsequence,

_say

G,

of

The recursions for

E{G’G}

are

_exactly

as

_they

are for

Efd

ij

}

and

Eld

’

3d

k

,,,},

(10)

After

_building

a matrix of second moments, the

_(co)variance

matrix

_(for

_genetic

effects summed over

_loci)

is obtained

by

absorbing

the first n rows. The

_resulting

array is a function

of u

6 only through

u2

=

u6 U{j.

The

vw

th

element of the absorbed

array E is:

which reflects the

_assumption

that _genotypesare additive over

_{independently}

segregating

loci. This

_assumption

can be

relaxed,

as

linkage disequilibrium

can

sometimes be accommodated via conditional

_{(ie, Bayesian)}

_analyses.

In

_practice,

we never evaluate the entire _arrayE or

E{G’G}.

In

_particular,

the

first n rows and columns can be

represented

_implicitly

_by

one row and column: rows of:

are

simple multiple

of each other. Our purpose is to show structural

_properties

that allow inversion rules.

_{Nevertheless,}

the above recursions are

_helpful

in

_evaluating

particular

moments; eg, those needed to _computethe inverse. This can be

accom-plished by

adapting

Tier’s

₍₁₉₉₀₎

recursive

_{pedigree algorithm:}

one calculates

only

needed moments and avoids redundant calculation. _{We may}add to our

_recursions,

shortcuts for

_particular

_degenerate

cases:

These remarkable results do not

_depend

on i _>

_{j, k,}

m or

k >

m.

They

are

due

to the

_principle

of conditional

_independence

and to the rule that

_{probabilities}

are

additive for

_mutually

exclusive events. The first rule

_appeared

in Maki-Tanila and

Kennedy

(1986).

It is similar to a rule in Crow and Kimura

(1970,

p

134)

based

on additive

relationship,

_although

rule

(i)

is more robust under

_inbreeding.

We also

(11)

where 77

=

QabQa

2r

’

Y=

82U;

;

2,

a=

o-!o-!!

and

_p=

ubQ! 2.

Because

_E,

_excluding

the first n rows and

_columns,

is at most of the order

_r

₂

_/2

by

r

2 /2,

where r is the number of _gametes,one

_{might incorrectly}

conclude that

proposed

calculations are

_prohibitive

_(of

the order

_r

₄

_/8)

and of no

practical

value.

Recursive

_algorithms,

like the

_depth-first

search in

_{Appendix A,}

can be

_surprisingly

fast. The value

_{of r}

₄

_/8

should be

_regarded

as an _upper

_boundary

that _protectsthe

algorithm

from combinatorial

_{explosion -}

the kind of

_explosion

that

_might

occur

when

_{enumerating genetic pathways}

in a

_{pathological pedigree.}

Compressed

tables

The

_genomic

table

_{given by}

Smith and Allaire

₍₁₉₈₅₎

can be

_compressed.

We may

add

_together

elements in 2

_by

2 blocks

_{corresponding}

to

_animals,

and

_multiply

this matrix

_by

_1/2,

to

_give

the numerator

_relationship

matrix.

_Compression

_{of E}

is

also

_feasible,

and has

_already

been demonstrated above for a case

_involving

G. In

general,

E is

_{compressed by}

_combining

columns of G to create a new matrix C. To

be

_useful,

C should be smaller than G and contain

_pertinent

effects.

It is

_possible

to devise recursive methods for

_evaluating

_E{C’C},

when C

is not closed under recursion.

However,

methods become more meticulous. For

example,

since the vector of additive merits for animals is not closed under

_gametic

recursion,

we need to add

_inbreeding

coefficients to the

_diagonals

when

_calculating

the numerator

_relationship

matrix.

Whereas,

when

_compression

is defined as the addition of all G

_columns,

it is

possible

to

do

this

_{stochastically,}

as Harris

(1964)

has done. For

_{example, Harris,}

by

preferring

a

_{zygotic analysis}

over a

_{gametic analysis,}

devised a scheme where entities were created

_by

a random

_sampling

of _genesfrom

_existing

_genotypes.

Compression

is an

_important

area and it needs to be

_developed

further. Some

concepts will be illustrated later

_by

an

_example.

INVERSE STRUCTURES General rule

(12)

Matrix E contains second moments and not

_{(co)variances}

as

_{required by}

the

mixed model

_{equations. However,}

_deleting

the first n rows and columns of E-1

gives precisely

an inverse matrix of

_{(co)variances.}

The extended

_genomic

table

is characterized

_by

blocks

_along

the

_diagonal.

_{By inspecting}

labels attached to vectors in

_G,

it is seen that

_they

come _{in groups.}For

_example,

the _groupassociated

with _gametei is a

_subsequence

of ai,

,

d

li

d2it ...

d

ii

.

Likewise,

when

considering

E =

E{G’G}

_wefind blocks

_along

the

_diagonal

_{associated with}_gametes._Recursions

above the

_diagonal

blocks are functions of column indices and not of row indices.

Now consider a submatrix

A

k

where

for some L and

_A

_k

contains the first k + 1 blocks. The matrix L is a

_simple

matrix defined

_by

column indices. If k = 1 note that:

for some

_L

_o

_,

where

A

o

corresponds

to the base

_assignments,

and

B

o

is the second block.

Let us assume that

Ao

is given

(perhaps

without the first n rows and

columns)

and note that:

-

-With

_(B

_o

_—

_LoA

_o

_L

_o

_)-

₁

_evaluated,

we find that

_All

is

a

simple

function of

AÕ

1 .

Given

_Ao

₁

_,

it is

_possible

to _compute

_A2

₁

_,

where

and

_B

₁

is the third block. In

_general,

_given

_A-’

we

can evaluate

A-1

1 ,

where

and

B

k

is block k + 2. The

_general

inversion formula is:

To evaluate

_E-

₁

_,

_apply

this rule

_{recursively starting}

with k = 0.

It is

_hoped

that

B

k

-

_k

_L

_k

_L[A

will be

sufficiently

small or

sufficiently

sparse

so

that its inversion is feasible

_(eg,

Tier and

_Smith,

_1989).

For

_evaluating

_E-

₁

_,

the

worst scenario is that the order of

_B

_k

_-

_LkA!L,!

is r +1.

_However,

this occurrence

is

_unlikely.

Note that Henderson’s

₍₁₉₇₅₎

rule for

_calculating

the inverse numerator

(13)

There are some notable

_{simplifications}

when E-1 is to be evaluated.

_First,

evaluation of

_Ao

is

best done

by

absorbing

the first n rows and then

_deleting

the first n rows and columns. The

resulting

matrix is some

_permutation

of a block

diagonal

matrix

_involving

2

_by

2 matrices:

and 1

_by

1 matrices

_{a§ and}

_ad.

This is a trivial matrix to invert.

Second,

k

-

B

_k

_L

_k

_A

_k

_L

has a

_peculiar

structure that can be identified

_by

examining

the recursive definition in Section IIIA. If block

B

k

corresponds

to

gamete i which has _{parent gametes}x and _y,then

L

k

is a matrix that

&dquo;picks&dquo;

appropriate

terms from

A

k

that involve x and _y.

_{Moreover, B}

_k

is also defined

by

terms that involve x and _y.Assume that the column

headings

for Bk are:

It

_might

be that i =

j

m

.

Now define the column

headings

where

_x

₌

_H

_(Fi

_I

i =

_{x) =}

_{{a!, d}

_d

_x

_l’

_j

_Xj

_2’

_....

_d!,,.}

_}

and

_{Hy = (Fi}

_I

-

_{y) _}

_{ay,

_dy

_Ù’

_d

_Y

_j

₂

_{7 ...}

_dyj&dquo;,

_I

In the definition of

_H

_x

and

_H!,

it is understood that

_d

_Xj

_&dquo;,

=

d!!

and

d!j&dquo;,

=

dyy,

if i =

j&dquo;,.

Select elements from

A

k

and build the

matrix,

where

_M!!

=

E{HxH!}, M

_x

_y

=

M!!

=

E{H

x

Hy},

and

Myy

=

EIH’Hy}.

A direct

_application

of the recursions

_gives:

Futhermore,

as

L

k

is a matrix that

_{&dquo;picks&dquo;}

terms under

_headings

H!

and

_H!:

and thus:

Equation

(5)

can also be derived

if B,!-L!A,!Lk

is

recognized

as the

(co)variance

matrix for the

_segregation

effects due to recombination of

_{gametes x}

and y in the

formation of _gametei. The

_mid-parent

values of

_F

_i

are the column vectors of

1/2H!

+

_1/2Hy.

As the

_{segregation effects,}

S =

F! - 1/2H! - 1/2H!,

have a mean

of _{zero, the}

_(co)variance

matrix is:

where S =

1/2H! -

(14)

Finally,

in rule

_{(3) L}

_k

_(B

_k

_-

_{L!A,!Lk)-1L!,}

_k

_-

_k

_(B

_-L

_L’A

_k

_L

_k

_)-

_l

and

(B! - Lj!A!L!) !

are contributions added under H

_by

H

_headings,

H

_{by F}

_i

head-ings,

and

_F

_i

_by

_F

_i

_headings,

_{respectively.}

Existence of inverses

When there is no

_inbreeding,

E-1 can be shown to exist.

_First,

we _presentthe

following

Lemma:

Lemma 1: In the absence of

_inbreeding,

_there

exists a matrix

_M

_k

_,

which is a

submatrix of

_A

_k

_,

and there exists a matrix

_X

_k

of full column rank such that:

where

_B!,,L,!

and

_A

_k

are associated with E.

Proof.

Because

_equ(5)

is

_given,

we

_only

_provethat such

M!

and

X

k

can be found

where XkM

k

X

k

=

1/4(M!! -

M!! - M!!

+

_Myy).

The matrix

_M

_k

_,

defined

_by

eqn(4),

will be ’a submatrix of

_A

_k

if there are no indices

_j

_m

= i,

j

v

= x

and j

w

₌_y

used in the

_definition

of

_F

_i

_.

The

_{algorithm presented}

in

_Appendix

A will not create

indices

_j&dquo;,,

_{= i, j}

_v

= x

and j

w

=

y when there is no

_inbreeding.

For this _case,we can take

_M

_k

=

M

k

,

and ’

=

1/2{I, -I},

where the

identity

matrix I has order

m + 1. Matrix

_X

_k

has full column rank

_((a.E.D).

Theorem 1: If E is constructed

_{by applying}

the recursion rules to some finite and non-inbred

_pedigree,

then E-1

exists,

provided:

Proof.

The matrix

A

o

is

_non-singular

when the condition of the theorem holds. Now assume that

_A-’

_exists,

then

_by

the inversion rule

₍₃₎

_A!+1

exists if

_(B

_k

_-

_L’A

_k

_L

_k

_)-’

exists.

_By

the above

_Lemma,

_B

_k

_-

_LkA

_k

_L

_k

=

3CkM

k

Xk-Because

_M

_k

is a submatrix of

_A

_k

_,

it is

_{non-singular. Therefore,}

(X!M!X!)-1

exists because

_X

_k

has full column rank. We conclude that the existence of

_Ak

1

implies

the existence

_{of A!+1.}

As

_AÕ1

_exists,

the theorem follows

_by

mathematical induction

_(Q.E.D).

The reader

_might

think that the concurrence of identical twins would contradict

Theorem 1.

_However,

this is not the _case,as the

_theory

assumes that _gametesare

distinct and can be ordered

_using

indices.

_Thus,

for identical

_twins,

the recursive

formulae

_presented

earlier are

incomplete.

This is not a

_practical

_problem,

as

identical

_gametes

can be

_{represented only}

once in G.

Henderson

₍₁₉₈₅₎

_considered

a non-inbred

population

and studied a dominance

relationship

matrix D. He used D-1_{in many}formulae without

_proof

of its existence.

However,

as D is a submatrix of

_E,

the theorem

_implies

that D-1 exists.

When there is

_inbreeding,

the

_algorithm

in

_Appendix

A will

_produce

labels like

dii,

di.

and

_diy,

where

_i!9

and has _{parent gametes}x and _y.In

general,

E is

_singular

(15)

Even with

_inbreeding,

4 labels like those in

_eqn(6)

need not occur

_together

as

headings

of E.

_However,

it is

_possible

to _mergethe

_identity

₍₆₎

with the recursion

and remove

d

ii

for

I(0

from the _sequenceG to _get

G -

see

_Appendix

A. The vector

G

is closed under recursion if

_eqn(6)

is used to redefine

_d

_ii

for

_i18.

The matrix of second moments

E

=

Eli7x4U}

has no row and column

_heading

_d

_ii

for

_I(0

and it is

_{non-singular.}

To prove

this,

we introduce Lemma 2 for

E. _{First, let}

us

_imagine

that

B

k

-

_LkAkL!

corresponds

to a

_particular

absorbed block of

E.

Lemma 2: There exists a matrix

M

k

,

which is a submatrix

_{of A}

_k

_,

and there exists

a matrix

X!

of full-column

rank,

such that

B

k

-

_k

_L

_L’A

=

-

4

where

B

k

,

L

k

and

_A

_k

are associated with

E. Proof.-

Because the

_pedigree

is

inbred,

we _expect

_{indices j,}

= x and

_j

_w

= _y

(x,

y parents of

_i)

in the definition of

F

i

- otherwise,

follow the

_proof

of Lemma 1.

By

construction,

there is no index

_j

_m

= i. This

implies

that there will be vectors

d

x

y,

d

xx

=

dxx, +dxx,,-dx!x&dquo;

and

_dyy

_{= d!yx+d!yy-dyzyv}

in

_H,

_{where x}

_x

_,

_{xy and}

y x

, yy are _{parent gametes}of x and _y,

_{respectively.}

With no loss in

_generality,

switch

the vectors aiwith

d

ix

,

and

d

i

y

with

di!&dquo;,

in

F

i

.

So as to maintain

_consistency

with the definition H =

{H

x

=

(Fi I

i =

x),

_{Hy = (F}

i

=

y)},

columns of H _maybe altered

_by

_switching:

These

_permutations

preserve the

identity

where

_X

_k

=

1/2{I, -I}

and

M

k

=

E{H’H}

if _wefurther

stipulate

that selected

columns of

G

have also been

_rearranged.

Note that columns and rows m + 1 and

m+2 2 in

_M

_k

are

_redundant,

as

_they

are both

_represented

_by

_d

_x

_y

=

dy

x

.

Therefore,

we _maydelete row and column m + 2 to create

_M

_k

_(delete

column m + 2 in

_H)

and note that

where

and I has order m — 2.

_Clearly,

i

k

is of

full-column

rank and is a candidate for

X

k

.

When x and _{y are}base

_gametes

(ie,

, xx_x!,Yx

and yy

are

unknown),

we are finished as we can take

M!

=

R

k

,

which is a submatrix of

_A

_k

_.

_{Unfortunately,}

when at

least one of the

_zygotes

_(x

_x

_,xy)

and

_,yy)

_Yx

₍

are

known,

M

k

is not a submatrix of

A

k

because of the

_composite

vectors

d!x

_{and/or dy}

.

We assume that both

(x

x

,

_yy)

and

_(y

_x

_{, y}

_x

₎

are known in the remainder of the

_proof

_(when

_only

1

_zygote

is

_known,

(16)

It

_might

be that

_d

_xx

_&dquo;

and

_d!!&dquo;

exist

_already

between columns 2 and m in the

redefined H.

_Likewise,

_d

_.

_y

_.

and

_d!!&dquo;

_mayexist

_beyond

_position

m. If _anyof the

labels _xx!,_xxy₇_yy_x and _yyydo not

_exist,

we introduce them as dominance vectors

in H.

_Further,

we may redefine the first and last columns to _representlabels _z_z_zy

and _y_x_yy,

_{respectively.}

This

_gives

us a modified matrix

Because

G

is

_{c_losed}

under

recursion,

all columns of

H

are in

G. _Moreover,

all columns of

H

are

_unique;

_XxX

_{y =1=}

_yxy!,because animals cannot mate with

themselves. We conclude that the matrix

is a submatrix of

A

k

.

Now define the matrix

/.. . I ... r. B.

where m and f are null _{vectors, except}for ones at

_positions

_{corresponding}

to zz

z

, Xxyl yyx and yyy; M and F are like

_{identity matrices,}

_exceptthat rows

corresponding

to _zz_z_,_xxy,_YYx and _yyywould be deleted if the

_{corresponding}

columns were introduced into H. Now

X

k

has full-column rank. Since

the Lemma is

_proved

_(Q.E.D).

Theorem 2: If

E

is constructed

_{by applying}

recursion modified

_by

_equ(6)

to some

finite

_pedigree,

then

E

exists,

_provided

Proof.

With the first n rows

_{absorbed, A}

_o

is

_non-singular

when the condition of the theorem holds.

_Therefore,

A

o

is

_non-singular

when the first n rows are intact.

The remainder of the

_proof

is identical to that of Theorem

_1,

_exceptthat reference

is made to Lemma

2,

rather than Lemma 1

_(Q.E.D).

We should _expect

_{singularities}

when

_inbreeding

coefficients

_approach

_unity,

as

occurs with the numerator

_relationship

matrix.

_Indeed,

the matrix

S

=

1/2H

x

-1/2Hy

becomes a null matrix when _{gametes x}

_{and y are identically}

_equal.

As

E{S’S}

equals

eqn(5),

the absorbed block for

_gamete

i is a matrix of zeros and

E

is

_singular.

Our construction of

E

assumes that the base

_population

is a random set of unrelated _gameteswhich have been

_sampled

from some infinite

_population;

that is even

_though

the base

_population

is

_finite,

_inbreeding

in the base does not

exist. Under

_{diploidy, inbreeding}

coefficients cannot be

_unity

with finite

_pedigrees.

Feasibility

It is

_beyond

the _scopeof the _present_paperto demonstrate the calculation of E-1

(17)

pedigree

borrowed from a beef cattle selection

_experiment

conducted at the

Agri-cultural Research

Centre,

Trangie,

NSW

_(PF

Parnell

_(1988),

_personal

communica-tion).

The

_pedigree

involved 1 122 animals with records and 625 ancestors without

records. The _{average number}of

_generations

on the female side was 9 and the maxi-mum was 16. On the male

side,

the average was 8

_generations,

with a maximum of

13. There were

_only

55 base _gametes,and the average

inbreeding

of the

population

was 0.1

_(the

maximum was

0.26).

The order of E and

E

was about 190 000. Matrix

A

o

was of the order 1595.

However,

excluding

A

o

,

the maximum block sizes were 321 and

323,

respectively.

The distribution of block sizes is

_presented

in Table II. Most of the blocks were of

the order 2 and 3. Given the absorbed

blocks,

and

_{ignoring possible singularities,}

matrices E-1 and

E

can

be evaluated. The

computations

are not

_trivial,

but are

feasible.

There are other

_approaches

to

_modelling

dominance that do not involve E-1

or

E-

1 .

For

_example,

we

_might

extract from E

_only

those elements that involve animals or animals with records.

_{If the}

extracted matrix is put into a matrix

_D,

then we could attempt to evaluate

D-

1 _using

sparse matrix

absorption

(Tier

and

Smith,

1989).

Then

U-

1

could be used in the mixed-model

_equations.

_{Alternatively,}

D

could be used

_directly

in the same way that Henderson

(1985)

used D. Because

E is an

_enlarged

matrix with unrealized combinations of _gametes

_(the

so-called

phantom

animals),

alternatives to

E-

1

are attractive.

_However,

breeding

schemes that utilize dominance variation are bound to

_require

the

_prediction

of dominance effects that

_correspond

to untried

_{gamete pairs}

_(say,

among gametes of _parentsthat contribute _genesto the next

_generation).

Further research is needed to evaluate all

(18)

ILLUSTRATION

In this section we demonstrate our

methods, using

the

_simple

pedigree

displayed

in

_Fig.

1. This

_pedigree

involves 4 animals or 8

_genomic

halves. For

_simplicity,

we assume

or2

₌

or =

_1;

U2

₌

a

2

₌0. Given that _gametes

_{pairs 12, 35,}

64 and 78 represent dominance vectors for animals and are to be included in

_G,

the

_algorithm

of

_Appendix

A creates 14 additional

_{pairs -}

_ignoring

first _arrayI and additive vec-tors. The matrix G is

_given

as row and column labels of Table III. Second moments

of Table III are derived

_{by applying}

the recursion of formulae. For

_example,

the

element

_Eld’2ld5il

was calculated as

_1/2EId’

_l

_d

_ll

_}

+

_1/2EId’

_l

_d

_2ll

_{= 0 + 1/2}

= 1/2.

To evaluate E-1

_requires

the determination of

_(A

₂

_-

_LiBiL!)-1

for i =

_{0,1, 2, 3.}

The absorbed blocks can be evaluated

_recursively,

but for _now,the reader _may

obtain these

_{by applying}

Gaussian elimination

_directly

to Table III. The inverse blocks are:

The matrices

_L!

_for

i =

0,1,2,3,

_are

displayed

below the

diagonal

in Table IV. In accordance with formulae

_(3),

the elements of

E-

1

are

_given

above the

_diagonal.

We can now _{compress E}

_{by defining}

_d

_b

=

d

u

₊

d

22 (with

labels,

this is notated

as b: 11 +

_22),

_d,=

d

31

₊

,

32 d

_d

= _d₄₁ ₊

_d

₄₂

_,

_d

f =

dsi

+

d

52 ,

di

=

d

74

₊

d

76 .

These

_definitions,

as well as other

_{symbolic definitions,}

are

_given

as row labels for

the

_compressed

matrix of Table V. As an

_exercise,

the

_compressed

table can be

evaluated

_{by inspecting}

Table III and

_adding

_together

_appropriate

elements. For

(19)

(20)

(21)

The real

_challenge

is to find

_simplifying

recursions that allow evaluation of

com-pressed

tables. In order to

_apply

inversion formulae

_(3),

recursions to the

_right

of blocks

_along

the

_diagonal

should be maintained

_exclusively

as a function of column

index,

as in the _present

_example.

These recursions are

_{given implicitly}

below the

diagonal

in Table VI as matrices

L

i

for i =

_{0,1, 2, 3.}

For

_example,

the element

E(d )di )can

also be obtained

_{by adding}

The blocks

_(A

_i

_-

_LiB

_i

_L

_il

_-

₁

for i =

0,1, 2,

are:

Inverse elements of the

_compressed

matrix are

_given

above the

_diagonal

in

Table VI.

DISCUSSION

Although

we have not

_emphasized

the evaluation of various

_identity

coefficients,

gametic

recursion of such coefficients is feasible.

_{Moreover, gametic recursions,}

like

those used to define

_E,

are _{very easy to}understand and _may

_provide

an

approach

useful in

_teaching,

whilst there may be more

_complicated

(but

_{perhaps numerically}

more

efficient)

alternatives. But

_gametic

recursion need not be

_numerically

ineffi-cient and is

_certainly

not

_subject

to combinatorial

_{explosion. Depth-first searches,}

like the

_algorithm

of

_{Appendix A,}

can be used to

_implement

_gametic

recursion in an

efficient _way.For

example,

to evaluate

_inbreeding

coefficients via

_gametic

_recursion,

one would first conceive of a

_symmetric

table with

_r

₂

_/2

elements.

However,

r

2 /2

operations

are not

required

to evaluate

_inbreeding

coeflicients via

_gametic

recur-sion. All that is needed is to

_identify

labels

_{i j}

of the dominance vectors

_d

_ij

in G.

By moving

down a list of such

_labels,

it is

_possible

to evaluate the coefficients

_using

one work vector. We need

_{only modify}

this scenario for

_{general identity}

coefficients. The

_{simplest explanation}

of the two

_commonly

found and

_{complementary}

phe-nomena

(heterosis

and

_inbreeding

depression)

is that there is dominance of alleles

at many loci.

_Therefore,

dominance should be

_regarded

as an essential feature of

genetic

models for loci

_affecting

_quantitative

traits. Characteristics

_closely

con-nected with

_fitness,

such as

_{reproduction,}

would

_mostly

have

_only

non-additive variation

_left,

as the additive has been exhausted

_(Robertson,

_1955).

There are also

suggestions

that dominance of alleles that _{maintain normal enzyme}

_activity,

is a

universal biochemical _property

_(Kacser

and

_Burns,

_1981).

It is clear that

_additivity,

dominance and

_inbreeding

can be modelled

_by

applying

mixed-model

_methodology.

The ramifications of such a

development

are

(22)

(23)

(24)

(i)

Determination of

_optimum

and

_{dynamic mating}

structures which

_capitalize

on additive and dominance variance while

_providing

for

_inbreeding.

Some of these

breeding strategies

can be studied

_by

the use of

_{moment-generating}

matrices for

regular mating

systems. Cockerham

_{( 1971)}

has

_given

an

_example

of such a transition matrix for full-sib

_mating.

Possible

_application

areas are:

(a)

mate selection

_(Jansen

and

_Wilton,

_1985;

Smith and

_Allaire,

_1985);

(b)

group selection

(Jansen,

1985;

Smith and

Hammond,

1987)

when the

selection of a random

_mating

_gene

_pool

is created

_by

a finite number of _parents.

The

_objective

of _groupselection is to

_improve

both additive merit and the _average

specific

combining ability

of _{genes in}the

_pool;

(c)

crossbreeding plans

to utilize between breed additive and heterotic effects

(Kinghorn,

1987);

(d)

selection of clones and animals created

_by

futuristic

_techniques.

(ii)

To allow dominance variance to be

_partitioned,

and

_thus,

remove some of

the

_confounding

that would otherwise _corruptstatistical

_analyses.

(iii)

To

_improve

our

_{understanding}

of traits which are

_likely

to show considerable

amounts of non-additive

_genetic

variation. This

_{understanding}

can be advanced

_by

simulation

_studies,

_eg,

_concerning

selection bias and finite loci. Other studies may

involve simulation of infinitesimal models with dominance via

_sampling

from normal

distributions.

The

_theory

we

_present

here is still _very

underdeveloped.

New methods are

needed for the estimation of

_genetic

_parameters,as _past

_approaches

have met

with _verylimited success

(eg,

_Gallais,

_1977).

It is our belief that an extension of

the derivative-free

_algorithm

of Graser et al.

₍₁₉₈₇₎

is

_possible,

and

_{consequently,}

genetic

parameters could be estimated

_by

restricted maximum likelihood. Research

is needed in the area of

_{computing strategies}

(eg,

_compression,

inversion).

Further,

models which allow different _parametersfor different

_populations

would be useful in

crossbreeding

studies. A multivariate extension of our work would also be desirable.

The

_forgotten

_papersof Harris

₍₁₉₆₄₎

and Gillois

₍₁₉₆₄₎

have been resurrected. While the methods are

_arguably

_{complicated, they}

_are,

_however,

feasible.

_Whereas,

classical

_quantitative

_genetics

is unable to

_fully

utilize information on dominance

and

_inbreeding

in

_predictions,

the

_appropriate

mixed model under a wide _rangeof

assumptions,

including

selection and environmental

_noise,

does that. ACKNOWLEDGEMENTS

The authors thank Keith Hammond for his _{encouragement.}This work was

sup-ported

by

the Australian Meat and Livestock Research and