• Aucun résultat trouvé

Relationship matrices and Iterative construction of their inverses

N/A
N/A
Protected

Academic year: 2021

Partager "Relationship matrices and Iterative construction of their inverses"

Copied!
42
0
0

Texte intégral

(1)

Relationship matrices

and

Iterative construction of their inverses

Pierre Faux & Nicolas Gengler

University of Liège, Gembloux Agro-Bio Tech, Belgium AFR Beneficiary (FNR)

University of Ljubljana, Slovenia November 13th, 2012

(2)

Summary

•  Theoretical measurement of similarity between

related

•  Construction of the matrix A

•  Direct creation of the inverse of A

•  Iterative construction of inverses using the

inverse factorization

(3)

Theoretical measurement of

similarity between related

individuals

(4)

Relationships and Similarity

•  When observing phenotypes:

–  A son has a certain similarity with his father

•  Relationships might help while predicting

unknown animals

•  How to scale this similarity? Theoretical

view:

–  What is similarity at a molecular level?

–  How to express this similarity from molecular to individual level?

(5)

•  Let us have a biallelic gene (either yellow or red)

•  2 possible configurations

–  Homozygous (identical)

–  Heterozygous (non-identical)

(6)

•  Homozygosity is imputed to the same common

ancestor

•  The « allele’s pathway » is known

Identical-By-Descent (IBD)

? A A X Y Z A A ? A ? A

(7)

•  IBD at a molecular level …

•  … is expressed as inbreeding at the individual

level

–  Inbreeding coefficient (F )= P(IBD), is function of d –  Inbreeding coefficient = ½ of relationship coefficient

(aX,Y) between parents,

–  Relationship coefficient is only additive part of relationship

Inbreeding

A A A X Y Z d

(8)

•  How to assess P(IBD) for animal Z?

•  Different gene configurations:

•  Thus, P(IBD) = 2/16 = 0.125 = FZ •  And aX,Y = 2 x 0,125 = 0.25

P(IBD)?

A X Y Z p=1/4 p=1/4 p=1/4 p=1/4 p=1 /4 p=1 /4 p=1 /4 p=1 /4 Given by Y G ive n by X

(9)

•  Count n , lowest number of steps to join X to Y through each of x common ancestors

•  Relationship coefficient between X and Y:

a X,Y = (2-n)

1 + (2-n)2 + ... + (2-n)x

Easier method to assess P(IBD)

A X Y 1 2 Halfsibs: 2-2= 0,25 X Y A B 1 2 1 2 Sibs: 2-2+ 2-2 = 0,5 2 1 A B X Grandchild to grandparent: 2-2=0,25 A X 1 Child to parent: 2-1=0,5

(10)

•  May ease more complicated cases

–  Example: various common ancestors at various levels

  aK,J= 2-5 + 2-3 + 2-3 = 0.28125 = 1/32 + 1/8 + 1/8 = 9/32

Easier Method

A B C F G D E H I J K

(11)

Construction of the additive

genetic relationship matrix

(12)

•  The previous method needs to be streamlined –  May be executed iteratively

–  Use of a symmetric matrix n*n (noted A) that will content

relationships between all individual

•  Requires to re-organize pedigree data

–  A given animal always needs to appear after both parent

•  Proceeds by filling successive squares

–  From square 1*1 to square n*n

–  For animal i, firstly row i from 1 to i, including element (i,i),

secondly column i from 1 to i-1 (using symmetry)

–  At each row, sum of weighted contributions from sire, dam, and

animal himself

(13)

1.  For a given population of n individuals (sorted

by generation order), define a square matrix of size n

2.  At line j, for all i younger than j, the relationship

between i and j is equal to the sum of the half of

the relationship of known parents of j with i

3.  At line j, diagonal element equal 1, plus half of

the relationship between parents of j

4.  At line j, paste line j (from 1 to i) in column j

5.  Go back to point 2.

(14)

•  For animal i, firstly row i from 1 to i, including

element (i,i), secondly column i from 1 to i-1

Tabular Method: Iterative process

1 2 3 4 5 6 1 2 3 4 5 6 (i,i) (i,i) (i,i)

(15)

  Numbered pedigree :

Tabular Method: Example

Animal Père Mère

A - - B - - C - - D - - E - - F B A G B C H D F I E G J E G K I H L J K

Animal Père Mère

1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 2 1 7 2 3 8 4 6 9 5 7 10 5 7 11 9 8 12 10 11

(16)

Tabular Method: Example

1 0 0 0 0 0,5 0 1 0 0 0 0,5 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0,5 0,5 0 0 0 1 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

(17)

Tabular Method: Example

0 1 0 0 0 0.5 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1+0*1/2 2 3 7 1 2 3 4 5 6 7 0 0.5 0.5 0 0 0.25 1 7 = 0,5 * + 0,5 * + 1 *

(18)

Tabular Method: Example

1 1/2 1/4 1/8 1/16 1 1/2 1/2 1/4 1/4 1/4 1/4 1/4 1 1/2 1/4 1/4 1/8 3/16 1 1/2 1/4 1/8 1 1/2 1/2 1/4 3/8 1/2 1/2 1 1/4 1/2 1/8 1/8 5/16 7/32 1/2 1/2 1/4 1 1/8 1/2 1/2 5/16 13/32 1/4 1/4 1/2 1/2 1/8 1 1/16 1/16 17/32 19/64 1/4 1/4 1/2 1/8 1/2 1/16 1 1/2 17/32 33/64 1/4 1/4 1/2 1/8 1/2 1/16 1/2 1 9/32 41/64 1/8 1/4 1/8 1/4 1/4 5/16 5/16 17/32 17/32 9/32 33/32 42/64 1/16 1/4 3/16 1/8 3/8 7/32 13/32 19/64 33/64 41/64 42/64 73/64 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

(19)

Direct creation of the inverse

of the matrix A

(20)

•  Additive genetic relationship matrix A will

structure the covariance of breeding values of animals in a related population

•  Used in the LHS of the MME for an animal

model:

Inverse of A

!

X X

X Z

!

!

Z X

Z Z +

!

!

" A

#1

$

%

&

&

'

(

)

)

(21)

•  Root-free Cholesky factorization of matrix A

•  Structures of matrices T and D:

  T: for any animal i, with sire s and dam d, the row is filled in with the same rules as for A :

  D: for any animal i, only one element, equal to

Factorization of A

A = TD !

T

T

i

= 0.5 T

s

+ 0.5 T

d

; T

i,i

= 1

(22)

•  Factorization of the inverse of A

•  Structures of matrices T-1 and D-1:

  T-1: for any animal i, with sire s and dam d, only 3 elements are filled in

  D-1: for any animal i, inverse of Di,i

Inverse of the factorization of A

A

!1

= (T

!1

) D

"

!1

T

!1

(23)

•  Structure of matrix T-1:

d, s are parents of i

(24)

•  Inverse of A can be viewed as a sum of sparse

matrix:

A-1 = (A-1)

1 + (A-1)2 + ... + (A-1)n

•  Each sparse matrix is created as:

(Ti)' x Di,i x Ti

•  Max. 3 non-zeros entries in Ti, each sparse

matrix is thus a block of max 9 non-zeros entries (with known positions: animal, sire, dam)

•  Time for creation is thus linearly related to size

of the population

(25)

Direct creation of the inverse of A

•  Three possible configurations (when no

inbreeding) –  0: = –  1: –  2: 1 1 1 1 -0,5 1 4/3 -0,5 1 = 1/3 -2/3 -2/3 4/3 -0,5 -0,5 1 2 -0,5 -0,5 1 = 1/2 1/2 -1 1/2 1/2 -1 -1 -1 2

(26)

Iterative construction of

inverses using the inverse

(27)

•  Let us define:

–  Zj = A, from row 1 to i and from column 1 to i –  yj = A, from row 1 to i, at column j

–  bj = (Zj)-1 yj

•  For this particular case, computation of bj is

trivial

•  Link with tabular method!

(28)

•  Let us now define Zk and (Zk)-1, (k=j+1)

where:

•  Relation between (Zk)-1 and (Zk)-1

Alternative way of direct creation

Zk = Zj yj y!j ajj " # $ $ $ % & ' ' ' ( (Zk))1 = (Zj) )1 +!jbjb! )j !jbj )!jb!j !j " # $ $ $ % & ' ' ' !j = (ajj ! b"yj j)!1 (Zk)!1 = (Zj) !1 0 0 0 " # $ $ % & ' '+!j !bj 1 " # $ $ % & ' ' !b( 1j " # $ % & '

(29)

•  As you can see ...

bj = - T-1 (at row j, from column 1 to i)

•  Situation is very trivial for the case of A

•  For other matrices, the aim is thus to determine

a set of bj with as much advantages as in A:

 easy to determine

 involving few computations  sparse

 having linear computation cost with size of matrix

(30)

Application to the case of

the inverse of G

(31)

•  A few words about genomic relationship...

–  Observed (or, at least, sampled) vs. expected

–  Some introduction with Bömcke and Gengler (2009)

The genomic relationship matrix

fMx,y;l = 1 4

(

Sac + Sad + Sbc + Sbd

)

; TAx,y;l = 2 fM TAx,y = TAx,y;l l=1 m

!

m ; TA = TA1,1 ! TA1,1 " # " TA1,1 ! TA1,1 " # $ $ $ $ % & ' ' ' ' A B a b d c

(32)

•  When working with a SNP matrix (M; n x m):

•  Van Raden’s G:

The genomic relationship matrix

M = 1 2 1 1 0 … 0 0 1 1 0 … 1 2 0 1 1 … ! " # # # $ % & & & ; Z = M '1 TA = ZZ'+ m m

f =!" MAF1 … MAFm #$ ; d = 2 % &f % (1 ' f )

P = 1 ! 1 ! " ( ( ( # $ ) ) ) *!" 2 f1 '1 … 2 fm '1 #$ ; Zg = M ' P ; G = Z &Z d

(33)

•  Purpose: decrease computation time

•  Solution: create the bj using OLS on

close-related

•  Why? Go back to tabular method...

... Model used in tabular method is back in the inverse decomposition!

•  Why? It ensures D to be diagonal:

Approximation of the inverse of the

factorization of G

(34)

•  Solutions are computed by OLS using a simple

model:

Approximation of the inverse of the

factorization of G

s ... d ... a ... s ... d ... a ... 1 1 1 1 1 1 G T* s ... d ... a ... s ... d ... a ...

- =

b

y

Z

(35)

•  Solutions are computed by OLS using a simple

model:

Approximation of the inverse of the

factorization of G

b = ( !

Z Z)

"1

Z y

!

b = (Z)

"1

(Z)

"1

(Z)y

# b = (Z)

"1

y

(36)

•  Solutions are computed by OLS using a simple

model:

•  Restriction to the close-family (Ω) of an animal

•  p is a « genomic » threshold that defines close-family

Approximation of the inverse of the

factorization of G

b = ( !

Z Z)

"1

Z y

!

b = (Z)

"1

(Z)

"1

(Z)y

# b = (Z)

"1

y

(37)

•  Use of the “backward” equation to get D:

Computation of D

-1

and G

-1

(38)
(39)
(40)
(41)

•  Use of the “backward” equation to get D:

•  Next? 2 options:

1)  Take a diagonal element of D and invert it

2)  Recursively process the remaining D as G

•  Final equation:

Computation of D

-1

and G

-1

D = T

!1

G(T

!1

)

"

(42)

Acknowledgements

•  Dr. G. Gorjanc (UL), Dr. F. Colinet and Mr. J.

Vandenplas (ULg)

•  Wallonie-Bruxelles International (WBI)

•  Government of Republic of Slovenia

•  Fonds National de la Recherche Luxembourg (FNR)

•  Host institution: Gembloux Agro-Bio Tech - University of Liège

•  Collaborating institutions: •  CONVIS s.c. (Luxembourg)

Références

Documents relatifs

Only two of the 40 systems claiming to practise chlorination were satisfactory (with adequate chlorine residuals) and, among the simple gravity systems, none of the 20 claiming

Proposition 4.2 (Folklore). We shall illustrate the possible underlying fractal nature of the top Lya- punov direction as introduced in Corollary A.. Then, there is no reason for

Reconfigurable FPGA presents a good solution to implement such dynamic reconfigurable system. For the whole system, we can first implement data acquisition and signal

In order to enact strategies to increase nurse safety, it is important to consider all aspects related to violence in health care settings: patient characteristics; the environment

The process of bringing a non-singular matrix A to its definite form, can be obtained by using transposition matrices and diagonal multipliers either on the columns or on the rows

Much study has been devoted to the eigenvalues of random unitary matrices, but little is known about the entries of random unitary matrices and their powers.. In this work, we

By the 3-dimensional Minimal Model Program (see, for instance, [14, 16, 20]), V has a min- imal model X (with K X nef and admitting Q-factorial terminal sin- gularities). From the

is unknown whether muscle activation and/or muscle moment arm counteract these individual 90.. differences in PCSA such that the distribution of torque among the muscle