• Aucun résultat trouvé

blockcluster, simerge and C++ with R

N/A
N/A
Protected

Academic year: 2021

Partager "blockcluster, simerge and C++ with R"

Copied!
41
0
0

Texte intégral

(1)

HAL Id: hal-01884822

https://hal.inria.fr/hal-01884822

Submitted on 1 Oct 2018

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

blockcluster, simerge and C++ with R

Serge Iovleff, Seydou Nourou Sylla

To cite this version:

Serge Iovleff, Seydou Nourou Sylla. blockcluster, simerge and C++ with R. Mixture Models: Theory

and Applications, Jun 2018, Paris, France. �hal-01884822�

(2)

blockcluster, simerge and C++ with R

Serge Iovleff, Nourou Sylla

Equipe Projet Modal, Equipe G4BBM, Institut Pasteur de Dakar´

(3)

blockcluster package

Summary

blockcluster package

simerge: Block clustering of binary data with Gaussian co-variables C++ Programming with R: Thesimergepackage

Preliminary Results References

(4)

blockcluster package

Co-Clustering

“Aims to organize data-set into a set of homogeneous blocks by simultaneous clustering of individuals and variables.”

Figure:Binary data set (a), data reorganized by a partition onI(b), by partitions onIandJsimultaneously (c) and summary matrix (d).

(5)

blockcluster package

Model Based Approach

xis data set doubly indexed by a setI withnelements (individuals) and a setJ withmelements (variables).

z= (z11, . . . ,zng)withzik=1 ifi belongs to clusterk andzik =0 otherwise,

w= (w11, . . . ,wmd)withwj`=1 ifj belongs to cluster `andwj`=0 otherwise,

f(x;θ) = X

(z,w)∈Z×W

p(z;θ)p(w;θ)f(x|z,w;θ) (1)

whereZ andW denote the sets of all possible labellingzofI andwofJ. There isgn×dm labelling possible.

(6)

blockcluster package

blockcluster: R Package For coclustering

I R interface to C++ library coclust (using STK++ in background), I Simple and Robust API,

I Extend four basic functions ”Plot”,”Summary”,”Show”,”Print”, I Implements “intelligent” estimation strategy.

Example

d a t a(g a u s s i a n d a t a)

out< -c o c l u s t e r G a u s s i a n(g a u s s i a n d a t a,m o d e l="

p i _ r h o _ s i g m a 2 k l ",n b c o c l u s t e r=c(2 ,3) ) p l o t(out)

p l o t(out, t y p e=" d i s t r i b u t i o n ")

(7)

blockcluster package

Example : Gaussian distribution

(a) (b)

Figure:Simulated and co-clustered data (a), Data block-distributions (b)

(8)

blockcluster package

Example : Binary distribution

(a) (b)

Figure:Simulated and co-clustered data (a), Data block-distributions (b)

(9)

blockcluster package

Example : Categorical distribution

s

Original Data Co−Clustered Data

12345

ColorLevels

Scale

(a)

1 2 3 4 5

Block ( 1 , 1 )

Data values block ( 1 , 1 )

0.000.050.100.150.20

1 2 3 4 5

Block ( 1 , 2 )

Data values block ( 1 , 2 ) Frequency0.00.20.40.60.8

1 2 3 4 5

Mixture of row 1

Data values of row 1 frequency0.00.10.20.30.40.5

1 2 3 4 5

Block ( 2 , 1 )

Data values block ( 2 , 1 )

0.00.20.40.60.8

1 2 3 4 5

Block ( 2 , 2 )

Data values block ( 2 , 2 ) Frequency0.00.20.40.60.8

1 2 3 4 5

Mixture of row 2

Data values of row 2 frequency0.00.10.20.30.4

1 2 3 4 5

Block ( 3 , 1 )

Data values block ( 3 , 1 )

0.00.20.40.60.8

1 2 3 4 5

Block ( 3 , 2 )

Data values block ( 3 , 2 ) Frequency0.00.20.40.60.8

1 2 3 4 5

Mixture of row 3

Data values of row 3 frequency0.00.10.20.30.4

Mixture of column 1

0.00.10.20.3

Mixture of column 2

Frequency0.000.050.100.150.200.250.30

Final mixture

Frequency0.000.050.100.150.20 Histogram/density for each block

(b)

Figure:Simulated and co-clustered data (a), Data block-distributions (b)

(10)

blockcluster package

Example : Poisson distribution

Original Data Co−Clustered Data

051015202530

ColorLevels

Scale

(a)

3 6 9 13182328 Block ( 1 , 1 )

Data values of block ( 1 , 1 )

0.000.020.040.060.08

036912 15 18

Block ( 1 , 2 )

Data values of block ( 1 , 2 ) Frequency0.000.020.040.060.080.100.120.14

036912 15 18

Block ( 1 , 3 )

Data values of block ( 1 , 3 ) Frequency0.000.020.040.060.080.100.120.14

048 12 17 22 27 32 Data values of row 1 Frequency0.000.020.040.060.080.10

0 2 4 6 81114

Block ( 2 , 1 )

Data values of block ( 2 , 1 )

0.000.050.100.15

1 4 7 11162126 Block ( 2 , 2 )

Data values of block ( 2 , 2 ) Frequency0.000.020.040.060.080.10

1 5 812 16 20 24 29 Block ( 2 , 3 )

Data values of block ( 2 , 3 ) Frequency0.000.020.040.060.080.10

048 12 17 22 27 Data values of row 2 Frequency0.000.010.020.030.040.050.060.07

Mixture density of column 1

0.000.020.040.060.08

Mixture density of column 2

Frequency0.000.020.040.060.08

Mixture density of column 3

Frequency0.000.020.040.060.08

Final mixture density

Frequency0.000.010.020.030.040.050.060.07 Histograms of classes of contingency data

(b)

Figure:Simulated and co-clustered data (a), Data block-distributions (b)

(11)

blockcluster package

Development history

I First versions developed during ADT coclust (October 2011-October 2013). Implement binary, Poisson, Gaussian models; BEM and BCEM algorithms.

I Release 3.0 in 2014 add:

1. Support for categorical data,

2. Add Bayesian inference estimation algorithms, 3. But stay unstable in certain situations (crashes..).

I Release 4.0 in November/December 2015 :

1. Use STK++ as background library (code became cleaner and more compact).

2. Fix (a lot of) crashes issues,

I Enhancement in release 4.2 in November/December 2016 (ADT Massicc)

1. Adding selection criteria,

2. Adding Gibbs estimation algorithms.

(12)

simerge: Block clustering of binary data with Gaussian co-variables

Summary

blockcluster package

simerge: Block clustering of binary data with Gaussian co-variables

C++ Programming with R: Thesimergepackage Preliminary Results

References

(13)

simerge: Block clustering of binary data with Gaussian co-variables

Simerge : Statistical Inference for the Management of Extreme Risks, Genetics and Global Epidemiology

http://mistis.inrialpes.fr/simerge/index.html

SIMERGE is a LIRIMA project-team started in January 2015. It includes I Mistis (Inria Grenoble - Rhˆone-Alpes, France)

I LERSTAD (Laboratoire d’Etudes et de Recherches en Statistiques et D´eveloppement, Universit´e Gaston Berger, S´en´egal)

I IRD (Institut de Recherche pour le D´eveloppement, ´equipe G4BBM, Dakar, S´en´egal)

I LEM (Lille Economie et Management, Universit´e Lille 2) I Modal (Inria Lille Nord-Europe)

The Associate team is built on two research themes:

1. Spatial extremes, application to management of extreme risks 2. Classification, application to genetics and global epidemiology

(14)

simerge: Block clustering of binary data with Gaussian co-variables

Challenge

Build statistical models in order to test association between diseases and human host genetics in a context of genome-wide screening.

Figure:Genotypes on 719,656 SNPs(Single Nucleotide Polymorphism) typed on481 individualsin Senegal, in rural area where malaria and arboviral diseases are endemic. 1 malariaquantitativephenotype ontwo sites: the individual effect on the risk of having malaria attack (iPFA).

(15)

simerge: Block clustering of binary data with Gaussian co-variables

Statistical Model

“Pour queblockclustermette en ´evidence une cause g´en´etique `a l’iPFA, il faudrait que les populations aient ´et´e expos´ees `a la maladie pendant plusieurs mill´enaires”(Cheick Loucoubar, head of G4BBM) xis a binary data-set.

yis a data-set (co-variables) ofRp indexed byI.

Classical block model formulation for binary data is extended

f(x,y;θ) = X

(z,w)∈Z×W

p(z;θ)p(w;θ)f(x|y,z,w;θ)f(y|z;θ). (2)

Dependency betweenxij andyi modeled by canonical link for binary response data

f(xij|yiz

iwj) = logis(βTz

iwjyi)xij

1−logis(βTz

iwjyi)1−xij

(3)

f(y|z;θ) =Y

i

φ(yizizi)withφmultivariate Gaussian density.

(16)

simerge: Block clustering of binary data with Gaussian co-variables

Estimation

EM algorithm not feasible as quantityeikj`=P(zikwj`=1|x,y,θ)is not computable.

Takeq(z,w) =t(z)r(w) =t×rwithtandrmatrices of sizes(n,g)and (m,d), then

l(θ) = ˜FC(t,r,θ) +KL(q(z,w)kp(z,w|x,y,θ)) (4) withKL(qkp)denoting theKullback-Liebler divergence andF˜C

denoting theFree EnergyorFuzzy Criterion F˜C(t,r,θ) =X

k

t.klogπk+X

`

r.`logρl

+ X

i,j,k,`

tikrj`logf(xij,yik`) (5)

+H(t) +H(r) andH(t),H(r)denoting the entropy oftandr.

Maximization of likelihoodl(θ)is replaced by the following maximization argmax

t,r,θ

F˜C(t,r,θ).

(17)

simerge: Block clustering of binary data with Gaussian co-variables

BEM algorithm

Initialization Sett(0),r(0) andθ(0)= (π(0)(0)(0)(0), Σ(0)).

(a) Row-EStep Computet(c+1) using formula

tik(c+1)=

πk(c)Q

jl

f(xij|yi(c)kl )φ(yi(c)k(c)k )rjl(c)

P

kπ(c)k Q

jl

f(xij|yi(c)kl )φ(yi(c)k(c)k )rjl(c) .

(b) Row-MStep Compute π(c+1)(c+1)(c+1) and estimateβ(c+1/2). (c) Col-EStep Computer(c+1) using formula

rjl(c+1)= ρ(c)l Q

ikf(xij|yi(c+1/2)kl )t(c+1)ik P

lρ(c)l Q

ikf(xij|yi(c+1/2)kl )tik(c+1) .

(d) Col-MStep Computeρ(c+1) and estimateβ(c+1). Iterate Iterate(a)-(b)-(c)-(d)until convergence.

(18)

simerge: Block clustering of binary data with Gaussian co-variables

Measuring contribution of a variable

ml denotes the number of columns with labell, i.e

ml = #{wjl=1, j =1, . . .m}and for a rowi fixed letmil denotes the number of elements such thatwjl=1 andxij=1, i.e.

mil = #{wjlxij=1, j=1, . . .m}. The posterior probability of the co-variabley is

f(y|x,z,w,θ)∝

n

Y

i=1

πziφ(yiz

izi)

d

Y

l=1

ρml l emilyTiβzi l

1+eyTi βzi lml (6)

Takinglog, contribution of the jth variable is computed as

I(j) = logρwj+

n

X

i=1

xijyTi βz

iwj−log(1+ exp(yTiz

iwj))

. (7)

using MAP estimator forzandw.

(19)

C++ Programming with R: Thesimergepackage

Summary

blockcluster package

simerge: Block clustering of binary data with Gaussian co-variables C++ Programming with R: Thesimergepackage

Preliminary Results References

(20)

C++ Programming with R: Thesimergepackage

Extreme Programming (XP)

1

Extreme Programming is a discipline of software development based on values of simplicity, communication, feedback, courage, and respect.

I Simple Design: XP teams build software to a simple but always adequate design. They start simple, and through programmer testing and design improvement, they keep it that way.

I Pair Programming: All production software in XP is built by two programmers, sitting side by side, at the same machine.

I Test-Driven Development: XP is obsessed with feedback, and in software development, good feedback requires good testing.

I Design Improvement (Refactoring): XP focuses on delivering business value in every iteration. To accomplish this over the course of the whole project, the software must be well-designed.

I Coding Standard: XP teams follow a common coding standard, so that all the code in the system looks as if it was written by a single – very competent – individual.

(21)

C++ Programming with R: Thesimergepackage

Extreme Programming (XP)

1

Extreme Programming is a discipline of software development based on values of simplicity, communication, feedback, courage, and respect.

I Simple Design: XP teams build software to a simple but always adequate design. They start simple, and through programmer testing and design improvement, they keep it that way.

I Pair Programming: All production software in XP is built by two programmers, sitting side by side, at the same machine.

I Test-Driven Development: XP is obsessed with feedback, and in software development, good feedback requires good testing.

I Design Improvement (Refactoring): XP focuses on delivering business value in every iteration. To accomplish this over the course of the whole project, the software must be well-designed.

I Coding Standard: XP teams follow a common coding standard, so that all the code in the system looks as if it was written by a single – very competent – individual.

1https://ronjeffries.com/xprog/what-is-extreme-programming/

(22)

C++ Programming with R: Thesimergepackage

Extreme Programming (XP)

1

Extreme Programming is a discipline of software development based on values of simplicity, communication, feedback, courage, and respect.

I Simple Design: XP teams build software to a simple but always adequate design. They start simple, and through programmer testing and design improvement, they keep it that way.

I Pair Programming: All production software in XP is built by two programmers, sitting side by side, at the same machine.

I Test-Driven Development: XP is obsessed with feedback, and in software development, good feedback requires good testing.

I Design Improvement (Refactoring): XP focuses on delivering business value in every iteration. To accomplish this over the course of the whole project, the software must be well-designed.

I Coding Standard: XP teams follow a common coding standard, so that all the code in the system looks as if it was written by a single – very competent – individual.

(23)

C++ Programming with R: Thesimergepackage

Design and Coding Standard

UseS4class for R side and a mirror C++ class s e t C l a s s(

C l a s s = " C o C l u s t e r B i n a r y ",

r e p r e s e n t a t i o n = r e p r e s e n t a t i o n(

# y p a r t

yid = " m a t r i x ", # c o v a r i a b l e s

m u k d = " m a t r i x ", # m e a n s of yid

s i g m a k d = " m a t r i x ", # s t a n d a r d d e v i a t i o n s i s C o M i x t u r e = " l o g i c a l ",# yid is a m i x t u r e ?

# x p a r t

xij = " m a t r i x ",

# . . . .

# C o n s t r u c t o r of the S4 c l a s s s e t M e t h o d(

f=" i n i t i a l i z e ",

s i g n a t u r e=c(" C o C l u s t e r B i n a r y ") ,

d e f i n i t i o n=f u n c t i o n(.Object, x, y, n b c o c l u s t e r, i s C o M i x t u r e)

(24)

C++ Programming with R: Thesimergepackage

Design and Coding Standard

UseS4class for R sideand a mirror C++ class

c l a s s C o C l u s t e r B i n a r y M o d e l: p u b l i c STK::I R u n n e r B a s e {

p u b l i c:

// c o n s t r u c t o r of the C ++ c l a s s

C o C l u s t e r B i n a r y M o d e l(R c p p::S4 s 4 M o d e l) ; // . . . .

STK::RMatrix<double> y i d _; STK::RMatrix<double> m u k d _;

STK::RMatrix<double> s d k d _;

b o o l i s C o M i x t u r e _;

STK::RMatrix<double> x i j _;

(25)

C++ Programming with R: Thesimergepackage

Design and Coding Standard

UseS4class for R sideand a mirror C++ class

C++ constructor get R structure and wrap them as STK++ arrays

# C o n s t r u c t o r of the S4 c l a s s s e t M e t h o d(

f=" i n i t i a l i z e ",

s i g n a t u r e=c(" C o C l u s t e r B i n a r y ") ,

d e f i n i t i o n=f u n c t i o n(.Object, x, y, n b c o c l u s t e r, i s C o M i x t u r e)

C o C l u s t e r B i n a r y M o d e l::C o C l u s t e r B i n a r y M o d e l( R c p p::S4 s 4 M o d e l) :

// . . . . .

, y i d _((S E X P)s 4 M o d e l.s l o t(" yid ") ) , m u k d _((S E X P)s 4 M o d e l.s l o t(" m u k d ") )

, s d k d _((S E X P)s 4 M o d e l.s l o t(" s i g m a k d ") ) , i s C o M i x t u r e _(s 4 M o d e l.s l o t(" i s C o M i x t u r e ") ) , x i j _((S E X P)s 4 M o d e l.s l o t(" xij ") )

// . . . . .

(26)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

R side s e t M e t h o d(

f=" l o g L i k e l i h o o d ",

s i g n a t u r e = " C o C l u s t e r B i n a r y ", d e f i n i t i o n = f u n c t i o n(o b j e c t) {

.C a l l(" l o g L i k e l i h o o d ",object,p a c k a g e=" s i m e r g e ") }

)

(27)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

C side

e x t e r n " C " S E X P l o g L i k e l i h o o d( S E X P m o d e l) {

R c p p::S4 s 4 m o d e l(m o d e l) ;

C o C l u s t e r B i n a r y M o d e l c o c l u s t(m o d e l) ;

c o c l u s t.c o m p u t e L o g L i k e l i h o o d() ; c o c l u s t.g e t V a l u e s(m o d e l) ;

r e t u r n m o d e l; }

(28)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

.

F˜C(t,r;θ) =X

k

t.klogπk+X

`

r.`logρl+H(t) +H(r)

+ X

i,j,k,`

tikrj`(log(1+ exp(yTi βkl)) +xijyTi βkl) + log(φ(yikk))

s e t M e t h o d(

f=" e n t r o p y ",

s i g n a t u r e = " C o C l u s t e r B i n a r y ", d e f i n i t i o n = f u n c t i o n(o b j e c t) {

e p s i l o n < - 1e-15 tik < - o b j e c t @ t i k rjl < - o b j e c t @ r j l

o b j e c t @ r o w E n t r o p y < - -sum(tik* log(e p s i l o n+tik) ) o b j e c t @ c o l E n t r o p y < - -sum(rjl* log(e p s i l o n+rjl) ) r e t u r n(o b j e c t)

} )

(29)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

.

F˜C(t,r;θ) =X

k

t.klogπk+X

`

r.`logρl+H(t) +H(r)

+ X

i,j,k,`

tikrj`(log(1+ exp(yTi βkl)) +xijyTi βkl) + log(φ(yikk))

C++ side

r o w E n t r o p y _= -t i k _.p r o d( (t i k _+R e a l M i n) .log() ) .sum() ; c o l E n t r o p y _= -r j l _.p r o d( (r j l _+R e a l M i n) .log() ) .sum() ;

(30)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

.

F˜C(t,r;θ) =X

k

t.klogπk+X

`

r.`logρl+H(t) +H(r)

+ X

i,j,k,`

tikrj`(log(1+ exp(yTi βkl)) +xijyTi βkl)+ log(φ(yikk))

for(k in 1:K) {

for(l in 1:L) {

o b j e c t @ l i k e l i h o o d k l[k,l] =

( (tik_[ ,k] * yid %* %b e t a k l d[k,l,]) %* % xij_ + c r o s s p r o d(tik[ ,k] ,p l o g i s(yid_%* %b e t a k l d[k,l

,] ,0 ,1 ,F,T) ) ) %* % rjl[ ,l];

} }

(31)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

.

F˜C(t,r;θ) =X

k

t.klogπk+X

`

r.`logρl+H(t) +H(r)

+ X

i,j,k,`

tikrj`(log(1+ exp(yTi βkl)) +xijyTi βkl)+ log(φ(yikk))

for(int k=0; k<K_; ++k) {

for(int l=0; l<L_; ++l) {

l i k e l i h o o d k l _(k,l)

= ( t i k _.col(k) .p r o d( y i d _*b e t a k l d _(k,l) ) . t r a n s p o s e() * x i j _

+ t i k _.col(k) .dot( (y i d _*b e t a k l d _(k,l) ) .l c d f c( l o g i s _) )

) * r j l _.col(l) ; }

}

(32)

C++ Programming with R: Thesimergepackage

Exemple: Computation of the Fuzzy Criterion F ˜

C

.

F˜C(t,r;θ) =X

k

t.klogπk+X

`

r.`logρl+H(t) +H(r)

+ X

i,j,k,`

tikrj`(log(1+ exp(yTi βkl)) +xijyTi βkl)+log(φ(yikk))

g a u s s i a n L o g L i k e l i h o o d _= c o m p u t e G a u s s i a n L o g L i k e l i h o o d()

;

f u z z y L o g L i k e l i h o o d _= l i k e l i h o o d k l _.sum() + tk_.dot(p i k _.log() ) + rl_.dot(r h o l _.log() ) + g a u s s i a n L o g L i k e l i h o o d _; f u z z y C r i t e r i o n _= f u z z y L o g L i k e l i h o o d _

+ r o w E n t r o p y _ + c o l E n t r o p y _;

(33)

Preliminary Results

Summary

blockcluster package

simerge: Block clustering of binary data with Gaussian co-variables C++ Programming with R: Thesimergepackage

Preliminary Results References

(34)

Preliminary Results

Data set

n=444 individuals andm=515721 SNPs conserved.

Figure:Histogram of the iPFA variable and fitted Gaussian mixture models obtained with MixAll package

(35)

Preliminary Results

Model selection

ICL BIC-like approximations leads to the followingBIC(g,d)

−2max

θ logf(x,y;θ)+(g−1) logn+λlogn+(d−1) logm+gd(p+1) log(mn) withλthe number of parameters of theydistribution.

Figure:Choosing the number of blocks (Note: implemented criteria waswrong)

(36)

Preliminary Results

Results with (g , d ) = (2, 22) and y Gaussian mixture

(a) (b)

Figure:iPFA density (a), Proportion of mutation (b), BIC = 290551317

(37)

Preliminary Results

Results with (g , d ) = (2, 22) and y Gaussian rv

(a) (b)

Figure:iPFA density (a), Proportion of mutation (b), BIC = 287770996

(38)

Preliminary Results

Influence Measure

Figure:Repartition of the influence in clusters (by columns)

(39)

References

Summary

blockcluster package

simerge: Block clustering of binary data with Gaussian co-variables C++ Programming with R: Thesimergepackage

Preliminary Results References

(40)

References

Merci ` a la G4BBM team

2

Maryam DIARRA –Biomathematician PhD in Applied Mathematics Saint Louis University (UGB)

Mamadou DIOP –Computer Scientist Bioinformatician Master in Computer Science Saint Louis University (UGB) Cheikh LOUCOUBAR –Biomathematician

PhD in Statistical Genetics Head of the Group Dakar University / Paris 5

Amadou DIALLO –Biomathematician Bachelor in Mathematics Minot State University, USA

Mareme S. THIAM –Master Fellow in Mathematics M2 Mathematics – Big Data AIMS

Seydou Nourou SYLLA –Biomathematician PhD in Applied Mathematics Saint Louis University (UGB

Dame SY –Data Manager

DTS in Computer Science

Aboubacry GAYE –Master Fellow in Mathematics M2 Mathematics Saint Louis University (UGB)

Mame Malick DIENG –Computer Scientist Master in Computer Science Saint Louis University (UGB)

Other Activities

§ Support IPD units in data management and analysis

§ Teaching in collaborations with universities

Main Activities

§Research on human host genetic diversity and implication in malaria phenotypes

§New grant application

(41)

References

Links

I http://www.pasteur.sn/recherche/

biostatistique-bio-informatique-et-modelisation/

I https://cran.r-project.org/package=blockcluster I https://cran.r-project.org/package=rtkore I https://cran.r-project.org/package=MixAll I http://www.stkpp.org

I https://modal.lille.inria.fr/wikimodal/doku.php

Références

Documents relatifs

Our definition of concept patterns deviate from the one used in the literature with respect to the following features: (i) our definition of concept patterns is more liberal in

When we apply the function library() on a package which is not installed on the machine, it causes an error and if the instruction is included in a file that we source (with

A fuzzy clustering model (fcm) with regularization function given by R´ enyi en- tropy is discussed. We explore theoretically the cluster pattern in this mathemat- ical model,

– testing association between the binarized phenotype (using the optimal cutoff identified by the co-clustering algorithm, here value 0.042) and genotypes by logistic regression..

MI meth- ods using a random intercept, with the short names we use for them in this paper, include JM for multivariate panel data (Schafer and Yucel, 2002), JM-pan; multi- level

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

The R package blockcluster allows to estimate the parameters of the co-clustering models [Govaert and Nadif (2003)] for binary, contingency and continuous data.. This package is

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des