• Aucun résultat trouvé

Large-scale significance testing of high thoroughput Data with FAMT

N/A
N/A
Protected

Academic year: 2021

Partager "Large-scale significance testing of high thoroughput Data with FAMT"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-02746752

https://hal.inrae.fr/hal-02746752

Submitted on 3 Jun 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Large-scale significance testing of high thoroughput Data with FAMT

Magalie Houée-Bigot, Chloé Friguet, Sandrine Lagarrigue, Anne Blum, David Causeur

To cite this version:

Magalie Houée-Bigot, Chloé Friguet, Sandrine Lagarrigue, Anne Blum, David Causeur. Large-scale significance testing of high thoroughput Data with FAMT. 14. Conference of the Applied Stochastic Models and Data Analysis International Society, Jun 2011, Rome, Italy. �hal-02746752�

(2)

Large-scale significance testing of high throughput data with FAMT

Magalie Houée-Bigot*, Chloé Friguet, Sandrine Lagarrigue, Yuna Blum, David Causeur

(*)Agrocampus Ouest- Applied Mathematics Department, INRA- UMR598, Animal Genetics, Rennes, France

Email: magalie.houee@rennes.inra.fr

Abstract: Analysis of complex systems using high throughput technologies offers new challenges for statistics. In systems biology for example, microarray technology gives access to whole genome transcription datasets. It has therefore turned out to be a powerful tool to find out genes which expression variations are significantly related to a given trait using large-scale significance testing. Since Benjamini and Hochberg (1995)'s procedure to control the False Discovery Rate (FDR), the multiple testing theory has been deeply renewed. But the heterogeneity of microarray data has long been ignored in statistical models. However, some recent papers (see Friguet et al., 2009) suggest that unmodeled heterogeneity factors may generate some dependence across gene expressions and affect consistency of the multiple testing results. Friguet et al. (2009) propose a supervised factor model to identify the latent heterogeneity components, method implemented in the R package FAMT (Factor Analysis for Multiple Testing).

The talk aims both at presenting the statistical handling of multiple testing dependence as proposed in Friguet et al. (2009) and at illustrating the performance of the method by a microarray data analysis using the R package FAMT. As described in Blum et al. (2010), this microarray study analyses the relationships between the abdominal fatness of chickens and hepatic transcriptome profiles. Some heterogeneity components are extracted from the data by an EM algorithm. Additional functionalities optimize the procedure, such as the estimation of the proportion of true null hypotheses or the optimal number of factors.

Keywords: factor analysis, multiple testing, dependence, high dimension, R

ASMDA 2011 - pag n° 632

Références

Documents relatifs

Note also that the mean difference, among the significant cycles, between the daily numbers of births within the full Moon period and within the rest of the moon cycle is about

In genomic data analysis, many researchers [Friguet, Kloareg and Causeur (2009), Leek and Storey (2008), Sun, Zhang and Owen (2012)] proposed modeling the dependence among tests using

The Fisher test statistics and the corresponding p-values are obtained using the modelFAMT function with arguments x = c(3, 6) to give the column numbers of the explanatory variables

The comparison between the asymptotic optimal allocation, in the case of sub-exponential and exponential distributions, underscores the impact of the risks nature on the behavior of

The main properties of semisimple Lie groups and of their discrete subgroups become therefore challeng- ing questions for more general, sufficiently large, isometry groups

On the figure 1 and figure 2 and figure 3 there are shown the calculated concentrational dependencies of the average magnetic moment and the experimental data

Figure I.10: Structures chimiques de quelques flavonoïdes et isoflavonoïdes isolés de différentes espèces du genre Genista

Sur le relevé de la caractéristique électrique avec présence du média traité par DBD, on distingue bien que le courant de la décharge a bel et bien diminué par