• Aucun résultat trouvé

Interactive Platform for Semantic Gene Expression Analysis of Alzheimer's Disease

N/A
N/A
Protected

Academic year: 2022

Partager "Interactive Platform for Semantic Gene Expression Analysis of Alzheimer's Disease"

Copied!
3
0
0

Texte intégral

(1)

Interactive platform for semantic gene expression analysis of Alzheimer’s disease

Toshiaki Katayama1, Masataka Kikuchi2, Soichi Ogishima3

1 Database Center for Life Science, Research Organization of Information and Systems, Tokyo, Japan

ktym@dbcls.jp

2 Department of Molecular Genetics, Center for Bioresources, Brain Research Institute, Niigata University, Niigata, Japan

kikuchi@bri.niigata-u.ac.jp

3 Department of Bioclinical Informatics, Tohoku Medical Megabank Organization, Tohoku University, Miyagi, Japan

ogishima@megabank.tohoku.ac.jp

Abstract. In the course of gene expression analysis, it is required to interpret data by referencing knowledge bases of genetics, pathways, diseases and drugs.

However, because those external resources are often stored in distributed databases in various formats, it is hard for biomedical scientists to use them in combination. Semantic Web technologies are suitable for integration of those heterogeneous datasets using Resource Description Framework (RDF) and providing a faceted search interface. In this work, we report an application of semantic data integration with faceted search for interactive analysis of gene expression data of Alzheimer’s disease. This framework can be easily extended for other biomedical domains.

Keywords: data integration, linked open data, faceted search, gene expression analysis, Alzheimer’s disease

Introduction

Difference of gene expressions of target and control samples can be measured by microarray technology. Those data are primarily processed to find genes having excessive fold change with a significant probability. Interpretation of relationship between those candidate genes and a hypothesis behind a design of the experiment depends on supportive information from existing knowledge. This includes genetics, pathways, diseases and drugs which are independently stored in different databases and only available in incompatible data formats. For integration of heterogeneous datasets, it is becoming standard to use RDF in life sciences [1]. With semantic Web technologies, relations of data can be represented as a linked graph which provides powerful means to explore a set of related information often referred to as faceted

(2)

2 Toshiaki Katayama1, Masataka Kikuchi2, Soichi Ogishima3

search. Here we report an application which can be used for exploring gene expression data with integrated semantic datasets for Alzheimer’s disease.

Results

We introduced RDF for data integration of Affymetrix microarray probes, Human Genome Organization (HUGO) gene annotations, UniProt protein annotations, gene- drug interactions in DrugBank, disease susceptibility genes in AlzGene [2] and molecular interactions in AlzPathway [3] databases for Alzheimer’s disease. Most of those datasets except for UniProt are converted into RDF by in-house developed scripts. Based on these datasets stored in a triple store, we developed a new Web application (Fig. 1) that accepts gene expression data and provides interactive interface to explore the list of candidate genes with various facets including gene expression ratio, chromosomal positions of genes, protein interactions with drugs, known genetic diseases, and pathway information.

Figure 1. Workflow of interactive faceted analysis.

The server is implemented in Node.js which accesses a backend triple store to perform faceted search by SPARQL Protocol and RDF Query Language (SPARQL).

The Web interface is developed using D3.js, Bootstrap and Google Map API.

(3)

Interactive platform for semantic gene expression analysis of Alzheimer’s disease 3

Discussions

Our application is designed for exploring candidate genes of Alzheimer’s disease based on the user-supplied gene expression data by referencing existing knowledge with a faceted navigation. We found that Semantic Web technologies are well-suited for data integration and implementation of a faceted search interface. This means that the method and the interface we developed can be extended for any other biomedical domains which requires integration of heterogeneous data where interactive search can assist a data-driven approach.

References

1. Katayama T, Wilkinson MD, Micklem G, et al.: The 3rd DBCLS BioHackathon:

improving life science data integration with semantic Web technologies. Journal of Biomedical Semantics 2013, 4:6.

2. Bertram L, McQueen MB, Mullin K, Blacker D, Tanzi RE: Systematic meta- analyses of Alzheimer disease genetic association studies: the AlzGene database.

Nature genetics 2007, 39:17–23.

3. Mizuno S, Iijima R, Ogishima S, et al.: AlzPathway: a comprehensive map of signaling pathways of Alzheimer’s disease. BMC systems biology 2012, 6:52.

Références

Documents relatifs

La caresse du pinceau Léchant mon scénario La griffure de la plume Qui en me labourant écume Sur le tatouage de la mine Ejaculant son encre de Chine Je ne peux pleurer. De peur

In this present study, we explored the expression of MMP-9 in relation to the volume of ischemia as assessed in perfusion imaging and the volume of the infarct lesion as assessed

In einer Kaplan-Meier-Kurve für das Gesamtüberleben der Patientinnen in Ab- hängigkeit von der p53-Überexpression zeigte sich, dass eine erhöhe Immunre- aktivität für p53 (.

Compared to baseline, the NIV mode corrected premature cycling in four ventilators (Extend, Newport e500, Raphael, Vela), and worsened it in four others (Elysée, Raphael, Servo i,

Based on an adapted clustering procedure of microarray expression data used in conjunction with an automated annotation procedure, FunCluster identifies relevant GO categories

In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene

Consequently, we conduct a comparison study of eight tests representative of variance modeling strategies in gene expression data: Welch’s t-test, ANOVA [1], Wilcoxon’s test, SAM

Abstract: The purpose of this study was to develop a method for identifying useful patterns in gene expression time-series data. We have developed a novel data mining approach that