• Aucun résultat trouvé

Next Generation Data Integration (for the Life Sciences)

N/A
N/A
Protected

Academic year: 2022

Partager "Next Generation Data Integration (for the Life Sciences)"

Copied!
1
0
0

Texte intégral

(1)

Next Generation Data Integration (for the Life Sciences)

[Abstract]

Ulf Leser

Humboldt-Universität zu Berlin Institute for Computer Science

leser@informatik.hu-berlin.de

ABSTRACT

Ever since the advent of high-throughput biology (e.g., the Human Genome Project), integrating the large number of diverse biological data sets has been considered as one of the most important tasks for advancement in the biolog- ical sciences. The life sciences also served as a blueprint for complex integration tasks in the CS community, due to the availability of a large number of highly heterogeneous sources and the urgent integration needs. Whereas the early days of research in this area were dominated by virtual inte- gration, the currently most successful architecture uses ma- terialization. Systems are built using ad-hoc techniques and a large amount of scripting. However, recent years have seen a shift in the understanding of what a ”data integra- tion system” actually should do, revitalizing research in this direction. In this tutorial, we review the past and current state of data integration (exemplified by the Life Sciences) and discuss recent trends in detail, which all pose challenges for the database community.

About the Author

Ulf Leser obtained a Diploma in Computer Science at the Technische Universit¨at M¨unchen in 1995. He then worked as database developer at the Max-Planck-Institute for Molec- ular Genetics before starting his PhD with the Graduate School for ”Distributed Information Systems” in Berlin. Since 2002 he is a professor for Knowledge Management in Bioin- formatics at Humboldt-Universit¨at zu Berlin.

Copyrightc by the paper’s authors. Copying permitted only for private and academic purposes.

In: G. Specht, H. Gamper, F. Klan (eds.): Proceedings of the 26thGI- Workshop on Foundations of Databases (Grundlagen von Datenbanken), 21.10.2014 - 24.10.2014, Bozen, Italy, published at http://ceur-ws.org.

Références

Documents relatifs

Frequencies of mutations at the ALS gene detected using individual plant analysis. 606 (dCAPS or Sanger sequencing) plotted against the frequencies assessed using

We provide a short introduction to the laboratory work required and then describe the possible steps in data processing for the detection of plant viruses, including quality control

The aim of the project is to develop accurate models that substantially outperform current state -of-the -art methods, for onshore and offshore wind power forecasting, exploiting both

b Service de Biochimie Métabolomique et Protéomique, Hôpital Universitaire Necker Enfants Malades, AP-HP, Paris, France c INSERM UMR-S973, Molécules Thérapeutiques in Silico,

The processing of healthcare information involves the use of different sources that usually contain a lot of redundant data so integration step has to address these issues and

The first studies using molecular methods to characterize the microbiome of the lower airways concluded that while the lower airways of healthy subjects are not sterile, the amount

In order to explain the edges at zeroth level, we need to have two paths that explain edges of the cross gad- get corresponding to each variable x j at level 0; the two additional

Or, lors de l'injection de charges sous un faisceau électronique, ou bien lors d'un test de claquage diélectrique, ou encore lors d'un frottement important entre deux matériaux,