• Aucun résultat trouvé

High performance text indexing and applications in life sciences

N/A
N/A
Protected

Academic year: 2021

Partager "High performance text indexing and applications in life sciences"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: lirmm-02093450

https://hal-lirmm.ccsd.cnrs.fr/lirmm-02093450

Submitted on 8 Apr 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

High performance text indexing and applications in life

sciences

Eric Rivals

To cite this version:

Eric Rivals. High performance text indexing and applications in life sciences. UK-France Bilateral International Meeting on High Performance Computing and Biomathematics, Feb 2019, Chicheley, United Kingdom. �lirmm-02093450�

(2)

High performance text indexing and applications in

life sciences

Eric Rivals

December 17, 2018

Address:

Laboratoire d’Informatique de Microélectronique et de Robotique de Montpellier(LIRMM) andInstitute of Computational Biology(IBC) CNRS and Université de Montpellier, France

http://www.lirmm.fr/~rivals/

Large corpura of texts or of sequences serve as references and are interrogated through web site or programming interfaces. In life sciences, new sequencing technologies have revolutionised the acquisition of genomic sequences and triggered an exponential accumulation of reference sequences in international, public databases. Several kinds of text queries form the basic oper-ation of programs that analyse genomic sequences. For instance, the webserver of EMBL-EBI receives 27 million queries a day. A typical sequencing experiment yields a hundred million sequencing reads – about 150 nucleotides long – each of which needs to be compared to a reference genome. To analyse such data or to mine public sequence repositories demands very efficient programs and algorithms. Only, the use of complex and specific, indexing data struc-tures allows us to match the needs of Life sciences communities. I will present some indexing data structures that enables high performance computational analyses in genomics, and men-tion their pracical applicamen-tions. Beyond text data, such data structures can be adapted to index other types of discrete data like trees or graphs. This will be key for the development of com-putational pan-genomics.

Funding acknowledgement: "GEM" flagship project from Labex NUMEV; for travelling to the workshop Royal Society (UK), Académie des Sciences, and CNRS.

Abstract of speech for the UK-France Bilateral International Meeting on High Performance Computing and Biomathematics, held at Chicheley, UK, on the 20-21 Feb. 2019, which was co-organised by the Royal Society and the French Académie des sciences.

Références

Documents relatifs

By spectral analysis of the processes behavior in a typical distributed scientific application, it is shown that there is a propagation of nondispersive idle waves at phase

A processor, or Central Processing Unit (CPU), is roughly made of three units: the memory unit that stores input data, intermediate results and output data, the instruction unit

The platform currently is being piloted on an ERP system specialized of municipali- ties. The context data sources are log files, live user observation as well as external

Nous onsid erons toujours le syst eme rapide{lent sto hastique (3.1), mais ette fois nous supposons que le dynamique lente am ene les traje toires au voisinage d'un point de

Les choix peuvent être inscrits dans des codes de conduite professionnels collectifs, apportant des réponses appropriées selon les différents secteurs d’activité, ou dans des

Liu, “A haptic shared control approach to teleoperation of mobile robots,” in 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems

We iterate the calls to Algorithm 2, adding a processor to the task with longest execution time, until: (i) either the task of longest ex- ecution time is already assigned p