• Aucun résultat trouvé

Spectral Database: from data model to web interface

N/A
N/A
Protected

Academic year: 2022

Partager "Spectral Database: from data model to web interface"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-02742064

https://hal.inrae.fr/hal-02742064

Submitted on 3 Jun 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Spectral Database: from data model to web interface

Nils Paulhe, Christophe Duperier, Daniel Jacob, Etienne Thévenot, Jean-Francois Martin, Claire Lopez, Franck Giacomoni

To cite this version:

Nils Paulhe, Christophe Duperier, Daniel Jacob, Etienne Thévenot, Jean-Francois Martin, et al..

Spectral Database: from data model to web interface. 8. Journées Scientifiques du Réseau Français de Métabolomique et Fluxomique RFMF, May 2014, Lyon, France. 2014. �hal-02742064�

(2)

METABOHUB’S SPECTRAL DATABASE: FROM DATA MODEL TO WEB INTERFACE

NILS PAULHE 1 , CHRISTOPHE DUPERIER 1 , DANIEL JACOB 2 ,

ÉTIENNE THÉVENOT 3 , JEAN-FRANCOIS MARTIN 4 , CLAIRE LOPEZ 1 , FRANCK GIACOMONI 1

!

1 METABOLISM EXPLORATION PLATFORM, INRA (CLERMONT-FERRAND/THEIX)

2 BORDEAUX METABOLOME PLATFORM, INRA (BORDEAUX)

3 THE METABOLOMEIDF , CEA (SACLAY)

4 METABOLOMICS AND FLUXOMICS PLATFORM, INRA (TOULOUSE)

INRA - Unite de Nutrition Humaine

Plate-Forme “Exploration du Métabolisme”

Centre de Theix

63122 Saint genes Champanelle France

http://www.clermont.inra.fr/plateforme_exploration_metabolisme

Conclusion Methods

For the MetaboHUB’s Spectral Database code architecture, we use modelling concepts and objects-objects relations to represent mass / nuclear magnetic resonance (NMR) spectrometry complexity. Manipulated entities and their links are based on data / metadata used by chemists in laboratory routine. Our work is split in three levels with data-model building, users services designing and code engineering.

Still in early development phase, some functional blocks are already being tested.

The first challenge was to validate this data-model and the use cases with chemists. We are currently testing the chemical library module with data provided by MetaboHUB partners.

Code architecture: the MVC pattern is better for code maintainability, team development and security.

Light view of the data-model with main entities: both reference and putative compounds are linked to spectra. A specific entity, metadata, is dedicated to link spectra by their shared properties. The user entity is independent: it is used only for web-portal access control,

not for spectra or compounds data ownership.

NMR spectre NMR peak

NMR peak pattern Coupling Constant

UV spectre

peaks

users putative compound

compound name

Ref. compound

spectre metadata

chromato

Introduction

MetaboHUB is an infrastructure of metabolomics and fluxomics that provides tools to research teams and partners; a full workpackage focuses on bioinfomatics tools design and development, including one called “MetaboHUB’s Spectral Database”. This tool is created to gather and share knowledge and data from both Mass and NMR Spectra.

The core of the database called “data-model”, is a computational representation of each entity involved in spectra analysis and chemical compounds annotation.

NMR spectre

NMR peak

NMR peak pattern NMR spectre

NMR peak

NMR peak pattern

correlation spot

Link spectra to compounds: spectra match 4 types of compounds, common properties or relations given by a virtual “parent” object in the data-model. We provide reference compounds via chemical libraries manager.

Putative compounds are annotated with an interactive identification tool and stored in the database. Three different entities of each reference compound allow us to match putative compounds with missing properties (like the 3D structure).

NMR spectra specificity: the data-model must support one and multi- dimensional spectra (schema 4) and specific entities like peaks patterns and coupling constants (schema 3).

peak spectre (v) Abstract Chem. Comp.

Putative Comp.

add via interactive 
 identification

identif.

list of

Ref. Comp. add via chem. lib

struct.

comp.

sub-struct.

comp.

generic comp.

chem. comp.

struct (can smiles) mass

id in ext.


bank 3D struct.


inChi 2D struct.


mol. file use in NMR
 and lipidomic

parent children

match (score)

name

match (score)

Carboxylic acid

pinene

(1R)-(+)-α-pinene

Code architecture and frameworks are an originality of this project. Developers use java frameworks and common generic toolbox to keep to the time scheduled and interoperability. There are three levels of architectures: the spectral database core (figure 2), implemented in both webservices and the web-portal model-view-controller (MVC) (schema 5).

There are two ways to access the data: though a web-portal or a webservice. Each service on the web-portal gets his own/specific interface, with user control access.

Additional interfaces, like spectral annotation tools, will be implemented later as plugin.

The core of the database is split in stand-alone blocks, which allow the community to re-use them in others projects. Each block is specific to one function.

It was designed with spectrometry experts knowledge; their experience will also be required to build an interactive identification interface for spectral analysis.

This database will be available through a web-portal and open for each partner of the MetaboHUB consortium in a first time, to the whole community later.

Two milestones are coming: a first to provide a mechanism to import spectral data in the data-model and feed the database, a second to define metadata around spectral analysis.

We use a user-friendly framework (twitter bootstrap) to design web interfaces.

Webservices are the way to open with bots data stored in the spectral database, submitting massives requests and hudge computing. We use popular technologies like REST and Json.

user  services

computing

To store data in the database, we use the Hibernate framework. We keep the model relations and constrains, with an abstraction of technologies used for SQL requests.

data  model

Removing the complexity of the full data-model allows us a better understanding of it; the first schema proposes a light view of relation entities, others are focused on parts of it.

Parts of the data-model are specific for spectra technics (like MS-MS), others are created for futur developments (like biological matrix spectra annotation).

Interfaces and user access: some database queries are available for all users (get data), others require an authentification (add or modify elements, data curation).

“Spectral annotation tools” building is an aim of the MetaboHUB’s Spectral Database. We have to create the structure to be used to host and link all data provided by chemists.

screenshot 1: a basic use of the Spectral Database: show a compound card

request  id   list

spectra  and/or   cpd  +list

request  peak  list

add  

compounds  list  / add  peak  list

success

u se rs  in p u ts

fail

sp ec tr al  D B  c o re u se rs  o u tp u ts matching  

list

matching   view

figure 1: three basic interface available in the web-portal

schema 3: NMR specific entities schema 4: multi-dimensional NMR spectra

schema 1: light view of the data-model

schema 2: compounds matryoshka dolls model

core

spectral db api

data  model files parser io-spectra-files io-chem-files

ext. resources 


connectors external-tools io-galaxy external-banks

figure 2 : core of the Spectral Database

view

user

controller model

manipulates

uses updates

sees

database load/store

store

security core

user   friendly

schema 5: MVC structure

Références

Documents relatifs

The model we propose has several key advantages: (i) it separates nomenclatural from taxonomic information; (ii) it is flexible enough to accommodate the ever-changing

Emerging Technologies in Academic Libraries 2010 Trondheim, Norway - April 2010... What do Libraries need

• Plenty of tools already exist to allow libraries to expose data as RDF.. • Become aware, experiment, work with vendors

Data collection; Data storage: the ability to organize heterogeneous data storage, providing real-time access to data, data storage using distributed storage (including

This structural mapping may be declaratively augmented by domain- specific semantics, and lifts a software design pattern highly suitable for large-scale IoT applications to

In our implementation we offer these Mementos (i.e., prior ver- sions) with explicit URIs in different ways: (i) we provide access to the original dataset descriptions retrieved

Heatmaps of (guess vs. slip) LL of 4 sample real GLOP datasets and the corresponding two simulated datasets that were generated with the best fitting

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des