HAL Id: hal-02754634
https://hal.inrae.fr/hal-02754634
Submitted on 3 Jun 2020
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
The need of meta-database for storing and managing large amount of soil biological data: EcoBioSoil
Daniel Cluzeau, Laurence Rougé, Mario Cannavacciuolo, Anne Bellido, Claudy Jolivet, Muriel Guernion, Charlène Briard, Denis Piron, Guenola Pérès
To cite this version:
Daniel Cluzeau, Laurence Rougé, Mario Cannavacciuolo, Anne Bellido, Claudy Jolivet, et al.. The need of meta-database for storing and managing large amount of soil biological data: EcoBioSoil. 9.
International Symposium on Earthworm Ecology (ISEE), Sep 2010, Xalapa, Mexico. �hal-02754634�
The need of meta database for storing and managing The need of meta-database for storing and managing The need of meta database for storing and managing g g g
l t f il bi l i l d t E Bi S il large amount of soil biological data: EcoBioSoil large amount of soil biological data: g g EcoBioSoil
D Cl
1L R é
1M C i l
2A B llid
1C J li t
3M G i
1C B i d
1D Pi
1G Pé è
1ECOBIO
D. Cluzeau
1, L. Rougé
1, M. Cannavacciuolo
2, A. Bellido
1, C. Jolivet
3, M. Guernion
1, C. Briard
1, D. Piron
1, G. Pérès
1ECOBIO
, g , , , , , , ,
UMR 6553
(1) U i i R 1 L b UMR CNRS 6553 E Bi S i Bi l i P i (F ) d i l l @ i 1 f
(1) University Rennes 1 - Lab. UMR CNRS 6553 EcoBio - Station Biologique, Paimpont (France) daniel.cluzeau@univ-rennes1.fr (2) ESA - Angers (France)
( ) g ( )
(3) INRA US InfoSol - Orléans (France) (3) INRA US InfoSol Orléans (France)
Obj ti
Objectives
Context Objectives
I d t k th il bi l i l i f ti il ibl
Context
In order to make the soil biological informations easily accessible
C tl th i t ti l t t i f bi di it ti (2010 Mill i ) th E
to potential users in France, it is essential to create a tool
Currently, among the international strategies for biodiversity conservation (2010-Millennium), the European
to potential users in France, it is essential to create a tool allowing the harmonisation and pertinent structuring of
Union wants to develop a unified approach of biodiversity’s analysis1,2. During three years (2005-2007),
allowing the harmonisation and pertinent structuring of d t ’ t
Union wants to develop a unified approach of biodiversity s analysis . During three years (2005 2007),
ENVASSO program had investigated the pan European strategies for soil sampling and data management and
data’ sets.
ENVASSO program had investigated the pan-European strategies for soil sampling and data management, and
It has to be evolutive, interoperable with existing databases
had proposed selected recommendations, including for soil biodiversity.
It has to be evolutive, interoperable with existing databases and has to ensure data’s quality for a long term
p p , g y
In the 1960s as soil biodiversity inventories begin in European countries3 4 the same researches begin in
and has to ensure data s quality for a long-term i
In the 1960s, as soil biodiversity inventories begin in European countries3,4, the same researches begin in
conservation.
France for earthworm groups5. These data have been implemented in a first database by Bouché and Soto:
Our aim is to develop a relational meta-database
g p p y
Lombricien2000*
Our aim is to develop a relational meta-database
“ E Bi S il ” d f d l i bj ti t t
Lombricien2000*.
“ EcoBioSoil ” and, for underlying objective, to create an
The EcoBio’s research team of the Rennes 1 University (France) had continued these inventories on several
y g j
integrated environment for improving the information
y ( )
monitoring sites at different scales (local regional national and european) and on various ecosystems and
integrated environment for improving the information sharing and the multidisciplinary exploration of the soil
monitoring sites, at different scales (local, regional, national and european) and on various ecosystems and
sharing and the multidisciplinary exploration of the soil
b d
agrosystems. All these studies were completed by soil parameters and agricultural practices, and sometimes
biota data.
g y p y p g p ,
related to the other biological soil organisms (from microbiology to macrofauna) related to the other biological soil organisms (from microbiology to macrofauna).
It rapidly became evident that a database was needed to integrate and coordinate amount earthworm Frenchp y g datasets coming from different providers Furthermore it was essential to ensure the interoperability with datasets, coming from different providers. Furthermore, it was essential to ensure the interoperability with
h i i il bi di i d b ( i i l “ f d b ”6 h d b O) d i h
Material and methods
others existing soil biodiversity databases (e.g. international “Macrofauna database”6 hosted by FAO) and with
Material and methods
national (Donesol**) European (ESDB7) and international (ISRIC8) soil databases
Database design has been conducted using the entity-relationship national (Donesol ), European (ESDB ) and international (ISRIC ) soil databases.
Database design has been conducted using the entity relationship model9 and the relational theory10,11
* Lombricien2000, the database developed by the CNUSC (INRA / CNRS, Montpellier, France), and hosted on the local server of EcoBio team, which, p y ( / , p , ), , model9 and the relational theory10,11.
contains all the data generated by the work of M.B. Bouché : taxonomic descriptions of earthworm species and their related anatomo-morphologic
At present, the data model is implemented in the relational database
g y p p p g
characteristics, as well as the location and the species lists of Bouché’s samplings in Europe. p , p
management system (RDBMS) Microsoft Access© which is particularly
** Donesol the French Soil Quality Monitoring Network database developed by INRA Infosol (Orléans France) accessible via internet management system (RDBMS) Microsoft Access©, which is particularly suitable for an internal use The team is now working on a migration in
** Donesol, the French Soil Quality Monitoring Network database, developed by INRA Infosol (Orléans, France) , accessible via internet
(http://www gissol fr/outil/donesol/donesol php) suitable for an internal use. The team is now working on a migration in
h d d f b h ( l )
(http://www.gissol.fr/outil/donesol/donesol.php)
another RDBMS more adapted for web use, such as MySQL (Oracle) or PostgreSQL.
Results
PostgreSQL.Results
Structure of the EcoBioSoil meta-database Glossary
Structure of the EcoBioSoil meta database Glossary
« Metadata » is a common term referring to "data about data“
RDB* containing the metadata tables (geo- I t bilit « Metadata » is a common term referring to data about data .
If the data are the results of the analysis of samples collected on field (taxonomic RDB* containing the metadata tables (geo-
location codification used description of the Interoperability Lombricien
If the data are the results of the analysis of samples collected on field (taxonomicdetermination, countings, individual weighing, soil physico-chemical measurements,...), the location, codification used, description of the
research programs methods used )
Lombricien
Metadata Catalog
2000determination, countings, individual weighing, soil physico chemical measurements,...), the metadata are all the other informations allowing to explain these data (geo- research programs, methods used…). Metadata Catalog
g p (g
location, sampling date, land use, land management, methods used, climatic, p g , , g , , conditions,…). Metadata are thus the descriptive variables which are going to allow to) p g g group together or, on the contrary, to separate the datasets for datamining. The types of
D l
Other
Donesol metadata required for ecological databases have been reviewed in numerous articles12,13.Geostats to developmodule(s)
The Biological Data Profile14 provides standards for the different types of metadata
d h b l l d k d d d
Geostats Module
to develop
associated with biological data. In Europe, a working group is dedicated to give
d ti f t d t ’ t d d ithi th f k f th I i
Module
recommendations for metadata’s standards within the framework of the Inspire Di ecti e establishing an inf ast ct e fo spatial info mation in E ope2
… Directive, establishing an infrastructure for spatial information in Europe2.E h d l i RDB* t i i th Grassland &
i lt l Vineyard
In the literature, the term « meta-database» can have two meanings:
Existing
Each module is a RDB* containing the
data of a specific program The RDB of agricultural
land Module Vineyard
Module Bi i di 2 Natural
• A database dedicated to the metadata management (also called metadata catalog),
Existing databases
data of a specific program. The RDB of all the modules has the same
land Module DonEcoSol
Module Bioindic2 Module
Natural
Land cover A database dedicated to the metadata management (also called metadata catalog),
• A database of databases : “Strictly speaking a meta-database can be considered a
databases
which can be connected
all the modules has the same architecture in order to facilitate
o coSo Module
Module Module
A database of databases : Strictly speaking a meta database can be considered a database of databases, rather than any one integration project or technology. It collects
which can be connected, if needed
architecture in order to facilitate
queries database of databases, rather than any one integration project or technology. It collects
data from different sources and usually makes them available in new and more convenient
if needed
queries.
data from different sources and usually makes them available in new and more convenient form, or with an emphasis on a particular (…) organism.”15 This is the definition used
Local Regional N ti l European
* RDB: Relational DataBase Program’ scale , p p ( ) g
for this term in this poster.
Local Regional National European
* RDB: Relational DataBase Program’ scale
p
Applications: some illustrations Applications: some illustrations
m²) 2. EcoBioSoil is implemented in Access©
ind/m
co oSo s p e e ted ccess©
(Microsoft) Directly in this software simple 1. EcoBioSoil can be connected to ArcGis© (ESRI) to generate maps with the geolocated sampling sites.
nce (i
(Microsoft). Directly in this software, simple operations of descriptive statistic can be
© ( ) g p g p g
ndan
operations of descriptive statistic can be p og ammed to anal se datasets
abunprogrammed to analyse datasets.
otal
Figure 2: Histogram representing the earthworm’s total abundance of
rm to
g g p g
the RMQS BioDiv sampling sites (DonEcoSol module; n=109),
thwoseparated by the land cover (see figure 1.1)
Eart
3. EcoBioSoil can be connected to statistic 3. EcoBioSoil can be connected to statistic softwares like R (open source) to make more softwares, like R (open source), to make more advanced analysis as geostatistical or advanced analysis, as geostatistical or
lti i t l i multivariate analysis.
Codes used for earthworm’s species are
Figure 1.1: Map representing the land cover of theg p p g detailed in the poster : “Integration of
geolocated sampling sites of the DonEcoSol module, biodiversity in soil quality monitoring:
h l f QS
Figure 1.2: Map of the whole geolocated sampling sites of EcoBioSoil: DonEcoSol containing the data of the regional Soil Quality
M it i N t k f Bi Di it (RMQS Bi Di earthworm result of RMQS BioDiv
g p g p g ”
module (109 RMQS BioDiv sites), and all the other modules (27 monitoring sites), Monitoring Network for BioDiversity (RMQS BioDiv;
n=109) program”.
connected with the Lombricien2000 database, representing 2000 sampling sites in
F E d l Al i
n=109).
France, Europe and also Algeria.
Figure 3: Graphical display of a multivariate analysis (PCA) discriminating the
th ’ i f th RMQS Bi Di li (D E S l d l )
earthworm’s species of the RMQS BioDiv samplings (DonEcoSol module).
Conclusion - Perspectives Conclusion - Perspectives
For the moment EcoBioSoil is reserved to an internal use allowing the EcoBio team and scientific partners to examine and analyze amount datasets generated either by local studies or multi- For the moment, EcoBioSoil is reserved to an internal use, allowing the EcoBio team and scientific partners to examine and analyze amount datasets generated either by local studies or multi partnership large-scale programs Currently the trend in the biological research community is for information sharing The scientists and politics (regional national and European) are now partnership large-scale programs. Currently, the trend in the biological research community is for information sharing. The scientists and politics (regional, national and European) are now convinced that such an approach is necessary to build real biodiversity observation systems So an additional level of the EcoBioSoil interoperability will be achieved with the open source convinced that such an approach is necessary to build real biodiversity observation systems. So, an additional level of the EcoBioSoil interoperability will be achieved with the open source
b i j d l i ill ll b id i h h d b ( bi i i ) I d i f E i (i l d d i 2010
website project development, as it will allow many bridges with other databases (abiotic, taxonomic...). In order to meet requirements of pan-European strategies (included in 2010 - Millennium) for developing a soil biodiversity analysis unified approach at European level, databases interconnectivity will have to be enhanced, for instance by using the new possibilities) p g y y pp p , y , y g p offered by the web services. Moreover, the development of the open-source website will also focus on data final representation, as communication tools for public, or eco-informatic products offered by the web services. Moreover, the development of the open source website will also focus on data final representation, as communication tools for public, or eco informatic products to support ecological and environmental decision makers
to support ecological and environmental decision makers.
Bibliography
1: ENVASSO (ENVironmental ASseSment Of Soil for mOnitoring) EU 6th Framework Research Program Priority SSP 4 Policies 1 5 7: European Soil DataBase http://eusoils jrc ec europa eu/ESDB Archive/ESDB/index htm
Bibliography
1: ENVASSO (ENVironmental ASseSment Of Soil for mOnitoring) EU 6th Framework Research Program Priority SSP-4 Policies-1.5
Task 6 http://www envasso com/home htm 7: European Soil DataBase http://eusoils.jrc.ec.europa.eu/ESDB_Archive/ESDB/index.htm
8: World Soil Information Database http://www isric org/
Task 6. http://www.envasso.com/home.htm
2: Directive 2007/2/EC of the European Parliament and the Council (14/03/07) establishing an Infrastructure for Spatial 8: World Soil Information Database http://www.isric.org/
9: Chen P., Pin-Shan P., 1976. The Entity-Relationship Model - Toward a Unified View of Data. ACM Transactions on Database 2: Directive 2007/2/EC of the European Parliament and the Council (14/03/07) establishing an Infrastructure for Spatial
Information in the European Community (INSPIRE) http://inspire.jrc.ec.europa.eu/ 9: Chen P., Pin Shan P., 1976. The Entity Relationship Model Toward a Unified View of Data. ACM Transactions on Database Systems 1 (1): 9–36.
p y ( ) p // p j p /
3: Rutgers M., Schouten A. J., Bloem J., Van Eekeren N., De Goede R. G. M., Jagersop Akkerhuis G. A. J. M., Van der Wal A., Mulder y ( )
10: Codd E. F., 1970. A Relational Model of Data for Large Shared Data Banks, CACM 13, No. 6.
C., Brussaard L. and Breure A. M. (2009). Biological measurements in a nationwide soil monitoring network. European Journal of Soil
S i 60 820 832 d i 10 1111/j 1365 2389 2009 01163 11: Tardieu H., Rochfeld A., 1983. MERISE: An information system design and development methodology. Information &
M t V l 6 I 3 143 159
Science, 60: 820–832. doi: 10.1111/j.1365-2389.2009.01163.x
4: Graefe U Beylich A (2002) The German long term soil monitoring program and its implications for the knowledge of Management, Volume 6, Issue 3, p. 143-159.
12: Michener W K Brunt J W Helly J J Kirchner T B Stafford S G 1997 Non geospatial metadata for the ecological sciences 4: Graefe U., Beylich A. (2002). The German long-term soil monitoring program and its implications for the knowledge of
Enchytraeidae Christensen B Standen V (eds ): Proceedings of the 4th International Symposium on Enchytraeidae Mols 12: Michener W.K., Brunt J.W., Helly J.J., Kirchner T.B., Stafford S.G., 1997. Non geospatial metadata for the ecological sciences.
Ecological Applications 7: 330-342 Enchytraeidae. Christensen, B., Standen, V. (eds.): Proceedings of the 4th International Symposium on Enchytraeidae, Mols
Laboratory, Denmark, 2-4 June 2000 (Newsletter on Enchytraeidae No. 7). Natura Jutlandica, Occasional papers No. 2, 2002. Ecological Applications 7: 330 342.
13: Hale S.S., 2000. How to manage data badly (part 2). Bulletin of the Ecological Society of America 81:101-103.
abo ato y, e a , Ju e 000 ( e s ette o c yt ae dae o ) atu a Jut a d ca, Occas o a pape s o , 00
5: Bouché, M.B. (1972). Lombriciens de France. Ecologie et systématique. Ann. Soc. Ecol. Anim. 72,1671. 3 a e S S , 000 o to a age data bad y (pa t ) u et o t e co og ca Soc ety o e ca 8 0 03
14: FGDC Biological Data Working Group and USGS Biological Resources Division, 1999. Content standard for digital geospatial 6: Lavelle P. and Fragoso C. (Eds.). 2000. The IBOY-MACROFAUNA project: Report of an international workshop held at Bondy
( ) d metadata - biological data profile, FGDCSTD- 001.1-1999. Federal Geographic Data Committee. Washington, D.C., U.S.A.
h // k d / k / l l d b d b
(France), 19-23 June 2000. IRD, Bondy, France. 15: http://en.wikipedia.org/wiki/Biological_database#Meta-databases