knowledge resources repositories - An ontology-based repository for combining heterogeneous kno

One characteristic that represents at the same time a strength and a weakness of the Semantic Web is the heterogeneity. The diversity of knowledge repre-sentation formalisms and the diversity of knowledge models enrich knowledge engineering by reflecting a side of the real world (different domains, diverse point of views, different cognitive models, etc.). This is counted as a positive aspect when knowledge engineers can define and create their own models for knowledge representations and design applications to extract and generate knowledge according to specific models.

However, when it comes to the principle of knowledge sharing and com-munication between software agents, specific conditions require to be fulfilled.

Communication and sharing knowledge between agents requires two levels of interoperability:

• structural and syntactic interoperability: knowledge within the seman-tic web and knowledge engineering contexts is machine-readable and models are provided to exchange structured data;

• semantic interoperability: the semantic aspect of shared knowledge requires adjustment, so that agents can reason about a shared knowl-edge without being confronted to inconsistency and ambiguity. The semantic interoperability requires a common reference model and a well-defined semantics.

Formalisms and standards for representing and sharing knowledge al-low representing and storing knowledge, within resources, having different types (ontologies, dictionaries, thesaurus, etc.). Consequently, reasoning with shared knowledge requires systems to organize and index these

knowl-edge resources for easier access. Thus multiple systems for indexing and storing knowledge resources have been created.

2.2.1 Repositories for indexing and retrieving knowledge re-sources

The increasing number of ontological resources on the web became problem-atic. On one hand, resources representing the same concepts are created independently, which leads often to resources that are too customized to be reused and generally application dependent. On the other hand, search en-gines and information retrieval models needed to be adapted to this kind of resources on the Web (see figure 2.4).

Figure 2.4: Architecture of Watson (“a gateway for the Semantic Web”) as described in [d’Aquin et al.2011]

Consequently, new search applications that automatically discover and index Semantic Web documents and answer queries about this kind of re-sources have been developed in the past decade. Swoogle [Finin et al.2005], Watson [d’Aquin et al. 2011], and OntoSelect [Buitelaaret al.2004] are some of the most popular Semantic Web resources repositories of this kind.

These repositories are designed based on the following features:

• Categorizing the semantic richness of semantic data to use it for ranking purposes;

• Representing relations between resources such as “import” or references to provide semantic clustering and navigation of resources;

• Providing access mechanisms and interfaces for software agents and human users;

These repositories solve one side of the problem, which is the aware-ness, and sharing of Semantic Web documents and ontological resources.

Thus, the models for resources representation do not consider the content of resources and do not solve semantic heterogeneity issues. Consequently, another type of repositories such as ontology libraries have been created to offer a centralized hosted approach for collecting knowledge resources.

2.2.2 Repositories for collecting and managing knowledge re-sources

The second type of resources repositories includes the systems that do not discover automatically ontological or knowledge resources on the web.

These systems rely on registered users that upload and maintain their resources. This allows collecting and storing resources from different sources and offer services of exploring and sharing knowledge. In the cat-egory of ontology repositories of this kind (ontology libraries) multiple sys-tems were developed such as BioPortal [Noyet al.2008], DAML Ontol-ogy Library², TONES Ontology Repository³, Semantic Web infrastructure [Baclawski & Schneider 2009]. Other types of repositories offer access to lan-guage resources such as TerminoTrad⁴, etc.

[d’Aquin & Noy 2012] provided a survey of ontology libraries. The au-thors defined a set of features to evaluate their usefulness by reviewing eleven ontology libraries. The criteria that were identified in this survey are:

• Purpose and coverage: each ontology library serves a set of purposes that are related to ontology development and sharing. Some ontology libraries index and collect ontologies from a specific domain;

• Library content: This feature involves the criteria of the type of proce-dure for collecting ontologies (manual, hybrid or automatic), the type of gatekeeping which is related to the validation of the submitted re-sources (manual or automatic), the metadata of ontologies and other key elements about the characteristics and types of content (mappings and relations between ontologies);

• Main functions for users: Ontology libraries are evaluated according to the main services they offer to the user. The basic functions that ontology libraries provide include search, browsing, selecting and evalu-ating ontologies. Some systems offer programmatic access to ontologies through web services and APIs.

2http://www.daml.org/ontologies/

3http://rpc295.cs.man.ac.uk:8080/repository/

4http://terminotrad.com

• Other features: this category represent all the extra features that are not considered as basic features for ontology libraries.

This survey represents a first study of a set of ontology libraries. The main contribution is the categories of features that are defined to evaluate and compare these libraries. This kind of repositories is evolving and becom-ing more and more accurate for an effective sharbecom-ing and reuse of ontological resources.

Another survey tackles the concept of ontology repositories from another perspective. [Heymans et al.2008] study a set of ontology repositories that store ontological resources based on their storage schemes. There are two cat-egories of ontology repositories which are native and database-based stores.

Native stores use the file system as storage mechanisms. This method has the advantage of supporting a large quantity of data. This type of storage is more popular thanks to its effectiveness in terms of loading and to its open-ness to possibilities of optimization. Allegrograph⁵, Jena TDB⁶, sesame⁷ and OWLIM [Kiryakov et al.2005] are ones of the most popular native stores.

Database-based stores use database management systems such as MySQL, PostgreSQL or Oracle. This model is less performant than the native storage for load and update actions but offers more advantages:

• benefit from the use of database systems such as query optimization mechanisms, transactions, persistence, access control, etc.;

• access knowledge within ontologies and other datasets within different databases. RDF queries can be translated into SQL queries which can be integrated within other SQL queries that retrieve data from other sources.

Multiple benchmarks are proposed in order to evaluate RDF triple store technologies⁸. Some ontology repositories are considered more relevant than others depending on their capabilities of performing reasoning and inference.

Evaluating RDF stores is somehow controversial since their performance de-pends on multiple parameters such as the hardware, the cache mechanisms, order of triples within queries, etc.

Dans le document An ontology-based repository for combining heterogeneous knowledge resources (Page 39-42)