Derivation and combination operators

4.4 Use case scenario

5.1.3 Derivation and combination operators

These operations are intended to generate new resources either from the selection of a part of an existing resource or from the aggregation or the composition of multiple resources. Many tools and approaches have been

proposed this kind of knowledge engineering (see surveys [Sattleret al. 2009, Vaníčeket al. 2009]).

5.1.3.1 Selection and derivation

This type of operation selects entities from a resource R to generate a new resource represented in the same content model having as content a subset of entities from R. The selection operator applies filters on the entities of the original resource in order to extract only part of this resource. The filtering options may involve restrictions based on the resource’s entities and/or other entities associated to them by means of annotations or alignments.

For instance, in a description logic ontology, this operator can select individuals in the ABox (Assertional Box), leaving the TBox (Terminological Box) untouched (as in a database selection) or it can select a subset of the TBox, and hence drop the ABox entities that depend on unselected TBox entities or roles (as in a databaseprojection).

LetR be the set of resources already imported in the repository,Mthe set of content representation models available in the repository and z the set of filters to be applied on a resources for extracting parts of their content.

The signature of the derivation operator is of the form:

DR,M,z :r1 ∈ R, M ∈ M, f ∈z→r2 ∈ R, M ∈ M (r₁, M₁, f)→(r₂, M₂);r₂ vr₁.

Applications in the literature propose different implementations of the selection operator qualified by different names such as module extraction for ontological resources [Doran et al.2007, d’Aquin et al.2006] or knowledge extraction for other types of resources [Wimalasuriya & Dou 2010].

5.1.3.2 Composition

Composition operators are applicable only on different resources repre-senting common entities linked to each other using transitive links. This operator may be applied on ontologies, dictionaries, terminologies, paral-lel corpora, comparable corpora, alignments or annotations. This oper-ator is applied on resources represented by the same content representa-tion model. It generates a new resource represented in the same model as the composed resources. Multiple approaches proposed tools and method-ologies for building knowledge resources by combining or composing enti-ties from other resources. We assume that these tools can be adapted to the context of a knowledge resources repository and used as composi-tion operators. For instance, [Mangeotet al. 2010, Otero & Campos 2010,

Nerima & Wehrli 2008, Klavans & Tzoukermann 1995] propose tools for building multilingual lexical resources using transitivity (composition opera-tor for terminological resources) and [Mitra & Wiederhold 2004, Klein 2001, Janninket al. 1998] propose methodologies for composing ontologies (com-position operator for ontological resources).

The composition of two alignment resources fromR1to R2 and from (R2

to R₃ results in a new alignment resource from R₁ to R₃. The semantics (relation type) of the resulting alignment depends on the relation types of the representation model of the input resources. We define and propose in chapter 7 some operators for composing alignment resources.

We assume that the facts of resources to be composed are represented as triples where two entities (node, link or expression entities) e_x and e_y are associated to each other using a role or relation l_i (the semantics of the relation is defined in the content representation model)he_x, li, eyi.

Let R be the set of resources already imported in the repository, M the set of content representation models available in the repository and two resources R₁ ∈ R and R₂ ∈ R. Let f₁ and f₂ be two facts where f₁ = he_x, l_i, e_yi ∈R₁ and f₂=he_y, l_j, e_zi ∈R₂.

If li is a transitive relation or the composition of two instances of link entities l_i and l_j is supported by the content representation modelM ∈ M, then the facts f1, f2 are compose able and their composition is a new fact f1◦2 where:

f1◦2 =he_x, l_i◦l_j, e_zi ∈R₃. whereR3 is a new resource represented in the model M.

5.1.3.3 Aggregation

The aggregation of knowledge resources is an operator that combines mul-tiple resources and generates an aggregated resource (see figure 5.6). The combination of theses resources can be seen as a union followed by an op-erator that solves conflicts and inconsistencies. The idea of the aggregation is to safely combine all the entities and facts imported from a set of re-sources [Porello & Endriss 2011, Noy & Musen 2003, Pinto & Martins 2001, de Bruijn et al.2004, Predoiu et al. 2005]. Depending on the representation language, the operation can take different forms.

For example, using the aggregation operator on two ontologies in the language DL (description logic) is reduced to perform the union operation of their vocabularies and axioms:

• (merge) disjoint union of the vocabularies and axioms plus equivalence and subsumption axioms from both ontologies;

• (replace) if a named concept C of the ontology O1 is aligned (equiv-alence) to a named concept D of the ontology O₂ then the operators keeps every axiom that defines C (C ≡ . . . and C v . . .), keeps the axioms that define D and adds the axiom C ≡ D. This is a way to replace the definitions given inO₁ by those inO₂ (used, for instance, whenO2 is considered as more reliable than O1);

• (check consistency) apply a reasoning process over the generated ontol-ogy and extract all the facts and axioms that generate inconsistencies;

• (solve inconsistency) use an operator that solves the consistency prob-lems, otherwise annotate inconsistent facts and add explanation;

M₁ M₂ M₃

r₁ r₁⁰ r⁰₂ r₂

A_ggr

R1 R3 R2

i i

trans trans

in in

A31 out A32

Figure 5.6: Aggregating (A_ggr) two views of resources represented with the same model; this operation gives as a result a new resource represented in the same model and two sets of alignments (A₃₁ andA₃₂) with the original resources

This operator takes as parameters a list of resources represented using the same content representation model and uses auxiliary resources such as alignments between them (see figure 5.6). Aggregating multiple alignment resources requires that they have the same source and target resources. Ag-gregating multiple annotation resources requires that they annotate the same resource. We define and propose in chapter 7 some operators for aggregating alignment resources.

The previous description of operators provides a general framework for representing knowledge engineering tasks applied to the resources that are represented in the repository. The full description of an operator is repre-sented using the model of resources combination and combination operators (previous chapter). For each class of operators described in this taxonomy we define the type of input, output and the set of parameters.

5.2 Usage of the model and operators to cate repository for combining terminological re-sources

In this section, we present two examples of scenarios reflecting the usage of the model proposed in the previous chapter and the operators described in the previous section. The first scenario have been fully implemented and tested. The second scenario is proposed only to illustrate another type of use case and no experimentation have been conducted.

Dans le document An ontology-based repository for combining heterogeneous knowledge resources (Page 100-104)