Computing Minimal Projection Modules for Conjunctive Queries

(1)

Computing Minimal Projection Modules for Conjunctive Queries

^?

Jieying Chen¹, Michel Ludwig², Yue Ma¹, and Dirk Walther³

1 Laboratoire de Recherche en Informatique, Universit´e Paris-Sud, France {jieying.chen,yue.ma}@lri.fr

3 Fraunhofer IVI, Dresden, Germany [email protected]

Abstract. We consider the problem of extracting modules of an ontology that contains the knowledge as represented by a second ontology.

The knowledge to be preserved is specified using entailment of conjunctive queries over a given vocabulary. We propose a novel module notion calledprojection module that preserves the answers to conjunctive queries as they follow from a reference ontology. We present an algorithm for computing minimal projection modules for conjunctive queries.

As target and reference ontology we take ELH^r-terminologies. The algorithm is based on simulation notions developed for detecting logical differences betweenELH^r-terminologies.

1 Introduction

Ontology comparison can help understanding the overlap and differences among ontologies which is often desired while a user manipulates multiple knowledge sources. In this paper, we propose the notion ofprojection modulewhich allows to compare the entailment capacities of two ontologies about a given vocabulary.

A projection module characterizes the relative knowledge of one ontology by taking another one as a reference. This can thus lead to a fine-grained ontology comparison measurement between two ontologies.

Various approaches to comparing ontologies have been suggested, including ontology mapping or alignment [12], and logical difference [19–21, 23]. Ontol- ogy matching is the process of determining correspondences, in particular, the subsumption, equivalence, or disjointness relations between two concept or role names from different ontologies. A good concept similarity [1, 22] is often helpful for ontology matching. In contrast, logical difference focuses on the comparison of entailed logical consequences from each ontology and returns difference witnesses if differences are present.

When an ontology has no logical difference compared to another one, our approach further extracts sub-ontologies of the first ontology that contain the

?This work is partially funded by the ANR project GoAsQ (ANR-15-CE23-0022).

(2)

knowledge as represented by the second ontology. For example, suppose T₁ = {A₁vA₂, A₂ vA₃},T₂ ={A₁vA₃uB₁, B₁v ∃r.A₃}, andΣ={A₁, A₃, r}.

ThenT₂entails all queries aboutΣthat follow fromT₁. However, the projection module of T2 with respect to T1 and Σ consists of A1 vA3uB1. This means that a strict sub-ontology ofT2is sufficient to capture all the information ofT1

about Σ. Moreover, T2 entails the consequence A1 v ∃r.A3, which is not the case forT1. Intuitively,T2 is richer in information aboutΣ thanT1.

Ontology modularity [9, 16, 19, 21, 24, 25] is about the extraction of sub- ontologies that preserve all logical consequences over a given signature. In contrast, the proposed projection module is different from modules of a single ontology. For the example above, the basic minimal module [9],⊥>^?-module, and MEX-module [18] of T2 w.r.t. Σ are allT2 itself. The extra axiom in T2 compared to its projection module with respect toT1 confirms thatT1conveys less information about Σ thanT2, which gives a way to compare these two ontologies. For latest results on logical inseparability (in particular, inseparability w.r.t.

conjunctive queries), see [7,8,13], and for a survey on query inseparability, see [6].

To compute projection modules, in this paper, we generalize the notion of justification to the notion ofsubsumption justificationas a minimal set of axioms that maintains a consequence. Our algorithm employs the classical notion of justification to compute subsumption justification. Currently, the approaches for computing all the justifications of an ontology w.r.t. a consequence can be classified into two categories: “glass-box” [2,5,14,15] and “black-box” [10,14,26].

We proceed as follows. Section 2 gives a brief review of Description Logic EL and its extensions as well as logical difference. In Section 3, we introduce the notion of project module. In Section 4, the notion and the computation of role subsumption justifications are presented, which are employed in Section 5 to compute project modules. Finally, Section 6 concludes the paper.

2 Preliminaries

We start by briefly reviewing the description logic ELand several of its extension with range restrictions and conjunction of roles and the universal role as well as concept subsumptions based on these extensions. For a more detailed introduction to description logics, we refer to [3, 4].

LetNCandNRbe sets of concept names and role names. We assume these sets to be mutually disjoint and countably infinite. The sets ofEL-concepts C,EL^ran- concepts D, EL^u-concepts E, and EL^u,u-concepts F, and the sets of ELH^r- inclusionsαandEL^ran,u,u-inclusionsβare built according to the grammar rules:

C ::= A|CuC| ∃r.C|dom(r)

D ::= A|DuD| ∃r.D|dom(r)|ran(r) E ::= A|EuE| ∃R.E

F ::= A|FuF | ∃R.F | ∃u.F

(3)

where A ∈ N_C, r, s ∈ N_R, u is a fresh logical symbol (the universal role) and R =r₁u. . .ur_n withr₁, ..., r_n ∈N_R, for n≥1. We refer to inclusions also as axioms. A Γ-TBox is a finite set ofΓ-inclusions, whereΓ ranges over the sets ofELH^r- andEL^ran,u,u-inclusions.

The semantics is defined as usual in terms of interpretations, which interpret concept and role names and are inductively extended to complex concepts. The notions of satisfaction of a concept, axiom and TBox as well as the notions of a model and the logical consequence relation are defined as usual. We skip a detailed introduction here.

A signature is a finite set of symbols from NC and NR. We write sig^N^C(ξ) and sig(ξ) for the set of concept names and the set of concept and role names occurring in a syntactic objectξ. The symbol Σis used as a subscript to a set of concepts or inclusions to denote that the elements only use symbols fromΣ.

AnELH^r-terminologyT is anELH^r-TBox consisting of axiomsαof the form AvC, A≡C,rvs,ran(r)vC or dom(r)vC, whereAis a concept name, C anEL-concept and no concept name occurs more than once on the left-hand side of an axiom.¹ To simplify the presentation we assume that terminologies do not contain axioms of the form A ≡B or A ≡ > (after removing multiple

>-conjuncts) for concept names A and B. For a terminology T, let ≺_T be a binary relation over N_C such that A ≺_T B iff there is an axiom of the form AvC orA≡C in T such that B ∈sig(C). A terminologyT isacyclic if the transitive closure≺⁺_T of≺_T is irreflexive; otherwiseT iscyclic. A concept name Ais said to beconjunctive inT iff there exist concept namesB1, . . . , Bn,n >0, such thatA≡B1u. . .uBn∈ T; otherwiseAis said to benon-conjunctive inT. An ELH^r-terminology T is normalised iff it only contains axioms of the formsϕvB1u. . .uBn,Av ∃r.B,Avdom(r),rvs, andA≡B1u. . .uBm, A ≡ ∃r.B, where ϕ∈ {A,dom(s),ran(s)}, n≥1,m≥2,A, B, Bi ∈ NC, r, s∈ NR, and each conjunctBi is non-conjunctive in T. EveryELH^r-terminologyT can be normalised in polynomial time such that the resulting terminology is a conservative extension ofT [17]. A subsetM ⊆ T is called ajustification for an ELH-concept inclusion α from T iff M |=α and M⁰ 6|=αfor every M⁰ ( M. We denote the set of all justifications for an ELH-concept inclusionαfrom an ELH-terminologyT with Just_T(α). The latter may contain exponentially many justifications in the number of axioms inT.

We now recall the notion of logical difference for concept subsumption queries, instance queries and conjunctive queries from [17, 20].

Definition 1 (Logical Difference).TheEL^ran,u,u-subsumption query and conjunctive query difference between T1 and T2 wrt. Σ are the sets cDiffΣ(T1,T2) andqDiff_Σ(T1,T2), where

– ϕ∈cDiffΣ(T1,T2)iffϕis an EL^ran,u,u_Σ -inclusion andT1|=ϕandT26|=ϕ;

– (A, q(a))∈qDiff_Σ(T1,T2)iffAis aΣ-ABox andq(a)aΣ-conjunctive query such that(T1,A)|=q(a)and(T2,A)6|=q(a).

1 A concept equationA≡Cstands for the inclusionsAvC andCvA.

(4)

The following theorem states that EL^ran,u,u-subsumption queries are sufficient to detect the absence of conj. query differences (Lemmas 62 and 63 in [17]).

Theorem 1. cDiffΣ(T1,T2) =∅ iffqDiff_Σ(T1,T2) =∅.

If the setcDiffΣ(T1,T2) is not empty, then it typically contains infinitely many concept inclusion. We make use of theprimitive witnesses theorems from [17], which state that if there is a concept inclusion difference in cDiffΣ(T1,T2), then there exists an inclusion in cDiffΣ(T1,T2) of one of the following three types δ1, δ2, δ3, which are built according to the grammar rules below:

δ1::=rvs δ2::=DvA

δ₃::=AvE|dom(r)vE |ran(r)vE

where δ₁ ranges over role inclusions, δ₂ is an EL^ran-inclusion, and δ₃ is an EL^ran,u,u-inclusion. Note that each of these inclusions has either a simple left- hand or a simple right-hand side.

The set of allEL^ran,u,u-subsumption difference witnesses is defined as WtnΣ(T1,T2) := (roleWtnΣ(T1,T2),lhsWtnΣ(T1,T2),rhsWtnΣ(T1,T2)), where the setroleWtnΣ(T1,T2) consists of all type-δ1inclusions incDiffΣ(T1,T2), and the setslhsWtnΣ(T1,T2)⊆(Σ∩NC)∪ {dom(r)|r∈Σ} ∪ {ran(r)|r∈Σ} and rhsWtnΣ(T1,T2)⊆NC∩Σ of left-hand and right-hand subsumption query difference witnesses consist of the left-hand sides of the type-δ3 inclusions in cDiffΣ(T1,T2) and the right-hand sides of type-δ2 inclusions in cDiffΣ(T1,T2), respectively. Consequently, the set WtnΣ(T1,T2) can be seen as a finite representation of the setcDiffΣ(T1,T2) [17], which is typically infinite. As a corollary of the primitive witness theorems in [17], we have that the representation is com- plete in the following sense:cDiff_Σ(T1,T2) =∅iffWtn_Σ(T1,T2) = (∅,∅,∅). Thus, deciding the existence of concept inclusion differences is equivalent to deciding non-emptiness of the three witness sets.

3 Projection Modules

A terminologyT1together with a signatureΣand a query languageQdetermine a set Φ of queries from Q formulated using only symbols from Σ that follow from T1. Intuitively,Φcaptures theQ-knowledge ofT1 aboutΣ. In this paper, we consider conjunctive queries or, equivalently,EL^ran,u,u-subsumption queries as a query language. A projection module for Φof another terminology T2 is a subset ofT2 that entails the queries inΦ. TheQ-knowledge inΦofT1 aboutΣ is captured by the projection module of T₂ and it is represented using axioms fromT₂. The projection can be seen as transforming the finite representation of Φwith axioms fromT₁into a finite representation ofΦwith axioms fromT₂. As an application, the projection of Φonto T2 allows us to determine the axioms that implementΦinT2. For instance, projection modules enable the verification

(5)

of compliance to certain modelling guidelines and standards by the axioms ofT₂ that implementΦ. In particular, projection modules that are minimal w.r.t. set inclusion are relevant for this task. Note that existing module notions suggest to compute a module of T2 for the given signature,Σ, to obtain theQ-knowledge ofT2 aboutΣ, whereas we are only interested inΦ, i.e. theQ-knowledge ofT1

aboutΣ. Consequently, extracting modules ofT2 forΣwill yield modules that generally contain irrelevant axioms and that are, therefore, likely too large for manual inspection.

Definition 2 (Projection Module). Let ρ=hT1, Σ,T2i be a projection setting. A set M ⊆ T2 is a conjunctive query projection module under ρ iff for every Σ-ABox A and every Σ-conjunctive query q(a): (T1,A) |=q(a) implies (M,A)|=q(a).

There may exist several (even exponentially many) minimal projection modules.

Example 1. Let T1 = {A1 v A4}, T2 = {A1 ≡ A2uA3, A2 v A4, A3 v A4} and Σ ={A1, A4}. Then the conjunctive query projection module under ρ= hT1, Σ,T2iisT2.

The notion of projection module is not symmetric, i.e., a projection module under hT1, Σ,T2i is not necessarily the same as a projection module under hT2, Σ,T1i. When the reference terminology equals the terminology from which axioms are to be extracted, a reflexive projection setting of the form ρ =hT, Σ,T iis used. We call a projection module underρ also anautomor- phic projection module.

An interesting application of the projection module is to compare entailment capacities of two terminologies, as shown in the following example.

Example 2. LetT1={AivAi+1|1≤i≤n}, T2={A1vAn, B1vB2}, Σ= {A1, An, B1, B2}. Intuitively,T1contains less information aboutΣthanT2. The minimal projection module underρ=hT1, Σ,T2iisM={A1vAn}. By using

|M|/|T2|= 1/2 as a measure, we see that only half ofT2is about the information ofT1 w.r.t.Σ.

4 Representing Projection Modules using Justifications

In this section, we introduce justification notions for sets of inclusions and we show how they can be combined to obtain minimal projection modules. We start with defining the notion of role subsumption justifications for a set ofEL^ran,u,u- inclusions of the formrvs, wherer, s∈Σ (cf. Section 2).

Definition 3 (Role Subsumption Justification). Letρ=hT₁, Σ,T₂i. A set M is called a role subsumption module under ρ iff M ⊆ T2 and for every r, s∈NR∩Σ,T1|=rvs impliesM |=rvs. A role subsumption justification under ρis the role subsumption module underρthat is minimal w.r.t. (.

(6)

We denote the set of all role subsumption justifications underρas J_ρ^R. Lemma 1. Let J ∈ J_ρ^R. ThenroleWtnΣ(T1, J) =∅.

We continue with defining the notion of subsumption justifications for inclusions of EL^ran,u,u that are of the form D v F, where D ranges over EL^ran- concepts andF overEL^u,u-concepts.

Definition 4 (Subsumption Justification).A subsumption setting is a tuple χ=hT1, X1, Σ,T2, X2i, whereT1 andT2 are normalisedELH^r-terminologies,Σ is a signature,X1, X2∈NC∪ {dom(r),ran(r)|r∈NR}.

A set M is called a subsumer module under χ iff M ⊆ T2 and for every C ∈ EL^ran,u,u_Σ , T1 |=X1 vC implies M |=X2 vC. M is called a subsumee module under χiff M ⊆ T2 and for every C∈ EL^ran,u,u_Σ ,T1 |=CvX1 implies M |=CvX2.

M is called a subsumption module under χ iff M is a subsumer module, a subsumee module under χ and role subsumption module under hT1, Σ,T2i. A subsumee (resp. subsumer, subsumption) justification under χ is a subsumee (resp. subsumer, subsumption) module underχ that is minimal w.r.t. (.

We denote the set of all subsumee (resp. subsumer, subsumption) justifications underχas J_χ^←(resp.J_χ^→,Jχ), whereχ=hT1, X1, Σ,T2, X2i.

Using Definition 1 and 4, we obtain the following proposition stating the absence of certain concept names, and domain and range restrictions of role names as left-hand and right-hand difference witnesses between a reference terminol- ogyT1 and a subsumer and subsumee justification of a second terminologyT2.

For a signatureΣ, letΣ^dom={dom(t)|t∈N_R∩Σ} andΣ^ran={ran(t)| t∈N_R∩Σ}be the sets consisting of concepts of the formdom(t) andran(t) for every role nametinΣ, respectively. Furthermore, letΣ^ζ =Σ∪Σ^dom∪Σ^ranfor ζ∈ C^Σ.

Lemma 2. Let ϕ∈(Σ∩NC)∪Σ^dom∪Σ^ran and letA∈Σ∩NC. Additionally, letχ=hT1, ϕ, Σ,T2, ϕiandχ⁰=hT1, A, Σ,T2, Ai. Then:

– ϕ6∈lhsWtn_Σ(T1, J_χ)for every J_χ∈ J_χ^→; – A6∈rhsWtn_Σ(T₁, J_χ⁰)for every J_χ⁰∈ J_χ^←0.

To obtain subsumption modules we can use an operator⊗to combine sets of role, subsumer and subsumee justifications, one justification for each potential difference witness that needs to be prevented; cf. Lemmas 1 and 2. Given a set S andS1,S2⊆2^S,S1⊗S2:={S₁∪S₂|S₁∈S1, S₂∈S2}. For instance, ifS1= {{α1, α₂},{α3}} and S2 = {{α1, α₃},{α4, α₅}}, then S1⊗S2 ={{α1, α₂, α₃}, {α1, α₂, α₄, α₅},{α3, α₄, α₅},{α1, α₃}}. For a setMof sets, we define a function Minimise_⊆(M) as follows: M ∈Minimise_⊆(M) iffM ∈M and there does not exist a set M⁰ ∈ M such that M⁰ ( M. Continuing the previous example, Minimise_⊆(S1⊗S2) ={{α₂, α₃},{α₁, α₂, α₄, α₅},{α₃, α₄, α₅}}.

We now use⊗and Minimise_⊆(·) to combine sets of role, subsumer and subsumee justifications to obtain the set of all minimal projection modules.

(7)

Theorem 2. Let Mρ be the set of all projection modules underρ=hT₁, Σ,T₂i that are minimal w.r.t.(. Then the following holds, whereχ(ψ) =hT₁, ψ, Σ,T₂, ψi:

Mρ =M inimize_⊆ J_ρ^R ⊗ O

ϕ∈(Σ∩NC)∪Σ^dom∪Σ^ran

J_χ(ϕ)^→ ⊗ O

A∈Σ∩N_C

J_χ(A)^←

5 Computing Projection Modules

In this section, we present algorithms for computing role, subsumer and subsumee justifications. The algorithms use the following notion of a cover of a set of sets. For a finite setSand a setT⊆2^S, we say that a setM⊆2^S is acover of TiffM⊆Tand for everyM ∈T, there existsM⁰ ∈Msuch that M⁰⊆ M. In other words, a cover is a subset ofTcontaining all sets fromTthat are minimal w.r.t.(. Therefore, a cover of the set of all subsumption modules also contains all subsumption justifications. We will use covers to characterise the output of our algorithms to ensure that all justifications have been computed.

Algorithm 7 shows how to collect relevantΣ-role inclusions for role subsumption justifications (cf. Def. 3). The following proposition states its correctness.

Proposition 1. J_ρ^R=Cover^R(T1, Σ,T2), whereρ=hT1, Σ,T2i.

5.1 Computing Subsumer Justifications

We now present our algorithm for computing subsumer justifications. The algorithm relies on the notion of a subsumer simulation between terminologies from [11, 23]. For defining the simulation notion, we need a specific notion of reachability. For ϕ∈N_C∪ {dom(r),ran(r)|r ∈N_R} and a normalisedELH^r- terminologyT, letF_T(ϕ) be the smallest set closed under the following three conditions:ϕ∈F_T(ϕ);Y ∈F_T(ϕ) ifX∈F_T(ϕ),T |=X vX⁰andX⁰./∃r.Y ∈ T; anddom(r)∈F_T(ϕ) ifran(r)∈F_T(ϕ).

Definition 5 (Subsumer Simulation).A relation S⊆N(T1, Σ)×N(T2, Σ), whereN(T, Σ) ={X,dom(r),ran(r)|X, r∈(sig(T)∪Σ), X∈NC, r∈NR}, is a Σ-subsumer simulation fromT1 toT2iff the following conditions are satisfied:

(SN^→C) if (X1, X2)∈ S, then for every ϕ∈Σ∪Σ^dom with T1 |=X1 vϕ, it holds thatT2|=X₂vϕ;

(S^→_∃) if (X1, X2)∈ S and X₁⁰ ./1 ∃r.Y1 ∈ T1 with ./1 ∈ {v,≡} such that T1 |= X1 vX₁⁰ andT1 |=r vs for some s∈Σ, there exists X₂⁰ ./2 ∃r⁰.Y2 ∈ T2

with ./2 ∈ {v,≡} such that T2 |= X2 v X₂⁰ and for every s ∈ Σ with T1|=rvs, it holds that T2|=r⁰ vs and(Y₁, Y₂)∈S.

We writeT₁∼^→_Σ T₂ iff there exists aΣ-subsumer simulationS fromT₁ toT₂ such that for every ϕ ∈ (Σ∩N_C)∪Σ^dom∪Σ^ran: (ϕ, ϕ) ∈ S, and for every ψ1∈F_T₁(ϕ), there exists aψ2∈F_T₂(ϕ)such that(ψ1, ψ2)∈S.

For X1, X2 ∈ NC, we write hT1, X1i ∼^→_Σ hT2, X2i iff there exists a Σ- subsumer simulationS fromT1 toT2 with (X1, X2)∈S for whichT1∼^→_Σ T2.

(8)

A subsumer simulation conveniently captures the set of subsumers in the following sense: If a Σ-subsumer simulation from T₁ to T₂ contains the pair (X₁, X₂), thenX₂entails w.r.t.T₂all subsumers ofX₁w.r.t.T₁that are formulated in the signature Σ. Formally, we obtain the following theorems from [23].

Theorem 3. It holds that T1∼^→_Σ T2 ifflhsWtnΣ(T1,T2) =∅.

Theorem 4. Let hT1, X₁i ∼^→_Σ hT2, X₂i. Then for every D ∈ EL^ran,u,u_Σ : T1 |= X₁vD impliesT₂|=X₂vD.

Algorithm 2 collects all the axioms necessary to satisfy the conditions of Def- inition 5. Observe that Cover→(T1, X1, Σ,T2, X2) may be called several times during the execution of the algorithm. A possible optimisation is to store return values in memory in order to retrieve them more quickly for subsequent calls.

The following theorem shows that Algorithm 2 indeed computes the set of subsumer modules, thus producing a cover of subsumer justifications.

Theorem 5. Letχ=hT1, X1, Σ,T2, X2iandM:=Cover^cq→(T1, X1, Σ,T2, X2).

If T1∼^→_Σ T2, thenM is a cover of the set of subsumer justifications underχ.

5.2 Computing Subsumee Justifications

Next we present the algorithm for computing subsumee justifications based on a notion of a subsumee simulation. The basic idea of the algorithm is to collect as few axioms fromT2as possible to maintain the subsumee simulation between Σ-concept names.

First we present some auxiliary notions for handling conjunctions on the left-hand side of subsumptions. We define for each concept name X a so-called definitorial forest consisting of sets of axioms of the form Y ≡ Y₁u. . .uY_n which can be thought of as forming trees. Any subsumee justification under hT1, X1, Σ,T2, X2icontains the axioms of a selection of these trees, i.e., one tree for every conjunction formulated overΣ that entailsX1 w.r.t.T1. Formally, we define a set of a DefForest_Tû(X) ⊆2^T to be the smallest set closed under the following conditions: ∅ ∈ DefForest_Tû(X); {α} ∈ DefForest_Tû(X) for α = X ≡ X1u. . .uXn ∈ T; andΓ ∪ {α} ∈DefForest_Tû(X) forΓ ∈DefForest_Tû(X) with Z ≡Z1u. . .uZk∈Γ andα=Zi≡Z_i¹u. . .uZ_iⁿ∈ T. GivenΓ ∈DefForest_Tû(X), we set leaves(Γ) := sig(Γ)\ {X ∈ sig(C) | X ≡ C ∈ Γ} if Γ 6= ∅; and {X}otherwise. We denote the maximal element of DefForest_Tû(X) w.r.t.⊆with max-tree_Tû(X). Finally, we set non-conj_T(X) := leaves(max-tree_Tû(X)).

For example, letT ={α1, α₂, α₃}, whereα₁:=X ≡YuZ,α₂:=Y ≡Y₁uY2, and α₃ := Z ≡ Z₁uZ₂. Then DefForest_T^u(X) = {∅,{α1},{α1, α₂},{α1, α₃}, {α₁, α₂, α₃}}. We have that leaves({α₁, α₃}) = {Y, Z₁, Z₂}, max-tree_T^u(X) = {α₁, α₂, α₃}, and non-conj_T(X) ={Y₁, Y₂, Z₁, Z₂}.

We say that a concept nameA is Σ-entailed w.r.t.T iff there is an EL^ran_Σ - conceptC such thatT |=CvA; and we say that a role names isΣ-entailed in T iff there existss⁰∈NR∩Σ such thatT |=s⁰vs.

(9)

We now define the notion of asubsumee simulationfromT₁toT₂as a subset of sig^N^C(T₁)×sig^N^C(T₂)× C_T^Σ

1, whereC_T^Σ

1 :={} ∪(N_R∩(Σ∪sig(T₁))) is the range of role contexts.

Definition 6 (Subsumee Simulation).A relationS⊆sig^N^C(T1)×sig^N^C(T2)×

C_T^Σ₁ is a Σ-subsumee simulation from T1 toT2 iff the following conditions hold:

(S_N^←_C) if(X1, X2, ζ)∈S, then for everyϕ∈Σ^ζ and for everyX₂⁰ ∈non-conj_T₂(X2) withT26|=ran(ζ)vX₂⁰,T1|=ϕvX1 impliesT2|=ϕvX₂⁰;

(S^←_∃) if (X1, X2, ζ) ∈ S and X1 ≡ ∃r.Y1 ∈ T1 such that T1 |= s v r for s ∈ Σ and Y₁ is Σ-entailed w.r.t. T1, then for every X₂⁰ ∈ non-conj_T

2(X₂) not entailed by dom(s) or ran(ζ) w.r.t. T2, there exists X₂⁰ ≡ ∃r⁰.Y₂ ∈ T2 such thatT2|=svr⁰ and(Y₁, Y₂, s)∈S;

(S^←_u) if (X1, X2, ζ) ∈ S and X1 ≡ Y1 u. . .uYn ∈ T1, then for every Y2 ∈ non-conj_T₂(X2)not entailed byran(ζ)inT2, there existsY1∈non-conj_T₁(X1) not entailed byran(ζ)w.r.t. T2 such that(Y1, Y2, )∈S.

We write T1 ∼^←_Σ T2 iff there is a Σ-subsumer simulation S from T1 to T2

such that for everyA∈Σ∩NC:(A, A, )∈S.

Forζ ∈Σ∩NR, we write hT1, X1i ∼^←_Σ,ζ hT2, X2i iff there is aΣ-subsumer simulationS fromT1 toT2 with (X1, X2, ζ)∈S for whichT1∼^←_Σ T2.

Analogously to subsumer simulations, a subsumee simulation captures the set of subsumees as it is made precise in the following theorems.

Theorem 6. T₁∼^←_Σ T₂ iffrhsWtn_Σ(T₁,T₂) =∅.

Theorem 7. Let hT1, X1i ∼^←_Σ,ζ hT2, X2i. Then for every C∈ EL^ran_Σ :T1|=Cv X1 implies thatT2|=CvX2.

Before introducing the algorithms, we first extend the notion ofΣ-entailment.

We say that a concept name X is complex Σ-entailed w.r.t. T iff for every Y ∈non-conj_T(X) one of the following conditions holds:

– there existsB∈Σ such thatT |=BvY and T 6|=BvX; or – there existsY ≡ ∃r.Z∈ T andr, Z areΣ-entailed in T.

Otherwise, X is said to be simply Σ-entailed. For example, let T = {X ≡ X1uX2, B1vX1, X2≡ ∃r.Z, B2 vZ, svr}. We have that non-conj_T(X) = {X1, X2}, then r is Σ-entailed w.r.t. T; X is complexΣ-entailed w.r.t. T for Σ={B1, B2, s}; butXis not complexΣ⁰-entailed w.r.t.T, whereΣ⁰ranges over {B1, B₂},{B1, s},{B2, s}. Additionally,X is not complexΣ-entailed w.r.t.T ∪ {B1vX}.

Algorithm 3 is responsible for computing a cover of all subsumee justifications. It orchestrates three further algorithms in order to collect the axioms necessary to satisfy the three conditions of Definition 6: Algorithm 4 for Case (S^←_N

C), Algorithm 5 for Case (S^←_∃ ) and Algorithm 6 for Case (S_u^←). While Algorithm 4 can readily be understood, we provide some additional explanation for the remaining algorithms.

(10)

Algorithm 1:Computing a Cover of all Sub- sumer Justifications for Conjunctive Queries

1 functionCover^cq→(T1, X1, Σ,T2, X2)

2 M^→(X1,X2):={∅}

3 for everyψ1∈FT₁(X1)do

4 M^→ψ1:={∅}

5 for everyψ2∈FT2(X2)such that hT1, ψ1i ∼^→Σ hT2, ψ2ido

6 M^→ψ1,ψ2:=Cover→(T1, ψ1, Σ,T2, ψ2)

7 M^→ψ₁:=M^→ψ₁∪M^→ψ₁,ψ₂ 8 M^→(X1,X2)=M^→(X1,X2)⊗M^→ψ1 9 returnMinimise⊆(M^→_(X₁_,X₂₎)

Algorithm 2:Computing a Cover of all Sub- sumer Justifications (Recursive)

1 functionCover^→(T1, X1, Σ,T2, X2)

2 M^→(X1,X2):={∅}

3 for everyB∈(Σ∩NC)∪ {dom(r)|r∈Σ} such thatT1|=X1vBdo

4 M^→(X₁,X₂):=M^→(X₁,X₂)⊗JustT2(X2vB)

5 for everyY ./1∃r.Z∈ T1(./1∈ {v,≡}) withT1|=X1vY,T1|=rvsfor some s∈Σ∩NRdo

6 M^→∃r.Z:={∅}

7 for everyY⁰./2∃r⁰.Z⁰∈ T2(./2∈ {v,≡}) withT2|=X2vY⁰andT2|=r⁰vsfor everys∈ {s⁰∈Σ∩NR| T1|=rvs⁰} andhT1, Zi ∼^→ΣhT2, Z⁰ido

8 M^→r⁰:={∅}

9 for everys∈Σ∩NRwithT1|=rvs do

10 M^→r⁰:=M^→r⁰⊗JustT₂(r⁰vs)

11 M^→Z⁰:=Cover→(T1, Z, Σ,T2, Z⁰)

12 M^→∃r.Z:=M^→∃r.Z∪ JustT2(X2vY⁰)

⊗ {{Y⁰./2∃r⁰.Z⁰}} ⊗M^→r⁰⊗M^→Z⁰

13 M^→(X1,X2):=M^→(X1,X2)⊗M^→∃r.Z 14 returnM^→(X1,X2)

Algorithm 3:Computing a Cover of all Sub- sumee Justifications

1 functionCover^cq←(T1, X1, Σ,T2, X2, ζ)

2 ifX1is notΣ-entailedw.r.t.T1then

3 return{∅}

4 M^←(X1,X2):=Cover^N←^C(T1, X1, Σ,T2, X2, ζ)

5 ifX1is not complexΣ-entailed inT1then

6 returnM^←(X1,X2)

7 ifX1≡ ∃r.Y∈ T1, andr, Y areΣ-entailed w.r.t.T1then

8 M^←(X1,X2):=

M^←(X1,X2)⊗Cover^∃←(T1, X1, Σ,T2, X2, ζ)

9 else ifX1≡Y1u. . .uYm∈ T1then

10 M^←(X1,X2):=

M^←(X1,X2)⊗Cover^u←(T1, X1, Σ,T2, X2, ζ)

11 returnMinimise⊆(M^←(X1,X2))

Algorithm 4:Computing a Cover of all Sub- sumee Projection Justifications (S^←N_C)

1 functionCover^N←^C(T1, X1, Σ,T2, X2, ζ)

2 M^←(X1,X2):={∅}

3 foreveryB∈Σ^ζsuch thatT1|=BvX1do

4 foreveryX2∈non-conj_T₂(X1)such that ζ=εorT2|=ran(ζ)vX2do

5 M^←(X₁,X₂):=M^←(X₁,X₂)⊗JustT2(BvX2)

Algorithm 5:Computing a Cover of all Sub- sumee Projection Justifications (S^←∃)

1 functionCover^∃←(T1, X1, Σ,T2, X2, ζ)

2 letαX1:=X1≡ ∃r.Y1∈ T1 3 M^←(X1,X2):={max-treeT^u2(X2)}

4 for everys∈Σ∩NRsuch thatT1|=svr do

5 for everyX2⁰∈non-conjT₂(X2) such that ζ6=εimpliesT26|=ran(ζ)vX₂⁰and T26|=dom(s)vX2⁰ do

6 letα_X0

2:=X₂⁰≡ ∃r⁰.Y₂⁰∈ T2

M^←Y₂⁰:=Cover^cq←(T1, Y1, Σ,T2, Y2⁰, s) M^←(X1,X2):=M^←(X1,X2)

⊗ {{αX⁰₂}} ⊗JustT₂(svr)⊗M^←Y₂⁰

Algorithm 6:Computing a Cover of all Sub- sumee Justifications (S^←u)

1 functionCover^u←(T1, X1, Σ,T2, X2, ζ)

2 letαX1:=X1≡Y1u. . .uYm∈ T1 3 M^←(X1,X2):=∅

4 for everyΓ∈DefForestT^u₂(X2)do

5 letδΓ:={defT^u₂(X⁰)|X⁰∈ leaves(Γ)∩defT^u2}

6 M^←Γ :={Γ}

7 for everyX₂⁰∈leaves(Γ) such thatζ=ε orT26|=ran(ζ)vX⁰2do

8 M^←X₂⁰:=∅

9 for everyX1⁰∈non-conj_T₁(X1) such thatζ=εorT26|=ran(ζ)vX⁰1do

10 ifhT1, X1⁰i ∼^←Σ,ζhT2\δΓ, X2⁰ithen

11 M^←X₂⁰:=M^←X₂⁰∪

Cover^cq←(T1, X1⁰, Σ,T2\δΓ, X⁰2, )

12 M^←Γ :=M^←Γ ⊗M^←X₂⁰ 13 M^←(X1,X2):=M^←(X1,X2)∪M^←Γ 14 returnM^←(X1,X2)

Algorithm 7:Computing a Cover of all Sub- sumption Justifications for Role Inclusions

1 functionCover^R(T1, Σ,T2)

2 M={∅}

3 for everyr, s∈Σ∩NRsuch that T1|=rvsdo

4 M:=M⊗JustT2(rvs)

5 returnMinimise⊆(M)

Fig. 1.Algorithms of computing all subsumer and subsumee justifications

(11)

The existence of axiomα_X₁ :=X₁≡ ∃r.Y₁∈ T₁ in Line 2 of Algorithm 5 is guaranteed by Line 7 of Algorithm 3. The axiomα_X⁰

2 :=X₂⁰ ≡ ∃r⁰.Y₂⁰ ∈ T₂ in Line 6 of Algorithm 5 exists as we assume thatX₂inT₂“subsumee-simulates”X₁ inT1. Moreover, there is at most one axiomαX₁ ∈ T1and at most oneα_X⁰

2 ∈ T2

as T1 and T2 are terminologies. The concept name X2 may be defined as a conjunction inT2 whose conjuncts in turn may also be defined as a conjunction inT2and so forth. In Line 3 all axioms forming the maximal resulting definitorial conjunctive tree are collected.

For the next algorithm, we define def_Tû:={X ∈sig^N^C(T)|X≡Y1u. . .uYn∈ T }to be the set of concept names that are conjunctively defined inT. For every X ∈def_Tû, we set def_Tû(X) :=α, whereα=X ≡Y1u. . .uYn∈ T.

The axiom αX₁ := X1 ≡ Y1 u. . .uYm ∈ T1 in Line 2 of Algorithm 6 is guaranteed by Line 9 of Algorithm 3. In case X2 is defined as a conjunction in T2, the pair consisting of T2 containing only a partial conjunctive tree rooted atX₂andX₂needs to be considered to be sufficient to “subsumee simu- late”X₁inT₁. Therefore Algorithm 3 considers every partial conjunctive treeΓ from DefForest_T^u

2(X₂) in Line 4 and removes the axioms inδ_Γ connecting the leaves ofΓ with the remaining conjunctive tree fromT₂ in lines 10 and 11.

The following theorem shows that Algorithm 3 indeed computes a cover of the set of subsumee modules. Thus every subsumee justification is guaranteed to be among the computed sets of axioms.

Theorem 8. Let χ=hT₁, ϕ₁, Σ,T₂, ϕ₂i andϕ₁, ϕ₂ ∈(Σ∩N_C)∪Σ^dom∪Σ^ran. Additionally, letM:=Cover^cq←(T1, ϕ1, Σ,T2, ϕ2, ). If T1∼^←_Σ T2, thenMis the set of all subsumee justifications underχ.

The number of (minimal) projection justifications depends onT₁,T₂andΣ.

In general, this number is bounded by an exponential in the size of the terminologies. The simulation checks can be performed in polynomial time [11, 23].

Our algorithm for computing projection justifications, therefore, runs in time exponential in the size of the input.

6 Conclusion

We introduce the notion of projection module for conjunctive queries as a subset of an ontology capturing the knowledge about a given signature as specified in a reference ontology. Here knowledge about a signature means the set of entailed conjunctive queries about the signature. This allows comparing ontologies in a more fine-grained fashion compared to merely extracting modules. Projection modules enable us to check how knowledge is implemented in terms of axioms in different ontologies. In particular, we can verify that and how specifications as defined in reference ontologies have been realised. We have presented algorithms for computing projection modules of acyclic ELH^r-terminologies w.r.t.

conjunctive queries. Similar algorithms can be used to deal with projection modules for concept subsumption queries and instance queries. We expect that the algorithms can be extended to deal with cyclic terminologies and even general ELH^r-TBoxes.

(12)

References

1. Alsubait, T., Parsia, B., Sattler, U.: Measuring similarity in ontologies: A new family of measures. In: Proceedings of EKAW’14: the 19th International Conference on Knowledge Engineering and Knowledge Management. pp. 13–25 (2014) 2. Arif, M.F., Menc´ıa, C., Ignatiev, A., Manthey, N., Pe˜naloza, R., Marques-Silva, J.:

BEACON: An efficient SAT-based tool for debugging EL⁺-ontologies. In: Proceed- ings of SAT’16: the 19th International Conference on the Theory and Applications of Satisfiability Testing. pp. 521–530 (2016)

3. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.

(eds.): The description logic handbook: theory, implementation, and applications.

Cambridge University Press, 2 edn. (June 2010)

4. Baader, F., Horrocks, I., Lutz, C., Sattler, U.: An Introduction to Description Logic. Cambridge University Press (2017)

5. Baader, F., Pe˜naloza, R., Suntisrivaraporn, B.: Pinpointing in the description logic EL. In: Proceedings of DL’07: the 20th International Workshop on Description Logics (2007)

6. Botoeva, E., Konev, B., Lutz, C., Ryzhikov, V., Wolter, F., Zakharyaschev, M.:

Inseparability and conservative extensions of description logic ontologies: A survey.

In: Reasoning Web Summer School 2016, LNCS, vol. 9885, pp. 27–89. Springer International Publishing (2017)

7. Botoeva, E., Kontchakov, R., Ryzhikov, V., Wolter, F., Zakharyaschev, M.: Games for query inseparability of description logic knowledge bases. Artificial Intelligence 234, 78 – 119 (2016)

8. Botoeva, E., Lutz, C., Ryzhikov, V., Wolter, F., Zakharyaschev, M.: Query-based entailment and inseparability for alc ontologies. In: Proceedings of IJCAI’16: the 25th International Joint Conference on Artificial Intelligence. pp. 1001–1007. AAAI Press (2016)

9. Chen, J., Ludwig, M., Ma, Y., Walther, D.: Zooming in on ontologies: Minimal modules and best excerpts. In: Proceedings Part I of ISWC’17: the 16th Interna- tional Semantic Web Conference. pp. 173–189 (2017)

10. Domingue, J., Anutariya, C. (eds.): Proceedings of ASWC’08: the 3rd Asian Se- mantic Web Conference on The Semantic Web, Lecture Notes in Computer Science, vol. 5367. Springer (2008)

11. Ecke, A., Ludwig, M., Walther, D.: The concept difference for EL-terminologies using hypergraphs. In: Proceedings of DChanges’13 (2013)

12. Euzenat, J., Shvaiko, P.: Ontology Matching, Second Edition. Springer (2013) 13. Jung, J.C., Lutz, C., Martel, M., Schneider, T.: Query conservative extensions in

horn description logics with inverse roles. In: Proceedings of IJCAI’17: the 26th In- ternational Joint Conference on Artificial Intelligence. pp. 1116–1122. AAAI Press (2017)

14. Kalyanpur, A., Parsia, B., Sirin, E., Hendler, J.A.: Debugging unsatisfiable classes in OWL ontologies. Journal of Web Semantics 3(4), 268–293 (2005)

15. Kazakov, Y., Skocovsky, P.: Enumerating justifications using resolution. In: Pro- ceedings of DL’17: the 30th International Workshop on Description Logics (2017) 16. Konev, B., Kontchakov, R., Ludwig, M., Schneider, T., Wolter, F., Zakharyaschev, M.: Conjunctive query inseparability of OWL 2 QL TBoxes. In: Proceedings of AAAI’11: the 25th Conference on Artificial Intelligence. AAAI Press (2011) 17. Konev, B., Ludwig, M., Walther, D., Wolter, F.: The logical difference for the

lightweight description logic EL. Journal of Artificial Intelligence Research 44, 633–708 (2012)

(13)

18. Konev, B., Lutz, C., Walther, D., Wolter, F.: Semantic modularity and module extraction in description logics. In: Proceedings of ECAI’08: the 18th European Conference on Artificial Intelligence. pp. 55–59. IOS Press (2008)

19. Konev, B., Lutz, C., Walther, D., Wolter, F.: Model-theoretic inseparability and modularity of description logic ontologies. Artificial Intelligence 203, 66–103 (2013) 20. Konev, B., Walther, D., Wolter, F.: The logical difference problem for description

logic terminologies. In: Proceedings of IJCAR’08. pp. 259–274 (2008)

21. Kontchakov, R., Wolter, F., Zakharyaschev, M.: Logic-based ontology comparison and module extraction, with an application to DL-Lite. Artificial Intelligence 174(15), 1093–1141 (2010)

22. Lehmann, K., Turhan, A.: A framework for semantic-based similarity measures for ELH-concepts. In: Proceedings of JELIA’12: the 13th European Conference Logics in Artificial Intelligence. Lecture Notes in Computer Science, vol. 7519, pp.

307–319. Springer (2012)

23. Ludwig, M., Walther, D.: The logical difference for ELHr-terminologies using hypergraphs. In: Proceedings of ECAI’14: the 21st European Conference on Artificial Intelligence. pp. 555–560 (2014)

24. Romero, A.A., Kaminski, M., Grau, B.C., Horrocks, I.: Module extraction in ex- pressive ontology languages via datalog reasoning. Journal of Artificial Intelligence Research 55, 499–564 (2016)

25. Sattler, U., Schneider, T., Zakharyaschev, M.: Which kind of module should I extract? In: Proceedings of DL’09. CEUR Workshop Proceedings, vol. 477. CEUR- WS.org (2009)

26. Zhou, Z., Qi, G., Suntisrivaraporn, B.: A new method of finding all justifications in OWL 2 EL. In: Proceedings of WI’13: IEEE/WIC/ACM International Conferences on Web Intelligence. pp. 213–220 (2013)