• Aucun résultat trouvé

Computing Solutions in OWL 2 QL Knowledge Exchange

N/A
N/A
Protected

Academic year: 2022

Partager "Computing Solutions in OWL 2 QL Knowledge Exchange"

Copied!
13
0
0

Texte intégral

(1)

Knowledge Base Exchange

M. Arenas1, E. Botoeva2, D. Calvanese2, and V. Ryzhikov2

1 Dept. of Computer Science, PUC Chile & University of Oxford marenas@ing.puc.cl

2 KRDB Research Centre, Free Univ. of Bozen-Bolzano, Italy lastname@inf.unibz.it

Abstract. The problem of exchanging knowledge bases from a source signature to a target signature connected through a mapping has recently attracted attention in knowledge representation. In this paper, we study this problem for knowledge bases and mappings expressed in OWL 2 QL, one of the profiles of the standard Web Ontology Language OWL 2. More specifically, we consider the member- ship and non-emptiness problems associated with computing universal solutions, which have been identified as one of the most desirable translations to be mate- rialized. We study two settings: when ABoxes are in OWL 2 QL and when null values are allowed in the ABox language. For the former case, we provide a novel technique based on reachability games on graphs to show that the non-emptiness and membership problems are in PTime. For the latter case, we report a range of complexity results from NP to EXPTIME. We also consider the problem of com- puting universalUCQ-solutions, which provide an alternative notion of transla- tion containing sufficient information to properly answer union of conjunctive queries, reporting a PSPACElower bound for the membership problem.

1 Introduction

Complex forms of information, maintained in different formats and organized according to different structures, often need to be shared between agents. In recent years, both in the data management and in the knowledge representation communities, several settings have been investigated that address this problem from various perspectives: in informa- tion integration, uniform access is provided to a collection of data sources by means of an ontology (or global schema) to which the sources are mapped [17]; in peer-to-peer systems, a set of peers declaratively linked to each other collectively provide access to the information assets they maintain [14,1,13]; in ontology matching, the aim is to un- derstand and derive the correspondences between elements in two ontologies [11,20];

finally, in data exchange, the information stored according to a source schema needs to be restructured and translated so as to conform to a target schema [12,8].

The work we present in this paper is inspired by the latter setting, investigated in databases. We study it, however, under the assumption of incomplete information typi- cal of knowledge representation [5]. Specifically, we investigate the problem ofknowl- edge base exchange, where a source knowledge base (KB) is connected to a target KB by means of a declarative mapping specification, and the aim is to exchange knowledge

(2)

from the source to the target by exploiting the mapping. We rely on a framework for KB exchange proposed recently in [2,3,4], based on lightweight Description Logics (DLs) of theDL-Litefamily [9]: both source and target are KBs constituted by a DL TBox, representing implicit information, and an ABox, representing explicit information, and mappings are sets of DL concept and role inclusions.

In this paper, we adjust the above mentioned framework to OWL 2 QL [19], one of the profiles of the standard Web Ontology Language OWL 2 [7], and then study the problem of computing universal solutions, which have been identified as one of the most desirable translations to be materialized. We investigate both the task of checking membership, where a candidate universal solution is given and one needs to check its correctness, andnon-emptiness, where the aim is to determine the existence of a univer- sal solution. We prove that both problems can be solved in PTIMEusing a novel reduc- tion to the problem of finding awinning strategyin reachability games on graphs [18].

Then, we argue that for certain natural shapes of source KBs and mappings the uni- versal solutions do not exist, unless null values are allowed in ABox languages. So, we considerextendedABoxes that may contain nulls, presenting a number of complexity results ranging from NP to EXPTIMEfor this setting.

Finally, we consideruniversalUCQ-solutions, an alternative notion for materializa- tion of knowledge in the target that contains sufficient information to answer unions of conjunctive queries. We show that universalUCQ-solutions exist for certain source KBs and mappings, where universal solutions (even with extended ABoxes) do not. We also report PSPACE-hardness for the membership problem for universalUCQ-solutions.

The paper is organized as follows. We give preliminary notions on DLs and queries in Section 2, and on KB exchange in Section 3. In Section 4, we present the known results and give some intuition about the shape of universal solutions. In Sections 5 and 6, we present the results on the complexity of computing universal solutions for KBs and extended KBs, respectively. In Section 7, we consider universalUCQ-solutions and in Section 8, we present conclusions and outline some future work.

2 Preliminaries

The DLs of theDL-Litefamily [6] are characterized by the fact that standard reasoning can be done in polynomial time. We adapt hereDL-LiteR, and present now its syntax and semantics. LetNC,NR,Na,N` be pairwise disjoint sets ofconcept names,role names,constants, andlabeled nulls, respectively. Assume in the following thatA∈NC

andP ∈NR; inDL-LiteR,BandCare used to denote basic and arbitrary (or complex) concepts, respectively, andRandQare used to denote basic and arbitrary (or complex) roles, respectively, defined as follows:

R ::= P | P Q ::= R | ¬R

B ::= A | ∃R C ::= B | ¬B

Below, for a basic roleR, we useRto denotePwhenR=P, andPwhenR=P. A TBox is a finite set ofconcept inclusionsB v C androle inclusionsR v Q.

We call an inclusion of the formB1v ¬B2orR1v ¬R2adisjointness assertion. An ABox is a finite set ofmembership assertionsB(a),R(a, b), wherea, b ∈ Na. Here,

(3)

we also consider extended ABoxes, obtained by allowing labeled nulls in membership assertions. Formally, anextended ABoxis a finite set of membership assertionsB(u) andR(u, v), whereu, v∈(Na∪N`). We denote byInd(A)the set of constants occur- ring inA. Moreover, a(nextended)KBKis a pairhT,Ai, whereT is a TBox andAis an (extended) ABox. AsignatureΣis a finite set of concept and role names. A KBKis said to bedefined over(or simply,over)Σif all the concept and role names occurring in Kbelong toΣ(and likewise for TBoxes, ABoxes, concept inclusions, role inclusions, and membership assertions). Aninterpretation I of Σ is a pairh∆IIi, where∆I is a non-empty domain and·I is an interpretation function such that: (1)AI ⊆ ∆I, for every concept nameA ∈ Σ; (2)PI ⊆ ∆I ×∆I, for every role name P ∈ Σ;

and (3)aI ∈∆I, for every constanta∈Na. Function·Iis extended to also interpret concept and role constructs: (1)(∃R)I={x∈∆I| ∃y∈∆Isuch that(x, y)∈RI};

(2) (¬B)I = ∆I \ BI; (3) (P)I = {(y, x) ∈ ∆I ×∆I |(x, y) ∈ PI}; and (4)(¬R)I= (∆I×∆I)\RI. Note that, consistently with the semantics of OWL 2 QL, we donotmake the unique name assumption, i.e., distinct constantsa, b∈Namay be interpreted as the same object. Note also that labeled nulls arenotinterpreted byI.

LetI =h∆IIibe an interpretation over a signatureΣ. ThenIis said to satisfy a concept inclusionB vCoverΣ, denoted byI |=B vC, ifBI ⊆CI;Iis said to satisfy a role inclusion R v QoverΣ, denoted by I |= R v Q, if RI ⊆ QI; andI is said to satisfy a TBoxT over Σ, denoted by I |= T, ifI |= αfor every α∈ T. Moreover, satisfaction of membership assertions overΣis defined as follows.

AsubstitutionoverI is a functionh : (Na∪N`) → ∆I such thath(a) = aI for everya∈ Na. ThenI is said to satisfy an (extended) ABoxA, denoted byI |=A, if there exists a substitutionhoverIsuch that: (1) for everyB(u)∈ A, it holds that h(u)∈ BI; and (2) for everyR(u, v) ∈ A, it holds that(h(u), h(v))∈ RI. Finally, I is said tosatisfya(n extended) KBK =hT,Ai, denoted byI |=K, ifI |=T and I |=A. SuchI is called amodelofK, and we use MOD(K)to denote the set of all models ofK. We say thatKisconsistentif MOD(K)6=∅. As is customary, given an (extended) ABoxAover a signatureΣand a membership assertionαoverΣ, we use notationA |=αto indicate that for every interpretationIofΣ, ifI |=A, thenI |=α (and likewise for (extended) KBs).

We also need to introduce the notions ofΣ-types andΣ-homomorphisms. For an interpretationIand a signatureΣ, theΣ-typestIΣ(x)andrIΣ(x, y)forx, y∈∆I are given by the set of conceptsBand rolesRoverΣ, respectively, such thatx∈BIand (x, y) ∈ RI. We also usetI(x)andrI(x, y)to refer to the types over the signature of allDL-LiteRconcepts and roles. AΣ-homomorphismfrom an interpretationIto an interpretationJ is a functionh:∆I 7→∆J such thath(aI) =aJ, for all individual namesainterpreted inI,tIΣ(x) ⊆ tJΣ(h(x))andrIΣ(x, y) ⊆rJΣ(h(x), h(y))for all x, y ∈∆I. We say thatIisΣ-homomorphically embeddableintoJ if there exists a Σ-homomorphism fromItoJ. IfΣis the set of allDL-LiteRconcepts and roles, we callΣ-homomorphism simplyhomomorphism.

DL-LiteR enjoys the canonical model property. Let K = hT,Ai be a (non- extended) KB, and vRT the reflexive and transitive closure of the role relation on the set of all basic roles over NR induced by T (that is, the reflexive and transi- tive closure of {(R1, R2) | R1 v R2 ∈ T or R1 v R2 ∈ T }). Then define

(4)

[R] = {S | R vRT SandS vRT R}, [R] ≤T [S] if R vRT S, and a generating relationship Kas follows:

– a Kw[R], if (1)K |=∃R(a); (2)K 6|=R(a, b)for everyb∈Na; (3)[R0] = [R]

for every[R0]such that[R0]≤T [R]andK |=∃R0(a).

– w[S] K w[R], if (1)T |=∃S v ∃R; (2)[S]6= [R]; (3)[R0] = [R]for every [R0]such that[R0]≤T [R]andT |=∃S v ∃R0.

Denote by path(K) the set of all K-paths, where a K-path is a sequence aw[R1]. . . w[Rn] such thatn ≥ 0,a ∈ Na,a K w[R1] andw[Ri] K w[Ri+1] for 1≤i≤n−1. Moreover, for everyσ∈path(K), denote bytail(σ)the last element in σ. Finally, thecanonical(or,universal)modelofK, denotedUK, is defined as:

UK=path(K), aUK=a, fora∈Na,

AUK={a∈Ind(A)| K |=A(a)} ∪ {σ·w[R]∈∆UK | T |=∃R vA}, PUK={(a, b)∈Ind(A)×Ind(A)| K |=P(a, b)} ∪

{(σ, σ·w[R])|tail(σ) Kw[R],[R]≤T [P]} ∪ {(σ·w[R], σ)|tail(σ) Kw[R],[R]≤T [P]}.

Theorem 1 ([16]).IfK is consistent,UKis a model of K. For every modelI |= K, there exists a homomorphism fromUKtoI.

Queries and certain answers. A k-ary query qover a signature Σ, withk ≥ 0, is a function that maps every interpretation h∆IIi of Σ into ak-ary relation qI ⊆ (∆I)k. Given a (non-extended) KB K over Σ, the set ofcertain answers toq over K, denoted bycert(q,K), is defined as:T

I∈MOD(K){(a1, . . . , ak) | {a1, . . . , ak} ⊆ Naand(aI1, . . . , aIk)∈qI}. Aconjunctive query(CQ)over a signatureΣis a formula of the formq(x) = ∃y. ϕ(x,y), wherex,yare tuples of variables andϕ(x,y)is a conjunction of atoms of the formA(t), withAa concept name inΣ, andP(t, t0), with Pa role name inΣ, where each oft, t0is either a constant fromNaor a variable fromx ory. Given an interpretationI=h∆IIiofΣ, the answer ofqoverI, denoted byqI, is the set of tuplesaof elements from∆I for which there exist a tuplebof elements from∆Isuch thatIsatisfies every conjunct inϕ(a,b). A union of conjunctive queries (UCQ) over a signatureΣis a formula of the formq(x) =Wn

i=1qi(x), where eachqi (1≤i≤n) is aCQoverΣ, whose semantics is defined asqI=Sn

i=1qIi.

3 Knowledge Base Exchange Framework for OWL 2 QL

Assume thatΣ12are signatures with no concepts or roles in common. An inclusion E1 v E2is said to befromΣ1toΣ2, ifE1 is a concept or a role overΣ1 andE2is a concept or a role overΣ2. A mapping is a tupleM = (Σ1, Σ2,T12), whereT12is a TBox consisting of inclusions fromΣ1toΣ2[2]. The semantics of such a mapping is defined in [2] in terms of a notion of satisfaction for interpretations, which has to be extended in our case to deal with interpretations not satisfying the unique name assumption (and, more generally, the standard name assumption). More specifically,

(5)

given interpretationsI,J ofΣ1andΣ2, respectively, pair(I,J)satisfiesTBoxT12, denoted by(I,J) |= T12, if (1) for everya ∈ Na, it holds that aI = aJ, (2) for every concept inclusionB v C ∈ T12, it holds that BI ⊆ CJ, and (3) for every role inclusion R v Q ∈ T12, it holds thatRI ⊆ QJ. Notice that the connection between the information in I andJ is established through the constants that move from source to target according to the mapping. For this reason, we require constants to be interpreted in the same way inIandJ, i.e., they preserve their meaning when they are transferred. Finally, SATM(I)is defined as the set of interpretationsJ ofΣ2such that(I,J)|=T12, and given a setXof interpretations ofΣ1, SATM(X)is defined as S

I∈XSATM(I).

The main problem studied in the knowledge exchange area is the problem of trans- lating a KB according to a mapping, which is formalized through several different no- tions of translation. LetM= (Σ1, Σ2,T12)be a mapping,K1=hT1,A1ia KB over Σ1 andK2 = hT2,A2i an extended KB overΣ2. The first such notion is the con- cept of solution, which is formalized as follows:K2is a solutionforK1underMif MOD(K2)⊆SATM(MOD(K1)). Thus,K2is a solution forK1underMif every inter- pretation ofK2is a valid translation of an interpretation ofK1according toM. Then, K2is auniversal solutionforK1underMif MOD(K2) = SATM(MOD(K1)). Thus, K2is designed to exactly represent the space of interpretations obtained by translating the interpretations ofK1underM[2].

A second class of translations is obtained in [2] by observing that solutions and universal solutions are too restrictive for some applications, in particular when one only needs a translation storing enough information to properly answer some queries. For the particular case ofUCQ, this gives rise to the notions ofUCQ-solution and universal UCQ-solution. LetK1,Mas above andK2 a KB overΣ2. ThenK2 aUCQ-solution for K1 under M if for every query q ∈ UCQover Σ2: cert(q,hT1 ∪ T12,A1i) ⊆ cert(q,K2), whileK2is auniversalUCQ-solutionforK1underMif for every query q∈UCQoverΣ2:cert(q,hT1∪ T12,A1i) =cert(q,K2).

Arguably, the most important problem in knowledge exchange [5,2], as well as in data exchange [12,15], is the task of computing a translation of a KB according to a mapping. To study the complexity of this task for the notions of solution just presented, we introduce the following decision problems. Themembershipproblem for universal solutions (resp. universalUCQ-solutions) has as input a mappingM= (Σ1, Σ2,T12), a KBK1overΣ1, and an extended KBK2(resp. a KBK2) overΣ2. Then the question to answer is whetherK2is a universal solution (resp. universalUCQ-solution) forK1

underM. In thenon-emptinessproblem for universal solutions the input is the same, except forK2, and the question to answer is whether a universal solution K2 exists (analogously for universal UCQ-solutions). The non-emptiness problems are directly related with the problem of computing translations of a KB according to a mapping.

4 The Shape of Universal Solutions

In what follows, we show some known results and examples of universal solutions.

First of all, it was shown in [3] that a KBK2=hT2,A2i(extended or not) overΣ2is a universal solution forK1=hT2,A2ioverΣ1underM= (Σ1, Σ2,T12)only if TBox

(6)

T2is trivial (we call a TBoxT trivial ifT is equivalent to the empty set of formulas).

Therefore, in the context of universal solutions, we only consider target KBs of the form h∅,A2i, and we treat ABoxesA2as such KBs.

Example 1. Assume thatM = {A(·), B(·)},{A0(·), B0(·)},{A v A0, B v B0} , and let K1 = hT1,A1i, whereT1 = {}and A1 = {A(a), B(b)}. Then the ABox A2={A0(a), B0(b)}is a straightforward universal solution forK1underM.

It can be shown (c.f. Lemma 1) that in the language without disjointness assertions, an ABox A2 is a universal solution forhT1,A1i under M = (Σ1, Σ2,T12)if and only ifUA2isΣ2-homomorphically equivalent toUhT1∪T12,A1i. This fact is used in the following more involved example:

Example 2. Assume M = {A(·), R(·,·), S(·,·)}, Σ2,T12

whereΣ2 = {Q(·,·)}, T12={RvQ, SvQ}, and thatK1=hT1,A1i, whereT1={Av ∃R,∃Rv ∃R}

andA1 = {A(a), S(a, a)}. LetA2 = {Q(a, a)}. In the following picture, it is easy to see his aΣ2-homomorphism fromUhT1∪T12,A1i toUA2. The existence of a Σ2- homomorphism in the other direction is trivial and, hence,A2is a universal solution for K1underM.

UA2 :

a Q

Σ2-reduct of

UhT1∪T12,A1i : a

aw[R] aw[R]w[R] aw[R]w[R]w[R]

· · ·

Q Q Q

Q

h

The following example shows that extended ABoxes are necessary to guarantee the existence of universal solutions in certain cases.

Example 3. Assume thatM= {A(·), R(·,·)},{B(·)},{∃R vB}

, and letK1 = hT1,A1i, whereT1 ={Av ∃R}andA1 ={A(a)}. Then the ABoxA2 ={B(n)}, wherenis a labeled null, is a universal solution forK1underMif nulls are allowed.

Notice that here, a universal solution with non-extended ABoxes does not exist: substi- tutingnby any constant is too restrictive, ruining universality.

Finally, we discuss the impact of disjointness assertions on the universal solutions.

Example 4. Consider Example 1 withT1={Av ¬B}. With this seemingly harmless disjointness assertionA2is no longer a universal solution (not even a solution) forK1

underM. The reason for that is the lack of the unique name assumption on the one hand, and the presence of the disjointness assertion in T1 that forcesa andb to be interpreted differently in the source, on the other hand. Thus, for a modelJ ofA2such thataJ =bJ,A0J =B0J ={aJ}, there is no modelIofK1such that(I,J)|=T12

(hence,aI =aJ andbI=bJ). In general, there is no universal solution forK1under M, even thoughK1andT12are consistent with each other.

The problem raised by the latter example could be solved by allowing for inequal- ities between constants in the ABoxes. A similar problem appears with disjointness assertions in the mapping, but it requires negative facts to be present for a universal so- lution to exist (i.e., facts of the form¬A(a),¬P(a, b)), which are not part of OWL 2 QL.

(7)

On the other hand, having disjointness assertion in the source or the mapping does not exclude the existence of universalUCQ-solutions, which is explained in Example 6.

5 Computing Universal Solutions: The Case of Knowledge Bases

In this section, we show that both the membership and the non-emptiness problems for universal solutions without null values are in PTIME.

Assume thatΣ12are disjoint signatures, and thatK =hT1∪ T12,A1iis a KB such thatT1is defined overΣ1andT12is a set of inclusions fromΣ1toΣ2. Moreover, letUKbe the canonical model ofK. Then a basic conceptBoverΣ1is said to besafe inUKifd6∈NaandtUΣK

2(d) =∅for everyd∈BUK. Intuitively, safeness forBmeans no constant “associated” withB and no target concept that is “associated” withB via T1 andT12will be mentioned in the target; in Example 4 neither AnorB is safe in UhT1∪T12,A1i. Furthermore, a pair of basic concepts(B, C)is is said to besafeifBor Cis safe. Intuitively, if a pair(B, C)is not safe and(B v ¬C)∈ T1, then universal solution cannot exist, as explained in Example 4. Similarly, we say a basic roleRover Σ1is safe if eitherd 6∈Na andtUΣK

2(d) = ∅, ord0 6∈ Na andtUΣK

2(d0) = ∅, for every (d, d0)∈RUK. A pair of roles(R, Q)is safe if 1)RorQis safe, and 2)tUΣK

2(d0) =∅ ortUΣK

2(d00) =∅for everyd, d0, d00∈∆UKsuch that(d, d0)∈RUKand(d, d00)∈QUK. Lemma 1. A KB K2 = hT2,A2i over Σ2 is a universal solution for a KB K1 = hT1,A1iunder a mappingM= (Σ1, Σ2,T12)iff the following conditions hold:

(tr) T2is a trivial TBox,

(hom) UA2isΣ2- homomorphically equivalent toUhT1∪T12,A1i,

(ps1) each pair of concepts(B, C)is safe inUhT1∪T12,A1iwhenever(Bv ¬C)∈ T1, (pm1) BUK =∅for each basic conceptBsuch that(Bv ¬B0)∈ T12,

(ps2) each pair of roles(R, Q)is safe inUhT1∪T12,A1iwhenever(Rv ¬Q)∈ T1, (pm2) RUK=∅for each basic roleRsuch that(Rv ¬R0)∈ T12.

It can be readily verified that conditions(tr),(ps1),(ps2),(pm1),(pm2)and the exis- tence of aΣ2-homomorphism fromUA2 toUhT1∪T12,A1irequired by(hom), are solv- able in polynomial time. To solve the problem of existence of aΣ2-homomorphism in the opposite direction, we are going to employ the technique of reachability games on graphs. Below we present the required basic notions.

5.1 Reachability games on graphs

A game is defined by a game graph (a playground) and a winning condition. Agame graphis a tripleG= (S0, S1, T), whereS=S0∪S1is a finite set of states,S0∩S1=∅ andT ⊆S×S is a transition relation. The game starts in some states0 ∈ S, and it is played in turns. In each turn, if the current statesis inSi(i = 0,1), then Playeri chooses some states0∈Ssuch that(s, s0)∈T. Thus, each play in the game is viewed as a pathπ, which can be infinite (π=s0, s1, s2, . . ., wheresi ∈Sand(si, si+1)∈T for everyi ≥ 0) or finite (π = s0, s1, s2, . . . , sk ∈ Sk+1, where(si, si+1) ∈ T for everyi∈ {0, . . . , k−1}and{s|(sk, s)∈T}=∅).

(8)

The winning condition defines what are the plays won by Player 0. We will consider areachability acceptance conditionspecified as follows: given a set of accepting states F ⊆S, a playπis awinfor Player 0 iff some vertex from F occurs inπ. Finally, a reachability gameis a pair G = (G, F)whereGis a game graph and F is a set of accepting states.

Astrategy for Player 0 from statesis a (partial) function f0 : SS0 → S such that it assigns to each sequence of statess0, s1, . . . , sk withs0 = sandsk ∈ S0, a successor statesk+1such that(sk, sk+1)∈T. A playπ=s0s1· · · is said toconform with strategyf0ifsi+1 = f0(s0s1. . . si)for everysi ∈ S0. Then, a strategyf0 is a winning strategyfor Player 0froms∈Sif every play that conforms withf0and starts insis a win for Player 0.

Proposition 1 ([18], [10]).Given a reachability gameG= (G, F)and a statesinG, it can be checked inPTIMEwhether Player 0 has a winning strategy froms.

5.2 The reduction

Assume given a mappingM= (Σ1, Σ2,T12), a KBK1=hT1,A1ioverΣ1, and a KB K2=h∅,A2ioverΣ2(w.l.o.g., we can assume that the TBox ofK2is empty). Denote hT1∪ T12,A1i byK. We show how the problem of checking whether there exists a Σ2-homomorphismhfromUKtoUA2can be reduced to the problem of existence of a winning strategy for Player 0 in a reachability game.

First, for such a homomorphismhto exist, it should be the case that tUΣK

2(a)⊆tUA2(a)andrUΣK

2(a, b)⊆tUA2(a, b)for alla, b∈Ind(A1). (1) These conditions can be clearly checked in PTIME.

Now, to check how the elementsa ∈ ∆UK witha ∈ Ind(A1)can be mapped on

UA2, we construct a gameGa = (Ga, Fa). The game graphGa = (S0, S1, T)has the set of states of the kind(x, y,p), wherex∈ ∆UA2,y ∈ {tail(aσ) |aσ ∈∆UK}, andp∈ {s,d}. The states(x, y,s)formS0and will be calledspoiling; intuitively, the moves going out of such states represent various edges of the treeUKaccessible from the end of the current edge. On the other hand, the states(x, y,d)formS1and will be calledduplicating; the moves from them “show” how the “challenged” edge of the tree UKcan be “mapped” onUA2. Notice that the size ofGaisO(|T1∪ T12| × |A2|). The transition relationTis defined as follows:

T ={ (x, y,s),(x0, y0,d)

|y Ky0andx0=x} ∪ { (x, y,d),(x0, y0,s)

|y=w[R], clΣT1∪T12

2 (∃R)⊆tUA2(x0), clΣT1∪T12

2 (R)⊆rUA2(x, x0), andy0=y}, where for a TBoxT and a conceptB,clTΣ(B)is the set of all conceptsB0overΣsuch thatT |=BvB0, and for a roleR,clΣT(R)is defined analogously.

The setFa in the definition of the game is given by the duplicating states that are

“dead ends”, i.e.,

Fa ={(x, y,d)|y=w[R]and for allx0∈∆UA2, clTΣ1∪T12

2 (∃R)6⊆tUA2(x0) or clTΣ1∪T12

2 (R)6⊆rUA2(x, x0)}.

(9)

Having constructed the game Ga = (Ga, Fa), we prove that verifying whether the elementsa∈∆UKcan beΣ2-homomorphically mapped on∆UA2 reduces to checking whether Player 0 has a winning strategy inGafrom the state(a, a,s).

Lemma 2. There exists aΣ2-homomorphism fromUKtoUA2iff (1)holds and for each a∈Ind(A1), Player 0 does not have a winning strategy inGafrom(a, a,s).

The example below illustrates the presented reduction.

Example 5. AssumeM= {R(·,·), S(·,·), Q(·,·)}, Σ2,T12), whereΣ2 ={R0(·,·), Q0(·,·)} and T12 = {R v R0, S v R0, Q v Q0}

, K1 = hT1,A1i, where T1 = {∃S v ∃R,∃R v ∃Q,∃Q v ∃Q} and A1 = {∃R(a),∃S(a)}, and A2={R0(a, a), R0(a, b), Q0(b, b)}. ThenFa ={(b, wR,d),(a, wQ,d)}, and the game graph Ga is depicted in a) below (we ignore the states that are not reachable from (a, a,s); the duplicating states forming S1are shown as ovals and the spoiling states forming S0 are shown as boxes). In b) below we show UK (the domain elements d ∈ ∆UK are shown as dots, the labels next to them representtail(d), and the labels on the edges(d, d0) show theΣ2-role namesP, such that(d, d0) ∈ PUK), in c) we showUA2, and the dashed arrows fromUKtoUA2show the homomorphismh.

a, a,s W

a, wS,d a, wR,d

b, wS,s a, wS,s b, wR,s a, wR,s

b, wR,d b, wQ,d a, wQ,d

b, wQ,s

Fa

a) The game graphGa

a

wS wR

wQ

wQ wR

wQ

wQ

···

···

R0 R0

Q0

Q0 R0

Q0

Q0

b)Σ2-reduct ofUhT1∪T12,A1i a

b R0

Q0 R0

c)UA2

h

Observe that in the gameGa Player 0 does not have a winning strategy from(a, a,s), because there is a way for Player 1 to play (infinitely) so that the game never goes out of the regionW shown in a). It is not difficult to see that such strategy of Player 1 can be used to define the homomorphismh, and vice versa.

Finally, combining Lemma 2, Lemma 1 and Proposition 1 one obtains:

Theorem 2. The membership problem for universal solutions with (non-extended) KBs is inPTIME.

We conclude this section by addressing the non-emptiness problem. It follows from Conditions(tr),(hom)that there exists a universal solution forK1underMiffK2= h∅,A2ioverΣ2is a universal solution forK1underM, whereA2satisfies(i)A2 |= B(a)iffK |=B(a)and(ii)A2|=R(a, b)iffK |=R(a, b)for alla, b∈Ind(A1∪ A2), Σ2-conceptB, andΣ2-roleR. Obviously, we can construct the requiredA2in PTIME, then it remains to check ifK2above is a universal solution. It follows:

(10)

Theorem 3. The non-emptiness problem for universal solutions with (non-extended) KBs is in PTIME. Moreover, there is an effective algorithm to compute a universal solution in polynomial time (if such a solution exists).

6 Computing Universal Solutions: The Case of Extended Knowledge Bases

We start with the membership problem for extended KBs. Assume given a mapping M= (Σ12,T12), a KBK1 =hT1,A1ioverΣ1, and a KBK2=h∅,A2ioverΣ2, whereA2is an extended ABox. It can be shown that an analogue of Lemma 1 holds, provided that the definition ofUA2 is adjusted in an obvious way to account for nulls, and so is the definition of homomorphism. In this setting, Σ2-homomorphism from UhT1∪T12,A1itoUA2can be still checked in PTIME, however, the opposite direction can- not be checked efficiently due to nulls inA2. In fact, it can be shown by reduction from the graph 3-colorability problem that the membership problem for universal solutions with null values is NP-hard. To decide in NP whether there exists a homomorphismh fromUA2 toUhT1∪T12,A1i, we can use the fact that the imageW ⊆ ∆UhT1∪T12,A1i of such a functionhon∆UA2 is of polynomial size. Therefore, for each constant and null inA2, one needs to guess its homomorphic image, and then check whether the resulting function is a homomorphism. Thus, we obtain:

Theorem 4. The membership problem for universal solutions with extended KBs is NP-complete.

Consider now the problem of checking whether there exists a universal solution K2=h∅,A2iforK1underM. This problem turns out to be harder than the member- ship problem as now candidate solutions are not part of the input. In fact, we show by reduction from the validity problem for quantified Boolean formulas that checking the existence of a universal solution is PSPACE-hard. As for the upper bound, first, it can be shown that such anA2exists if and only ifUhT1∪T12,A1iisΣ2-homomorphically em- beddable into a finite part of itself. Then, such a finite subset ofUhT1∪T12,A1iprojected onΣ2can be taken as a universal solution forK1underM. As the inclusion of inverse roles is one of the main sources of complexity, we usetwo-way alternating automata on infinite trees (2ATA), which are a generalization of nondeterministic automata on infinite trees [21] well suited for handling inverse roles inDL-LiteR. More precisely, given a KBK, we first show that it is possible to construct the following automata: (1)AcanK is a 2ATA that accepts trees corresponding to the canonical model ofKwith nodes arbitrary labeled with a special symbolG; (2)AmodK is a 2ATA that accepts a tree if its subtree labeled withGcorresponds to a tree modelIofK(that is, a model forming a tree on the labeled nulls); and (3)Afinis a (one-way) non-deterministic automaton that accepts a tree if it has a finite prefix where each node is marked withG, and no other node in the tree is marked withG. Then to verify whether a KBK1=hT1,A1ihas a universal solution under a mappingM= (Σ1, Σ2,T12), we solve the non-emptiness problem for an automatonBdefined as the product automaton ofπΓK(AcanK ),πΓK(AmodK )andAfin, whereK=hT1∪T12,A1i,πΓK(AcanK )is the projection ofAcanK on a vocabularyΓKnot mentioning symbols fromΣ1, and likewise forπΓK(AmodK ). If the language accepted

(11)

Universal solutions ABoxes Extended ABoxes

Membership PTIME NP-complete

Non-emptiness PTIME PSPACE-hard, in EXPTIME

Fig. 1.Complexity results for universal solutions

byBis empty, then there is no universal solution forK1underM, otherwise a universal solution (of exponential size) exists and it can be extracted from the tree accepted byB.

Theorem 5. The non-emptiness problem for universal solutions with extended KBs is PSPACE-hard and inEXPTIME. Moreover, there is an effective algorithm to compute a universal solution in exponential time (if such a solution exists).

7 Universal UCQ-solutions

We start by arguing that universalUCQ-solutions exhibit more robust behavior in the presence of disjointness assertions than universal solutions.

Example 6. ConsiderM,K1, andA2from Example 4. Recall thatA2is not a universal solution forK1underM. However,A2is a universalUCQ-solution forK1underM.

Moreover,A2 remains a universal UCQ-solution for K1 under Mindependently of whether the unique name assumption is employed.

Unfortunately, universalUCQ-solutions are also harder to compute, which can be ex- plained by the fact that TBoxes have bigger impact on the structure of universalUCQ- solutions rather than of universal solutions. In fact, by using a reduction from the valid- ity problem for quantified Boolean formulas, similar to a reduction in [16], we are able to prove the following:

Theorem 6. The membership problem for universalUCQ-solutions isPSPACE-hard.

8 Conclusions

A summary of our results for universal solutions is presented in Figure 1.

In this paper, we have studied the problem of KB exchange for OWL 2 QL, improv- ing on previously known results both w.r.t. the expressiveness of the ontology language and w.r.t. the understanding of the computational properties of the problem. Our main contribution is a novel PTIMEalgorithm for the membership and non-emptiness prob- lems for universal solutions when OWL 2 QL ABoxes are considered. Our investigation leaves open several issues, which we intend to address in the future. For the computation of universal solutions when extended ABoxes are allowed, while we have pinned-down the complexity of the membership problem as NP-complete, an exact characterization for the non-emptiness problem is still missing. Moreover, it is easy to see that allowing for inequalities between terms and for negated atoms in the (target) ABox would allow one to obtain more universal solutions, but a full understanding of this case is still miss- ing. Finally, we intend to investigate the challenging problem of computing universal UCQ-solutions, adopting also here an automata-based approach.

(12)

References

1. Philippe Adjiman, Philippe Chatalic, Franc¸ois Goasdou´e, Marie-Christine Rousset, and Lau- rent Simon. Distributed reasoning in a peer-to-peer setting: Application to the Semantic Web.

J. of Artificial Intelligence Research, 25:269–314, 2006.

2. Marcelo Arenas, Elena Botoeva, and Diego Calvanese. Knowledge base exchange. InProc.

of the 24th Int. Workshop on Description Logic (DL 2011), volume 745 ofCEUR Electronic Workshop Proceedings,http://ceur-ws.org/, 2011.

3. Marcelo Arenas, Elena Botoeva, Diego Calvanese, Vladislav Ryzhikov, and Evgeny Sherkhonov. Exchanging description logic knowledge bases. InProc. of the 13th Int. Conf.

on the Principles of Knowledge Representation and Reasoning (KR 2012), 2012.

4. Marcelo Arenas, Elena Botoeva, Diego Calvanese, Vladislav Ryzhikov, and Evgeny Sherkhonov. Representability inDL-Literknowledge base exchange. InProc. of the 25th Int. Workshop on Description Logic (DL 2012), volume 846 ofCEUR Electronic Workshop Proceedings,http://ceur-ws.org/, 2012.

5. Marcelo Arenas, Jorge P´erez, and Juan L. Reutter. Data exchange beyond complete data. In Proc. of the 30th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2011), pages 83–94, 2011.

6. Alessandro Artale, Diego Calvanese, Roman Kontchakov, and Michael Zakharyaschev. The DL-Litefamily and relations.J. of Artificial Intelligence Research, 36:1–69, 2009.

7. Jie Bao et al. OWL 2 Web Ontology Language document overview (second edition). W3C Recommendation, World Wide Web Consortium, December 2012. http://www.w3.

org/TR/owl2-overview/.

8. Pablo Barcel´o. Logical foundations of relational data exchange.SIGMOD Record, 38(1):49–

58, 2009.

9. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Ric- cardo Rosati. Tractable reasoning and efficient query answering in description logics: The DL-Litefamily. J. of Automated Reasoning, 39(3):385–429, 2007.

10. Krishnendu Chatterjee and Monika Henzinger. An o(n2) time algorithm for alternating B¨uchi games. InProc. of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2012), pages 1386–1399, 2012.

11. J´erˆome Euzenat and Pavel Schwaiko.Ontology Matching. Springer, 2007.

12. Ronald Fagin, Phokion G. Kolaitis, Ren´ee J. Miller, and Lucian Popa. Data exchange: Se- mantics and query answering.Theoretical Computer Science, 336(1):89–124, 2005.

13. Ariel Fuxman, Phokion G. Kolaitis, Ren´ee J. Miller, and Wang Chiew Tan. Peer data ex- change. ACM Trans. on Database Systems, 31(4):1454–1498, 2006.

14. Anastasios Kementsietsidis, Marcelo Arenas, and Ren´ee J. Miller. Mapping data in peer-to- peer systems: Semantics and algorithmic issues. InProc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 325–336, 2003.

15. Phokion G. Kolaitis. Schema mappings, data exchange, and metadata management. In Proc. of the 24th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2005), pages 61–75, 2005.

16. Boris Konev, Roman Kontchakov, Michel Ludwig, Thomas Schneider, Frank Wolter, and Michael Zakharyaschev. Conjunctive query inseparability of OWL 2 QL TBoxes. InProc.

of the 25th AAAI Conf. on Artificial Intelligence (AAAI 2011), 2011.

17. Maurizio Lenzerini. Data integration: A theoretical perspective. InProc. of the 21st ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2002), pages 233–246, 2002.

18. Ren´e Mazala. Infinite games. InAutomata, Logics, and Infinite Games, pages 23–42, 2001.

(13)

19. Boris Motik, Bernardo Cuenca Grau, Ian Horrocks, Zhe Wu, Achille Fokoue, and Carsten Lutz. OWL 2 Web Ontology Language profiles (second edition). W3C Recommen- dation, World Wide Web Consortium, December 2012. http://www.w3.org/TR/

owl2-profiles/.

20. Pavel Shvaiko and J´erˆome Euzenat. Ontology matching: State of the art and future chal- lenges.IEEE Trans. on Knowledge and Data Engineering, 25(1):158–176, 2013.

21. Moshe Y. Vardi. Reasoning about the past with two-way automata. InProc. of the 25th Int. Coll. on Automata, Languages and Programming (ICALP’98), volume 1443 ofLecture Notes in Computer Science, pages 628–641. Springer, 1998.

Références

Documents relatifs

In this respect, we notice that through G T ∗ it is already possible to obtain the classification of all basic concepts, basic roles, and attributes, and not only that of predicates

In its judgment of 13 February 2003, the Grand Chamber of the Court concluded unanimously that article 11 of the European Convention on Human Rights had not been violated in this

freedoms contained in the American Convention on Human Rights means that the State parties must prevent, investigate and punish human rights violations and that they must,

Fully taking into account the presence of data significantly complicates the analysis of execution flows since it makes the system to be an infinite state one in general.. On the

According to this technique, a query and a DL-Lite ontology are transformed into a union of conjunctive queries (often called a UCQ rewriting) such that, the answers of the union

stand how RIF-Core is used to model the OWL 2 RL rules that use lists, consider the RIF implementation of prp-spo2 that consists of three rules that utilize an auxiliary

(2) The shrinking step, in which the clauses of Ξ( O ) are selectively used as side premises in resolution rule applications in or- der to compute rewritings which differ from the

In this paper we exploit the knowledge earned developing the QuOnto sys- tem, and present the implementation of ROWLKit, an efficient query answering engine for OWL 2 QL, based on