Computing Solutions in OWL 2 QL Knowledge Exchange

(1)

Knowledge Base Exchange

M. Arenas¹, E. Botoeva², D. Calvanese², and V. Ryzhikov²

1 Dept. of Computer Science, PUC Chile & University of Oxford [email protected]

2 KRDB Research Centre, Free Univ. of Bozen-Bolzano, Italy [email protected]

Abstract. The problem of exchanging knowledge bases from a source signature to a target signature connected through a mapping has recently attracted attention in knowledge representation. In this paper, we study this problem for knowledge bases and mappings expressed in OWL 2 QL, one of the profiles of the standard Web Ontology Language OWL 2. More specifically, we consider the membership and non-emptiness problems associated with computing universal solutions, which have been identified as one of the most desirable translations to be materialized. We study two settings: when ABoxes are in OWL 2 QL and when null values are allowed in the ABox language. For the former case, we provide a novel technique based on reachability games on graphs to show that the non-emptiness and membership problems are in PTime. For the latter case, we report a range of complexity results from NP to EXPTIME. We also consider the problem of computing universalUCQ-solutions, which provide an alternative notion of translation containing sufficient information to properly answer union of conjunctive queries, reporting a PSPACElower bound for the membership problem.

1 Introduction

Complex forms of information, maintained in different formats and organized according to different structures, often need to be shared between agents. In recent years, both in the data management and in the knowledge representation communities, several settings have been investigated that address this problem from various perspectives: in information integration, uniform access is provided to a collection of data sources by means of an ontology (or global schema) to which the sources are mapped [17]; in peer-to-peer systems, a set of peers declaratively linked to each other collectively provide access to the information assets they maintain [14,1,13]; in ontology matching, the aim is to un- derstand and derive the correspondences between elements in two ontologies [11,20];

finally, in data exchange, the information stored according to a source schema needs to be restructured and translated so as to conform to a target schema [12,8].

The work we present in this paper is inspired by the latter setting, investigated in databases. We study it, however, under the assumption of incomplete information typi- cal of knowledge representation [5]. Specifically, we investigate the problem ofknowl- edge base exchange, where a source knowledge base (KB) is connected to a target KB by means of a declarative mapping specification, and the aim is to exchange knowledge

(2)

from the source to the target by exploiting the mapping. We rely on a framework for KB exchange proposed recently in [2,3,4], based on lightweight Description Logics (DLs) of theDL-Litefamily [9]: both source and target are KBs constituted by a DL TBox, representing implicit information, and an ABox, representing explicit information, and mappings are sets of DL concept and role inclusions.

In this paper, we adjust the above mentioned framework to OWL 2 QL [19], one of the profiles of the standard Web Ontology Language OWL 2 [7], and then study the problem of computing universal solutions, which have been identified as one of the most desirable translations to be materialized. We investigate both the task of checking membership, where a candidate universal solution is given and one needs to check its correctness, andnon-emptiness, where the aim is to determine the existence of a universal solution. We prove that both problems can be solved in PTIMEusing a novel reduction to the problem of finding awinning strategyin reachability games on graphs [18].

Then, we argue that for certain natural shapes of source KBs and mappings the universal solutions do not exist, unless null values are allowed in ABox languages. So, we considerextendedABoxes that may contain nulls, presenting a number of complexity results ranging from NP to EXPTIMEfor this setting.

Finally, we consideruniversalUCQ-solutions, an alternative notion for materializa- tion of knowledge in the target that contains sufficient information to answer unions of conjunctive queries. We show that universalUCQ-solutions exist for certain source KBs and mappings, where universal solutions (even with extended ABoxes) do not. We also report PSPACE-hardness for the membership problem for universalUCQ-solutions.

The paper is organized as follows. We give preliminary notions on DLs and queries in Section 2, and on KB exchange in Section 3. In Section 4, we present the known results and give some intuition about the shape of universal solutions. In Sections 5 and 6, we present the results on the complexity of computing universal solutions for KBs and extended KBs, respectively. In Section 7, we consider universalUCQ-solutions and in Section 8, we present conclusions and outline some future work.

2 Preliminaries

The DLs of theDL-Litefamily [6] are characterized by the fact that standard reasoning can be done in polynomial time. We adapt hereDL-Lite_R, and present now its syntax and semantics. LetNC,NR,Na,N` be pairwise disjoint sets ofconcept names,role names,constants, andlabeled nulls, respectively. Assume in the following thatA∈NC

andP ∈NR; inDL-LiteR,BandCare used to denote basic and arbitrary (or complex) concepts, respectively, andRandQare used to denote basic and arbitrary (or complex) roles, respectively, defined as follows:

R ::= P | P⁻ Q ::= R | ¬R

B ::= A | ∃R C ::= B | ¬B

Below, for a basic roleR, we useR⁻to denoteP⁻whenR=P, andPwhenR=P⁻. A TBox is a finite set ofconcept inclusionsB v C androle inclusionsR v Q.

We call an inclusion of the formB1v ¬B2orR1v ¬R2adisjointness assertion. An ABox is a finite set ofmembership assertionsB(a),R(a, b), wherea, b ∈ Na. Here,

(3)

we also consider extended ABoxes, obtained by allowing labeled nulls in membership assertions. Formally, anextended ABoxis a finite set of membership assertionsB(u) andR(u, v), whereu, v∈(N_a∪N_`). We denote byInd(A)the set of constants occurring inA. Moreover, a(nextended)KBKis a pairhT,Ai, whereT is a TBox andAis an (extended) ABox. AsignatureΣis a finite set of concept and role names. A KBKis said to bedefined over(or simply,over)Σif all the concept and role names occurring in Kbelong toΣ(and likewise for TBoxes, ABoxes, concept inclusions, role inclusions, and membership assertions). Aninterpretation I of Σ is a pairh∆Î,·Îi, where∆Î is a non-empty domain and·Î is an interpretation function such that: (1)AÎ ⊆ ∆Î, for every concept nameA ∈ Σ; (2)PÎ ⊆ ∆Î ×∆Î, for every role name P ∈ Σ;

and (3)aÎ ∈∆Î, for every constanta∈Na. Function·Îis extended to also interpret concept and role constructs: (1)(∃R)Î={x∈∆Î| ∃y∈∆Îsuch that(x, y)∈RÎ};

(2) (¬B)Î = ∆Î \ BÎ; (3) (P⁻)Î = {(y, x) ∈ ∆Î ×∆Î |(x, y) ∈ PÎ}; and (4)(¬R)Î= (∆Î×∆Î)\RÎ. Note that, consistently with the semantics of OWL 2 QL, we donotmake the unique name assumption, i.e., distinct constantsa, b∈N_amay be interpreted as the same object. Note also that labeled nulls arenotinterpreted byI.

LetI =h∆Î,·Îibe an interpretation over a signatureΣ. ThenIis said to satisfy a concept inclusionB vCoverΣ, denoted byI |=B vC, ifBÎ ⊆CÎ;Iis said to satisfy a role inclusion R v QoverΣ, denoted by I |= R v Q, if RÎ ⊆ QÎ; andI is said to satisfy a TBoxT over Σ, denoted by I |= T, ifI |= αfor every α∈ T. Moreover, satisfaction of membership assertions overΣis defined as follows.

AsubstitutionoverI is a functionh : (N_a∪N_`) → ∆Î such thath(a) = aÎ for everya∈ N_a. ThenI is said to satisfy an (extended) ABoxA, denoted byI |=A, if there exists a substitutionhoverIsuch that: (1) for everyB(u)∈ A, it holds that h(u)∈ BÎ; and (2) for everyR(u, v) ∈ A, it holds that(h(u), h(v))∈ RÎ. Finally, I is said tosatisfya(n extended) KBK =hT,Ai, denoted byI |=K, ifI |=T and I |=A. SuchI is called amodelofK, and we use MOD(K)to denote the set of all models ofK. We say thatKisconsistentif MOD(K)6=∅. As is customary, given an (extended) ABoxAover a signatureΣand a membership assertionαoverΣ, we use notationA |=αto indicate that for every interpretationIofΣ, ifI |=A, thenI |=α (and likewise for (extended) KBs).

We also need to introduce the notions ofΣ-types andΣ-homomorphisms. For an interpretationIand a signatureΣ, theΣ-typestÎ_Σ(x)andrÎ_Σ(x, y)forx, y∈∆Î are given by the set of conceptsBand rolesRoverΣ, respectively, such thatx∈BÎand (x, y) ∈ RÎ. We also usetÎ(x)andrÎ(x, y)to refer to the types over the signature of allDL-Lite_Rconcepts and roles. AΣ-homomorphismfrom an interpretationIto an interpretationJ is a functionh:∆Î 7→∆^J such thath(aÎ) =a^J, for all individual namesainterpreted inI,tÎ_Σ(x) ⊆ t^J_Σ(h(x))andrÎ_Σ(x, y) ⊆r^J_Σ(h(x), h(y))for all x, y ∈∆Î. We say thatIisΣ-homomorphically embeddableintoJ if there exists a Σ-homomorphism fromItoJ. IfΣis the set of allDL-Lite_Rconcepts and roles, we callΣ-homomorphism simplyhomomorphism.

DL-LiteR enjoys the canonical model property. Let K = hT,Ai be a (non- extended) KB, and v^R_T the reflexive and transitive closure of the role relation on the set of all basic roles over NR induced by T (that is, the reflexive and transitive closure of {(R1, R2) | R1 v R2 ∈ T or R⁻₁ v R⁻₂ ∈ T }). Then define

(4)

[R] = {S | R v^R_T SandS v^R_T R}, [R] ≤_T [S] if R v^R_T S, and a generating relationship Kas follows:

– a _Kw_[R], if (1)K |=∃R(a); (2)K 6|=R(a, b)for everyb∈Na; (3)[R⁰] = [R]

for every[R⁰]such that[R⁰]≤_T [R]andK |=∃R⁰(a).

– w_[S] _K w_[R], if (1)T |=∃S⁻ v ∃R; (2)[S⁻]6= [R]; (3)[R⁰] = [R]for every [R⁰]such that[R⁰]≤_T [R]andT |=∃S⁻ v ∃R⁰.

Denote by path(K) the set of all K-paths, where a K-path is a sequence aw[R₁]. . . w[R_n] such thatn ≥ 0,a ∈ Na,a K w[R₁] andw[R_i] K w[R_i+1] for 1≤i≤n−1. Moreover, for everyσ∈path(K), denote bytail(σ)the last element in σ. Finally, thecanonical(or,universal)modelofK, denotedUK, is defined as:

∆^U^K=path(K), a^U^K=a, fora∈N_a,

AÛ^K={a∈Ind(A)| K |=A(a)} ∪ {σ·w_[R]∈∆Û^K | T |=∃R⁻ vA}, PÛ^K={(a, b)∈Ind(A)×Ind(A)| K |=P(a, b)} ∪

{(σ, σ·w_[R])|tail(σ) _Kw_[R],[R]≤_T [P]} ∪ {(σ·w_[R], σ)|tail(σ) _Kw_[R],[R⁻]≤_T [P]}.

Theorem 1 ([16]).IfK is consistent,UKis a model of K. For every modelI |= K, there exists a homomorphism fromUKtoI.

Queries and certain answers. A k-ary query qover a signature Σ, withk ≥ 0, is a function that maps every interpretation h∆Î,·Îi of Σ into ak-ary relation qÎ ⊆ (∆Î)^k. Given a (non-extended) KB K over Σ, the set ofcertain answers toq over K, denoted bycert(q,K), is defined as:T

I∈MOD(K){(a1, . . . , ak) | {a1, . . . , ak} ⊆ Naand(aÎ₁, . . . , aÎ_k)∈qÎ}. Aconjunctive query(CQ)over a signatureΣis a formula of the formq(x) = ∃y. ϕ(x,y), wherex,yare tuples of variables andϕ(x,y)is a conjunction of atoms of the formA(t), withAa concept name inΣ, andP(t, t⁰), with Pa role name inΣ, where each oft, t⁰is either a constant fromNaor a variable fromx ory. Given an interpretationI=h∆Î,·ÎiofΣ, the answer ofqoverI, denoted byqÎ, is the set of tuplesaof elements from∆Î for which there exist a tuplebof elements from∆Îsuch thatIsatisfies every conjunct inϕ(a,b). A union of conjunctive queries (UCQ) over a signatureΣis a formula of the formq(x) =Wn

i=1q_i(x), where eachq_i (1≤i≤n) is aCQoverΣ, whose semantics is defined asq^I=Sn

i=1q^I_i.

3 Knowledge Base Exchange Framework for OWL 2 QL

Assume thatΣ₁,Σ₂are signatures with no concepts or roles in common. An inclusion E₁ v E₂is said to befromΣ₁toΣ₂, ifE₁ is a concept or a role overΣ₁ andE₂is a concept or a role overΣ₂. A mapping is a tupleM = (Σ₁, Σ₂,T₁₂), whereT₁₂is a TBox consisting of inclusions fromΣ₁toΣ₂[2]. The semantics of such a mapping is defined in [2] in terms of a notion of satisfaction for interpretations, which has to be extended in our case to deal with interpretations not satisfying the unique name assumption (and, more generally, the standard name assumption). More specifically,

(5)

given interpretationsI,J ofΣ₁andΣ₂, respectively, pair(I,J)satisfiesTBoxT₁₂, denoted by(I,J) |= T₁₂, if (1) for everya ∈ N_a, it holds that aÎ = a^J, (2) for every concept inclusionB v C ∈ T₁₂, it holds that BÎ ⊆ C^J, and (3) for every role inclusion R v Q ∈ T12, it holds thatRÎ ⊆ Q^J. Notice that the connection between the information in I andJ is established through the constants that move from source to target according to the mapping. For this reason, we require constants to be interpreted in the same way inIandJ, i.e., they preserve their meaning when they are transferred. Finally, SAT_M(I)is defined as the set of interpretationsJ ofΣ2such that(I,J)|=T12, and given a setXof interpretations ofΣ1, SAT_M(X)is defined as S

I∈XSATM(I).

The main problem studied in the knowledge exchange area is the problem of translating a KB according to a mapping, which is formalized through several different notions of translation. LetM= (Σ1, Σ2,T12)be a mapping,K1=hT1,A1ia KB over Σ1 andK2 = hT2,A2i an extended KB overΣ2. The first such notion is the concept of solution, which is formalized as follows:K2is a solutionforK1underMif MOD(K2)⊆SATM(MOD(K1)). Thus,K2is a solution forK1underMif every interpretation ofK2is a valid translation of an interpretation ofK1according toM. Then, K2is auniversal solutionforK1underMif MOD(K2) = SATM(MOD(K1)). Thus, K2is designed to exactly represent the space of interpretations obtained by translating the interpretations ofK1underM[2].

A second class of translations is obtained in [2] by observing that solutions and universal solutions are too restrictive for some applications, in particular when one only needs a translation storing enough information to properly answer some queries. For the particular case ofUCQ, this gives rise to the notions ofUCQ-solution and universal UCQ-solution. LetK1,Mas above andK2 a KB overΣ₂. ThenK2 aUCQ-solution for K1 under M if for every query q ∈ UCQover Σ₂: cert(q,hT1 ∪ T12,A1i) ⊆ cert(q,K₂), whileK₂is auniversalUCQ-solutionforK₁underMif for every query q∈UCQoverΣ₂:cert(q,hT₁∪ T₁₂,A₁i) =cert(q,K₂).

Arguably, the most important problem in knowledge exchange [5,2], as well as in data exchange [12,15], is the task of computing a translation of a KB according to a mapping. To study the complexity of this task for the notions of solution just presented, we introduce the following decision problems. Themembershipproblem for universal solutions (resp. universalUCQ-solutions) has as input a mappingM= (Σ₁, Σ₂,T₁₂), a KBK₁overΣ₁, and an extended KBK₂(resp. a KBK₂) overΣ₂. Then the question to answer is whetherK2is a universal solution (resp. universalUCQ-solution) forK1

underM. In thenon-emptinessproblem for universal solutions the input is the same, except forK2, and the question to answer is whether a universal solution K2 exists (analogously for universal UCQ-solutions). The non-emptiness problems are directly related with the problem of computing translations of a KB according to a mapping.

4 The Shape of Universal Solutions

In what follows, we show some known results and examples of universal solutions.

First of all, it was shown in [3] that a KBK2=hT2,A2i(extended or not) overΣ2is a universal solution forK1=hT2,A2ioverΣ1underM= (Σ1, Σ2,T12)only if TBox

(6)

T₂is trivial (we call a TBoxT trivial ifT is equivalent to the empty set of formulas).

Therefore, in the context of universal solutions, we only consider target KBs of the form h∅,A₂i, and we treat ABoxesA₂as such KBs.

Example 1. Assume thatM = {A(·), B(·)},{A⁰(·), B⁰(·)},{A v A⁰, B v B⁰} , and let K1 = hT1,A1i, whereT1 = {}and A1 = {A(a), B(b)}. Then the ABox A2={A⁰(a), B⁰(b)}is a straightforward universal solution forK1underM.

It can be shown (c.f. Lemma 1) that in the language without disjointness assertions, an ABox A2 is a universal solution forhT1,A1i under M = (Σ1, Σ2,T12)if and only ifUA2isΣ₂-homomorphically equivalent toU_hT₁_∪T₁₂_,A₁_i. This fact is used in the following more involved example:

Example 2. Assume M = {A(·), R(·,·), S(·,·)}, Σ2,T12

whereΣ2 = {Q(·,·)}, T12={RvQ, SvQ}, and thatK1=hT1,A1i, whereT1={Av ∃R,∃R⁻v ∃R}

andA1 = {A(a), S(a, a)}. LetA2 = {Q(a, a)}. In the following picture, it is easy to see his aΣ₂-homomorphism fromU_hT₁_∪T₁₂_,A₁_i toUA2. The existence of a Σ₂- homomorphism in the other direction is trivial and, hence,A2is a universal solution for K1underM.

UA₂ :

a Q

Σ2-reduct of

UhT₁∪T₁₂,A₁i : ^a

aw_[R] aw_[R]w_[R] aw_[R]w_[R]w_[R]

· · ·

Q Q Q

Q

h

The following example shows that extended ABoxes are necessary to guarantee the existence of universal solutions in certain cases.

Example 3. Assume thatM= {A(·), R(·,·)},{B(·)},{∃R⁻ vB}

, and letK1 = hT1,A1i, whereT1 ={Av ∃R}andA1 ={A(a)}. Then the ABoxA2 ={B(n)}, wherenis a labeled null, is a universal solution forK1underMif nulls are allowed.

Notice that here, a universal solution with non-extended ABoxes does not exist: substi- tutingnby any constant is too restrictive, ruining universality.

Finally, we discuss the impact of disjointness assertions on the universal solutions.

Example 4. Consider Example 1 withT1={Av ¬B}. With this seemingly harmless disjointness assertionA2is no longer a universal solution (not even a solution) forK1

underM. The reason for that is the lack of the unique name assumption on the one hand, and the presence of the disjointness assertion in T₁ that forcesa andb to be interpreted differently in the source, on the other hand. Thus, for a modelJ ofA2such thata^J =b^J,A^0J =B^0J ={a^J}, there is no modelIofK1such that(I,J)|=T12

(hence,a^I =a^J andb^I=b^J). In general, there is no universal solution forK1under M, even thoughK1andT12are consistent with each other.

The problem raised by the latter example could be solved by allowing for inequalities between constants in the ABoxes. A similar problem appears with disjointness assertions in the mapping, but it requires negative facts to be present for a universal solution to exist (i.e., facts of the form¬A(a),¬P(a, b)), which are not part of OWL 2 QL.

(7)

On the other hand, having disjointness assertion in the source or the mapping does not exclude the existence of universalUCQ-solutions, which is explained in Example 6.

5 Computing Universal Solutions: The Case of Knowledge Bases

In this section, we show that both the membership and the non-emptiness problems for universal solutions without null values are in PTIME.

Assume thatΣ1,Σ2are disjoint signatures, and thatK =hT1∪ T12,A1iis a KB such thatT1is defined overΣ1andT12is a set of inclusions fromΣ1toΣ2. Moreover, letU_Kbe the canonical model ofK. Then a basic conceptBoverΣ1is said to besafe inU_Kifd6∈Naandt^U_Σ^K

2(d) =∅for everyd∈B^U^K. Intuitively, safeness forBmeans no constant “associated” withB and no target concept that is “associated” withB via T1 andT12will be mentioned in the target; in Example 4 neither AnorB is safe in U_hT₁_∪T₁₂_,A₁_i. Furthermore, a pair of basic concepts(B, C)is is said to besafeifBor Cis safe. Intuitively, if a pair(B, C)is not safe and(B v ¬C)∈ T1, then universal solution cannot exist, as explained in Example 4. Similarly, we say a basic roleRover Σ₁is safe if eitherd 6∈N_a andt^U_Σ^K

2(d) = ∅, ord⁰ 6∈ N_a andt^U_Σ^K

2(d⁰) = ∅, for every (d, d⁰)∈R^U^K. A pair of roles(R, Q)is safe if 1)RorQis safe, and 2)t^U_Σ^K

2(d⁰) =∅ ort^U_Σ^K

2(d⁰⁰) =∅for everyd, d⁰, d⁰⁰∈∆Û^Ksuch that(d, d⁰)∈RÛ^Kand(d, d⁰⁰)∈QÛ^K. Lemma 1. A KB K2 = hT2,A2i over Σ2 is a universal solution for a KB K1 = hT1,A1iunder a mappingM= (Σ1, Σ2,T12)iff the following conditions hold:

(tr) T2is a trivial TBox,

(hom) UA₂isΣ2- homomorphically equivalent toU_hT₁_∪T₁₂_,A₁_i,

(ps1) each pair of concepts(B, C)is safe inU_hT₁_∪T₁₂_,A₁_iwhenever(Bv ¬C)∈ T₁, (pm1) B^U^K =∅for each basic conceptBsuch that(Bv ¬B⁰)∈ T12,

(ps2) each pair of roles(R, Q)is safe inU_hT₁_∪T₁₂_,A₁_iwhenever(Rv ¬Q)∈ T1, (pm2) R^U^K=∅for each basic roleRsuch that(Rv ¬R⁰)∈ T12.

It can be readily verified that conditions(tr),(ps1),(ps2),(pm1),(pm2)and the existence of aΣ2-homomorphism fromUA₂ toU_hT₁_∪T₁₂_,A₁_irequired by(hom), are solv- able in polynomial time. To solve the problem of existence of aΣ2-homomorphism in the opposite direction, we are going to employ the technique of reachability games on graphs. Below we present the required basic notions.

5.1 Reachability games on graphs

A game is defined by a game graph (a playground) and a winning condition. Agame graphis a tripleG= (S₀, S₁, T), whereS=S₀∪S1is a finite set of states,S₀∩S1=∅ andT ⊆S×S is a transition relation. The game starts in some states₀ ∈ S, and it is played in turns. In each turn, if the current statesis inS_i(i = 0,1), then Playeri chooses some states⁰∈Ssuch that(s, s⁰)∈T. Thus, each play in the game is viewed as a pathπ, which can be infinite (π=s₀, s₁, s₂, . . ., wheres_i ∈Sand(s_i, s_i+1)∈T for everyi ≥ 0) or finite (π = s0, s1, s2, . . . , sk ∈ S^k+1, where(si, si+1) ∈ T for everyi∈ {0, . . . , k−1}and{s|(sk, s)∈T}=∅).

(8)

The winning condition defines what are the plays won by Player 0. We will consider areachability acceptance conditionspecified as follows: given a set of accepting states F ⊆S, a playπis awinfor Player 0 iff some vertex from F occurs inπ. Finally, a reachability gameis a pair G = (G, F)whereGis a game graph and F is a set of accepting states.

Astrategy for Player 0 from statesis a (partial) function f0 : S^∗S0 → S such that it assigns to each sequence of statess0, s1, . . . , sk withs0 = sandsk ∈ S0, a successor statesk+1such that(sk, sk+1)∈T. A playπ=s0s1· · · is said toconform with strategyf0ifsi+1 = f0(s0s1. . . si)for everysi ∈ S0. Then, a strategyf0 is a winning strategyfor Player 0froms∈Sif every play that conforms withf0and starts insis a win for Player 0.

Proposition 1 ([18], [10]).Given a reachability gameG= (G, F)and a statesinG, it can be checked inPTIMEwhether Player 0 has a winning strategy froms.

5.2 The reduction

Assume given a mappingM= (Σ₁, Σ₂,T₁₂), a KBK₁=hT₁,A₁ioverΣ₁, and a KB K2=h∅,A2ioverΣ2(w.l.o.g., we can assume that the TBox ofK2is empty). Denote hT1∪ T12,A1i byK. We show how the problem of checking whether there exists a Σ2-homomorphismhfromU_KtoU_A₂can be reduced to the problem of existence of a winning strategy for Player 0 in a reachability game.

First, for such a homomorphismhto exist, it should be the case that t^U_Σ^K

2(a)⊆tÛÂ²(a)andrÛ_Σ^K

2(a, b)⊆t^U^A²(a, b)for alla, b∈Ind(A1). (1) These conditions can be clearly checked in PTIME.

Now, to check how the elementsa ∈ ∆^U^K witha ∈ Ind(A1)can be mapped on

∆ÛÂ², we construct a gameGa = (Ga, Fa). The game graphGa = (S0, S1, T)has the set of states of the kind(x, y,p), wherex∈ ∆ÛÂ²,y ∈ {tail(aσ) |aσ ∈∆Û^K}, andp∈ {s,d}. The states(x, y,s)formS0and will be calledspoiling; intuitively, the moves going out of such states represent various edges of the treeU_Kaccessible from the end of the current edge. On the other hand, the states(x, y,d)formS1and will be calledduplicating; the moves from them “show” how the “challenged” edge of the tree UKcan be “mapped” onUA2. Notice that the size ofGaisO(|T1∪ T12| × |A2|). The transition relationTis defined as follows:

T ={ (x, y,s),(x⁰, y⁰,d)

|y Ky⁰andx⁰=x} ∪ { (x, y,d),(x⁰, y⁰,s)

|y=w_[R], cl_Σ^T¹^∪T¹²

2 (∃R⁻)⊆t^U^A²(x⁰), cl_Σ^T¹^∪T¹²

2 (R)⊆r^U^A²(x, x⁰), andy⁰=y}, where for a TBoxT and a conceptB,cl^T_Σ(B)is the set of all conceptsB⁰overΣsuch thatT |=BvB⁰, and for a roleR,cl_Σ^T(R)is defined analogously.

The setF_a in the definition of the game is given by the duplicating states that are

“dead ends”, i.e.,

F_a ={(x, y,d)|y=w_[R]and for allx⁰∈∆^U^A², cl^T_Σ¹^∪T¹²

2 (∃R⁻)6⊆t^U^A²(x⁰) or cl^T_Σ¹^∪T¹²

2 (R)6⊆r^U^A²(x, x⁰)}.

(9)

Having constructed the game G_a = (G_a, F_a), we prove that verifying whether the elementsa∈∆Û^Kcan beΣ₂-homomorphically mapped on∆ÛÂ² reduces to checking whether Player 0 has a winning strategy inG_afrom the state(a, a,s).

Lemma 2. There exists aΣ2-homomorphism fromUKtoUA₂iff (1)holds and for each a∈Ind(A1), Player 0 does not have a winning strategy inGafrom(a, a,s).

The example below illustrates the presented reduction.

Example 5. AssumeM= {R(·,·), S(·,·), Q(·,·)}, Σ2,T12), whereΣ2 ={R⁰(·,·), Q⁰(·,·)} and T12 = {R v R⁰, S v R⁰, Q v Q⁰}

, K1 = hT1,A1i, where T1 = {∃S⁻ v ∃R,∃R⁻ v ∃Q,∃Q⁻ v ∃Q} and A1 = {∃R(a),∃S(a)}, and A2={R⁰(a, a), R⁰(a, b), Q⁰(b, b)}. ThenFa ={(b, wR,d),(a, wQ,d)}, and the game graph Ga is depicted in a) below (we ignore the states that are not reachable from (a, a,s); the duplicating states forming S1are shown as ovals and the spoiling states forming S0 are shown as boxes). In b) below we show U_K (the domain elements d ∈ ∆^U^K are shown as dots, the labels next to them representtail(d), and the labels on the edges(d, d⁰) show theΣ2-role namesP, such that(d, d⁰) ∈ P^U^K), in c) we showUA₂, and the dashed arrows fromUKtoUA₂show the homomorphismh.

a, a,s W

a, wS,d a, wR,d

b, wS,s a, wS,s b, wR,s a, wR,s

b, w_R,d b, w_Q,d a, w_Q,d

b, w_Q,s

Fa

a) The game graphGa

a

wS wR

wQ

w_Q w_R

w_Q

wQ

···

R⁰ R⁰

Q⁰

Q⁰ R⁰

Q⁰

b)Σ2-reduct ofUhT₁∪T₁₂,A₁i a

b R⁰

Q⁰ R⁰

c)UA₂

h

Observe that in the gameGa Player 0 does not have a winning strategy from(a, a,s), because there is a way for Player 1 to play (infinitely) so that the game never goes out of the regionW shown in a). It is not difficult to see that such strategy of Player 1 can be used to define the homomorphismh, and vice versa.

Finally, combining Lemma 2, Lemma 1 and Proposition 1 one obtains:

Theorem 2. The membership problem for universal solutions with (non-extended) KBs is inPTIME.

We conclude this section by addressing the non-emptiness problem. It follows from Conditions(tr),(hom)that there exists a universal solution forK₁underMiffK₂= h∅,A₂ioverΣ₂is a universal solution forK₁underM, whereA₂satisfies(i)A₂ |= B(a)iffK |=B(a)and(ii)A2|=R(a, b)iffK |=R(a, b)for alla, b∈Ind(A1∪ A2), Σ2-conceptB, andΣ2-roleR. Obviously, we can construct the requiredA2in PTIME, then it remains to check ifK2above is a universal solution. It follows:

(10)

Theorem 3. The non-emptiness problem for universal solutions with (non-extended) KBs is in PTIME. Moreover, there is an effective algorithm to compute a universal solution in polynomial time (if such a solution exists).

6 Computing Universal Solutions: The Case of Extended Knowledge Bases

We start with the membership problem for extended KBs. Assume given a mapping M= (Σ1,Σ2,T12), a KBK1 =hT1,A1ioverΣ1, and a KBK2=h∅,A2ioverΣ2, whereA2is an extended ABox. It can be shown that an analogue of Lemma 1 holds, provided that the definition ofU_A₂ is adjusted in an obvious way to account for nulls, and so is the definition of homomorphism. In this setting, Σ2-homomorphism from U_hT₁_∪T₁₂_,A₁_itoU_A₂can be still checked in PTIME, however, the opposite direction cannot be checked efficiently due to nulls inA2. In fact, it can be shown by reduction from the graph 3-colorability problem that the membership problem for universal solutions with null values is NP-hard. To decide in NP whether there exists a homomorphismh fromUA2 toU_hT₁_∪T₁₂_,A₁_i, we can use the fact that the imageW ⊆ ∆Û^hT¹^∪T¹²^,A¹ⁱ of such a functionhon∆ÛÂ² is of polynomial size. Therefore, for each constant and null inA2, one needs to guess its homomorphic image, and then check whether the resulting function is a homomorphism. Thus, we obtain:

Theorem 4. The membership problem for universal solutions with extended KBs is NP-complete.

Consider now the problem of checking whether there exists a universal solution K2=h∅,A2iforK1underM. This problem turns out to be harder than the membership problem as now candidate solutions are not part of the input. In fact, we show by reduction from the validity problem for quantified Boolean formulas that checking the existence of a universal solution is PSPACE-hard. As for the upper bound, first, it can be shown that such anA2exists if and only ifU_hT₁_∪T₁₂_,A₁_iisΣ2-homomorphically em- beddable into a finite part of itself. Then, such a finite subset ofU_hT₁_∪T₁₂_,A₁_iprojected onΣ2can be taken as a universal solution forK1underM. As the inclusion of inverse roles is one of the main sources of complexity, we usetwo-way alternating automata on infinite trees (2ATA), which are a generalization of nondeterministic automata on infinite trees [21] well suited for handling inverse roles inDL-Lite_R. More precisely, given a KBK, we first show that it is possible to construct the following automata: (1)A^canK is a 2ATA that accepts trees corresponding to the canonical model ofKwith nodes arbitrary labeled with a special symbolG; (2)A^modK is a 2ATA that accepts a tree if its subtree labeled withGcorresponds to a tree modelIofK(that is, a model forming a tree on the labeled nulls); and (3)Afinis a (one-way) non-deterministic automaton that accepts a tree if it has a finite prefix where each node is marked withG, and no other node in the tree is marked withG. Then to verify whether a KBK₁=hT₁,A₁ihas a universal solution under a mappingM= (Σ₁, Σ₂,T₁₂), we solve the non-emptiness problem for an automatonBdefined as the product automaton ofπ_Γ_K(A^canK ),π_Γ_K(A^modK )andAfin, whereK=hT1∪T12,A1i,πΓ_K(A^canK )is the projection ofA^canK on a vocabularyΓ_Knot mentioning symbols fromΣ1, and likewise forπΓ_K(A^mod_K ). If the language accepted

(11)

Universal solutions ABoxes Extended ABoxes

Membership PTIME NP-complete

Non-emptiness PTIME PSPACE-hard, in EXPTIME

Fig. 1.Complexity results for universal solutions

byBis empty, then there is no universal solution forK1underM, otherwise a universal solution (of exponential size) exists and it can be extracted from the tree accepted byB.

Theorem 5. The non-emptiness problem for universal solutions with extended KBs is PSPACE-hard and inEXPTIME. Moreover, there is an effective algorithm to compute a universal solution in exponential time (if such a solution exists).

7 Universal UCQ-solutions

We start by arguing that universalUCQ-solutions exhibit more robust behavior in the presence of disjointness assertions than universal solutions.

Example 6. ConsiderM,K1, andA2from Example 4. Recall thatA2is not a universal solution forK1underM. However,A2is a universalUCQ-solution forK1underM.

Moreover,A2 remains a universal UCQ-solution for K1 under Mindependently of whether the unique name assumption is employed.

Unfortunately, universalUCQ-solutions are also harder to compute, which can be explained by the fact that TBoxes have bigger impact on the structure of universalUCQ- solutions rather than of universal solutions. In fact, by using a reduction from the validity problem for quantified Boolean formulas, similar to a reduction in [16], we are able to prove the following:

Theorem 6. The membership problem for universalUCQ-solutions isPSPACE-hard.

8 Conclusions

A summary of our results for universal solutions is presented in Figure 1.

In this paper, we have studied the problem of KB exchange for OWL 2 QL, improv- ing on previously known results both w.r.t. the expressiveness of the ontology language and w.r.t. the understanding of the computational properties of the problem. Our main contribution is a novel PTIMEalgorithm for the membership and non-emptiness problems for universal solutions when OWL 2 QL ABoxes are considered. Our investigation leaves open several issues, which we intend to address in the future. For the computation of universal solutions when extended ABoxes are allowed, while we have pinned-down the complexity of the membership problem as NP-complete, an exact characterization for the non-emptiness problem is still missing. Moreover, it is easy to see that allowing for inequalities between terms and for negated atoms in the (target) ABox would allow one to obtain more universal solutions, but a full understanding of this case is still missing. Finally, we intend to investigate the challenging problem of computing universal UCQ-solutions, adopting also here an automata-based approach.

(12)

References

1. Philippe Adjiman, Philippe Chatalic, Franc¸ois Goasdou´e, Marie-Christine Rousset, and Lau- rent Simon. Distributed reasoning in a peer-to-peer setting: Application to the Semantic Web.

J. of Artificial Intelligence Research, 25:269–314, 2006.

2. Marcelo Arenas, Elena Botoeva, and Diego Calvanese. Knowledge base exchange. InProc.

of the 24th Int. Workshop on Description Logic (DL 2011), volume 745 ofCEUR Electronic Workshop Proceedings,http://ceur-ws.org/, 2011.

3. Marcelo Arenas, Elena Botoeva, Diego Calvanese, Vladislav Ryzhikov, and Evgeny Sherkhonov. Exchanging description logic knowledge bases. InProc. of the 13th Int. Conf.

on the Principles of Knowledge Representation and Reasoning (KR 2012), 2012.

4. Marcelo Arenas, Elena Botoeva, Diego Calvanese, Vladislav Ryzhikov, and Evgeny Sherkhonov. Representability inDL-Literknowledge base exchange. InProc. of the 25th Int. Workshop on Description Logic (DL 2012), volume 846 ofCEUR Electronic Workshop Proceedings,http://ceur-ws.org/, 2012.

5. Marcelo Arenas, Jorge P´erez, and Juan L. Reutter. Data exchange beyond complete data. In Proc. of the 30th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2011), pages 83–94, 2011.

6. Alessandro Artale, Diego Calvanese, Roman Kontchakov, and Michael Zakharyaschev. The DL-Litefamily and relations.J. of Artificial Intelligence Research, 36:1–69, 2009.

7. Jie Bao et al. OWL 2 Web Ontology Language document overview (second edition). W3C Recommendation, World Wide Web Consortium, December 2012. http://www.w3.

org/TR/owl2-overview/.

8. Pablo Barcel´o. Logical foundations of relational data exchange.SIGMOD Record, 38(1):49–

58, 2009.

9. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Ric- cardo Rosati. Tractable reasoning and efficient query answering in description logics: The DL-Litefamily. J. of Automated Reasoning, 39(3):385–429, 2007.

10. Krishnendu Chatterjee and Monika Henzinger. An o(n2) time algorithm for alternating B¨uchi games. InProc. of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2012), pages 1386–1399, 2012.

11. J´erˆome Euzenat and Pavel Schwaiko.Ontology Matching. Springer, 2007.

12. Ronald Fagin, Phokion G. Kolaitis, Ren´ee J. Miller, and Lucian Popa. Data exchange: Se- mantics and query answering.Theoretical Computer Science, 336(1):89–124, 2005.

13. Ariel Fuxman, Phokion G. Kolaitis, Ren´ee J. Miller, and Wang Chiew Tan. Peer data exchange. ACM Trans. on Database Systems, 31(4):1454–1498, 2006.

14. Anastasios Kementsietsidis, Marcelo Arenas, and Ren´ee J. Miller. Mapping data in peer-to- peer systems: Semantics and algorithmic issues. InProc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 325–336, 2003.

15. Phokion G. Kolaitis. Schema mappings, data exchange, and metadata management. In Proc. of the 24th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2005), pages 61–75, 2005.

16. Boris Konev, Roman Kontchakov, Michel Ludwig, Thomas Schneider, Frank Wolter, and Michael Zakharyaschev. Conjunctive query inseparability of OWL 2 QL TBoxes. InProc.

of the 25th AAAI Conf. on Artificial Intelligence (AAAI 2011), 2011.

17. Maurizio Lenzerini. Data integration: A theoretical perspective. InProc. of the 21st ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2002), pages 233–246, 2002.

18. Ren´e Mazala. Infinite games. InAutomata, Logics, and Infinite Games, pages 23–42, 2001.

(13)

19. Boris Motik, Bernardo Cuenca Grau, Ian Horrocks, Zhe Wu, Achille Fokoue, and Carsten Lutz. OWL 2 Web Ontology Language profiles (second edition). W3C Recommen- dation, World Wide Web Consortium, December 2012. http://www.w3.org/TR/

owl2-profiles/.

20. Pavel Shvaiko and J´erˆome Euzenat. Ontology matching: State of the art and future chal- lenges.IEEE Trans. on Knowledge and Data Engineering, 25(1):158–176, 2013.

21. Moshe Y. Vardi. Reasoning about the past with two-way automata. InProc. of the 25th Int. Coll. on Automata, Languages and Programming (ICALP’98), volume 1443 ofLecture Notes in Computer Science, pages 628–641. Springer, 1998.