• Aucun résultat trouvé

Representability in DL-Lite_R Knowledge Base Exchange

N/A
N/A
Protected

Academic year: 2022

Partager "Representability in DL-Lite_R Knowledge Base Exchange"

Copied!
11
0
0

Texte intégral

(1)

Representability in DL-Lite

R

Knowledge Base Exchange

M. Arenas1, E. Botoeva2, D. Calvanese2, V. Ryzhikov2, and E. Sherkhonov3

1 Dept. of Computer Science, PUC Chile marenas@ing.puc.cl

2 KRDB Research Centre, Free Univ. of Bozen-Bolzano, Italy lastname@inf.unibz.it

3 ISLA, University of Amsterdam, Netherlands e.sherkhonov@uva.nl

Abstract. Knowledge base exchange can be considered as a generalization of data exchange in which the aim is to exchange between a source and a target con- nected through mappings, not only explicit knowledge, i.e., data, but also implicit knowledge in the form of axioms. Such problem has been investigated recently using Description Logics (DLs) as representation formalism, thus assuming that the source and target KBs are given as a DL TBox+ABox, while the mappings have the form of DL TBox assertions. In this paper we are interested in the prob- lem of representing a given source TBox by means of a target TBox that captures at best the intensional information in the source. In previous work, results on representability have been obtained forDL-LiteRDFS, a DL corresponding to the FOL fragment of RDFS. We extend these results to the positive fragment ofDL- LiteR, in which, differently fromDL-LiteRDFS, the assertions in the TBox and the mappings may introduce existentially implied individuals. For this we need to overcome the challenge that the chase, a key notion in data and knowledge base exchange, is not guaranteed anymore to be finite.

1 Introduction

Knowledge base exchange is an extension of the data exchange setting, where the source may contain implicit knowledge by which new data may be inferred. The first frame- work for data exchange with incompletely specified data in the source was proposed in [3]. This framework is based on the general notion ofrepresentation system, as a mechanism to represent multiple instances of a data schema, and considers the problem of incomplete data exchanges under mappings constituted by a set of tuple generating dependencies (tgds), i.e., mappings between pairs of conjunctive queries. Given that the source data may be incompletely specified, (possibly infinitely) many source instances are implicitly represented. This framework was refined in [1, 2] to the case where as a representation system Description Logics (DL) knowledge bases (KBs) were used: the TBox and the ABox of a DL KB represent implicit and explicit information respec- tively, and mappings are sets of DL inclusions. While in the traditional data exchange setting, given a source instance and a mapping specification,(universal) solutionsare target instances derived from the source instance and the mapping, in this case solutions are target DL KBs, derived from the source KB and the mapping.

(2)

In such a setting, in order to minimize the exchange (and hence transfer and materi- alization) of explicit (i.e., ABox) information, one is interested in computing universal solutions that contain as much implicit knowledge as possible. Therefore, the notion of representabilitywas defined, which helps us in understanding the capacity of (univer- sal) solutions to transfer implicit knowledge: we say that a source TBox is representable under a mapping if there exists a target TBox that leads to a universal solution when it is combined with a suitable ABox computed from the source ABox, independently of the actual source ABox.Weak representabilityis concerned with representability under a mapping extended with assertions that are implied by the given mapping and the source TBox. (Weak) representability of a source TBox under a mapping implies that the only knowledge that remains to be transferred explicitly via the (extended) mapping is the one in the source ABox. Therefore, checking (weak) representability and computing a representation of a source TBox under a(n extended) mapping turn out to be crucial problems in the context of KB exchange.

In [1, 2] the problems of deciding representability and weak-representability, and of computing a representation for a given mapping and a TBox was tackled forDL- LiteRDFS, the DL counterpart of RDFS [5] and a member of the DL-Lite family of DLs [6]. It has been shown that these problems can be solved in polynomial time for DL-LiteRDFS mappings and TBoxes. Moreover, due to the simplicity of the logic, the characterization of representations is concise and simple.

In this paper we extend those results to the case ofDL-LiteRwithout disjointness assertions, a DL that we callDL-LiteRpos. The presence of existential quantifiers on the right-hand side of concept inclusions makes the problem considerably more compli- cated than forDL-LiteRDFS. However, we show that also in the presence of existentials on the right-hand side we are able to decide representability and weak representability of aDL-LiteRposTBox under aDL-LiteRposmapping in polynomial time and to construct a polynomial size representation.

2 Preliminaries

2.1 DL-LiteRposKnowledge Bases

The DLs of theDL-Litefamily [6] are characterized by the fact that reasoning can be done in polynomial time, and that data complexity of reasoning and conjunctive query answering is in AC0. We present now the syntax and semantics ofDL-LiteRpos, which is the DL that we adopt here, together with a sub-language of it.

In the following, we useAandP to denote concept and role names, respectively, andBandRto denote generic concepts and roles, respectively. The latter are defined by the following grammar:

R ::= P | P B ::= A | ∃R

For a roleR, we useRto denotePwhenR=P, andP whenR=P.

ADL-LiteRposTBox is a finite set of concept inclusionsB vB0and role inclusions RvR0. ADL-LiteRposABox is a finite set of membership assertions of the formA(u) andP(u, v), whereuandvareindividualsorlabeled nulls. We distinguish between the

(3)

two, since individuals are interpreted under the unique name assumption, while labeled nulls obtain their meaning through assignments (see below). Notice that we include labeled nulls in ABoxes as they are needed in the exchange of KBs. ADL-LiteRposKB Kis a pairhT,Ai, whereT is aDL-LiteRposTBox andAis aDL-LiteRposABox.

Note thatDL-LiteRpos is the fragment of the DL DL-LiteR studied in [6] without disjointness assertions on concepts and roles. InDL-LiteR,B andRare calledbasic concepts andbasicroles, respectively, and for coherence with previous work on theDL- Litefamily, we adopt here this terminology as well. We callDL-LiteRDFSthe fragment of DL-LiteRpos(and hence of DL-LiteR) in which there are only atomic concepts and atomic roles on the right-hand side of inclusions.

The semantics ofDL-LiteRposis, as usual in DLs, based on the notion of first-order interpretationI=h∆IIi, where∆Iis a non-empty domain and·I is an interpreta- tion function such that:(1)AI⊆∆I, for every concept nameA;(2)PI ⊆∆I×∆I, for every role nameP;(3)aI ∈∆I, for every individual namea; and(4)such that:

(∃R)I = {x∈∆I|there existsy∈∆Is.t.(x, y)∈RI} (P)I = {(y, x)∈∆I×∆I|(x, y)∈PI}

Moreover, satisfaction of concept and role inclusions is defined as follows:I |= BvB0ifBI ⊆B0I, andI |=RvR0ifRI⊆R0I. Finally, satisfaction of member- ship assertions is defined as follows. Asubstitutionover an interpretationIis a function hfrom individuals and labeled nulls to∆I such thath(a) =aIfor each individuala.

Then(I, h)|=A(u)ifh(u)∈AI, and(I, h)|=P(u, v)if(h(u), h(v))∈PI. An interpretationIis amodelof aDL-LiteRposTBoxT if for everyα∈ T, it holds thatI |=α, and it is amodelof aDL-LiteRposABoxAif there exists a substitutionh overIsuch that for everyα∈ A, it holds that(I, h) |=α. Finally,Iis amodelof a DL-LiteR KBK =hT,AiifI is a model of bothT andA. The set of all models of Kis denoted MOD(K), andKisconsistentif MOD(K) 6=∅. We observe that inDL- LiteRposone cannot express any form of negative information, and hence a DL-LiteRpos KB is always consistent.

We assume that interpretations satisfy the standard names assumption, that is, we assume given a fixed infinite setUof individual names, and we assume that for every interpretationI, it holds that∆I ⊆UandaI =afor every individual namea. This implies that interpretations satisfy the unique name assumption over individual names.

AsignatureΣis a set of concept and role names. An interpretationI is said to be an interpretation ofΣif it is defined exactly on the concept and role names inΣ. Given a KBK, thesignatureΣ(K)ofKis the alphabet of concept and role names occurring inK, andKis said to bedefined over(or simply,over) a signatureΣifΣ(K)⊆Σ(and likewise for a TBoxT, an ABoxA, inclusionsB vCandR vQ, and membership assertionsA(u)andP(u, v)).

2.2 Queries, Certain Answers, and Chase

Ak-ary queryqover a signatureΣ, withk≥0, is a function that maps every interpreta- tionh∆IIiofΣinto ak-ary relationqI⊆∆k. In particular, ifk= 0, thenqis called a Boolean query, andqIis either a relation containing the empty tuple()(representing

(4)

the value true) or the empty relation (representing the value false). A queryqis said to be a query over a KBKifqis a query over a signatureΣandΣ⊆Σ(K). Moreover, the answer toqoverK, denoted bycert(q,K), is defined ascert(q,K) =T

I∈MOD(K)qI. Each tuple in cert(q,K)is called acertain answer for qover K. Notice that ifq is a Boolean query, then cert(q,K) evaluates to true ifqI evaluates to true for every I ∈MOD(K), and it evaluates to false otherwise.

In this paper, we adopt the class of unions of conjunctive queries as our main query formalism. Aconjunctive query (CQ) over a signatureΣis a first-order formula of the formq(x) = ∃y.conj(x,y), wherex,yare tuples of variables andconj(x,y)is a conjunction of atoms of the form: (1)A(t), withAa concept name inΣandteither an individual fromUor a variable fromxory, or (2)P(t1, t2), withP a role name inΣandti (i = 1,2) either an individual fromUor a variable fromxory. In a CQ q(x) = ∃y.conj(x,y)over a signature Σ,xis the tuple of free variables ofq(x).

Moreover, given an interpretationI =h∆IIiofΣ, the answer ofqoverI, denoted byqI, is defined as the set of tuplesaof elements from∆I for which there exist a tuplebof elements from∆Isuch thatIsatisfies every conjunct inconj(a,b). Finally, a union of conjunctive queries (UCQ) over a signatureΣis a finite set of CQs overΣ that have the same free variables. A UCQq(x)is interpreted as the first-order formula W

qi∈qqi(x), and its semantics is defined asqI =S

qi∈qqiI.

Certain answers inDL-LiteRposcan be characterized through the notion of chase. We call achasea (possibly infinite) set of assertions of the formA(t),P(t, t0), wheret, t0are either individuals fromU, or labeled nulls interpreted as not necessarily distinct domain elements (see the definition of the semantics ofDL-LiteRposin Section 2.1). For DL-LiteRposKBs, we employ the notion of restricted chase as defined in [6]. For such a KBhT,Ai, the chase ofAw.r.t.T, denotedchaseT(A), is a chase obtained fromAby adding facts implied by inclusions inT, and introducing fresh labeled nulls whenever required by an inclusion with∃Rin the right-hand side (see [6] for details).

2.3 Knowledge Base Exchange Framework

Assume that Σ12 are signatures with no concepts or roles in common. Then we say that an inclusionN1 v N2 is aninclusion fromΣ1toΣ2, ifN1 is a concept or a role overΣ1andN2is a concept or a role overΣ2. For a DLL(e.g.,DL-LiteRpos), we define anL-mapping(or justmapping, whenLis clear from the context) as a tuple M= (Σ1, Σ2,T12), whereT12is a TBox inLconsisting of inclusions fromΣ1toΣ2: (1) C1vC2, whereC1,C2are concepts inLoverΣ1andΣ2, respectively, and (2) Q1vQ2, whereQ1andQ2are roles inLoverΣ1andΣ2, respectively.

IfT12is anL-TBox, for a DLL(e.g.,DL-LiteRpos), thenMis called anL-mapping.

The semantics of a mapping is defined in terms of the notion of satisfaction. More specifically, given a mappingM = (Σ1, Σ2,T12), an interpretationI of Σ1 and an interpretation J of Σ2, pair (I,J) satisfies TBox T12, denoted by(I,J) |= T12, if for each concept inclusion C1 v C2 ∈ T12, it holds that C1I ⊆ C2J, and for each role inclusionQ1 v Q2 ∈ T12, it holds thatQ1I ⊆ Q2J. Moreover, given an interpretationIofΣ1, SATM(I)is defined as the set of interpretationsJ ofΣ2such

(5)

that(I,J)|=T12, and given a setX of interpretations ofΣ1, SATM(X)is defined as:

SATM(X) =S

I∈XSATM(I).

LetM= (Σ1, Σ2,T12)be a mapping,K1a KB overΣ1, andK2a KB overΣ2. – K2is called asolutionforK1underMif MOD(K2)⊆SATM(MOD(K1)), and – K2 is called a universal solution for K1 under M if MOD(K2) =

SATM(MOD(K1)).

Universal solutions present several limitations, as argued in [1, 2]. First, a universal solution (inDL-LiteRposandDL-LiteR) does not always exist. Second, if it exists, then its TBox is trivial (that is, equivalent to the empty TBox). Finally, in the worst case the smallest universal solution is exponential in the size of the mapping and the source KB. A notion of solution parametrized w.r.t. a query language was proposed in [1, 2] in order to overcome these limitations. Such a notion, though weaker, is in line with the objective of (data and) KB exchange of providing in the target sufficient information to answer queries that could also be posed over the source.

LetQbe a class of queries,M= (Σ1, Σ2,T12)a mapping,K1=hT1,A1ia KB overΣ1, andK2a KB overΣ2. Then

– K2 is called aQ-solutionfor K1 under Mif for every query q ∈ Qover Σ2, cert(q,hT1∪ T12,A1i)⊆cert(q,K2), and

– K2is called auniversalQ-solutionforK1underMif for every queryq∈ Qover Σ2,cert(q,hT1∪ T12,A1i) =cert(q,K2).

The definitions of solutions are illustrated in the following example.

Example 1. Assume Σ1 = {Painting(·), PaintedBy(·,·),ArtMovement(·,·)} and Σ2={ArtPiece(·),ArtAuthor(·,·),HasStyle(·,·),HasGenre(·,·)}. Consider map- pingM= (Σ1, Σ2,T12), whereT12is the following TBox:

PaintingvArtPiece Paintingv ∃HasGenre

PaintedBy vArtAuthor ArtMovement vHasStyle

Further, assumeT1 ={Painting ≡ ∃PaintedBy,Painting v ∃ArtMovement}and A1={Painting(blacksquare)}. Then, a universal solution for the KBK1=hT1,A1i underM is the KBK2 = hT2,A2i, whereT2 = ∅ andA2 is the following ABox, where n01, n02, and m01are labelled nulls:

ArtPiece(blacksquare) HasGenre(blacksquare, m01)

ArtAuthor(blacksquare, n01) HasStyle(blacksquare, n02) Now, consider KB K02 = hT20,A02i with non-empty TBox, where T20 = {ArtPiece ≡ ∃ArtAuthor,ArtPiece v ∃HasStyle,ArtPiece v ∃HasGenre}and A02={ArtPiece(blacksquare)}. Then we have thatK20 is a solution forK1underM.

However, we also have thatK02 is not a universal solution forK1 underM. Notably, both KBK2and KBK02are universal UCQ-solutions for KBK1under mappingM.

In order to understand the capacity of universal solutions, and also of the query- languages based notions of solutions to transfer implicit knowledge, the notion ofrep- resentability has been introduced in [1, 2]. Here we adapt that definition to the case

(6)

where the KB is always satisfiable, as inDL-LiteRpos. In the definition below we use chaseT(A)to denote the projection ofchaseT(A)on the signatureΣ.

LetLbe a DL,Qa class of queries,M= (Σ1, Σ2,T12)anL-mapping, andT1an L-TBox overΣ1. Then,

– T1 is (Q-)representable under Mif there exists an L-TBoxT2 over Σ2, called a (Q-)representation of T1 under M, such that for every ABox A1 over Σ1, hT2,chaseT122(A1)iis a (Q-)universal solution forhT1,A1iunderM.

– T1 is weakly (Q-)representable under M if there exists a mapping M? = (Σ1, Σ2,T12?)such thatT12 ⊆ T12?,T1∪ T12 |= T12?, andT1 is (Q-)representable underM?.

Example 2. LetM = (Σ1, Σ2,T12)andT1 be as in Example 1. Then we have that T2 ={ArtPiece ≡ ∃ArtAuthor,ArtPiece v ∃HasStyle,ArtPiece v ∃HasGenre}

is a UCQ-representation ofT1underM.

On the other hand, ifM0 = (Σ1, Σ2,T120 )withT120 ={PaintedBy vArtPiece}, then we have that T1 is not UCQ-representable under M0: take ABox A1 = {Painting(blacksquare)}, then chaseT0

122(A1) = ∅ and for no TBox T20, hT20,chaseT0

122(A1)iis a universal UCQ-solution forhT1,A1iunderM0. However, if we considerT12? =T120 ∪ {Paintingv ∃ArtAuthor}, we conclude thatT1is weakly UCQ-representable under M0 since T120 ⊆ T12?,T1 ∪ T120 |= T12? and T1 is UCQ- representable under M? = (Σ1, Σ2,T12?) (in fact,∅ is a UCQ-representation of T1

underM?).

3 Solving UCQ-Representability for DL-Lite

Rpos

In this section, we show that the UCQ-representability problem can be solved in poly- nomial time for the case where TBoxes and mappings are expressed in DL-LiteRpos. More specifically, we give a polynomial time algorithm UCQREPposthat, given aDL- LiteRpos-mapping M = (Σ1, Σ2,T12)and aDL-LiteRpos-TBoxT1, verifies whetherT1 is UCQ-representable underM, and if this is the case computes a UCQ-representation ofT1 underM. Moreover, we also show that this algorithm can be used to solve the UCQ-representability problem for the case ofDL-LiteRDFS. It is important to notice that the algorithm we present can be used to compute universal UCQ-solutions of polyno- mial size, which make good use of the source implicit knowledge. Thus, this algorithm computes solutions with good properties to be used in practice.

A related problem is that of query inseparability [7], which can be formulated as follows: given TBoxesT1andT2, and a signatureΣ, decide whether for each ABoxA overΣand for each queryqoverΣ,cert(q,hT1,Ai) =cert(q,hT2,Ai). In contrast to our polynomial result for UCQ-representability, query inseparability has been proved to be PSPACE-hard forDL-LiteRTBoxes and CQs [7], and an analysis of the proof shows that the same lower bound holds already forDL-LiteRpos.

3.1 Checking Whether a Given Target TBox is a UCQ-Representation

We start by considering the decision problem associated with UCQ-representability:

Given aDL-LiteR-mappingM = (Σ1, Σ2,T12), aDL-LiteR-TBoxT1overΣ1, and

(7)

aDL-LiteR-TBoxT2overΣ2, check whetherT2is a UCQ-representation ofT1under M, i.e., for each ABoxA1overΣ1,hT2,chaseT122(A1)iis a universal UCQ-solution forhT1,A1iunderM. This problem can be solved in two steps:

(C1) Check whether for each ABoxA1 over Σ1,hT2,chaseT122(A1)iis a UCQ- solution forhT1,A1iunderM.

(C2) Check whether for each ABoxA1overΣ1and for each UCQqoverΣ2, we have thatcert(q,hT2,chaseT122(A1)i)⊆cert(q,hT1∪ T12,A1i).

If both checks succeed, thenT2is a UCQ-representation ofT1underM, otherwise not.

We develop now techniques to perform these two checks in polynomial time.

Checking Condition (C1). For a DL-LiteRpos TBox T and a concept or role N, we define the upward closure of N w.r.t. T as the set UT(N) = {N0 | N0is concept or role andT |= N v N0}, and the strict upward closure ST(N) as UT(N)\ {N}. Then, for a set N of concepts and roles we define UT(N) = S

N∈NUT(N), and its strict version ST(N). Notice that both ST12(UT1(N)) and UT2(SM(N))are sets overΣ2, for each concept or roleN overΣ1.

With these notions in place, we can provide a necessary and sufficient condition for the satisfaction of condition (C1).

Proposition 1. Let M = (Σ1, Σ2,T12)be a DL-LiteRpos-mapping,T1 a DL-LiteRpos- TBox overΣ1, andT2a DL-LiteRpos-TBox overΣ2. ThenT1,T2, andMsatisfy condi- tion (C1) iff the following conditions are satisfied:

(A) SM(UT1(B))⊆UT2(SM(B)), for each basic conceptBoverΣ1; (B) SM(UT1(R))⊆UT2(SM(R)), for each basic roleRoverΣ1;

(C) for each basic conceptBand each basic roleRoverΣ1such that∃R∈UT1(B) and SM(UT1(∃R))6=∅, we have that

if SM(UT1(R))6=∅, then there exists a roleQR,BoverΣ2such that (CA) ∃QR,B∈UT2(SM(B)),

(CB) SM(UT1(R))⊆UT2(QR,B), and (CC) SM(UT1(∃R))⊆UT2(∃QR,B), and if SM(UT1(R)) =∅, either

(CD) SM(UT1(∃R))⊆UT2(SM(B)),

or there exist rolesQ1R,B, . . . , QnR,BoverΣ2such that (CE) ∃Q1R,B∈UT2(SM(B)),

(CF) T2|=∃(Q1R,B)v ∃Q2R,B, . . . ,∃(Qn−1R,B) v ∃QnR,B, and (CG) SM(UT1(∃R))⊆UT2(∃(QnR,B)).

It is important to notice that the necessary and sufficient condition in Proposition 1 can be checked in polynomial time, as the implication problem for DL-LiteRpos can be solved in polynomial time. In particular, for a basic concept B and a basic role R overΣ1 such that∃R ∈ UT1(B),SM(UT1(∃R)) 6= ∅,SM(UT1(R)) = ∅, and SM(UT1(∃R)) * UT2(SM(B)), checking the existence of rolesQ1R,B, . . . , QnR,B overΣ2satisfying conditions(CE),(CF), and(CG)can be reduced to checking reach- ability in a directed graph. Indeed, for each pair of basic concepts B2, B20 over Σ2

(8)

such thatB2 ∈ SM(B)andSM(UT1(∃R)) ⊆ UT2(B20), we use the following ap- proach to check for the existence of the roles Q1R,B, . . . , QnR,B over Σ2 such that T2 |= B2 v ∃Q1R,B,T2 |= ∃(Q1R,B) v ∃Q2R,B, . . . ,T2 |= ∃(QnR,B) v B20. Let G= (V, E)be the directed graph defined as:

V = {B2, B20} ∪ {Q2|Q2is a role inΣ2}

E = {(B2, B02)| T2|=B2vB20} ∪ {(B2, Q2)| T2|=B2v ∃Q2} ∪ {(Q2, B20)| T2|=∃Q2 vB20} ∪ {(Q2, Q02)| T2|=∃Q2 v ∃Q02} Then we test for the existence of the rolesQ1R,B, . . . , QnR,Bby verifying whetherB02is reachable fromB2inG. If for some pairB2,B20 the aforementioned two-step test suc- ceed, then we have that there exist rolesQ1R,B, . . . , QnR,Bthat satisfy conditions(CE), (CF), and(CG). Otherwise, we know that such roles do not exist.

Checking Condition (C2). We rely on the following result:

Proposition 2. Let M = (Σ1, Σ2,T12)be a DL-LiteRpos-mapping,T1 a DL-LiteRpos- TBox overΣ1, andT2a DL-LiteRpos-TBox overΣ2. ThenT1,T2, andMsatisfy condi- tion (C2) iff the following conditions are satisfied:

(A) UT2(SM(B))⊆SM(UT1(B))for each basic conceptBoverΣ1; (B) UT2(SM(R))⊆SM(UT1(R))for each roleR∈Σ1;

(C) for each basic roleQoverΣ2and each basic conceptBoverΣ1such that∃Q∈ UT2(SM(B))andUT2(∃Q)6={∃Q}, there exists a roleRQ,BoverΣ1s.t.

(CA) ∃RQ,B ∈UT1(B)and (CB) Q∈SM(RQ,B).

The necessary and sufficient condition in Proposition 2 can be checked in polynomial time, as the implication problem can be solved in polynomial time forDL-LiteRpos.

Thus, given that, by Propositions 1 and 2, both conditions (C1) and (C2) can be tested in polynomial time, we obtain the following result.

Theorem 1. The problem of verifying, given a DL-LiteRpos-mapping M = (Σ1, Σ2,T12), a DL-LiteRpos-TBox T1 over Σ1, and a DL-LiteRpos-TBoxT2 over Σ2, whetherT2is a UCQ-representation ofT1underMcan be solved in polynomial time.

3.2 The Algorithm for Computing a UCQ-Representation

In what follows, we present the algorithm UCQREPpos, which verifies whether a source TBoxT1 is UCQ-representable under a mappingM, and if this is the case returns a UCQ-representation ofT1underM.

Intuitively, given a source TBoxT1and a mappingM, algorithm UCQREPposcon- structs the best possible candidateT2for a UCQ-representation ofT1underM(given the conditions in Propositions 1 and 2), and then checks whetherT2effectively satisfies the properties required for a UCQ-representation (note, thatT2is indeed aDL-LiteRpos TBox). To prove the correctness of this algorithm, we need to show thatT1 is UCQ- representable underMif and only ifT2is a UCQ-representation ofT1underM. This

(9)

Algorithm: UCQREPpos(T1,M)

Input: ADL-LiteRpos-mappingM= (Σ1, Σ2,T12)and aDL-LiteRpos-TBoxT1overΣ1. Output:ADL-LiteRpos-TBoxT2 overΣ2 that is a UCQ-representation ofT1 underM, if

T1is UCQ-representable underM. The keywordfalseotherwise.

1. LetT2be a TBox overΣ2defined as:

T2 = {N2vM2|N1a basic concept or role overΣ1, N2∈SM(N1), M2∈SM(UT1(N1))}

2. Remove fromT2 every inclusionN2 v M2 such that(i)N2 ∈ SM(N1) for some N1 overΣ1, and(ii)for everyM1 overΣ1 such thatM2 ∈ SM(M1), it holds that T16|=N1vM1. Moreover, ifN2=∃R2andM2=∃R02, then also remove inclusions R2vR02andR2 vR02

fromT2.

3. Remove fromT2every inclusion of the form either∃R2 vB2orR2 vR02orR2 v R02for rolesR2,R02and a conceptB2overΣ2, if there exists a conceptB1overΣ1such that(i)∃R2∈SM(B1), and(ii)for every roleR1overΣ1such that∃R1∈UT1(B1) andR2 ∈SM(R1), it holds thatT16|=B1v ∃R1.

4. Verify whetherT2is a UCQ-representation ofT1underM. If the test succeeds, return T2, otherwise returnfalse.

Fig. 1.Algorithm to compute the UCQ-representation of aDL-LiteRposTBoxT1under aDL-LiteRpos mappingM.

is done in the following theorem, where it is also proved that the algorithm works in polynomial time. The latter is a consequence of the fact thatT2is of polynomial size in the sizes ofT1andM, and that, by Theorem 1, it is possible to check in polynomial time whetherT2is a UCQ-representation ofT1underM.

Theorem 2. AlgorithmUCQREPposis correct and runs in polynomial time.

The following examples illustrate how algorithm UCQREPposworks.

Example 3. Let M = (Σ1, Σ2,T12), where Σ1 = {A1(·), B1(·), C1(·)}, Σ2 = {A2(·),B2(·)}, andT12={A1vA2,B1vB2, C1vB2}. Furthermore, assume that T1={B1vA1}. Then, in step 1, the algorithm constructs the TBoxT2={B2vA2}.

In step 2, it removes the only axiom fromT2asB2 ∈ SM(C1)andT1 6|=C1 vA1. In step 3, it does nothing, and finally, at the last step it checks whether the empty TBox T2 is a UCQ-representation of T1 under M. Since A2 ∈ SM(UT1(B1))and A2∈/UT2(SM(B1)), the algorithm returnsfalse.

Example 4. LetM = (Σ1, Σ2,T12), whereΣ1 = {B1(·), P1(·,·), R1(·,·)}, Σ2 = {A2(·), B2(·), R2(·,·)}, andT12 ={∃P1 vA2, B1 vB2, R1 vR2}. Furthermore, assume thatT1 = {B1 v ∃P1, B1 v ∃R1,∃R1 v ∃P1}. Then, in step 1, the al- gorithm constructs the TBox T2 = {B2 v ∃R2,∃R2 v A2}. It does not remove anything in steps 2 and 3. Finally, at the last step it successfully checks that T2 is a UCQ-representation ofT1underMand returnsT2.

(10)

3.3 Solving UCQ-Representability forDL-LiteRDFS

It is not difficult to see that if the input of algorithm UCQREPpos is aDL-LiteRDFS- mapping M = (Σ1, Σ2,T12) and aDL-LiteRDFS-TBox T1 over Σ1, then the set T2

computed by this algorithm is a DL-LiteRpos-TBox over Σ2 that can be easily trans- formed into an equivalentDL-LiteRDFS-TBox. Indeed,T2might contain inclusions be- tween basic concepts of the form∃R2 v ∃R20, but this occurs only ifT2implies also the role inclusionR2 vR02. Hence, all concept inclusions that would fall outsideDL- LiteRDFS are implied by theDL-LiteRDFS fragment ofT2 and can be removed fromT2

without affecting its semantics. Thus, we conclude that algorithm UCQREPposcan also be used to solve in polynomial time the UCQ-representability problem forDL-LiteRDFS

mappings and TBoxes.

4 Solving Weak UCQ-Representability for DL-Lite

Rpos

In this section, we show that also the weak UCQ-representability problem can be solved in polynomial time when TBoxes and mappings are expressedDL-LiteRpos. We first need to introduce some terminology. Given aDL-LiteR-TBoxT over a signatureΣ and a UCQqoverΣ, a UCQqroverΣis said to be aperfect reformulationofqw.r.t.T if for every ABoxAoverΣ, it holds that [6]:cert(q,hT,Ai) =cert(qr,h∅,Ai). That is, the certain answers to the UCQqover a KBhT,Aican be computed by posing the UCQ qrover the ABoxA. It is well-known that every UCQqadmits a perfect reformulation w.r.t. aDL-LiteR-TBoxT, which can be computed in polynomial time [6].

Interestingly, the fundamental notion of perfect reformulation can be used to solve the UCQ-representability problem for DL-LiteRpos. More precisely, let M = (Σ1, Σ2,T12)be aDL-LiteR-mapping andT1aDL-LiteR-TBox overΣ1. Then define a mapping COMP(M,T1) = (Σ1, Σ2,T12?)that extendsMby compiling the knowledge fromT1intoT12. Formally, for a basic concept or roleN overΣ1, letbqN be the CQ defined as follows:bqA(x) =A(x),bq∃P(x) =∃y.P(x, y),bq∃P(x) =∃y.P(y, x), bqP(x, y) = P(x, y), andbqP(x, y) = P(y, x). Then, for every concept inclusion B vC∈ T12and for every CQqin the perfect reformulation ofbqBw.r.t.T1, include Cq vCintoT12?, whereCqis the (unique) basic concept such thatbqC

q =q. Also, for every role inclusionRvQ∈ T12and for every CQqin the perfect reformulation of bqRw.r.t.T1, includeRq vQintoT12?, whereRq is the basic role such thatbqRq =q.

It is important to notice that ifM = (Σ1, Σ2,T12)is aDL-LiteRpos-mapping and T1aDL-LiteRpos-TBox overΣ1, then COMP(M,T1) = (Σ1, Σ2,T12?)is aDL-LiteRpos- mapping that can be computed in polynomial time in the sizes ofMandT1. Therefore, given that the set of inclusions defining COMP(M,T1)contains the set of inclusions definingMandT1∪ T12|=T12?, we conclude that COMP(M,T1)can be used to check in polynomial time whetherT1is weakly UCQ-representable underM.

Theorem 3. LetM = (Σ1, Σ2,T12)be a DL-LiteRpos-mapping andT1a DL-LiteRpos- TBox over Σ1. Then T1 is weakly UCQ-representable under M if and only if T1 is UCQ-representable underCOMP(M,T1).

From this result and Theorem 2 we obtain a polynomial time algorithm for solving the weak UCQ-representability problem forDL-LiteRposmappings and TBoxes.

(11)

The example below shows aDL-LiteRposTBoxT1and aDL-LiteRposmappingMsuch thatT1is not weakly UCQ-representable underM.

Example 5. LetM = (Σ1, Σ2,T12), whereΣ1 = {P1(·,·), B1(·)}, Σ2 = {A2(·), B2(·)}, andT12 ={B1 v B2,∃P1 v A2}. Furthermore, assume thatT1 ={B1 v

∃P1}. ThenT1is not weakly UCQ-representable underM. In fact, given that the perfect reformulation of ∃P1 w.r.t. TBox T1is∃P1 itself, and likewise for conceptB1, we have thatM= COMP(M,T1)and, thus,T1is not weakly UCQ-representable under M, asT1is not UCQ-representable underM.

Instead, as shown in [1, 2], for eachDL-LiteRDFS TBoxT1andDL-LiteRDFS mapping M,T1is weakly UCQ-representable underM.

5 Conclusions

In this paper, we have extended previous results on representability in the knowledge exchange framework toDL-LiteRpos, a DL of theDL-Litefamily that allows for existen- tials in the right-hand side of inclusion assertions, both in the source TBox and in the mapping. We are currently working on extending our results toDL-LiteR, which in- cludes disjointness assertions, and to the other DLs in the extendedDL-Litefamily [4].

A further interesting problem that we are investigating is that of checking the existence of universal solutions forDL-LiteRposand other more expressive DLs.

Acknowledgements. The authors were supported by the EU Marie Curie action FP7- PEOPLE-2009-IRSES (grant 24761), by Fondecyt (grant 1090565), by the EU ICT projects ACSI (grant FP7-257593) and FOX (grant FP7-233599), and by the NWO project DEX (612.001.012).

References

1. Arenas, M., Botoeva, E., Calvanese, D.: Knowledge base exchange. In: Proc. of DL 2011.

CEUR, ceur-ws.org, vol. 745 (2011)

2. Arenas, M., Botoeva, E., Calvanese, D., Ryzhikov, V., Sherkhonov, E.: Exchanging descrip- tion logic knowledge bases. In: Proc. of KR 2012 (2012)

3. Arenas, M., P´erez, J., Reutter, J.L.: Data exchange beyond complete data. In: Proc. of PODS 2011. pp. 83–94 (2011)

4. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: TheDL-Litefamily and rela- tions. J. of Artificial Intelligence Research 36, 1–69 (2009)

5. Brickley, D., Guha, R.V.: RDF vocabulary description language 1.0: RDF Schema.

W3C Recommendation, World Wide Web Consortium (Feb 2004), available at http://www.w3.org/TR/rdf-schema/

6. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: TheDL-Litefamily. J. of Automated Rea- soning 39(3), 385–429 (2007)

7. Konev, B., Kontchakov, R., Ludwig, M., Schneider, T., Wolter, F., Zakharyaschev, M.: Con- junctive query inseparability of OWL 2 QL TBoxes. In: Proc. of AAAI 2011. pp. 221–226 (2011)

Références

Documents relatifs

setting, given a source instance and a mapping specification, (universal) solutions are target instances derived from the source instance and the mapping, in our case solutions

medicine strategy as a framework for the development of national traditional medicine programmes; to develop and implement national policies and regulations on

introduced the setting of knowledge exchange, where one is given a source knowledge base and a mapping between a source schema and a target schema, and the goal is to materialize

1) Strain and clusters from public and corporate sources 2) Alignments on antimicro- bial terms. 3) Mapping of taxon labels. 4) Genome and locus sequence content. 5) Inferring

In the non-emptiness problem for universal solutions the input is the same, except for K 2 , and the question to answer is whether a universal solution K 2 exists (analogously

[19], given a source schema, a target schema, an instance of the source schema, and a set of mapping rules, data exchange is the problem of creating an instance of the target

In particular, a query with shape a rooted tree of size k can be constructed either by a one step extension of an existing query with shape a rooted tree S of size k − 1 (by

1) The roles, the identities and the status of the knowledge are defined in the course of the interaction starting from representations of the situation. These de facto draw on not