• Aucun résultat trouvé

On the Data Complexity of Ontology-Mediated Queries with a Covering Axiom

N/A
N/A
Protected

Academic year: 2022

Partager "On the Data Complexity of Ontology-Mediated Queries with a Covering Axiom"

Copied!
12
0
0

Texte intégral

(1)

On the Data Complexity of Ontology-Mediated Queries with a Covering Axiom

O. Gerasimova1, S. Kikot2, V. Podolskii3,1, and M. Zakharyaschev2

1 National Research University Higher School of Economics, Moscow, Russia

2 Birkbeck, University of London, U.K.

3 Steklov Mathematical Institute, Moscow, Russia

Abstract. This paper reports on our ongoing work that aims at a classification of conjunctive queriesq according to the data complexity of answering ontology- mediated queries({A v T tF},q). We give examples of queries from the complexity classesC ∈ {AC0,L,NL,P,CONP}, and obtain a few syntactical conditions forC-membership andC-hardness.

1 Introduction

TheOWL 2 QL profile ofOWL 2—as well as the underlining description logics from theDL-Litefamily [4, 2]—were designed to ensure that every ontology-mediated query (OMQ, for short)(T,q)with anOWL 2 QLontologyT and a conjunctive query (CQ) q is first-order (FO) rewritable. However, when developing ontologies for ontology- based data access (OBDA) [10] applications, domain experts are often tempted to use axioms with constructs that are not available in OWL 2 QL. For example, the NPD FactPages ontology,4 which was created to facilitate querying the datasets of the Nor- wegian Petroleum Directorate,5contains cardinality restrictions and covering axioms of the formA v B1t · · · tBn. Typical answers to the question whether such axioms could have a negative impact on OMQ rewriting are as follows: (i) the data satisfies the axioms anyway (because of the database schema), (ii) our ‘real-world queries’ are never affected by them, and (iii) OBDA systems such as Ontop drop everything outside OWL 2 QL. Ideally, of course, we would rather want our system to detect automatically whether the given OMQ is FO-rewritable and alert the user if this is not so. Furthermore, in case of non-FO-rewritability, we might want the system to check whether a datalog rewriting is possible, and so on. From the complexity-theoretic point of view, we are thus interested in the data complexity of answering a given OMQ with an expressive ontology.

A systematic investigation of this problem was started in [3], which showed among other results that answering OMQs of the form(Disn,u), whereuis aunionof CQs (UCQ) andDisn ={AvB1t · · · tBn}, is polynomially equivalent to the constraint satisfaction problems CSP(A). In particular, a P/CONP dichotomy for such OMQs would give a dichotomy for CSPs, thereby confirming the Feder-Vardi conjecture. As

4http://sws.ifi.uio.no/project/npd-v2/

5http://factpages.npd.no/factpages/

(2)

shown in [7], answering CQs with basic schema.org ontologies (in particular, Disn) and CQs of qvar-size≤2is in P for combined complexity, whereqis ofqvar-sizenif the restriction ofqto its quantified variables is a disjoint union of CQs with at mostn variables each. Moreover, FO- and datalog-rewritability of OMQs of the form(T,u), whereT is a schema.org ontology anduis a UCQ, are decidable in NEXPTIME. It has also been recently established in [5] that checking FO-rewritability of OMQs with on- tologies formulated in any description logic betweenALCIandSHIis 2NEXPTIME- complete. Datalog rewritability of OMQs with ontologies given in disjunctive datalog has been investigated in [8].

In this paper, we consider one fixed non-Horn ontologyDis={AvT tF}. Ul- timately aiming at a complete classification of CQsqaccording to the data complexity of answering OMQsQ= (Dis,q), here we present our initial observations about this problem. Ideally, we would like to obtain transparent necessary and sufficient conditions relating the structure ofq—say, the way howTandFoccur in it—with the complexity of answeringQ. For example, one such condition guaranteeing datalog rewritability, and so tractability of answering Q follows from [8, Theorem 27]: it suffices thatq contains at most one occurrence ofFor at most one occurrence ofT. We obtain a few conditions in the same spirit for the complexity classes AC0, L, NL and P. We also give quite a few simple and instructive CQs distinguishing between NL and P, and develop techniques for establishing P andCONP lower bounds.

2 Preliminaries

In our context, a conjunctive query (CQ) is a first-order (FO) formula of the form q(x) = ∃yϕ(x,y), whereϕ is a conjunction of unary or binary atoms P(z)with z⊆x∪y.Unions of conjunctive queries(UCQ) is a disjunction of conjunctive queries.

Given an ABox (or data instance)A, we denote byind(A)the set of individual names that occur inA. A tuplea⊆ind(A)is acertain answerto the OMQQ= (Dis,q(x)) overAifM|=q(a), for every modelMofDis∪ A; in this case we writeDis,A |= q(a). If the setxofanswer variablesis empty, acertain answertoQoverDis ‘yes’ if M|=q, for every modelMofDis∪ A, and ‘no’ otherwise. OMQs and CQs without answer variablesxare calledBoolean. We often regard CQs assetsof their atoms. In this paper, we assume that the all CQs areconnected.

LetQ= (Dis,q(x))be a fixed OMQ. ByansweringQ, we understand the problem of checking, given an ABoxAand a tuplea ⊆ind(A), whetherDis,A |=q(a). It is readily seen that this problem is always inCONP. It is in the complexity class AC0if there is an FO-formulaq0(x), called anFO-rewritingofQ, such thatDis,A |=q(a) iffq0(a)holds in the model given byA, for any ABoxAand any tuplea⊆ind(A).

Adatalog program,Π, is a finite set ofrulesof the form∀z(γ0←γ1∧ · · · ∧γm), where eachγiis an atomQ(y)withy⊆zor an equality(z=z0)withz, z0∈z. (As usual, we omit∀z.) The atomγ0is theheadof the rule, andγ1, . . . , γmitsbody. All variables in the head must occur in the body, and=can only occur in the body. The predicates in the heads of rules inΠ areIDB predicates, the rest (including=)EDB predicates. A programΠis calledlinearif the body of every rule inΠcontains at most one IDB predicate.

(3)

Adatalog queryis a pair(Π, G(x)), whereΠ is a datalog program andG(x)an atom. A tuplea ⊆ind(A)is ananswer to(Π, G(x))overan ABoxAifG(a)holds in the first-order structure with domainind(A)obtained by closingAunder the rules inΠ; in this case we writeΠ,A |= G(a). A datalog query(Π, G(x))is adatalog rewritingof an OMQQ = (Dis,q(x))in caseDis,A |= q(a)iffΠ,A |= G(a), for any ABox Aand anya ⊆ ind(A). Theevaluation problemfor(Π, G(x))—that is, checking, given an ABoxAand a tuplea ⊆ind(A), whetherΠ,A |= G(a)—is known to be in P; for linearΠ, this problem is in NL; see [6] and references therein.

3 AC

0

By asolitary occurrenceofFin a CQqwe mean anyF(x)∈qsuch thatT(x)∈/ q;

likewise, asolitary occurrenceofT inqis anyT(x)∈qsuch thatF(x)∈/q.

Theorem 1. For any CQq without solitary occurrences ofF (orT), answering the OMQQ= (Dis,q)is inAC0.

Proof. We show thatDis,A |= q(a)iffA |= q(a). Suppose thatA 6|= q(a)and F(x)∈q ⇒T(x)∈q. TakeA0 =A ∪ {F(a)|a∈ind(A)∧T(a)∈ A}. Clearly,/ A0|=DisandA06|=q(a). The converse direction is trivial.

In particular, answering any OMQQ= (Dis,q), whereqdoes not contain one of ForT, is in AC0. This observation can be easily generalised to OMQs with ontologies Disn ={AvB1t · · · tBn}, forn≥2:

Theorem 2. Supposeqis any CQ that does not contain an occurrence ofBi, for some i(1≤i≤n). Then answering the OMQQ= (Disn,q)is inAC0.

Thus, only those CQs can ‘feel’Disnas far as FO-rewritability is concerned that contain all theBn(which makes them quite complex in practice). Theorem 1 also shows thatQ= (Dis,q)satisfying the respective condition has a trivial FO-rewriting, viz.q itself. This is not accidental as shown by the following observation:

Proposition 1. IfQ= (Dis,q)is inAC0, thenqis a rewriting ofQ.

Proof. By [3, Proposition 5.9], ifQis FO-rewritable, it has a UCQ rewriting. Then there is a homomorphism fromqto any CQq0in this rewriting.

We do not know yet whether the sufficient condition for FO-rewritability given by Theorem 1 is also a necessary one for minimalCQsq(that are not equivalent to any of their proper subqueries). For non-minimal CQs, this is not the case as shown

by F ◦ F T ◦ T

R R R R which is in AC0 because it is equivalent to the CQ

◦ F T ◦

R R . Below we obtain some partial results showing how a singleF-atom and a singleT-atom inqcan cause L- and NL-hardness.

(4)

4 L and NL

We say that a Boolean CQqis anF-T-CQif it has exactly one atom of the formF(x), exactly one atom of the formT(y), and the variablesxandyare distinct.

Theorem 3. Answering any OMQQ= (Dis,q)with anF-T-CQqisL-hard.

Proof. The proof is by reduction to the reachability problem for undirected graphs, which is known to be L-complete; see, e.g., [1]. Let q0 be the CQ obtained fromq by removing the atoms F(x) andT(y). Suppose we are given an undirected graph G= (V, E)and two verticess, t∈V. It will be convenient to regardGas a directed graph such that(u, v)∈ Eiff(v, u)∈ E, for anyu, v ∈V. We encodeGby means of an ABoxAG that is obtained fromGas follows. For every edgee= (u, v)∈ E, letq0ebe the set of atoms inq0 withxrenamed tou,ytovand all other variablesz toze. ThenAGcomprises all suchqe, fore∈E, as well asF(s),T(t)andA(v), for v∈V \ {s, t}. Our aim is to show thats→GtiffDis,AG|=q.

Supposes →G t, that is, there exists a paths = v0, v1, . . . , vn = t inGwith ei = (vi, vi+1)∈E, fori < n. Consider an arbitrary modelI ofDisandAG. Since I |=DisandF(s),T(t),A(vi), for1≤i < n, are all inAG, we can find somei < n such thatI |=F(vi)andI |=T(vi+1). Asq0eiis an isomorphic copy ofq0, we obtain I |=q. Conversely, supposes6→Gt. Define an interpretationIby extending the ABox AGwithFI={v∈V |s→Gv}andTI ={v∈V |s6→Gv}. Clearly,Iis a model ofDis. By the construction, the elements of the connected component ofIcontaining scannot be instances ofT, while the remaining elements ofI cannot be instances of F. Sinceqis connected, it follows thatI 6|=q.

We call a Boolean CQqlinear-directedif all of its variables can be arranged in a sequencev0, . . . , vmsuch that all binary predicates inqare of the formR(vi, vi+1), for somei,0≤i < m.

Theorem 4. Answering any OMQQ= (Dis,q)with a linear-directed CQqcontain- ing both a solitaryFand a solitaryT isNL-hard.

Proof. SupposeF(vk)∈q,T(vk)∈/ qandF(vl)∈/ q,T(vl)∈q, for somek, lwith 0≤k < l≤m. We rename the sequencevk, . . . , vltox0, . . . , xn. The proof proceeds by reduction to the reachability problem in directed graphs, which is known to be NL- complete; see, e.g., [1]. Given adirectedgraphG= (V, E)and verticess, t∈V, we construct the ABoxAGin the same way as in the proof of Theorem 3 treatingx0asx andxn asy. Again, we show thats→G tiffDis,AG |=q. The implication(⇒)is established exactly as above.

To prove(⇐), we assume thats6→G tand consider the same modelI as defined in the proof of Theorem 3. Taking account of linear-directedness ofq, we immediately conclude that there is no homomorphismh: q → I withh(x0) ∈ V. It remains to show that there is no homomorphismh: q → I withh(x0) ∈/ V either. Suppose to the contrary that such a homomorphism exists. Then there existB ∈ {F, T} and a homomorphismf: q→(AG2∪ {B(r)}), whereG2 = ({s, r, t},{(s, r),(r, t)}). We denote the points ofAG2 betweensandrbyx0, x1, . . . , xnand those betweenrand

(5)

tbyxn, x01, . . . , x0n. By comparing the lengths of appropriate segments ofq, we obtain f(x0) = xi, for somei(0 < i < n). AsF(x0) ∈ q, we must haveF(xi) ∈q; see the picture below. Asf(xi) = x2i if 2i ≤ n, and f(xi) =x02i modn otherwise, we also haveF(x2i modn)∈q; more generally,F(xki modn)∈qfor all naturalk. Now, since the equation of the form ‘iX = n modn’ always has a solution,F(xn) ∈ q, which is impossible ifB = T. IfB = F, we use a similar argument starting from T(xi)∈qand show thatT(xn)∈q, which is again a contradiction.

F x0 . . .

f(x0) xi . . .

f(xi) . . .

B xn

. . . x0i . . . T x0n

Fx0 . . . xi . . . xj . . . T xn

Theorems 1 and 4 give the followingdichotomy for OMQsQ = (Dis,q)with linear-directed CQsq:

– eitherqdoes not contain a solitaryF or a solitaryT, and answeringQis in AC0, – orqcontains both solitaryFandT, and answeringQis NL-hard.

We now complement the sufficient conditions of L- and NL-hardness obtained above with sufficient conditions of OMQ answering in L- and NL.

A CQq0(x, y)issymmetricif the CQsq0(x, y)andq0(y, x)are equivalent in the sense thatq0(a, b)holds inAiffq0(b, a)holds inA, for any ABoxAanda, b∈ind(A).

Theorem 5. LetQ= (Dis,q)be any OMQ such that

q=∃x, y(F(x)∧q01(x)∧q0(x, y)∧q02(y)∧T(y)),

for some connected CQsq0(x, y),q01(x)andq02(y)that do not contain solitaryTand F, andq0(x, y)is symmetric. Then answeringQcan be done inL.

Proof. It is not hard to show that, for any ABoxA, we haveDis,A |=qiff there exist v0, v1, . . . , vn∈ind(A), for somen≥1, such that the following conditions hold:

– F(v0), A(v1), . . . , A(vn−1), T(vn)∈ A;

– A |=q0(vi, vi+1)for0≤i < n;

– A |=q01(vi)for0≤i < n;

– A |=q02(vi)for1≤i≤n.

It remains to observe that checking these conditions reduces to checkingVT–VF reach- ability in the undirected graph GA = (VA, EA)defined below. The vertices in GA comprise the setVA=VT∪VA∪VF, where

– VT ={v∈ind(A)| A |=T(v)∧q02(v)};

– VA={v∈ind(A)| A |=A(v)∧q01(v)∧q02(v)};

– VF ={v∈ind(A)| A |=F(v)∧q01(v)}.

The edges inGAcomprise the setEA=ET A∪EAA∪EF A, where

(6)

– Eall ={(x, y)| A |=q0(x, y)};

– ET A={(x, y)∈Eall |(x∈VT ∧y∈VA)∨(y∈VT ∧x∈VA)};

– EAA={(x, y)∈Eall|x∈VA∧y∈VA};

– EF A={(x, y)∈Eall|(x∈VF∧y∈VA)∨(y∈VF∧x∈VA)}.

It is readily seen that GA = (VA, EA)is undirected in the sense that, for all of its verticesuandv,(u, v)∈EAiff(u, v)∈EA.

If we do not requireq0(x, y)to be symmetric, the complexity upper bound increases to NL:

Theorem 6. LetQ= (Dis,q)be any OMQ such that q=∃x, y(F(x)∧T(y)∧q0(x, y)),

for some connected CQq0(x, y)without solitary occurrences ofFandT. Then answer- ingQcan be done inNL.

Proof. We claim that the datalog query(Π, G)with the following linear datalog pro- gramΠ, whereq˜0is the result of omitting all the∃fromq0:

G←F(x)∧˜q0(x, y)∧P(y) P(x)←T(x)

P(x)←A(x)∧q˜0(x, y)∧P(y)

is a datalog rewriting ofQ. Indeed, ifΠ,A |=Gthen there arev0, v1, . . . , vn∈ind(A) such that F(v0), A(v1), . . . , A(vn−1), T(vn) ∈ A and q0(vi, vi+1)holds in A, for 0≤i < n. Clearly, in any modelIofDisandAthere isiwithI |=F(vi)∧T(vi+1).

It follows thatDis,A |=q.

Conversely, supposeΠ,A 6|=G. LetVP ={v ∈ind(A)|Π,A |=P(v)}. Define a modelIofDiswith domainind(A)by setting

TI={v|T(v)∈ A} ∪ {v∈VP |A(v)∈ A}, FI =FA∪ {v /∈VP |A(v)∈ A}.

We claim that I 6|= q. Indeed, otherwise there is a homomorphism h: q → I. As h(y)∈TI, we haveΠ,A |=P(h(y)). Ash(x)∈FI, we have eitherF(h(x))∈ Aor A(h(x))∈ A, contrary toΠ,A 6|=G.

The sufficient conditions of Theorems 5 and 6 only apply to CQs with exactly one solitary occurrence ofF and exactly one solitary occurrence ofT. What happens if we allow more than one solitary occurrences ofForT?

5 P

The following result is a consequence of [8, Theorem 27]:

Theorem 7. LetQ= (Dis,q)be any OMQ such that

q=∃x, y1, . . . , yn(F(x)∧T(y1)∧ · · · ∧T(yn)∧q0(x, y1, . . . , yn)), for some connected CQ q0(x, y1, . . . , yn) without solitary occurrences of T and F. Then answeringQcan be done inP.

(7)

Indeed, for any ABoxA, we haveDis,A |= q iffΠ,A |= G, whereΠ is the following datalog program andq˜0is the result of omitting all the∃fromq0:

G←F(x)∧˜q0(x, y1, . . . , yn)∧P(y1)∧ · · · ∧P(yn) P(x)←T(x)

P(x)←A(x)∧q˜0(x, y1, . . . , yn)∧P(y1)∧ · · · ∧P(yn).

Is the P-upper bound of Theorem 7 optimal? The following example gives a typical OMQ in the scope of that theorem answering which is P-hard.

Example 1. We show thatQ= (Dis,q)is P-hard forqshown in the picture below.

T T F

S R

The proof is by reduction of the alternating monotone circuit evaluation problem, which is known to be P-complete [9]. An example of an alternating monotone circuit is shown in the picture below. Given such a circuitCand an inputα, we define an ABoxAαCas the set of the following atoms:

– R(g, h), if a gategis an input of a gateh;

– S(g, h), ifgandhare distinct inputs of some AND-gate;

– S(g, g), ifgis an input gate or a non-output AND-gate;

– T(g), ifgis an input gate with 1 underα;

– F(g), for the only output gateg;

– A(g), for thosegthat are neither inputs nor the output.

To illustrate, the picture below shows an alternating monotone circuitC, an inputαfor it, and the ABoxAαC, where the solid arrows representRand the dashed onesS:

1 0 1 0 0 0

OR OR OR OR

AND AND AND

OR OR

AND

T T

A

A

A A

A A A

A A

F

One can show thatC(α) = 1iffDis,AαC|=q.

Curiously, by changingStoRin the CQ from Example 1, we obtain an OMQ that is NL-complete as follows from Theorem 8 below.

6 NL vs. P

Theorem 8. Answering any OMQQ= (Dis,qn)with qn =∃x1, . . . , xn, y

n−1

^

i=1

T(xi)∧R(xi, xi+1)

∧T(xn)∧R(xn, y)∧F(y),

(8)

forn≥1, isNL-complete.

Proof. The lower bound follows from Theorem 4. The proof of the upper one is by reduction to directed reachability. We splitqninto two CQs:

q0n=∃x1, . . . , xn

n−1

^

i=1

T(xi)∧R(xi, xi+1)

∧T(xn),

q=∃x, y T(x)∧R(x, y)∧F(y) .

One can show that, for any ABoxA, we haveDis,A |=qniff there exist a homomor- phismf: q0n → Aand a directedR-pathf(xn), v0, v1, . . . , vm ∈ ind(A)such that A(vi) ∈ A, fori = 1, . . . , m−1, andF(vm) ∈ A. Clearly, this criterion reduces to directed reachability.

To further illustrate how minor modifications to the structure of CQs can send them to different complexity classes, we collect in Table 1 a number of CQs in the scope of Theorem 7, some of which turn out to be NL-complete, while others are P-complete.

(All the omitted labels on the arrows in Table 1 are assumed to beR,−/Ameans either blank orA, andF T /Ameans eitherF T orA).

Here, we only sketch the proof of P-hardness for the OMQ(Dis,q), whereqis

T F T

R R

The proof is by reduction of the monotone circuit evaluation problem. Given a mono- tone circuitCand an inputα, we define an ABoxAαCas the following labelled directed graph, all of whose edges are labelled withR. For each gategofC except the inputs and output, the graph contains two verticesg andg0 labelled withA; the output gate ggives rise to only one vertexglabelled withF, while each input gategto only one vertexg0 labelled according toα. For an OR-gateg =h1∨h2, we have the directed edges(h01, g),(h02, g),(g, rg), wherergis a new vertex labelled withT. For an AND- gateg=h1∧h2, we have the edges(h01, g),(g, h02). Also, for each gateg, we have the edges(g, g0),(g0, tg), wheretgis a new vertex labelled withT. An example illustrating the construction is given below. One can show thatC(α) = 1iffDis,AαC|=q.

1 1 0

g1 AND OR g2

AND

g3

T T F

g1 A A g2

T rg2 g01 A A g02

T tg1 F g3 T tg2

The membership in NL for the CQs in the left column of Table 1 can be shown by constructing appropriate linear datalog programs. For example, answering the OMQ

(9)

NL-complete P-complete

T T F T T F

S R

T F T

T T T F T T −/A F

T −/A T F

T T F T F T T F T F

T T F T F T F

T −/A T F T F T F

T T −/A F T F T T T −/A F T F

T T −/A −/A F T F

T T T −/A F T F T F

T T F T T F

T T F T F T F T /A T F

T −/A T F

T A T F

T F T T F T

Table 1.NL- and P-complete OMQs in the scope of Theorem 7.

with the last CQ of the left column can be done by the following linear program:

P(x)←R(x, y), T(y), R(x, z), R(z, v), T(v) P(x)←R(x, y), T(y), R(x, z), R(z, v), P(v), A(v) P(x)←R(x, y), P(y), A(y)

G←P(x), F(x)

Note that the classification problem we deal with in this section can be regarded as an instance of a more general problem of classifyingdatalog programsin terms of their data complexity, in particular, finding an NL/P dichotomy.

7

CO

NP

On the other hand, a minor extension of the CQ from Example 1 can lead to CONP- completeness. First we show that answering the OMQQ= (Dis,q)with the Boolean CQqgiven in the picture below isCONP-complete.

(10)

T T F F

R S Q

Consider the ABoxesANconstructed according to the pattern shown below forN= 3:

a0 c3 b3 a3 c2

b2

a2

c1

b1 a1

c0 b0

Q S Q R

S

R

Q

S

R Q

S R T A F T A F

T

A

F T

A F

Let V = {a0, . . . , aN}. It is not hard to see that (i)for any interpretation I based onAN, ifI 6|= q then eitherV ⊆ TIor V ⊆FI;(ii)the interpretationsI andI0 obtained by extendingAN withTI=TA∪V andFI0 =FA∪V, respectively, are both models ofDisthat do not satisfyq.

Given a 2+2-CNFφwith clausesD1, . . . , DN and variablesp1, . . . , pM, we take M disjoint copies ofAN, distinguishing between them by the superscripts1, . . . , M.

For example,a23is thea3-point of the second copy ofAN andV2={a20, . . . , a2N}. For eachDnof the form¬pi∨ ¬pj∨pk∨pl, we add to those copies the atomsR(ain, ajn), S(ajn, akn)andQ(akn, aln), and denote the resulting ABox byAφ.

A ain A ain−1

A ajn A ajn−1

A akn A akn−1

A aln A aln−1 R

S

Q

pi pj pk pl

We show thatφis satisfiable iffDis,Aφ 6|=q. Letq0 =R(x, y)∧S(y, z)∧Q(z, w).

Observe that any possible match ofq0inAφfalls into one of the two groups:

(A)(ain, bin, cin, ain+1), for0≤n≤N,1≤i≤M and addition moduloN+ 1;

(B)(ain, ajn, akn, aln), for some clauseDn= (¬pi∨ ¬pj∨pk∨pl)inφ.

Supposeφis satisfiable under an assignmenta. We define a modelI of Dis by extendingAφwithTI=TAφ∪S

{Vi|a(pi) = 1},FI =FAφ∪S

{Vi|a(pi) = 0}.

We claim that I 6|= q. Indeed, the tuples in (A) cannot yield a match by(ii)above, while the tuples in (B) do not give a match sincea(Dn) = 1, for alln≤N. To see this, suppose a tuple(ain, ajn, akn, aln)from (B) is a match forqinI. Then{ain, ajn} ⊆ TI and{akn, aln} ⊆FI, from whicha(pi) = 1,a(pj) = 1,a(pk) = 0anda(pl) = 0, and so the clauseDn=¬pi∨ ¬pj∨pk∨plis false undera.

Conversely, supposeDis,Aφ 6|= q. Then there is a modelI ofDisbased onAφ

such thatI 6|=q. By(i)above applied to the copies ofAN, for everyi≤M, we have

(11)

eitherVi⊆TIorVi ⊆FI. In the former case, we seta(pi) = 1; in the latter one, we seta(pi) = 0. We claim thatφis satisfiable undera. Indeed, ifDn=¬pi∨¬pj∨pk∨pl is false undera, thena(pi) = 1,a(pj) = 1,a(pk) = 0anda(pl) = 0, and so the tuple (ain, ajn, akn, aln)would be a match forqinI.

The proposed method is generic in the sense that we can try to apply it to any

‘sufficiently asymmetric’ CQ q with two T-atoms and two F-atoms: we use aT-F fragment ofq for copying the values of the Boolean variables, and the wholeq for encoding the clauses of a2 + 2-CNF. However, this method does not work for the CQ

q0

T T F F

R R R

which requires a somewhat different technique. We showCONP-hardness of(Dis,q0) by reduction of 3SAT. Given a 3CNFψ, we define an ABoxAψas follows. First, for every variablepinψ, we construct a ‘gadget’ shown in the picture below, where the number ofA-nodes above each of the circles matches the number of clauses inψ; we refer to these nodes asp-nodesand, respectively,¬p-nodes(below the circles, there are 2p- and 2¬p-nodes):

F F A A

A A T T

A T

F

A T

F

A T

F

A T

F

... ...

A

A

A

A

p ¬p

Observe that, for any modelI ofDisand the constructed gadget forp, ifI 6|=qthen either(i)thep-nodes are all inFI and the¬p-nodes are all inTI, or(ii)thep-nodes are all inTIand the¬p-nodes are all inFI.

Now, for every clausec= (l1∨l2∨l3)inψ, we add to the constructed gadgets the atomsT(c),R(c, ac¬l

1),R(ac¬l

1, acl

2),R(acl

2, acl

3), wherecis a new individual,ac¬l

1 a fresh¬l1-node,acl2a freshl2-node, andacl3a freshl3-node. For example, for the clause c= (p∨q∨r), we obtain the fragment below. The resulting ABox is denoted byAψ.

T A A A

A T

F A T

F

A T

F A T

F

A T

F A T

F

¬p q r

One can show thatψis satisfiable iffDis,Aψ6|=q0.

(12)

Acknowledgements. The work of O. Gerasimova and M. Zakharyaschev was carried out at the National Research University Higher School of Economics and supported by the Russian Science Foundation under grant 17-11-01294; the work of V. Podol- skii was supported by the Russian Academic Excellence Project ‘5-100’ and by grant MK-7312.2016.1. Thanks are due to Frank Wolter and Carsten Lutz for comments, suggestions and discussions.

References

1. Arora, S., Barak, B.: Computational Complexity: A Modern Approach. Cambridge Univer- sity Press, New York, NY, USA, 1st edn. (2009)

2. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and re- lations. Journal of Artificial Intelligence Research (JAIR) 36, 1–69 (2009)

3. Bienvenu, M., ten Cate, B., Lutz, C., Wolter, F.: Ontology-based data access: A study through disjunctive datalog, csp, and MMSNP. ACM Transactions on Database Systems 39(4), 33:1–

44 (2014)

4. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: theDL-Litefamily. Journal of Automated Reasoning 39(3), 385–429 (2007)

5. Feier, C., Kuusisto, A., Lutz, C.: Rewritability in monadic disjunctive datalog, MMSNP, and expressive description logics. CoRR abs/1701.02231 (2017),http://arxiv.org/

abs/1701.02231

6. Gottlob, G., Papadimitriou, C.H.: On the complexity of single-rule datalog queries. Inf. Com- put. 183(1), 104–122 (2003), http://dx.doi.org/10.1016/S0890-5401(03) 00012-9

7. Hernich, A., Lutz, C., Ozaki, A., Wolter, F.: Schema.org as a description logic. In: Calvanese, D., Konev, B. (eds.) Proceedings of the 28th International Workshop on Description Logics, Athens,Greece, June 7-10, 2015. CEUR Workshop Proceedings, vol. 1350. CEUR-WS.org (2015),http://ceur-ws.org/Vol-1350/paper-24.pdf

8. Kaminski, M., Nenov, Y., Grau, B.C.: Datalog rewritability of disjunctive datalog programs and non-Horn ontologies. Artif. Intell. 236, 90–118 (2016),http://dx.doi.org/10.

1016/j.artint.2016.03.006

9. Papadimitriou, C.: Computational Complexity. Addison-Wesley (1994)

10. Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. Journal on Data Semantics X, 133–173 (2008)

Références

Documents relatifs

2 This naive algorithm can be improved and it is possible to obtain a word complexity in O(n) (A. Weilert 2000) using a divide and conquer approach... Show how the LLL algorithm gives

An infinitesimal increase of the volume ∆v brings an increase in p int , so that the piston moves until another equilibrium point with a bigger volume. On the other hand, a

The use of hornMTL and datalogMTL ontologies (with both diamond and box operators in rule bodies and only box operators in rule heads) for querying temporal log data was advocated

We prove a number of new syntactic and semantic sufficient and nec- essary conditions for ontology-mediated queries (OMQs) with one covering ax- iom to be in the classes AC 0 and NL

This work addresses this problem by introducing a temporal query language that extends a well-investigated temporal query language with probability operators, and investigating

In particular, in [12], besides the AR semantics, three other inconsistency-tolerant query answering semantics are proposed, including the approximate intersection of repairs

I We first focus on the simpler OMQ language based on guarded TGDs and atomic queries, and, in Section 2, we provide a characterization of FO-rewritability that forms the basis

In particular, we construct an ontology T such that answering OMQs (T , q) with tree-shaped CQs q is W[1]- hard if the number of leaves in q is regarded as the parameter.. The number