The Complexity of Conjunctive Query Abduction in DL-Lite

(1)

DL-Lite

^?

Diego Calvanese¹, Magdalena Ortiz², Mantas Šimkus², and Giorgio Stefanoni¹²

1 KRDB Research Centre for Knowledge and Data Free University of Bozen-Bolzano, Italy

2 Institute of Information Systems Vienna University of Technology, Austria

Abstract. In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for DLs, where research focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. We address this latter problem for (conjunctive) query answering over DL-Lite ontologies, by adopting abductive reasoning, that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning tasks, we consider existence and recognition of an explanation, and relevance and necessity of a certain assertion for an explanation. We characterize the computational complexity of these problems for subset minimal and cardinality minimal solutions.

1 Introduction

Query answering over ontologies formulated in Description Logics (DLs) has received considerable attention both in research and industry. Given an ontology, users typically pose queries over the conceptual schema and get answers that take into account the constraints specified at the conceptual level. Many efforts have concentrated on lightweight description logics. For instance,DL-LiteAhas been tailored for query answering over large data sets [7]. For this reason, expressive power is traded in favor of a better computational behavior in terms of data-complexity. In fact, conjunctive query answering in DL-LiteAenjoys FOL-rewritability, i.e., it can be reduced to the problem of evaluating a suitably constructed FOL query over a database instance.

In order to meet usability requirements set by domain users, most logic-based applications provide explanation algorithms for reasoning services. This holds also for DLs, where research focused on the explanation of both TBox reasoning [12,6,14,3]

and, more recently, query answering [5]. In addition, the latter paper advocates the importance of tackling the problem of explaining the absence of a tuple in the answers to a query over an ontology. This problem stems from the database community, where it has been solved in the context of databases extended with provenance information [9]. We address this problem by considering explanations for the absence of a

?This work was partially supported by the Austrian Science Fund (FWF) grant P20840.

(2)

tuple in the context of query answering overDL-Lite_Aontologies. We adoptabductive reasoning[10,11], that is, we consider which additions need to be made to the ABox to force the given tuple to be in the result. More precisely, given a TBoxT, an ABoxA, and a queryq, anexplanationfor a given tupletis a new ABoxUsuch that the answer toqoverhT,A ∪ U icontainst. An important aspect in explanations is to provide the user with solutions that are simple to understand and free of redundancy, hence as small as possible. To address this requirement, we study various restrictions on solutions, in particular, we focus on subset minimal and cardinality minimal ones. We consider stan- dard decision problems associated to logic-based abduction:(i) existenceof an explanation,(ii) recognitionof a given ABox as being an explanation, and(iii) relevanceand (iv) necessityof an ABox assertion, i.e., whether it occurs in some or all explanations.

After motivating such problems and formalizing them, we provide algorithms to solve them and a precise characterization of their computational complexity forDL-Lite_A. The complexity results for the various reasoning tasks are summarized in Table 1.

2 Preliminaries

DL-Lite_A. DL-Lite_Ais a member of theDL-Litefamily of DLs [7], which have been designed for dealing efficiently with large amounts of extensional information. InDL- LiteA, concept expressionsC, denoting sets of objects, and role expressionsR, denoting binary relations between objects, are formed as follows:

C −→ A | ∃R, R −→ P | P⁻.

whereAdenotes an atomic concept and P an atomic role³. In a DL-LiteA ontology O=hT,Ai, the TBoxT consists of axioms of the form

C₁vC₂, C₁v ¬C₂,

R₁vR₂,

R₁v ¬R₂, (functR),

and the ABoxAconsists of assertions of the formA(c)andR(c, c⁰), wherec,c⁰ are constants (or,individuals) from a countably infinite setC. An interpretation is a pair I =h∆Î,·Îi, where∆Î is a non-emptydomain, and theinterpretation function·Î is defined as usual. We adopt here theunique name assumption(UNA), i.e.,cÎ₁ 6=cÎ₂ for allc1, c2∈Cwithc16=c2. We refer to [7] for more details.

Conjunctive Queries. LetVbe a countably infinite set of variables. ExpressionsA(t) andP(t, t⁰) are called atoms, wheret, t⁰ ∈ V∪C. Aconjunctive query (CQ)q is an expressionq(x₁, . . . , x_n) ← a₁, . . . , a_m,where eacha_i,1 ≤ i ≤ m, is an atom.

Let V(q)denote the set of variables occurring in q, C(q)the set of constants in q, and letat(q) =S

1≤i≤m{a_i}. Amatchforqin an interpretationI is a mapping π : V(q)∪C(q) → ∆^I such that πis the identity on constants, π(t) ∈ A^I for each

3We ignore here the distinction between data values and objects, since it is immaterial for our results. As a consequence, we do not consider value domains and attributes, which are present inDL-LiteA.

(3)

A(t)∈at(q), andhπ(t), π(t⁰)i ∈PÎfor eachP(t, t⁰)∈at(q). The tuplehx1, . . . , xni is the tuple ofanswer variablesofq. TheanswertoqoverI, denotedans(q,I), is the set of alln-tupleshd1, . . . , dni ∈Cⁿsuch thathdÎ₁, . . . , dÎ_ni=hπ(x1), . . . , π(xn)ifor some matchπforqinI. Aunion of conjunctive queries(UCQ) is a set of CQs with the same answer variable tuple. For a UCQq, we letans(q,I) =S

q⁰∈qans(q⁰,I). The certain answerto a CQ or a UCQqoverOis defined ascert(q,O) ={c∈Cⁿ |c∈ ans(q,I)for each modelIofO}.

3 Explaining Negative Query Answers

We now define the problem considered in this paper:

Definition 1. LetO=hT,Aibe an ontology,qa UCQ, andca tuple of constants. We callP =hO, q,ciaquery abduction problem(QAP). AsolutiontoP(or anexplana- tion forP) is any ABox U such that the ontologyO⁰ =hT,A ∪ U iis consistent and c∈cert(q,O⁰). The set of all explanations forPis denotedexpl(P).

Ifc∈/ cert(q,O), then we callcanegative answertoqoverO. Note that a query over the ontology can have a negative answer only if the ontology is satisfiable. Dif- ferently, if the ontology is unsatisfiable then the QAPP does not have any solution.

In the following, we will examine various restrictions toexpl(P)to reduce redundancy in explanations. This is achieved by the introduction of a preference relation among explanations. This relation is reflexive and transitive, i.e., we have a pre-order among solutions.

Definition 2. Assume a QAPP. Letdenote a pre-order on the setexpl(P)of solutions. We writeU ≺ U⁰ifU U⁰andU⁰U. The preferred explanationsexpl(P)of a QAPP under the pre-orderare defined as follows:expl(P) ={ U ∈expl(P)| there is noU⁰ ∈expl(P)s.t.U⁰ ≺ U }, i.e.,expl(P)contains all the-explanations that areminimalunder.

Two preference orders are considered here: thesubset-minimality order, denoted by

⊆, and theminimum explanation size order, denoted by≤. The latter order is defined byU ≤ U⁰iff|U | ≤ |U⁰|. Observe thatexpl_≤(P)⊆expl_⊆(P).

We define now four decision problems related to (minimal) explanations, which are parametric w.r.t. the chosen preference order. Given a QAPP:

– -EXISTENCE: Does there exist a-explanation forP?

– -RECOGNITION: Is a setU of ABox assertions a-explanation forP? – -RELEVANCE: Does an assertionαoccur in some-explanation forP? – -NECESSITY: Does an assertionαoccur in all-explanations forP?

In the following, whenever no preference is applied (i.e., whenis the identity) we omit to writein front of the problem’s names. We provide an example, in which we highlight the consequences of choosing among the various orderings.

(4)

Table 1: Summary of main complexity results (completeness) -EXISTENCE -RECOGNITION -RELEVANCE -NECESSITY

none PTIME(4.1) NP PTIME(4.3) PTIME(4.2)

≤ PTIME DP P^NP_k P^NP_k (4.2)

⊆ PTIME DP Σ₂^P(4.3) PTIME(4.2)

Example 1. LetO=hT,Aibe an ontology describing a university domain, whereT is

PostGrad vStudent, UnderGrad vStudent, UnderGrad v ¬PostGrad,

PartTime vUnderGrad.

Tutor vProfessor,

∃hasTutor vPartTime,

∃hasTutor⁻ vTutor,

AdvancedvCourse,

∃teachesvProfessor,

∃teaches⁻vCourse,

That is, there are two different kinds of students,PostGradandUnderGrad. More- over,PartTime students are tutored byTutors, who are particular professors. Addi- tionally, the university offers someAdvanced courses. Let the ABoxAconsist of the assertionsteaches(rob,SWT),hasTutor(peter,rob). Now, assume that a user is in- terested in finding all those who both teach an advanced course and tutor a student.

Then, she would write the query

q(x)←teaches(x, y),Advanced(y),hasTutor(z, x).

Moreover, she may expect rob to be part of the result, i.e., rob ∈ cert(q,O), but this is not the case. Intuitively, rob satisfies all the constraints imposed by the query, except that the SWT course is not known to be Advanced. One can easily see that {teaches(rob,TOC),Advanced(TOC),hasTutor(john,rob)} is an explanation, {teaches(rob,ALG),Advanced(ALG)} is a ⊆-minimal explanation, while {Advanced(SWT)}is a≤-minimal explanation.

In the next section, the complexity of the four main problems is studied in the light of the different preference relations.

4 Complexity of Explanations

Table 1 provides an overview of our complexity results. Recall that the class Σ₂^P is a member of the Polynomial Hierarchy [13]; it is the class of all decision problems solvable in non-deterministic polynomial time using an NP oracle. Moreover, the class P^NP_k contains all the decision problems that can be solved in polynomial time with an NP oracle, where all oracle calls must be first prepared and then issued in parallel. The class DP contains all problems that, considered as languages, can be characterized as the intersection of a language in NP and a language inCONP [13]. Note that: PTIME⊆ NP ⊆DP⊆P^NP_k ⊆Σ₂^Pis believed to be a strict hierarchy of inclusions and here we make such an assumption.

(5)

Our results can be explained as follows. We show in the next section that EX-

ISTENCEcan be reduced to the PTIME-complete satisfiability problem forDL-Lite_A without the UNA [1], which justifies our PTIME upper bound. This result can then be used to characterize the complexity of RELEVANCE,NECESSITY, and⊆-NECESSITY.

≤-RELEVANCEand≤-NECESSITYare harder. The reason being that in order to solve these problems one has to compute first the minimal size of a solution and, then, inspect all the solutions of that size. Additionally, there is another increase in complexity when dealing with⊆-RELEVANCE. The intuition is that there is an exponential number of candidate solutions to examine and for each of them one has to check that none of its subsets is itself a solution, which requires aCONP computation. Due to space limita- tions, the results on-RECOGNITIONare not detailed in this paper (see [8] for more details). The intuition for the NP bound forRECOGNITIONis that one needs simply to check consistency and perform query evaluation to solve the problem. In case a preference order is in place, one has to check minimality as well, which is aCONP check for

⊆- and≤-minimal explanations that leads to completeness for DP.

4.1 Complexity of-EXISTENCE

For -EXISTENCE note first that the existence of an explanation for P implies the existence of an explanation under the⊆and≤orderings. Thus, we only considerEX-

ISTENCE. Our first task is to show that we can restrict ourselves to explanations built from the original signature of the input QAP plus a small number of fresh constants.

Proposition 1. If P = hO, q,cihas a solution, thenP has a solutionU⁰ with con- cepts, roles, and attributes only fromOand at mostm=maxq_i∈q|at(qi)|fresh ABox individuals.

Proof. Assume an arbitrary solutionUtoP. Given the consistency ofO⁰=hT,A ∪ U i, it follows that there exists a modelI ofO⁰ under the UNA. W.l.o.g. we assume that

∆^I=Cwithc^I =cfor eachc∈C. Additionally, the interpretationIadmits a match πfor someq_i(x)∈q, such thatπ(x) =c. LetU⁰={A(o)|A(t)∈at(q_i)andπ(t) = o} ∪ {R(o, o⁰) | R(t, t⁰) ∈ at(q_i)andπ(t) = oandπ(t⁰) = o⁰}. Observe that U⁰ has no more individuals thanq_i has variables. It remains to see thatU⁰ is a solution.

Clearly the original matchπwitnesses alsoc∈ans(q_i,A ∪ U⁰). It remains to see that O⁰⁰ =hT,A ∪ U⁰iis consistent. But this follows from the fact thatIis a model ofO⁰

and that the atoms inU⁰hold inI ut

The above restriction allows us to considercanonical explanations, i.e., explanations resulting from suitable instantiations of the bodies of CQsqi ∈ q. Keeping in mind that CQs, seen as FOL formulae, are always satisfiable, an explanation does not exist only if the structure of the query is not compliant with the constraints expressed in the ontology. That is, for all the interpretationsJ ofqwithans(q,J)6=∅, there is no modelIofO, such thatI ∪ J |=O. To check whether a UCQ is compliant with the ontological constraints, a naïve method is to iteratively go through all the CQs inqand instantiate them in the ABox. If for none of the CQs we obtain a consistent ontology, then the query violates some of the constraints imposed at the conceptual level.

(6)

Proposition 2. For DL-Lite_Aontologies,EXISTENCEisPTIME-complete.

Proof. (MEMBERSHIP) Note thatP =hO, q,ci, withqa UCQ, has a solution iffPq⁰ = hO, q⁰,cihas a solution for someq⁰ ∈ q. Hence, it suffices to show the upper bound for CQs. To this end, we provide a logspace reduction fromEXISTENCEto consistency inDL-LiteAwithout UNA, which in turn is PTIME-complete [1]. Assume a QAPP = hO, q,ci, whereqis a CQ andO= hT,Ai. We argue thatP has a solution iffO⁰ = hT ∪ T⁰,A ∪ Uq∪ A⁰i is consistent, whereO⁰ is an ontology obtained fromO and q(c)as follows. The ABoxUq is obtained from at(q(c))by replacing each variable xwith a fresh individual namea_x. The ABoxA⁰ consists of assertionsA_o(o)for all constantso occurring inT,AandUq, where eachA_o is a fresh concept name. The TBoxT⁰consists of axiomsA_ov ¬A_o⁰for all pairso6=o⁰of constants occurring inA andq(c). We now show thatPhas a solution iffO⁰is consistent.

Assume thatP has a solutionU. Then, due to consistency ofO⁰⁰ = (T,A ∪ U), there is a modelIofO⁰⁰under the UNA. Additionally,Iadmits a matchπforq(c). Let I⁰ be the extension ofIthat additionally interprets(i)constants inUq asaÎ_x⁰ =π(x) for all variablesxinq, and(ii)AÎ_o⁰ ={oÎ}for all freshly introducedAo. It remains to show thatI⁰is a model ofO⁰. Observe that sinceIis under the UNA, we have that AÎ_o⁰∩AÎ_o0⁰ =∅for all constant pairso6=o⁰. ThusI⁰satisfies all the disjointness axioms Aov ¬Ao⁰ inT⁰. The assertions inA⁰are satisfied due to(ii), while the assertions in Uqdue to(i)above.

The other direction of the proof is obvious and we omit it here.

(HARDNESS) Let us reduce consistency inDL-LiteAwithout UNA toEXISTENCE. GivenO=hT,Ai, we create QAPP =hO⁰, q(),hiisimply by encoding the ABoxA in the CQqby replacing each constanta∈ Aby a distinct variable namex_ainq, while

the ontologyO⁰consists only of the TBoxT. ut

4.2 Complexity of (-)NECESSITY

The existence of an explanation is most of the times not very informative to the user.

In fact, given a negative answer to a query, it is important to delineate the fundamental reasons leading to the absence of the expected tuple. That is, users would like to know which assertions occur in all the solutions to a QAPP.

Proposition 3. For DL-Lite_Aontologies, theNECESSITYand⊆-NECESSITYproblems arePTIME-complete.

Proof. (MEMBERSHIP) We assume a QAPP =hO, q,ciwithO=hT,Ai, an asser- tionϕ(t)and considerNECESSITYfirst. This problem can be reduced to non-EXISTENCE, which was shown to be in PTIMEin the previous section. We buildO⁰ =hT⁰,A⁰isuch thatϕ(t)occurs in all explanations forP iffP⁰ = hO⁰, q,cihas no explanation. We defineO⁰ by settingA⁰ =A ∪ {ϕ(t)}¯ andT⁰ =T ∪ {ϕ¯v ¬ϕ}, whereϕ¯is a fresh predicate name. It is easy to see that ifP⁰has no explanation, then eitherP has no explanation as well, or, all the explanations forP must containϕ(t). For⊆-NECESSITY, observe thatϕ(t)occurs in all explanation forP iffϕ(t)occurs in all⊆-minimal explanations for P. Thus ⊆-NECESSITY can be decided in polynomial time using our algorithm forNECESSITY.

(7)

(HARDNESS) The lower-bound can be proved through a logspace reduction from

EXISTENCEto non-NECESSITY, that is, deciding whether there exists a solution to a QAP that does not contain the given assertion. LetP = hO, q,cibe a QAP with q a CQ, we buildhP, αisuch thatP has a solution iffhP, αiis a negative instance to

NECESSITY. Letα = A(o), for some fresh conceptA and constantonot occurring inP. Now, assumeP has a solutionU. By Proposition 1, we know that there exists a solutionU⁰toP containing only concept and role names fromO. Hence,A(o)6∈ U⁰, since concept nameAis not in the ontology. Therefore, one can conclude thatA(o)is not a necessary assertion toP.

The other direction is straightforward. ut

Let us now show the complexity of necessity under the≤preference order.

Theorem 1. For DL-LiteAontologies,≤-NECESSITYis P^NP_k -complete.

Proof. (MEMBERSHIPonly,HARDNESSin [8]) Let’s assume a QAPP =hO, q,ciand an assertionα. By the use of canonical explanations, we know that the size mof the largest solution toPcorresponds to the size of the largest CQ inq. Observe that(P, α) is a negative instance of≤-NECESSITYiff there is an0 ≤i ≤msuch that(a)P has an explanationU with|U | =iandα 6∈ U, and(b)U is≤-minimal. Thus, we use an auxiliary problemSIZE-OUT, which is to decide given a tuplehP⁰, α⁰, n⁰i, whereP⁰is a QAP,α⁰ is an assertion, andn⁰ is an integer, whether there exists an explanationU⁰ forP⁰ such that|U⁰|=n⁰andα⁰ 6∈ U⁰. Furthermore, the problemNO-SMALLERis to decide given a tuplehP⁰, n⁰iof a QAP and an integer whether there is no explanation U⁰ forP⁰ such that|U⁰| < n⁰. Observe thatSIZE-OUTis in NP, whileNO-SMALLER

is in CONP. Take the tupleS = hA0, B0, . . . , Am, Bmi, where(a) Ai = (P, α, i), for all 0 ≤ i ≤ m, and (b) Bi = (P, i), for all 0 ≤ i ≤ m. Due to the above observation,αoccurs in all≤-minimal explanationsU forPiff for all0≤i≤m, one of the following holds:(i)Aiis a negative instance ofSIZE-OUT, or(ii)Biis a negative instance ofNO-SMALLER. Note thatScan be built in polynomial time in the size of the input, while the positivity of the instances inScan be decided by making2mparallel calls to an NP oracle. Thus we obtain membership in P^NP_k . ut

4.3 Complexity of-RELEVANCE

A domain user faced with a negative answer to a query may ask herself whether, the absence of a certain ABox assertionαin the ontology is related with the lack of the tuple in the results. That is, she would like to know whetherαoccurs in some explanation to QAPP.

Proposition 4. For DL-LiteAontologies,RELEVANCEisPTIME-complete.

Proof. (MEMBERSHIP) We assume a QAPP = hO, q,ciwithO = hT,Aiand an assertionφ(t). We now provide a reduction fromRELEVANCEtoEXISTENCE. We con- structO⁰=hT,A⁰i, whereA⁰=A∪φ(t). Then,Phas an explanationUwithφ(t)∈ U iff there exists an explanation toP⁰ = (O⁰, q,c). This is because, any explanation to P⁰can be extended by addingφ(t). It is simple to see that any such explanation is an

(8)

explanation toP, as well. Finally, ifP⁰ does not admit explanations, then φ(t)is a source of inconsistency inP.

(HARDNESS) Hardness can again be proved with a reduction fromEXISTENCEin a

similar way as done in Section 4.2. ut

Let us now tackle the problem of⊆-RELEVANCE, for which we recall the FOL- rewritability of query answering inDL-LiteA. Given an ABoxA, letDB(A)be the following interpretation:(i)∆^DB(A)is the set of constants occurring inA,(ii)A^DB(A)= {o∈∆^DB(A)|A(o)∈ A}, for each atomic conceptA, and(iii)P^DB(A)={ho, o⁰i ∈

∆^DB(A)|P(o, o⁰)∈ A}, for each atomic roleP.

Proposition 5 ([7]). Let PerfectRef(q,T)be the perfect reformulation ofq w.r.t. T, which is a UCQ obtained by applying the rewrite rules given in [7]. Then,cert(q,O) = S

r∈PerfectRef(q,T)ans(r, DB(A)).

In other words, the certain answers to a UCQ can be computed by rewriting the query into a FOL query to be evaluated over the ABox.

Theorem 2. For DL-LiteAontologies,⊆-RELEVANCEisΣ₂^P-complete.

Proof. (MEMBERSHIP) The membership inΣ₂^Pis clear from Algorithm 1, which works as follows. An explanation U containing φ(t) is non-deterministically computed by guessing an instantiation of a subquery inPerfectRef(q(c),T), whereAnonis a set of fresh ABox individuals (see Proposition 1). Let HAS-SUBEXPLsolve the problem of deciding whether a solution U has a subset which is itself an explanation. The problem can be easily proved to be in NP. Then, the algorithm checks the complement of HAS-SUBEXPLin order to assure that none of the subsets of U is itself an explanation, from which it follows that φ(t) is ⊆-relevant. Checking the complement of

HAS-SUBEXPLrequires the power of aCONP machine. For this reason, the algorithm is solvable in non-deterministic polynomial time by a TM with an NP oracle.

(HARDNESS) We prove it by a reduction from theΣ₂^P-complete problem co-CERT3COL

[15] (see also [4]). An instance of co-CERT3COLis given by a graphG= (V, E)with vertices V = {0, . . . , n−1} such that every edge is labeled with a disjunction of two literals over the Boolean variables{p(i,j) | i, j < n}.Gis a positive instance if there is a truth value assignmentt to the Boolean variables such that the grapht(G)

Algorithm 1

INPUT: QAPP=hq,O,ciand ABox assertionφ(t) OUTPUT: yes iffφ(t)is relevant toP

1: Guessqi∈ {q1, . . . , qn}=q

2: Guess the derivation of one rewritingr(c)inPerfectRef(qi(c),T) 3: Guess a set of atomsU ⊆at(r)

4: Guess a mappingπfromV(q)to constants inDB(A)andAnon 5: Check that(T,A ∪ U)6|=⊥, whereUis the instantiation ofUthroughπ.

6: Check thatφ(t)∈ Uandπis a match forr(c)overDB(A ∪ U) 7: Check thatHAS-SUBEXPL(P,U) =no.

(9)

obtained fromGby including only those edges whose label evaluates to true undert is not 3-colorable. Assume an instanceGof co-CERT3COL. We show how to build in polynomial time a QAPPG = h(TG,AG), qG,cGiand an ABox assertion αG such that:Gis a positive instance of co-CERT3COL iffαG is⊆-relevant forPG. We use an empty TBox and a Boolean query, thusTG = ∅ andcG = hi. The query qG is a UCQ q_G = q_e₁ ∪ · · · ∪q_e_k ∪q⁰, where {e1, . . . , e_k} = E, each q_e_i is an atomic queryq_e_i()← W_e_i(x, y), andq⁰is defined as follows. AssumeB ={t, f} to be the set of truth values. The queryq⁰has the following atoms for each edgee= (i, j)inE:

(a)B(x_i),R_e(x_i, y_e),R_e(y_e, x_j),B(x_j), and(b)P(y_e, z_p_i),A_p_i(z_p_i),W_p_i(z_p_i, z⁰_p

i), wherep_i ∈ {p₁, p₂}andp₁, p₂are the first and the second proposition in the labeling ofe, respectively. The queryq⁰simply incorporatesGtogether with the disjunctions on the edges. Observe that if two edges have the same proposition in their label, then this will be reflected inq⁰by some shared variableszp_i.

To buildAG we use individualscp andc_¬p for the truth value of proposition p.

Intuitively, the truth value ofpwill be determined by eitherAp(cp)orAp(c_¬p)being in the update. Assume a tuplet=he, v1, v2, a, bi, wheree∈E,{v1, v2} ⊆ B, anda, b are individuals. Letp1, p2be the first and the second propositions ofe. Fori∈ {1,2}

andvi=t, letli=piifpiis positive andli=¬piotherwise. Similarly, fori∈ {1,2}

andvi =f, letli =¬pi ifpi is positive andli =piotherwise. Then, the ABoxA(t) consists of the assertionsR_e(a, d_T),R_e(d_T, b),P(d_T, c_l₁)andP(d_T, c_l₂)depending on the boolean values in input.

The ABoxA_Gis the union of the following ABoxes:

(A1) A(he, v, v⁰, ai, aji)for alle∈E,v, v⁰∈ B,0≤i, j≤2, andi6=j;

(A2) A(he, f, f, ai, aii)for alle∈E, and0≤i≤2;

(A3) A(he, v, v⁰, b, bi)for alle∈E,v, v⁰ ∈ B;

(A4) The ABox{B(a0), B(a1), B(a2)};

(A5) The assertionsWp(cp, c_¬p)andWp(c_¬p, cp)for all propositions.

LetαG = B(b). It is not too difficult to see that Gis a positive instance of co-

CERT3COL iff there exists an ⊆-explanation U to P such that αG ∈ U. Basically, definitions (A1)-(A3) encode a triangular structureTin which edges inGthat evaluate to false according to a given truth assignment can be mapped on any edge ofT, reflexive edges included. If an edge ofGevaluates to true, then it must be mapped to one of the non-reflexive edges. This ensures that ifGcan be mapped toT under truth assignment t, thent(G)is 3-colorable. Instead, definitions (A4)-(A5) define a cyclic structureC into which any graphGcan be embedded. It has to be noted that the nodeb is not asserted to be a member ofB, henceqGcannot be mapped there directly with any truth assignment. We see this more formally next:

“⇒” Suppose there is a truth assignmenttsuch thatt(G)is not 3-colorable. Let U ={B(b)}∪U1, whereU1={Ap(cp)|t(p) =t}∪{Ap(c_¬p)|t(p) =f}. It remains to argue thatU is a⊆-explanation toP. It is not hard to see thatU is an explanation.

IndeedqGmatches already in the ABox obtained by point (A3) (hint: sinceB(b)∈ U, we matchqGby mapping all variables ofqG to (interpretation of)b). Suppose there is a smaller updateU⁰ ⊂ U. Observe thatU1⊆ U⁰. This is because for all propositionsp, the symbolApdoes not occur inAGbut does occur inqG. Then,U \ {B(b)}must be an

(10)

update. If this is the case, thenqGcan be matched inAGwithout the ABox from (A3), i.e. in the triangle part. Thent(G)is 3-colorable, which contradicts the assumption.

“⇐” LetU be a⊆-minimal explanation containingB(b). Due to the presence of qe₁ ∪. . . ∪qe_k inqG and the assumption, the role Wp cannot occur in U for any proposition p. SinceU is an explanation, by the definition of q⁰ and (A5) we have thatA_p(c_p) ∈ U orA_p(c_¬p) ∈ U for all propositionsp. Since for any propositionp we have thatA_poccurs inq_Gwith one and only variablez_p, we know that exactly one ofA_p(c_p) ∈ U andA_p(c_¬p) ∈ U holds. Due to the atomsW_p(z_p, z_p⁰)inq_G, we also have that individuals of the formc_pandc_¬pare the only ones that can get anA_plabel.

Consider the assignmenttdefined as follows:t(e) = tifA_p(c_p) ∈ U, andt(e) = f ifA_p(c_¬p)∈ U. It is not difficult to argue thatt(G)is not 3-colorable and thusGis a positive instance of co-CERT3COL. Indeed, ift(G)was 3-colorable,Qshould be map- pable into the triangle part obtained in (A1)-(A3). ThenU \ {B(b)}would be a smaller

update, which would mean a contradiction. ut

Note that the above lower bound holds already for empty TBoxes.

Proposition 6. For DL-Lite_Aontologies,≤-RELEVANCEis P^NP_k -complete.

Proof. (MEMBERSHIPonly,HARDNESSin [8])≤-RELEVANCEcan be tackled in a way similar to≤-NECESSITY. In fact, the algorithm described in Theorem 1 can be modi- fied in order to solve this problem. LetSIZE-IN solve the following problem: given a tuplehP, α, ni, where P is a QAP,αan assertion, andnan integer, decide whether there exists an explanationU, with|U |=nandα∈ U. Then, we change the positivity condition of the≤-NECESSITYalgorithm as follows:αoccurs in some≤-minimal ex- planationsU forPiff for somei,0≤i≤m, it holds that: (i)Aiis a positive instance of SIZE-IN, and (ii)Bi is a positive instance of NO-SMALLER.It is easy to see that

SIZE-INis solvable in NP, hence the whole problem is again in P^NP_k . ut

5 Conclusions

In this paper we have provided the formalization of a new problem, namely the explanation of negative answers to user queries over ontologies. A tuple is said to be a negative answer, if the user expects it to be part of cert(q,O)but the tuple is actually not. In our framework, an explanation consists of an ABox that when added to the ontology leads the negative answer to be returned in the results of the query. We define various problems that help us in characterizing the complexity of finding explanations, such as the existence of explanations and relevance/necessity of assertions. We further consider a minimality criterion to be applied over explanations, such as subset-minimal and minimum-size preference orders. Within this framework, we provide a characterization of the computational complexity of the various problems for the DLDL-Lite_A.

Future work includes studying the application of this framework to other lightweight description logics, starting with theEL-family. We would also like to investigate the problem in the case where the ontology signature and the explanation signature may be different. That is, the signature over which explanations can be constructed is restricted only to a subset of the ontology signature [2].

(11)

References

1. A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev. TheDL-Litefamily and relations.J. of Artificial Intelligence Research, 36:1–69, 2009.

2. F. Baader, M. Bienvenu, C. Lutz, and F. Wolter. Query and predicate emptiness in description logics. InProc. of the 12th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2010), 2010.

3. F. Baader and R. Peñaloza. Axiom pinpointing in general tableaux.J. of Logic and Compu- tation, 20(1):5–34, 2010.

4. P. A. Bonatti, C. Lutz, and F. Wolter. The complexity of circumscription in description logics.

J. of Artificial Intelligence Research, 35:717–773, 2009.

5. A. Borgida, D. Calvanese, and M. Rodríguez-Muro. Explanation in theDL-Litefamily of description logics. InProc. of the 7th Int. Conf. on Ontologies, DataBases, and Applications of Semantics (ODBASE 2008), volume 5332 ofLecture Notes in Computer Science, pages 1440–1457. Springer, 2008.

6. A. Borgida, E. Franconi, and I. Horrocks. ExplainingALCsubsumption. InProc. of the 14th Eur. Conf. on Artificial Intelligence (ECAI 2000), 2000.

7. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, M. Rodríguez-Muro, and R. Rosati. Ontologies and databases: TheDL-Liteapproach. In S. Tessaris and E. Franconi, editors,Semantic Technologies for Informations Systems – 5th Int. Reasoning Web Summer School (RW 2009), volume 5689 ofLecture Notes in Computer Science, pages 255–356.

Springer, 2009.

8. D. Calvanese, M. Ortiz, M. Šimkus, and G. Stefanoni. The complexity of conjunctive query abduction in DL-Lite. Technical Report INFSYS RR 1843-11-01, Institute of Informa- tion Systems, Vienna University of Technology, 2011. Available at:http://www.kr.

tuwien.ac.at/research/reports/.

9. A. Chapman and H. V. Jagadish. Why not? InProc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 523–534, 2009.

10. T. Eiter and G. Gottlob. The complexity of logic-based abduction.J. of the ACM, 42(1):3–42, 1995.

11. S. Klarman, U. Endriss, and S. Schlobach. ABox abduction in the description logicALC.J.

of Automated Reasoning, 46(1):43–80, 2011.

12. D. L. McGuinness and A. Borgida. Explaining subsumption in description logics. InProc.

of the 14th Int. Joint Conf. on Artificial Intelligence (IJCAI’95), pages 816–821, 1995.

13. C. H. Papadimitriou.Computational Complexity. Addison Wesley Publ. Co., 1994.

14. R. Penaloza and B. Sertkaya. Complexity of axiom pinpointing in theDL-Litefamily. In Proc. of the 23rd Int. Workshop on Description Logic (DL 2010), volume 573 of CEUR Electronic Workshop Proceedings,http://ceur-ws.org/, 2010.

15. I. A. Stewart. Complete problems involving boolean labelled structures and projection trans- actions. J. of Logic and Computation, 1(6):861–882, 1991.