Inconsistency-Tolerant First-Order Rewritability of DL-Lite with Identification and Denial Assertions

(1)

Inconsistency-Tolerant First-Order Rewritability of DL-Lite with Identification and Denial Assertions

Domenico Lembo, Maurizio Lenzerini, Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo

Dipartimento di Ingegneria Informatica, Automatica e Gestionale Antonio Ruberti Sapienza Universit`a di Roma

lembo,lenzerini,rosati,ruzzi,savo@dis.uniroma1.it

Abstract. This paper is motivated by two requirements arising in practical applications of ontology-based data access (OBDA): the need of inconsistency- tolerant semantics, which allow for dealing with classically inconsistent speci- fications; and the need of expressing assertions which go beyond the expressive abilities of traditional Description Logics, namely identification and denial assertions. We consider an extension of DL-Lite (the most used DL in OBDA) which allows for the presence of both the aforementioned kinds of assertions in the TBox. We adopt an inconsistency-tolerant semantics for such a DL, specifically the so-called Intersection ABox Repair (IAR) semantics, and we study query answering in this setting. Our main contribution is a query rewriting technique which is able to take into account both identification and denial assertions. By virtue of this technique, we prove that conjunctive query answering under such semantics is first-order rewritable in the considered extension of DL-Lite.

1 Introduction

Ontology-based data access (OBDA) is a computing paradigm which considers the problem of accessing data stored in autonomous databases through the use of an ontology, which provides a conceptual and shared representation of the domain of interest [10]. The main components of an OBDA system are the ontology, the set of data sources to be accessed, and the mapping, which specifies the relationship between the ontology and the data sources. The reasoning service of main interest is query answering, which amounts to process a query expressed in terms of the ontology alphabet, and to compute its answer on the basis of the knowledge specified by the ontology, the mapping, and the data stored at the sources. Various languages for specifying the ontology in this context have been recently proposed [3,2,7], which are designed in order to allow for tractable query answering w.r.t. the size of the data (data complexity). Among these, the members of the DL-Litefamily of light-weight Description Logics (DLs) present the distinguishing feature offirst-order rewritable query answeringfor unions of conjunctive queries (UCQs), which means that such a reasoning service can be reduced to the evaluation of a first-order query (directly translatable into SQL) over a database.

This turns out to be a crucial aspect in OBDA, where data are in general not moved from the sources, and are usually managed by a Relational Database Management System.

In the last years we have experimented (see, e.g., [11]) that two important requirements arise in OBDA applications:(i)the need of declarative mechanisms for dealing

(2)

with data that are inconsistent with respect to the intensional knowledge specified by the ontology, and(ii)the need of expressing assertions which go beyond the expressive abilities of traditional DLs, namelyidentification assertionsanddenial assertions. Solu- tions aiming at fulfilling these two requirements should of course still guarantee enough efficiency. Ideally, they should allow for inconsistency-tolerant first-order rewritable query answering of UCQs. Devising one such solution is the goal of the present paper.

Motivated by the first mentioned requirement, we have recently proposed various inconsistency-tolerant semantics[8], inspired by the studies on inconsistency handling in belief revision [6] and by the work on consistent query answering in databases [5].

Later works [9,1] have then shown that answering UCQs under the so-called Inter- section ABox Repair (IAR) semanticsis first-order rewritable for ontologies specified inDL-LiteA, a member of theDL-Litefamily [9]. In the present paper we extend the above result, and show that first-order rewritability of conjunctive query answering still holds if we enrichDL-LiteAwith identification and denial assertions.

Identification assertions (IdAs) are mechanisms for specifying a set of properties that can be used to identify instances of concepts. IdAs we study here have been origi- nally presented in [4]. Such assertions allow for sophisticated forms of object identification, which may include paths realized through the chaining of roles, their inverses, and attributes. Roughly speaking, when we say that instances of a conceptCare identified by a pathπ, we impose that no two instances ofCwith overlapping sets ofπ-fillers exist, i.e., in every interpretationI, no two instancesoando⁰ofCexist with a non-empty intersection of the set of objects reachable byoand byo⁰ inI. This naturally extends to IdAs involving more than one path. Identification assertions are very useful in prac- tice and are essential to representn-relations and attributes of roles through reification.

Indeed, reification is the only way to modeln-relations and role attributes in ontology languages, such as OWL, which do not natively allow for such constructs.

Denial Assertions (DAs) are instead used to impose that the answer to a certain boolean conjunctive query over the ontology is false, analogous to negative constraints in [2]. This is particularly useful to specify general forms of disjointness, which is again not supported in traditional ontology languages.

The query rewriting algorithm given in this paper elegantly deals with all forms of inconsistency that can arise inDL-LiteAenriched with IdAs and DAs, thus it represents an alternative to the rewriting algorithm given in [9] forDL-LiteA only. Moreover, it properly manages all the inconsistencies caused by erroneous assignments of values to attributes, i.e., arising when a value of typeTiis assigned to an attribute of value-type Tjdisjoint fromTi. This kind of inconsistency has not been considered in [9].

In this paper, for the sake of simplicity, we consider the setting of a single ontology constituted by a TBox and an ABox. However, our technique can be easily extended to the OBDA setting, where mapping assertions relate the TBox to the data sources [10].

The paper is organized as follows. In section 2 we provide some preliminaries. In Section 3 we provide some general properties of theIAR-semantics, useful for the rest of the paper. In Section 4, we show first-order rewritability of query answering of UCQs under theIAR-semantics. In Section 5 we conclude the paper.

(3)

2 Preliminaries

Let Σ be a signature partitioned into ΣP, containing symbols for predicates, i.e., atomic concepts, atomic roles, attributes and value-domains, andΣC, containing symbols for individual (object and value) constants. Given a DL languageL, anL-ontology O =hT,AioverΣconsists of a TBoxT, i.e., a set of intensional assertions overΣ expressed inL, and an ABoxA, i.e., a set of extensional assertions overΣexpressed in L. We assume that ABox assertions are always atomic, i.e., they correspond toground atoms, and therefore we omit to refer toLwhen we talk about ABox assertions.

The semantics of a DL ontologyO is given in terms of first-order (FOL) interpretations. We denote with Mod(O) the set of models of O, i.e., the set of FOL- interpretations that satisfy all the assertions inT andA, where the definition of satisfac- tion depends on the DL language in whichOis specified. An ontologyOissatisfiable ifMod(O)6=∅, andOentailsa FOL-sentenceφ, denotedO |=φ, ifφis satisfied by everyI ∈Mod(O). The above definitions naturally apply to a TBox alone.

In the following, we consider DL ontologieshT,Aiin which the TBoxT is always satisfiable, whereas the ABoxAmay contradictT. When this happens, we say thatA isT-inconsistent,T-consistent otherwise. For ontologies of this kind, theIntersection ABox Repair (IAR)semantics introduced in [8] allows for meaningful reasoning in the presence of inconsistency, which is instead trivialized under the FOL-semantics. The IAR-semantics is defined in terms ofA-repairs: given an ontologyO =hT,Ai, anA- repairforOis a setA⁰⊆ Asuch thatMod(hT,A⁰i)6=∅, and there does not existA⁰⁰ such thatA⁰ ⊂ A⁰⁰⊆ AandMod(hT,A⁰⁰i)6=∅. In other terms, anA-repair forOis a maximalT-consistent subset ofA. The set ofA-repairs forOis denoted byAR-Set(O).

TheIAR-semantics is then given in terms ofIAR-models, as follows.

Definition 1. Let O = hT,Ai be a possibly inconsistent DL ontology. The set of Intersection ABox Repair (IAR)-models of O is defined as ModIAR(O) = M od(hT,T

Ai∈AR-Set(O)Aii)

Notice that ifOis satisfiable under FOL-semantics, thenMod(O) =ModIAR(O).

The notion of consistent entailment is then the natural extension of classical entailment toIAR-semantics: a FOL-sentenceφisIAR-consistently entailed, or simplyIAR- entailed, byO, writtenO |=_IARφ, ifI |=φfor everyI ∈ModIAR(O).

We focus now onDL-LiteA,id, a DL of theDL-Litefamily [3] equipped with identification assertions [4]. SinceDL-LiteA,iddistinguishes between object and value constants, we partition the setΣCinto two disjoint setsΣO, which is the set of constants denoting objects, andΣV, which is the set of constants denoting values.

BasicDL-LiteA,idexpressions are defined as follows:

B −→ A | ∃Q | δ(U)

C −→ B | ¬B

Q −→ P | P⁻

R −→ Q | ¬Q

E −→ ρ(U)

F −→ T1 | · · · | Tn

V −→ U | ¬U

A,P, andP⁻denote anatomic concept, anatomic role, and theinverse of an atomic role;δ(U)(resp.ρ(U)) denotes thedomain (resp. therange) of anattribute U, i.e., the set of objects (resp. values) thatU relates to values (resp. objects);T₁, . . . , T_n are unbounded pairwise disjoint predefined value-domains;Bis calledbasic concept.

(4)

ADL-LiteA,idTBox is a finite set of the following assertions:

BvC QvR U vV EvF (functQ) (functU) (idB π₁, . . . , π_n) The assertions above, from left to right, respectively denote inclusions between concepts, roles, attributes, and value-domains, global functionality on roles and on attributes, and identification assertions (IdAs). In IdAs, πi is apath, which is an expression built according to the following syntax:π −→ S | D? | π1◦π2, where S denotes an atomic role, the inverse of an atomic role, an attribute, or the inverse of an attribute,π1◦π2 denotes the composition of pathsπ1andπ2, andD?, calledtest relation, represents the identity relation on instances ofD, which can be a basic concept or a value-domain. Test relations are used to impose that a path involves instances of a certain concept or value-domain. In our logic, IdAs are local, i.e., at least one πi ∈ {π1, ..., πn}has length 1, i.e., it is an atomic role, the inverse of an atomic role, or an attribute. Intuitively, an IdA of the above form asserts that for any two different instanceso,o⁰ofB, there is at least oneπisuch thatoando⁰differ in the set of their π_i-fillers, that is the set of objects that are reachable fromoby means ofπ_i. For example, the IdA(idMatch homeTeam,visitorTeam)says that there are not two different matches with the same home team and host team (which is indeed what happens, for instance, in a season schedule of a football league).

Inclusion assertions that do not contain (resp. contain) the symbols ’¬’ in the right- hand side are calledpositive inclusions(resp.negative inclusions). The set of positive (resp., negative) inclusions inT will be denoted byT⁺(resp.,T⁻).

ADL-LiteA,id ABox is a finite set of assertions of the form A(a),P(a, b), and U(a, v), whereAis an atomic concept,P is an atomic role,U is an attribute,aandb are constants ofΣ_O, andvis a constant ofΣ_V. In aDL-LiteA,idontologyO=hT,Ai, the following condition holds: each role or attribute that either is functional inT or appears (in either direct or inverse direction) in a path of an IdA inT is not specialized, i.e., it does not appear in the right-hand side of assertions of the form Q v Q⁰ or U vU⁰.

The semantics ofDL-LiteA,id is given in [4]. We only recall here that each value- domain Ti is interpreted by means of the same function, denoted val(Ti), in every interpretation, and each constantc_i ∈Σ_V is interpreted as one specific value, denoted val(c_i). Also, the Unique Name Assumption is adopted.

Aboolean conjunctive query (BCQ)over aDL-LiteAontology is an expression of the formq=∃~y.conj(~t)where~yis a set of variables and~tis a set of terms (i.e., constants or variables) such that each variable in~tis also in~y, andconj(~t)is a conjunction of atoms of the formA(z),T(z⁰),P(z, z⁰)U(z, z⁰)whereAis an atomic concept,T is a value-domain,P is an atomic role andU is an attribute, andz,z⁰are terms. In the following, we will mainly considerboolean unions of conjunctive queries (BUCQs), i.e., first order sentences of the formQ=Wn

i=1q_iwhereq_i=∃y~_i.conj_i(~t_i)is a BCQ.

A BCQqis satisfied by an ABoxA(denoted byh∅,Ai |=q) if there exists a substitution σfrom the variables inq to constants of ΣC such that the formulaσ(q)is satisfied in the FOL-interpretation structure where the ABoxAdetermines the extension of concepts, roles and attributes and the functionvaldetermines the extension of the value-domains. With a little abuse of notation we sometimes useσ(q)to indicate the set of the atoms inσ(q). When queryqis satisfied byA, we say thatσ_A(q) =σ(q)∩ A

(5)

is theimage ofqinA. A BUCQQ=Wn

i=1q_iis satisfied by an ABoxAif there exists i∈ {1, . . . n}such thatqiis satisfied byA.

All the results of the present work can be straightforwardly extended tonon-boolean UCQs, by imposing the (natural) condition that every free variable in the query appears in at least one atom not involving value-domains.

With the notion of query in place we introduce nowdenial assertions (DAs). A DA is an expression of the form∀~y.conj(~t) → ⊥whereconj(~t)is defined as for BCQs.

A DA is satisfied by an interpretationIif the formula∃~y.conj(~t)evaluates tofalse inI. DAs are used to impose general forms of disjointness. For example, the denial

∀x, y.Match(x)∧homeTeam(x, y)∧visitorTeam(x, y)→ ⊥says that a match cannot have the same team as both home and visitor team.

In the following we will considerDL-LiteA,idontologies whose TBoxes are enriched with DAs, and call themDL-LiteA,id,denontologies.

3 Properties of the IAR-Semantics

We discuss here some properties of theIAR-semantics that will be used in the rest of the paper. We recall that theIAR-semantics is defined for DL ontologies composed by a consistent TBox and an ABox comprising only atomic assertions, therefore we consider below only this kind of ontologies. We start with the notion of inconsistent set.

Definition 2. LetT be a TBox and letAbe an ABox.Ais aminimalT-inconsistent set ifAisT-inconsistent, i.e., the ontologyhT,Aiis unsatisfiable, and there is noA⁰⊂ A such thatA⁰isT-inconsistent.

Let O = hT,Ai be a possibly inconsistent DL ontology. We denote with minIncSets(O)the set of minimalT-inconsistent sets contained inA. Notice thatO is satisfiable iffminIncSets(O) =∅. Then, the following property holds.

Proposition 1. LetO=hT,Aibe a DL ontology such thatminIncSets(O)6=∅, and letαbe an ABox assertion inA. An A-repairA⁰ofOsuch thatα6∈ A⁰ exists iff there existsV ∈minIncSets(O)such thatα∈V.

Let nowAR-Set(O) = {A1, . . .An} be the set ofA-repairs of an ontology O = hT,Ai, and letqbe a BCQ. From Definition 1, we derive thatO |=_IARqiff there exists a subsetA⁰ofAsuch thatA⁰ ⊆ Aifor every1≤i ≤nandhT,A⁰i |=q. From the observation above, we derive the following theorem, which characterizes the notion of IAR-entailment of BCQs.

Theorem 1. LetO=hT,Aibe a DL ontology such thatminIncSets(O)6=∅, and let qbe a BCQ.O |=IARqiff there existsA⁰ ⊆ Asuch that:(i)A⁰isT-consistent;(ii) hT,A⁰i |=q;(iii)A⁰∩V =∅for everyV ∈minIncSets(O).

4 Query Rewriting under IAR-Semantics in DL-Lite

_A,id,den

In this section, we focus onDL-LiteA,id,denontologies, and provide our technique for computing FOL-rewritings of BUCQs under theIAR-semantics. The notion of FOL- rewritability is essentially the same used under FOL-semantics [3]. More precisely,

(6)

BUCQs in DL-LiteA,id,den are FOL-rewritable under theIAR-semantics if, for each BUCQs qand each DL-LiteA,id,den TBoxT, there exists a FOL-queryqr such that, for any ABox A hT,Ai,|=IAR qiff h∅,Ai |= qr. Suchqr is called theIAR-perfect reformulationofqw.r.t.T.

To come up with our reformulation method, we exploit Theorem 1. Roughly speaking, in the reformulation of a BUCQqover aDL-LiteA,id,denTBoxT, we encode into a FOL-formula all violations that can involve assertions belonging to images ofqin any ABoxA. Indeed, this can be done by reasoning only on the TBox, and considering each query atom separately. Intuitively, we deal with inconsistency by rewriting each atomαofqinto a FOL-formulaαrin such a way thath∅,Ai |=αronly if there exists a substitutionσof the variables inqto constants ofΣC such thatqis satisfied byA, andσ(α)does not belong to any minimalT-inconsistent set.

This inconsistency-driven rewriting is then suitably casted into the final reformulation, which takes into account also positive knowledge of the TBox, i.e., the inclusion assertions inT⁺. As we will show later in this section, this can be done by means of a slight variation of the algorithmPerfectRefof [3,10]. To formalize the above idea, we need to introduce some preliminary definitions.

We consider Boolean queries corresponding to FOL-sentences of the form:

∃z1, . . . , zk.

n

^

i=1

Hi(t¹_i)∧

m

^

i=1

¬Ti(t²_i)∧

`

^

i=1

Si(t³_i, t⁴_i)∧

h

^

i=1

t⁵_i 6=t⁶_i (1)

where every Hi is an unary predicate, which is either an atomic concept or a value- domain, everyT_iis a value-domain, everyS_i is a binary predicate, which is either an atomic role or an attribute, everyt^j_i is a term (i.e., either a constant or a variable), and z1, . . . , zk are all the variables of the query. Notice that sentences of the form (1) are BCQs enriched with limited forms of inequalities and negation.

Given aDL-LiteA,id,denTBoxT, we denote withcontr(T)the set of all negative inclusions, functionalities, identifications, denials, and value-domain inclusions occurring inT. Intuitively, the setcontr(T)contains all those assertions inT which can be contradicted by assertions in the ABox. Now, to each assertionτ ∈contr(T), we associate a Boolean query, denotedϕ(τ), which encodes the violation ofτ, i.e., it looks for images of the negation ofτ. Below we provide some of such encodings. Missing cases are can be obtained in a similar way.

-ϕ((functP)) : ∃x, x1, x2.P(x, x1)∧P(x, x2)∧x16=x2

-ϕ(A₁v ¬A₂) : ∃x.A₁(x)∧A₂(x)

-ϕ(ρ(U)vT) : ∃x1, x2.U(x1, x2)∧ ¬T(x2) -ϕ(∀~x.conj(~t)→ ⊥) : ∃~x.conj(~t)

For example, ϕ((id Match homeTeam,visitorTeam)) = ∃x, x⁰, y, z.Match(x)∧ homeTeam(x, y) ∧ visitorTeam(x, z) ∧ Match(x⁰) ∧ homeTeam(x⁰, y) ∧ visitorTeam(x⁰, z) ∧ x 6= x⁰, and ϕ(∀x, y.Match(x) ∧ homeTeam(x, y) ∧ visitorTeam(x, y)→ ⊥) =∃x, y.Match(x)∧homeTeam(x, y)∧visitorTeam(x, y).

Then, the algorithmunsatQueries(T)computes the first-order reformulation under FOL-semantics of ϕ(τ) for each τ ∈ contr(T), and returns the union of all such reformulations. To this aim, unsatQueries(T) makes use of the algorithm

(7)

PerfectRef6=(T⁺, ϕ(τ)), which is a slight variation of the algorithmPerfectRefgiven in [3]. In this modified version, inequality is considered as a primitive role and negated value-domains are considered as primitive concepts, thus inequality and negated atoms are never rewritten by the algorithm, and the algorithm does not unify query atoms if this causes a violation of an inequality. Note that the result ofPerfectRef6=is a union of boolean queries of the form (1), represented as a set of queries, as usual.

For example, assume to have a TBox containing the above mentioned IdA τ₁ : (id Match homeTeam,visitorTeam) and the inclusion assertion playedMatch v Match. The algorithmPerfectRef6=(T⁺, ϕ(τ1))returns, among others, the query:

∃x, x⁰, y, z.playedMatch(x)∧homeTeam(x, y)∧visitorTeam(x, z)∧playedMatch(x⁰)∧

homeTeam(x⁰, y)∧visitorTeam(x⁰, z)∧x6=x⁰

Notice that the queryϕ(τ1)cannot be rewritten by unifyingMatch(x)andMatch(x⁰) because of the inequalityx6=x⁰. Such an inequality actually causes that in this example the algorithm can never rewrite queries through unification.

LetO =hT,Aibe aDL-LiteA,id,denontology. SinceOis satisfiable iff there are noT-inconsistent sets inA, the following theorem states that the query produced by the algorithmunsatQueries(T)can be used to check satisfiability ofO.

Theorem 2. LetO=hT,Aibe a DL-LiteA,id,denontology. AT-inconsistent setV ⊆ Aexists iffh∅,Ai |=unsatQueries(T).

Ifh∅,Ai |=unsatQueries(T), then there existsq∈unsatQueries(T)such that h∅,Ai |= q. This implies that there exists a substitutionσfrom the variables inqto constants ofΣ_C such thatσ(q)projected over its positive atoms is contained inA.

Similar to BCQs, we call this set the image of qinA, denotedσ_A(q). It is possible to show thatσ_A(q)is aT-inconsistent set contained inA(in fact, this is part of the proof of Theorem 2). In order to exploit Theorem 1 towards the definition of a FOL- rewriting procedure, we need however to identify those assertions inAthat participate to aminimalT-inconsistent set. From the definition of minimalT-inconsistent set, and from Theorem 2 it follows that σ_A(q) is a minimalT-inconsistent set iff for every V ⊂σA(q)and every queryq⁰ ∈unsatQueries(T), we have thath∅, Vi 6|=q⁰.

Based on the above observations, we introduce the algorithm minUnsatQueries(T) which, starting from the set unsatQueries(T), computes a new setQmin of queries, which enjoys the following properties:(i)For each query q ∈ Qmin, h∅,Ai |= q iff there exists in unsatQueries(T) a query q⁰ such that h∅,Ai |= q⁰. This guarantees that Theorem 2 also holds withminUnsatQueries(T) in place ofunsatQueries(T).(ii)For each queryq∈ Q_min, ifh∅,Ai |=q, then for every imageσ_A(q)ofqinA,h∅, Vi 6|= q⁰, whereV ⊂ σ_A(q)andq⁰ ∈ Qmin. This guarantees that if a queryq∈ Qmin is such thath∅,Ai |=q, then every image ofqin Ais a minimalT-inconsistent set.

Before presenting the algorithmminUnsatQueries(T), we need to introduce some preliminary notions. Given a queryq, we say that a termtoccurs in anobject position ofqifqcontains an atom of the formA(t),P(t, t⁰),P(t⁰, t),U(t, t⁰), whereas we say thattoccurs in avalue positionofqifqcontains an atom of the formT(t),U(t⁰, t), whereA,P,U, andT have the usual meaning. Given two termst₁andt₂occurring in a queryq, we say thatt1andt2arecompatibleinqif either botht1andt2appear only

(8)

in object positions ofqor botht₁andt₂appear only in value positions ofq. Moreover, letqandq⁰be two boolean queries. We say thatqis aproper syntactical subsetofq⁰, writtenq≺Rn q⁰if there exists a renaming functionRn(q, q⁰)of the variables inqto the variables inq⁰, such that every atomS(~t)occurring inRn(q, q⁰)occurs also inq⁰and an analogous renaming fromq⁰toqdoes not exists. The algorithmminUnsatQueries(T) is given below.

Algorithm:minUnsatQueries(T) Input: aDL-LiteA,id,denTBoxT Output: a set of queries

1 Q⁰←unsatQueries(T);

2 Q⁰⁰←saturate(Q⁰);

3 Q⁰⁰⁰← ∅;

4 whileQ⁰⁰⁰6=Q⁰⁰do

5 Q⁰⁰⁰← Q⁰⁰;

6 foreachq∈ Q⁰⁰do

7 foreachatomU(t, t⁰)inqdo

8 if there exist no atomsT(t⁰)and¬T(t⁰)inqthen

9 foreachvalue-domainTin{T1, . . . Tn}do

10 Q⁰⁰← Q⁰⁰\ {q};

11 Q⁰⁰← Q⁰⁰∪ {q∧T(t⁰)};

12 Q⁰⁰← Q⁰⁰∪ {q∧ ¬T(t⁰)};

13 foreachq∈ Q⁰⁰⁰do

14 foreachtermtoccurring inqdo

15 ifbothTi(t)andTj(t)occur inq, withi6=jthen Q⁰⁰⁰← Q⁰⁰⁰\ {q};

16 foreachqandq⁰inQ⁰⁰⁰do

17 ifq≺Rnq⁰then Q⁰⁰⁰← Q⁰⁰⁰\ {q⁰};

18 returnQ⁰⁰⁰

The algorithm proceeds as follows.

Step 1(line 1): the algorithm initializesQ⁰to the setunsatQueries(T).

Step 2 (line 2): the algorithm computes the setQ⁰⁰ through the algorithmsaturate.

Starting from each queryq ∈ Q⁰, such an algorithm first unifies pairs of compatible terms inqin all passible ways; then, for any queryq⁰ computed in this way, for each pair of termst₁ andt₂ occurring inq⁰ that are syntactically different, it adds the inequality atom t₁ 6= t₂ toq⁰; finally it returns the union of Q⁰ with this new set of queries. Notice that such an operation is sound and complete with respect to the set of queries inQ⁰: in particular, no answer to the set of queries is lost by this transformation, since, for every pair ofcompatibletermst₁, t₂which are forced to be not equal inq, there is a queryq⁰ inQ⁰ where the same terms have been unified. For instance, assume that Q⁰ contains only the queryq : ∃x, y.Match(x)∧homeTeam(x, y)∧ visitorTeam(x, y). Then, the algorithm saturate returns a set constituted by the queriesq₁ : ∃x, y.Match(x)∧homeTeam(x, y)∧visitorTeam(x, y)∧x 6= yand q₂: ∃x.Match(x)∧homeTeam(x, x)∧visitorTeam(x, x).

Step 3(lines 3-12): let{T1. . . Tn} be the set of value-domains. The algorithm computes the setQ⁰⁰⁰by producing from each queryq∈ Q⁰⁰the following queries: for each atomU(x, y), whereU is an attribute name, if no atomsT(y)appears inq, whereT is a value-domain, then the algorithm builds2nnew queries by substituting the atom U(x, y)with either the conjunction of atomsU(x, y)∧Ti(y)orU(x, y)∧ ¬Ti(y), for

(9)

each 1 ≤ i ≤ n. In this way, every possible combination is computed. It is easy to verify that such a transformation is also sound and complete. Step 3 is motivated by the need to provide a complete comparison in Step 5, as explained below.

Step 4(lines 13-15): the algorithm removes fromQ⁰⁰⁰every queryqin which a termt occurs in two atoms of the formT₁(t)andT₂(t)(T₁andT₂are disjoint, and therefore for each constantcinΣV,T1(c)∧T2(c)is a contradiction).

Step 5(lines 16-17): the algorithm removes from the setQ⁰⁰⁰ each queryq⁰such that there is inQ⁰⁰⁰a different queryqwhose atoms form, up to renaming of the variables in q, a proper subset of the atoms appearing inq⁰. This simplified form of query contain- ment guarantees that for every ABoxAand every queryqinQ⁰⁰⁰, there does not exist a queryq⁰inQ⁰⁰⁰, such that for every imageσ(q)ofqinA,h∅, Vi |=q⁰whereV ⊂σ(q).

Step 6(line 18): the algorithm terminates by returning the setQ⁰⁰⁰.

Let us now continue the example given at Step 2. Assume that Q⁰ contains also the query q3 : ∃x.homeTeam(x, x) which is associate to the denial assertion

∀x.homeTeam(x, x) → ⊥. Obviously, besides queryq1andq2,Q⁰⁰contains alsoq3, which is natively in a “saturated” form. Step 3 and Step 4 have no effect in this example, since none of the queries in Q⁰⁰ has atoms with attributes or value-domains as predicate. Therefore,Q⁰⁰⁰ = Q⁰⁰, and Step 5 eliminates queryq2 fromQ⁰⁰⁰, since q3 ≺Rn q2. Notice that this step is crucial to ensure that every image of a query inQ⁰⁰⁰ in any ABoxA is aminimal T-inconsistent set. Indeed, let us, for instance, poseA = {Match(a),homeTeam(a, a),visitorTeam(a, a)};q₂ has an image inA, which is A itself, which is also a T-inconsistent set, but not minimal: indeed, the set {homeTeam(a, a)} ⊂ A is a also T-inconsistent set, since it violates the DA

∀x.homeTeam(x, x) → ⊥. This set is also minimal, and is indeed an image inA of the queryq₃.

The following lemma states that the algorithmminUnsatQueries(T)can be used to check if aDL-LiteA,id,denontologyO=hT,Aiis consistent.

Lemma 1. LetO=hT,Aibe a possibly inconsistent DL-LiteA,id,denontology. Then, h∅,Ai |=minUnsatQueries(T)iffh∅,Ai |=unsatQueries(T).

The following crucial lemma guarantees instead that one can use the queries produced by the algorithmminUnsatQueries(T)in order to compute every minimalT- inconsistent set inA.

Lemma 2. LetO = hT,Aibe a possibly inconsistent DL-LiteA,id,den ontology, and letqbe a query inminUnsatQueries(T). Ifh∅,Ai |=q, then every image ofqinAis a minimalT-inconsistent set.

Letα,βbe two atoms. We say thatβis compatible withαif there exists a mapping µ of the variables occurring in β to the terms occurring inα such thatµ(β) = α (and in this case we denote the above mappingµ with the symbolµα/β). Given an atomαand a queryq, we denote byCompSet(α, q)the set of atoms ofqwhich are compatible withα. Then, letT be aDL-LiteA,id,den TBox and letαbe an atom, we defineMinIncSet^T(α)as follows.

MinIncSet^T(α) = _

q∈minUnsatQueries(T)∧CompSet(α,q)6=∅



 _

β∈CompSet(α,q)

µα/β(q)





(10)

The following key property holds.

Theorem 3. LethT,Aibe a possibly inconsistent DL-LiteA,id,denontology, and letα be an ABox assertion. There exists a minimalT-inconsistent setV inAsuch thatα∈V iffh∅,Ai |=MinIncSet^T(α).

LetT be aDL-LiteA,id,denTBox and letqbe a BCQ of the form

∃z1, . . . , zk.

n

^

i=1

Ai(t¹_i)∧

m

^

i=1

Pi(t²_i, t³_i)∧

`

^

i=1

Ui(t⁴_i, t⁵_i) (2)

where all symbols have the usual meaning. We denote byIncRewrIAR(q,T)the following FOL-sentence:

IncRewr_IAR(q,T) =∃z₁, . . . , z_k.

n

^

i=1

A_i(t¹_i)∧ ¬MinIncSet^T(A_i(t¹_i))∧

m

^

i=1

Pi(t²_i, t³_i)∧ ¬MinIncSet^T(Pi(t²_i, t³_i))∧

`

^

i=1

Ui(t⁴_i, t⁵_i)∧ ¬MinIncSet^T(Ui(t⁴_i, t⁵_i))

Let Q be a set of CQs, we define IncRewrUCQ_IAR(Q,T) = W

q_i∈QIncRewrIAR(qi,T). We are then able to give our final results on reformulation of UCQs under theIAR-semantics.

Theorem 4. LetT be a DL-LiteA,id,denTBox and letQbe a UCQ. For every ABoxA, hT,Ai |=IARQiffh∅,Ai |=IncRewrUCQ_IAR(saturate(PerfectRef(T⁺, Q)),T).

In the above theorem,PerfectRefcoincides with the algorithm of [10]. This algorithm takes as input the set of positive inclusionsT⁺ and the queryQ, and computes the perfect reformulation under FOL-semantics of Q w.r.t. T⁺.PerfectRef(T⁺, Q)returns a set of CQs specified overT. Through this reformulation we first preprocess each query according to “positive” knowledge of the TBox, and then manage it to deal with possible inconsistency. Then, the algorithmsaturatepreviously described is ap- plied to the query thus obtained: this step is necessary for technical reasons (roughly speaking, it is needed in order to exactly identify, for every query atomα, the queries fromminUnsatQueries(T)which correspond to inconsistent sets to which an image of αmight belong). This query is finally reformulated according to the definition of IncRewrUCQ_IAR.

The following complexity result is a direct consequence of Theorem 4, since estab- lishing whetherh∅,Ai |=IncRewrUCQ_IAR(saturate(PerfectRef(T⁺, Q)),T)sim- ply amounts to evaluating a FOL-query over the ABox A, which is inAC⁰ in data complexity.

Corollary 1. Let O be a DL-LiteA,id,den ontology and let Q be a UCQ. Deciding whetherO |=_IARQis inAC⁰in data complexity.

We finally notice that the above results directly imply that query answering of UCQs inDL-LiteA,id,denunder standard semantics is FOL-rewritable, and therefore inAC⁰in data complexity, i.e., it has the same computational behavior of all DLs of theDL-Lite family.

(11)

5 Conclusion

Motivated by the requirements arising in applications of ontology-based data access, we have presented a new algorithm for inconsistency-tolerant query answering in a DL obtained by extendingDL-LiteAwith identification and denial assertions. The algorithm is based on a rewriting technique, and shows that query answering under the considered inconsistency-tolerant semantics inDL-Literemains first-order rewritable, even when identification and denial assertions are added to the TBox. We will soon experiment our new technique in the context of several ongoing OBDA projects. Our main goal is to devise optimization techniques that allow for simplifying the rewritten query as much as possible, so that the performance of the query answering process remains acceptable even under the inconsistency-tolerant semantics.

References

1. Bienvenu, M.: First-order expressibility results for queries over inconsistentDL-Liteknowl- edge bases. In: Proc. of DL 2011. CEUR,ceur-ws.org, vol. 745 (2011)

2. Cal`ı, A., Gottlob, G., Lukasiewicz, T.: A general Datalog-based framework for tractable query answering over ontologies. In: Proc. of PODS 2009. pp. 77–86 (2009)

3. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Litefamily. J. of Automated Reasoning 39(3), 385–429 (2007)

4. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Path-based identification constraints in description logics. In: Proc. of KR 2008. pp. 231–241 (2008)

5. Chomicki, J.: Consistent query answering: Five easy pieces. In: Proc. of ICDT 2007. LNCS, vol. 4353, pp. 1–17. Springer (2007)

6. G¨ardenfors, P., Rott, H.: Belief revision. In: Handbook of Logic in Artificial Intelligence and Logic Programming, vol. 4, pp. 35–132. Oxford University Press (1995)

7. Kontchakov, R., Lutz, C., Toman, D., Wolter, F., Zakharyaschev, M.: The combined approach to query answering inDL-Lite. In: Proc. of KR 2010. pp. 247–257 (2010)

8. Lembo, D., Lenzerini, M., Rosati, R., Ruzzi, M., Savo, D.F.: Inconsistency-tolerant semantics for description logics. In: Proc. of RR 2010 (2010)

9. Lembo, D., Lenzerini, M., Rosati, R., Ruzzi, M., Savo, D.F.: Query rewriting for inconsistent DL-Liteontologies. In: Proc. of RR 2011 (2011)

10. Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. J. on Data Semantics X, 133–173 (2008)

11. Savo, D.F., Lembo, D., Lenzerini, M., Poggi, A., Rodr´ıguez-Muro, M., Romagnoli, V., Ruzzi, M., Stella, G.: MASTROat work: Experiences on ontology-based data access. In: Proc. of DL 2010. CEUR,ceur-ws.org, vol. 573, pp. 20–31 (2010)