• Aucun résultat trouvé

On Bounded Positive Existential Rules

N/A
N/A
Protected

Academic year: 2022

Partager "On Bounded Positive Existential Rules"

Copied!
13
0
0

Texte intégral

(1)

Michel Lecl`ere, Marie-Laure Mugnier, and Federico Ulliana University of Montpellier, LIRMM, Inria

Abstract. We consider the existential rule framework, which generalizes Horn description logics. We study and compare several boundedness notions in this framework. Our main result states that (strongly-) bounded rules are exactly those at the intersection of two well-known abstract classes of existential rules, namely fes(finite expansion sets, which ensure the finiteness of the core chase) andfus (finite unification sets, which correspond to UCQ-rewritable rules).

1 Introduction

Existential rules [6, 12, 21] are a fragment of first-order logic generalizing most onto- logical languages studied in ontology-mediated query answering (OMQA), in particular Horn description logics [13, 22, 23]. In the OMQA setting, databases (or fact bases) are added with an ontological layer, which allows to deduce new facts from incomplete datasources, thereby enriching answers to database queries.

As OMQA can be directly implemented on top of relational database systems, many research efforts have been devoted to make the paradigm efficient. At the core of the techniques developed for existential rules, and to some extent for Horn description log- ics, we find the two classical paradigms for processing rules, namely forward chaining and backward chaining. In the OMQA setting, both approaches are recast as ways of reducing the problem to a classical database query answering problem, by embedding the rules into the facts or into the query. Forward chaining is decomposed into a mate- rialization step (applying the rules to the data, hence materializing inferences into the data) followed by the evaluation of the query against the enriched database. Backward chaining is decomposed into a query rewriting step (rewriting the query using the rules) followed by the evaluation of the rewritten query against the database, thereby leaving the data untouched.

Both approaches rely on a fixpoint operator to cope with rule semantics. Indeed, materialization should continue until the point where only redundant facts are added to the dataset, while rewriting should continue until the point where only redundant queries are added to the rewriting set. It is understood that both processes may not terminate since entailment with existential rules is undecidable (e.g., [14]). Deciding halting of these processes is undecidable as well [17, 5]. Following the terminology introduced in [5], we say that a set of existential rules is afinite expansion set (fes)if it ensures that a finite sound and complete materialization can be computed for any fact base, and afinite unification set (fus) if it ensures that any conjunctive query can be rewritten into a finite sound and complete set of conjunctive queries (such a set being seen as a union of conjunctive queries). As concrete examples offes rules, one can

(2)

cite (plain) datalog [1] and sets of rules satisfying various acyclicity conditions [16, 2]. Some prominent classes offus rules are linear rules (which generalize some DL- Lite dialects [13, 12]), sticky rules [10] and more generally rules with the “backward shyness” property [25]. Note that [5] defines a third property ensuring decidability of OMQA, namely bounded-treewidth sets (which contains EL [23] and the family of guarded rules [9, 7]), however this class is out of the scope of this paper.

Tofes(resp.fus) rules is naturally associated a finite breadth-first materialization (resp. query rewriting) process, whose maximal number of steps generally depends not only on the set of rules but also on the data (resp. on the input query). In contrast, bounded rules are exactly those rules that can be evaluatedwithout a fixpoint operator.

A set of rules is said to beboundedif breadth-first materialization computes all conse- quences from a knowledge base (composed of a fact base and the rules) in a predefined number of stepsk. More precisely, there iskindependentfrom any fact base, such that the materialization of any fact base at stepk0 > kis equivalent to that obtained at stepk.

As relational database systems may lack a fixpoint operator (see for example MySQL), bounded rules form an interesting intersection point between databases and ontologies, because they can be seamlessly evaluated on any relational system, which is not the case in general.

The goal of this work is to further investigate the properties of bounded existential rules, in particular the precise relationships betweenfes,fusand bounded sets of rules.

Our contributions are more precisely the following:

1. We first define a breadth-first query rewriting technique which is precisely the dual of breadth-first materialization. LetK = (F,R)be any knowledge base, whereF is a fact base andRa set of existential rules, and letqbe a (Boolean) conjunctive query. From earlier work on existential rules, we know thatKentailsqiff there isk (depending onRandF) such thatFk, the saturation ofFat stepk, entailsq(e.g., [11, 4]); equivalently,Kentailsqiff there isk0(depending onRandq) such thatF entailsQk0, the set of rewritings ofqobtained at stepk0(see [20, 18] for practical algorithms). We define a variant of the breadth-first query rewriting technique from [20] that fulfills the following property: for anyi≥0,FientailsqiffFentailsQi

(Theorem 1), hencek=k0in the above sentence.

2. We point out that boundedness can also be defined in terms of query rewriting instead of materialization, and using Theorem 1, we obtain the same bound for both definitions: for anyk, it holds thatFk is equivalent toFk+1for allF iff it holds thatQkis equivalent toQk+1for allq(Theorem 2). We also show that the notion of “bounded-depth derivation property” introduced in [12] is equivalent tofus, and define variants of this property corresponding tofesand boundedness respectively.

3. By definition, every bounded set of rules is both fes and fus. The question of whether the reciprocal statement holds was open. We show that, indeed, bounded rules are exactly those at the intersection offesandfus(Theorem 3).

(3)

2 Preliminaries

We consider a first-order setting with constants but no other function symbols. Aterm is either a constant or a variable. In the examples we will denote constants by letters at the beginning of the alphabet (a,b, ...,) and variables by letters at the end of the alphabet (v,w,x,y,z). Anatomis of the formp(t1, . . . , tk)wherepis a predicate of aritykand thetiare terms. Given an atom or set of atomsA,vars(A),consts(A)and terms(A)denote its set of variables, constants and terms, respectively. We denote by

|=the classical logical consequence and by≡the logical equivalence. Given two sets of atomsAandA0, ahomomorphismhfromAtoA0 is a substitution ofvars(A)by terms(A0)such thath(A)⊆A0. If there is a homomorphismhfromAtoA0, we say thatAmapstoA0(byh), which is also denoted byA≥A0. It is convenient to extend any substitutionsto unchanged terms (we sets(t) =tfor all considered constants and unchanged variables).

Afactis an existentially closed conjunctions of atoms. We denote byF afact base, that is a set of facts. Since a conjunction of facts is equivalent to a single fact, we also see a fact base as an existentially closed conjunction of atoms. ABoolean conjunctive query(in short CQ) is also an existentially closed conjunction of atoms. Next, fact bases and conjunctive queries will be seen as sets of atoms. The answer to a CQqin a fact baseFistrueiffF |=q. It is well known thatF |=qiffq≥F. Aunion of conjunctive queries(UCQ)Q= q1∨q2. . .∨qn is seen as a set of CQsQ={q1, . . . , qn}. The answer toQinF istrueiffF |=Q, i.e.,F |=qifor someqi∈Q.

Anexistential rule r = ∀x,y(B[x,y] → ∃zH[x,z])is a closed formula where B is a conjunction of atoms constituting the body of the rule,H is a conjunction of atoms for the head of the rule,xandyare sets of universally quantified variables, and zis the set of existentially quantified variables of the rule. The variables inx, i.e., those shared byBandH, are called thefrontier variablesof the rule. In the following we will refer to a rule as a pair of sets of atoms(B,H), interpreting their common variables as the frontier. In the examples, we will use the simplified notationB →H, for instance the rule∀x(q(x)→ ∃y (p(x, y)∧q(y)))will be writtenq(x) →p(x, y), q(y). A set of rules is denoted byR. We implicitly assume that all rules employ disjoint sets of variables. Aknowledge base(KB)K = (F,R)is a pair whereFis a fact base andRis a set of rules. Theconjunctive query entailment problemconsists in deciding for given KBKand CQqwhetherK |=q(whereKis seen as the first order theory associated withF∪ R). It has long been shown that this problem is undecidable (this follows e.g., from [14]).

A ruler = (B,H)isapplicableto a fact baseF if there is a homomorphismh such thath(B)⊆F. The application ofrtoF with respect to a homomorphismhis denoted byα(F, r, h)and defined asα(F, r, h) =F∪hsafe(H)wherehsafeis asafe extension ofhtoH, i.e., it substitutes existential variables fromH with fresh variables (not used elsewhere). It holds thatF,R |=qiff there is a fact baseF0 derived fromF with rules inR,1such thatF0 |=q.

1i.e.,F0is obtained by a sequence of fact basesF(=F0), F1, . . . , Fk(=F0)such that, for all i >0,Fi=α(Fi−1, ri, h)withri∈ Randha homomorphism from the body ofritoFi−1.

(4)

We now define fact materialization by (breadth-first) forward chaining, a useful tool, also known as saturation or (breadth-first) chase. 2 The one-step saturationof F withR, denoted by α(F,R), is defined asα(F,R) = F ∪(r,h)hsafe(H)for all r= (B, H)∈ Randhhomomorphism fromBtoF. Thek-saturationofF withR, denoted byαk(F,R), is inductively defined as follows:α0(F,R) = F and, for any k >0,αk(F,R) =α(αk−1(F,R),R).

Next, whenRis fixed, we will denote the setαk(F,R)byFk. Saturation is sound and complete, i.e., for all F,R andQ,F,R |= Q iffFk |= Q for some positive integerk (see e.g., [11, 4]). A set of rulesRis said to be a fes(finite expansion set) if for all fact baseF there exists a positive constant ksuch thatFk |= Fk+1 (hence Fk≡Fk+1) [6]. Note thatkgenerally depends onF. It follows that whenRisfes, the saturation process is finite, hence it can be used to decide ifF,R |=Q. The following example illustrate thefesproperty.

Example 1 (fes). Let K = (F,R)with F = {p(a, b)} and R = {r : p(x, y) → p(y, z), p(z, y)}. ThenF0=F,F1=F∪ {p(b, z0), p(z0, b)},F2=F1∪ {p(z0, z1), p(z1, z0), p(b, z2), p(z2, b), p(b, z00), p(z00, b)}(the rule applications corresponding to homomorphisms already found at the preceding step are written in gray font, next we will omit them). We haveF2≡F1(note that there is noksuch thatFk =Fk+1, even by considering only new homomorphisms at each step, which shows the importance of checking equivalence and not only equality). Actually it holds thatF2≡F1for anyF, henceRisfes(and even bounded, see Sect. 4). Let us considerR0 ={r: p(x, y)→ p(y, z)}. ThenF0=F,F1=F∪ {p(b, z0)},F2=F1∪ {p(z0, z1), p(b, z2)}and so on. There is noksuch thatFk ≡Fk+1, henceR0is notfes.

Backward chaining is a dual approach to deciding conjunctive query entailment, which involves rewriting the input query using the rules. The key operation is unifica- tion between (a subset of) a query and (a subset of) a rule head, which requires particular care due to the presence of existential variables in the rules. For this reason, we use the notion of a piece-unifier (introduced in [6]). Given a subqueryq0 ⊆ q, we callsepa- ratingvariables ofq0the variables occurring in bothq0and(q\q0); the other variables fromq0are called non-separating. The definition of piece-unifier below ensures that,q0 being the unified part ofq, only non-separating variables fromq0can be unified with an existential variable of the rule.

Definition 1 (Piece Unifier) Letqbe a query andr = (B,H)a rule. Apiece-unifier ofq withris a triple µ = (q0,H0, u)whereq0 6= ∅,q0 ⊆ q,H0 ⊆ H, anduis a substitution ofT = terms(q0∪H0)byT such that(i)u(q0) =u(H0)and(ii)for all existential variablex∈ vars(H0)andt ∈ T, witht 6= x, ifu(x) = u(t), thentis a non-separating variable fromq0.

Given a CQq, a ruler = (B,H)and a piece-unifierµ = (q0,H0, u)the direct rewritingofqwithrandµ, denoted byβ(q, r, µ), is the CQusafe(B)∪u(q\q0), where usafeis a safe extension ofutoB substituting variables invars(B)\vars(H0)with fresh variables.

2Several chase variants have been defined, see the last section.

(5)

Example 2 (Piece Unifier).Letr =r(x)→p(x, y)andq1 ={p(w, v), s(v)}. There is no piece-unifier ofq1withrsince, withq10 ={p(w, v)},vis a separating variable ofq01, hence cannot be unified with the existential variabley. Letq2 ={s(z), p(z, v), p(w, v), t(w)}. The tripleµ= (q0, H0, u)withq0={p(z, v), p(w, v)},H0={p(x, y)}

andu={x7→z, y7→v, w7→z}is a piece-unifier ofq2withr, which yields the direct rewriting{r(z), s(z), t(z)}.

It holds thatF,R |=qiff there is a rewritingq0 ofqwith rules inR,3such that F |=q0. A set of rewritingsQofq(withR) is said to besoundandcompleteif for any Fit holds thatF,R |=qiff there isq0∈Qsuch thatF |=q0. WhenQis finite it can be seen as a UCQ, hence the previous condition can then be recast as follows:F,R |=q iffF |=Q. The setRis said to befus(finite unification set) if for anyq, there is a finite sound and complete set of rewritings ofqwithR[6, 20].

Similarly in spirit to saturation, one can consider breadth-first query rewriting: start- ing from the UCQQ={q}, at each step we compute all the direct rewritings of CQs in the current UCQ. Formally: the one-step rewriting of a UCQQwithRisβ(Q,R) = Q∪(q,r,µ){β(q, r, µ)}whereq∈Q,r∈ Randµis a piece-unifier ofqwithr. Then, the(breadth-first)k-rewritingofQwithR, denoted byβk(Q,R), is inductively defined as follows:β0(Q,R) =Q, and for allk >0,βk(Q,R) =β(βk−1(Q,R),R).

It holds thatF,R |= qiffF |= βk({q},R)for some positive integerk(follows from [4]). This property yields an alternative characterization offus: a set of rules isfus iff for allq, there isksuch thatβk({q},R)≡βk+1({q},R).4

Example 3 (fus).

Letr=p(x, y), p(y, z)→p(z, t). Letq0 ={p(v, a)}, whereais a constant: since tis an existential variable, there is no piece-unifier ofq0withr.

Letq00={p(a, v)}.

β1({q00},{r})={q00} ∪ {{p(x0, y0), p(y0, a)}};

β2({q00},{r}) = β1({q00},{r})∪ {{p(x00, y00), p(y00, a)}}, where the last CQ corre- sponds to the piece-unifier already found in the preceding step. Hence,β2({q00},{r}) = β1({q00},{r})if we restrict the computation to new piece-unifiers.

Finally, letq={p(v, w)}.

β1({q},{r})={q} ∪ {q1={p(x0, y0), p(y0, v)}}.

β2({q},{r}) = β1({q},{r})∪ {{p(x1, y1), p(y1, y0), p(x0, y0)}} (we consider only new piece-unifiers). Since existential variables can only be unified with non-separating variables, there is only one new piece-unifier({p(x0, v)},{p(z, t)},{z7→y0, t7→v}), which yieldsq1.

β3({q},{r})=β2({q},{r}) ∪ {{p(x2, y2), p(y2, y1), p(x1, y1)}}, where the new the piece-unifier is({p(x0, y0), p(y1, y0)},{p(z, t)},{z 7→ y1, x0 7→ y1, t 7→ y0}). And so on. There is noksuch thatβkk+1(up to bijective variable renaming), however any UCQβiis equivalent toq. As these examples suggest it,{r}is indeedfus.

3i.e.,q0is obtained by a sequence of direct rewritingsq(=q0), q1, . . . , qk(=q0)such that, for alli >0,qi=β(qi−1, ri, µ)withri∈ Randµa piece-unifier ofqi−1withri.

4The breadth-first rewriting algorithm in [20] builds β(βi(Q,R),R) by considering only queries that are not contained in a query fromβi(Q,R); in this case the fuscondition be- comes: there isksuch thatβk+1(Q,R) =βk(Q,R).

(6)

Note that there isksuch thatF |=βk({q},R)if and only if there isk0 such that αk0(F,R) |= q. However, the above breadth-first rewriting is not exactly the dual of breadth-first saturation, in the sense thatα(F,R)|=qdoes not implyF |=β({q},R).

In other words, a single step of breadth-first rewriting is not able to “simulate” a step of saturation. Let us illustrate this observation with a simple example.

Example 4 (Saturation vs rewriting step).LetK = (F,R)withF = {p0(a), q0(a)}

andR={rp:p0(x)→p1(x);rq :q0(x)→q1(x)}. Letq={p1(u), q1(u)}. A single saturation stepα(F,R) = {p0(a), q0(a), p1(a), q1(a)}allows to entailq. However, a single rewriting stepβ1({q},R) = {{p1(u), q1(u)},{p0(u), q1(u)},{p1(u), q0(u)}}

does not allow to prove thatK |= q, i.e.,F 6|= β1({q},R). The trouble is that each direct rewriting is performed with respect to a single rule, hence the desired CQ{p0(u), q0(u)}, which requires to rewriteqwith bothrpandrq, is obtained only at the second rewriting step (from{p0(u), q1(u)}or from{p1(u), q0(u)}).

3 A New Breadth-First Query Rewriting

In order to obtain the precise dual of saturation, we define another breadth-first rewrit- ing mechanism able to unify a CQ with several rules fromR at once. Instead of a piece-unifier, we consider an “aggregated unifier” (as introduced in [19] for algorithmic purposes), which aggregates several piece-unifiers of a CQ with rules from R, pro- vided that these piece-unifers are compatible (briefly, they involve disjoint subsets of the query and unify common variables in a way that does not lead to unify different con- stants together). To avoid technical developments, we will consider here an alternative way of defining exactly the same kind of rewriting by modifying the set of rules instead of the unifier (this alternative definition is not practically relevant but it is suitable for our study).

LetR0 = {r1, . . . , rl} be a set of rules with eachri = (Bi,Hi). We recall that distinct rules have disjoint sets of variables. The aggregated rule assigned toR0, denoted byr1. . .rl, is defined asB1∧. . .∧Bl→H1∧. . .Hl. LetRbe the (infinite) set of aggregated rules assigned to multisubsets ofR(i.e., an aggregated rule may involve several “copies” of the same rule fromR, with a safe renaming of variables in each copy).

Then, for any UCQQ,β(Q,R)= Q∪(q,r,µ){β(q, r, µ)}, whereq ∈ Q, r = r1. . .rl ∈ R withl ≤ |q|, andµis a piece-unifier ofq withr. We denote by βk(Q,R)the associated breadth-first k-rewriting.5

Example 5. Consider again Ex. 4. We haveβ1({q},R) =β1({q},R)∪{{p0(u), q0(u)}}, where the additional query is obtained by unifyingqwith the aggregated ruler1r2= p0(x0), q0(x1)→p1(x0), q1(x1).

We now prove that the new breadth-first rewriting fulfills the desired properties.

Lemma 1 For allF,Randq, it holds thatα(F,R)|=qiffF|=β({q},R).

5We could stateβ(Q,R) = β(Q,R), however it must be clear that, for each CQ, only a finite subset ofRneeds to be considered.

(7)

Proof:(Sketch) (⇒) LetF1 =α(F,R). AsF1 |= qthere ishsuch thath(q)⊆ F1. Letq0be the largest subset ofqsuch thath(q0)⊆F and{q1, . . . , ql}be the partition ofq\q0such that eachqi(0 < i≤l) maps to the atoms produced by the application of a ruleri = (Bi,Hi) ∈ RtoF with a homomorphismhi, i.e.,h(qi) ⊆ hsafei (Hi) (andF∪hsafe1 (H1)∪. . .∪hsafel (Hl)⊆F1). For eachi >0, letHi0 ⊆Hi denote the useful part ofHi, i.e.,h(qi) =hsafei (Hi0). Ifq0=q, i.e.,F |=q, thenF |=β({q},R) sinceq ∈ β({q},R). Otherwise, letr = r1 . . .rl andµbe the piece-unifier ofq withr naturally associated with the homomorphismshandh1∪. . .∪hl (i.e., µ = (q1∪. . .∪ql, H10 ∪. . .∪Hl0, u)where for all termseande0 in the domain of u,u(e) = u(e0)iff(h∪hsafe1 ∪. . .∪hsafel )(e) = (h∪hsafe1 ∪. . .∪hsafel )(e0)). We easily check that F |= β(q, r, µ). Sinceβ(q, r, µ) ∈ β({q},R), we obtain that F |=β({q},R).

(⇐) IfF |=qthenα(F,R)|=q. Otherwise, we know there isq16=qinβ1({q},R) s.t. F |= q1 whereq1 is obtained by rewritingq0 ⊆ q with an aggregated ruler = r1. . .rland a piece-unifier(q0, H10∪. . .∪Hl0, u). Leth0be a homomorphism from q1toF. For eachri= (Bi, Hi)composingr,h0◦usafeis a homomorphism fromBi toF. Hence,(h0◦u)safe(Hi) ⊆ α(F,R). We build the following homomorphismh fromqtoα(F,R): for allx∈vars(q), ifx∈vars(q)\vars(q0)thenh(x) =h0(x), otherwise, let anye∈Hi0such thatu(x) =u(e), thenh(x) = (h0◦usafe)safe(e).

Theorem 1. For allF, q,Randk≥0, it holds thatαk(F,R)|=qiffF|=βk({q},R) Proof: The proof is by induction onk. Fork = 0, the property is trivially true. As- sume the property holds for any k < n. First note that, for anyi ≥ 1,αi(F,R) = αi−1(α(F,R),R)and βi({q},R) = βi−1({q},R),R), which directly follows from the definitions. We haveαn(F,R)|=qiff

αn−1(α(F,R),R)|=qiff

α(F,R)|=βn−1 ({q},R)(by induction hypothesis) iff F|=βn−1 ({q},R),R)(by Lemma 1) iff

F|=βn({q},R).

Hence,αk(F,R)|=qiffF|=βk({q},R)holds for anyk≥0.

This correspondence betweenαk andβk allows us to rely on the soundness and completeness of breadth-first forward chaining to establish the soundness and com- pleteness of breadth-first rewriting:

Corollary 1. It holds thatF,R |=qiffF |=βk({q},R)for some positive integerk.

Breadth-first rewriting yields an alternative definition offus.

Proposition 1 A set of rulesRisfusiff for all queryqthere is a positive constantk such thatβk({q},R)≡βk+1({q},R).

From now on, for a UCQQ and a fixed set of rules R, we will denote the set βk(Q,R)simply byQk.

(8)

4 Several Notions of Boundedness

We now define bounded rules and clarify their relationships with other properties found in the literature. Two meaningful notions of boundedness can be provided, with the bound being based on fact saturation or on query rewriting.

Definition 2 A set of existential rulesRis

(saturation-bounded) if there existsk∈Nsuch that for allF,Fk≡Fk+1. (rewriting-bounded) if there existsk∈Nsuch that for allQ,Qk ≡Qk+1.

We first show that these two notions precisely coincide.

Theorem 2. Any set of existential rules Ris saturation-bounded iff it is rewriting- bounded. Moreover, the bound is the same, i.e., for allF and allQ, for allk,Fk ≡ Fk+1iffQk≡Qk+1.

Proof:(⇒) Assume thatRis saturation-bounded and letkbe the bound. Then for all QandF it holds that F,R |= QiffFk |= Q. Letq be any query inQk+1. Since q |=Qk+1, by soundness of rewriting, it holds thatq,R |=Q; sinceRis saturation- bounded,{q}k|=Qhence, by Th. 1,q|=Qk. Since this holds for any queryq∈Qk+1, we haveQk+1|=Qk. Furthermore, by definition,Qk|=Qk+1. ThusQk≡Qk+1.

(⇐) Assume that Ris rewriting-bounded and let kbe the bound. Then for allQ andFit holds thatF,R |=QiffF |=Qk. By soundness of forward chaining, it holds that F,R |= Fk+1; sinceR is rewriting-bounded,F |= (Fk+1)k hence, by Th. 1, Fk|=Fk+1. Furthermore, by definition,Fk+1|=Fk. ThusFk≡Fk+1. In light of this, next we will simply sayboundedrules. In [12], Cal`ı et al. introduced the notion of bounded-depth derivation propertyfor existential rules, for which the number of saturation steps is bounded for all query (in short, Q-BDDP). We define the analogous of the bounded-depth derivation property for facts (F-BDDP), as well as a natural property issued from both from (Q-BDDP) and (F-BDDP), we call strong bounded-depth derivation property (strong BDDP). We show that these three properties coincide with the classesfus,fesandbounded.

Definition 3 (Bounded-Derivation Properties) LetRbe a set of existential rules. Then, Renjoys the property

(F-BDD) if for allFthere isk∈Nsuch that for allQ:F,R |=QiffFk |=Q (Q-BDD) if for allQthere isk∈Nsuch that for allF:F,R |=QiffFk|=Q (strong BDD) if there isk∈Nsuch that for allFandQ:F,R |=QiffFk |=Q.

We now show that these notions respectively correspond tofes,fusand boundedness.6 Proposition 2 Any set of existential rules isfesiff it has the F-BDD Property.

6The three following propositions were first proven by J.-F. Baget and M.-L. Mugnier and presented in a seminar at Oxford University in December 2012.

(9)

Proof:(⇒) LetRbefes: for allF, there isksuch thatFk≡Fk+1. By soundness and completeness of forward chaining, for anyQwe haveF,R |=Qiff there isnsuch that Fn|=Q. For anyF and suchn, it holds thatFk|=Fn. Hence,Fk |=Q.

(⇐) Assume Rhas the F-BDD Property and set Q = {Fk+1} (where k is the bound of the F-BDD Property). We thus haveF,R |=Fk+1iffFk |=Fk+1. Forward chaining is sound, soF,R |=Fk+1. HenceFk |=Fk+1. By definition,Fk+1 |=Fk.

Therefore,Fk ≡Fk+1.

Proposition 3 Any set of existential rules isfusiff it has the Q-BDD Property.

Proof:(⇒) LetRbefus: for allQ, there isksuch thatQk≡Qk+1. By soundness and completeness of breadth-first rewriting (Corollary 1), for anyF, we haveF,R |=Qiff F |=Qk. Equivalently, by Th. 1,Fk |=Q. (⇐) AssumeRhas the Q-BDD Property.

For anyq ∈ Qk+1 (wherekis the bound of the Q-BDD Property), let us setF = q:

we haveq,R |=Qiff(q)k |=Q. Breadth-first rewriting is sound, soq,R |=Q. Hence q|=Qk. Since this holds for anyq∈Qk+1, we haveQk+1|=Qk. SinceQkis included inQk+1, we also haveQk |=Qk+1. Therefore,Qk ≡Qk+1. Proposition 4 Any set of existential rules is bounded iff it has the strong BDD Property.

Proof:The proof goes like that of Prop. 2.

(⇒) LetRbe saturation-bounded and letksuch that for allF,Fk ≡ Fk+1. By soundness and completeness of forward chaining, for anyFandQ, we haveF,R |=Q iffFk |=Q.

(⇐) Assume Rhas the strong BDD Property and setQ = {Fk+1} (wherek is the bound of the strong BDD Property). We thus haveF,R |=Fk+1iffFk |=Fk+1. Forward chaining is sound, so F,R |= Fk+1. Hence Fk |= Fk+1. By definition,

Fk+1|=Fk. Therefore,Fk ≡Fk+1.

5 Boundedness = fes ∩ fus

We now prove that bounded rules are exactly those at the intersection offesandfus classes.

Theorem 3. RisfesandfusiffRis bounded.

One direction of the proof(⇐)is straightforward, and follows by Prop.s 2 and 3, simply picking thek of the bound. We dedicate the remaining of this section to the formal development of the other direction. We first consider the set of all objects of the form( ¯B,H¯)whereB¯ is a rewriting of a rule body inRandH¯ is the saturation of B¯ withR. WhenRisfes, such objects correspond to rules (i.e., eachH¯ can be made finite). When moreoverRisfus, the set of all rewritings to be considered is finite, hence the set of all rules of interest is finite.

Definition 4 (Rule Completion) LetR={r1, . . . , rn}be afesandfusset of rules of the formri= (Bi,Hi). We define the setCRas follows:

(10)

1. For any ri = (Bi,Hi), let ki be the smallest integer such that βk

i(Bi,R) ≡ βk

i+1(Bi,R)(such akiexists sinceRisfus).

2. For anyqij ∈βk

i(Bi,R), letkij be the smallest integer such thatαkij(qij,R)≡ αkij+1(qij,R)(such akij exists sinceRisfes).

3. Then:

CR={( ¯B,H¯)|ri= (Bi, Hi)∈ R, B¯ ∈βki(Bi,R),H¯ =αkij( ¯B,R)} Theboundassociated withRisdR=maxri∈R,qij∈β

ki(Bi,R)(kij).

Note thatCRis finite and unique (up to bijective variable renaming). Most impor- tantly, its size depends solely onR.

Proposition 5 For anyfesandfusset of rulesR, it holds thatCR≡ R.

Proof: The directionCR |= Rholds since, for each ruleri = (Bi,Hi) ∈ R,Bi ∈ βk

i(Bi,R), hence there is a rule ¯r = ( ¯B,H¯) ∈ CR withBi = ¯B andHi ⊆ H¯. DirectionR |=CRfollows from the soundness and completeness of saturation. Indeed, for each rule( ¯B,H¯)∈ CR, let( ¯B0,H¯0)be obtained by replacing each frontier variable with a distinct fresh constant (i.e., that does not occur inR). By definition ofCR, we haveB¯0,R |= ¯H0, and equivalentlyR |= ¯B →H¯. We now show that each saturation step withCR can be computed by a bounded number of saturation steps withR(again with a bound independent from any fact base, Prop. 6). Next, we will show that a single saturation step withCRis actually sufficient to saturate any fact base (Prop. 7), therefore it is equivalent to a bounded number of saturation steps withR.

Proposition 6 LetRbefesandfusandCRbe the associated completion set. Then: for any fact baseF,αdR(F,R)|=α(F,CR)(wheredRis the bound introduced in Def. 4).

Proof:(Sketch) LetF0dR(F,R). We show that for each ruler¯= ( ¯B,H¯)∈ CR and each homomorphism hfromB¯ toF,F0 |= α(F,r, h)¯ holds, which suffices to prove thatF0 |=α(F,CR)since all rules are applied “in parallel” in a single saturation step. Consider the sequenceS of rule applications leading fromB¯ toH¯. The choice of dR implies that αdR( ¯B,R) ≡ H¯. Let hfrom B¯ toF. The sequence S can be applied “similarly” toF(i.e., each homomorphismhiassociated with a rule application is replaced byh◦hi), yieldingS(F)withS(F)|=h( ¯H). SinceH¯ is obtained in at most dRbreadth-first steps, this is also true forS(F), henceF0|=S(F). SinceF ⊆F0, we obtainF0|=F∪hsafe( ¯H) =α(F,r, h).¯

Proposition 7 LetRbe a set of rules bothfesand fus. For any fact baseF, it holds thatα(F,CR)≡αk(F,CR)for allk≥1.

Proof:(Sketch) We focus on proving thatα(F,CR)|=α(α(F,CR),R)which suffices to derive the thesis. Indeed, we know that for anyF0andR0 ifF0 |=α(F0,R0)then

(11)

F0 |=αk(F0,R0), for allk ∈ N. Henceα(F,CR) ≡αk(α(F,CR),R), in particular fork =dR. Moreover, asRis bothfesandfusby Prop. 6 we have thatα(F,CR) |= α(α(F,CR),CR).

Letr= (B,H)be any rule ofR. Ifris applicable toα(F,CR)with a homomor- phismh, thenh(B) ⊆ α(F,CR). By Prop. 6 we thus haveαdR(F,R) |= h(B). By Th. 1, we get thatF |=βd

R({h(B)},R), so there existsh(B)rew ∈βd

R({h(B)},R) such thatF |= h(B)rew. Any rewriting sequence fromh(B)toh(B)rew can be per- formed “similarly” from B yieldingBrew ∈ βd

R({B},R) withBrew ≥ h(B)rew. Besides, by definition ofCR, we know there existsr¯= ( ¯B,H¯)∈ CRwithB¯ =Brew

andH¯ ≡αdR(Brew,R). We show that any derivation sequence fromB¯ toH¯ can be applied toh(B)rewto entailh(B). Sincer= (B,H)∈ Rand this sequence is applied until saturation,h(H)is also entailed. We conclude thatα(F,CR)|=h(H).

Proof:[Proof of Th. 3] We now prove that ifRisfesandfusthen it is bounded.

For any fact base F and any CQ q,F,R |= q iff F,CR |= q (Prop. 5). From Prop. 7, F,CR |= q iff α(F,CR) |= q. Hence, F,R |= q iff α(F,CR) |= q. It remains to show that there is k independent from F and q such that αk(F,R) ≡ α(F,CR). From Prop. 6, we have a constantk(k = dR) independent fromF andq such thatαk(F,R)|=α(F,CR); moreover, sinceF,R |=αk(F,R)for anyk, we have α(F,CR)|=αk(F,R). We conclude thatRis bounded.

6 Concluding Remarks

In this paper, we provide several results that clarify the relationships between funda- mental properties of existential rule sets, namelyfes,fusand boundedness. The main result is that bounded rule sets are exactly those that are bothfesandfus.

Recognizing if a set of rules is bounded is a difficult problem, which is already un- decidable for plain datalog [1], hence for manyfesclasses of existential rules. A signif- icant exception are monadic datalog programs, for which boundedness is recognizable [15]. An open question is whether boundedness is recognizable for specific classes of existential rules, in particular those known to befus. Whether restricting predicate arity to two could have an impact on the problem decidability is also an interesting issue.

Besides, the halting condition for the saturation process considered here relies on logical equivalence (i.e.,Fk ≡Fk+1). This condition corresponds exactly to the halt- ing of the chase variant known as thecorechase [17]. Other chase variants have been proposed in the literature, in particular the restricted chase [8], the skolem chase [24]

and the oblivious chase [9], which are known to halt in (increasingly) fewer cases (see, e.g., [3] for examples illustrating the differences between these mechanisms). With each of these variants could be associated a different boundedness notion. Note however that all the chase variants collapse on rules without existential variables (i.e., plain datalog), hence the undecidability of boundedness recognition in plain datalog applies to all these potential variants of boundedness.

Acknowledgments. This work was partially supported by project PAGODA (ANR-12- JS02-007-01).

(12)

References

1. Abiteboul, S., Hull, R., Vianu, V. (eds.): Foundations of Databases: The Logical Level.

Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edn. (1995) 2. Baget, J., Garreau, F., Mugnier, M., Rocher, S.: Extending acyclicity notions for existential

rules (long version). CoRR abs/1407.6885 (2014),http://arxiv.org/abs/1407.6885 3. Baget, J., Garreau, F., Mugnier, M., Rocher, S.: Revisiting chase termination for existential

rules and their extension to nonmonotonic negation. CoRR abs/1405.1071 (2014), http:

//arxiv.org/abs/1405.1071, proc. of NMR 2014

4. Baget, J.F., Lecl`ere, M., Mugnier, M.L., Salvat, E.: On Rules with Existential Variables:

Walking the Decidability Line. Artificial Intelligence 175(9-10), 1620–1654 (2011) 5. Baget, J., Lecl`ere, M., Mugnier, M.: Walking the decidability line for rules with existential

variables. In: KR 2010 (2010)

6. Baget, J., Lecl`ere, M., Mugnier, M., Salvat, E.: Extending decidable cases for rules with existential variables. In: IJCAI 2009. pp. 677–682 (2009)

7. Baget, J., Mugnier, M., Rudolph, S., Thomazo, M.: Walking the complexity lines for gener- alized guarded existential rules. In: IJCAI 2011. pp. 712–717 (2011)

8. Beeri, C., Vardi, M.Y.: A proof procedure for data dependencies. J. ACM 31(4), 718–741 (1984)

9. Cal`ı, A., Gottlob, G., Kifer, M.: Taming the infinite chase: Query answering under expressive relational constraints. In: KR’08. pp. 70–80 (2008)

10. Cal`ı, A., Gottlob, G., Pieris, A.: Query answering under non-guarded rules in datalog+/-. In:

RR’10. pp. 1–17 (2010)

11. Cal`ı, A., Gottlob, G., Kifer, M.: Taming the infinite chase: Query answering under expressive relational constraints. J. Artif. Intell. Res. (JAIR) 48, 115–174 (2013)

12. Cal`ı, A., Gottlob, G., Lukasiewicz, T.: A general datalog-based framework for tractable query answering over ontologies. In: PODS 2009. pp. 77–86 (2009)

13. Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: DL-Lite: Tractable description logics for ontologies. In: AAAI. pp. 602–607 (2005)

14. Chandra, A.K., Vardi, M.Y.: The implication problem for functional and inclusion depen- dencies is undecidable. SIAM J. Comput. 14(3), 671–677 (1985)

15. Cosmadakis, S.S., Gaifman, H., Kanellakis, P.C., Vardi, M.Y.: Decidable optimization prob- lems for database logic programs (preliminary report). In: ACM Symposium on Theory of Computing. pp. 477–490 (1988)

16. Cuenca Grau, B., Horrocks, I., Kr¨otzsch, M., Kupke, C., Magka, D., Motik, B., Wang, Z.:

Acyclicity notions for existential rules and their application to query answering in ontologies.

J. Art. Intell. Res. (JAIR) 47, 741–808 (2013)

17. Deutsch, A., Nash, A., Remmel, J.B.: The chase revisited. In: PODS. pp. 149–158 (2008) 18. Gottlob, G., Orsi, G., Pieris, A.: Query rewriting and optimization for ontological databases.

ACM Trans. Database Syst. 39(3), 25:1–25:46 (2014)

19. K¨onig, M., Lecl`ere, M., Mugnier, M., Thomazo, M.: On the exploration of the query rewrit- ing space with existential rules. In: Web Reasoning and Rule Systems - RR 2013. pp. 123–

137 (2013)

20. K¨onig, M., Lecl`ere, M., Mugnier, M., Thomazo, M.: Sound, complete and minimal ucq- rewriting for existential rules. Semantic Web 6(5), 451–475 (2015)

21. Kr¨otzsch, M., Rudolph, S.: Extending decidable existential rules by joining acyclicity and guardedness. In: Proc. of IJCAI. pp. 963–968 (2011)

22. Kr¨otzsch, M., Rudolph, S., Hitzler, P.: Complexity boundaries for Horn description logics.

In: Proc. of AAAI. pp. 452–457. AAAI Press (2007)

(13)

23. Lutz, C., Toman, D., Wolter, F.: Conjunctive query answering in the description logicEL using a relational database system. In: Proc. of IJCAI. pp. 2070–2075 (2009)

24. Marnette, B.: Generalized schema-mappings: from termination to tractability. In: PODS. pp.

13–22 (2009)

25. Thomazo, M.: Conjunctive Query Answering Under Existential Rules - Decidability, Com- plexity, and Algorithms. Ph.D. thesis, Universit´e Montpellier II (2013), https://tel.

archives-ouvertes.fr/tel-00925722

Références

Documents relatifs

In this talk, we focus on rule-based ontol- ogy languages, and we demonstrate the tight connection between classical and consistent query answering.. More precisely, we focus on

Dans toute la feuille, K désigne un corps commutatif..

For future work, we are working on identifying a more general class of datalog rewritable existential rules, optimising our rewriting algorithm and implementation, and

Answering conjunctive queries (CQs) over knowledge bases (KBs) containing dis- junctive existential rules is a relevant reasoning task which can be addressed using the disjunctive

This invited talk is based on a long line of research that is still ongoing, and several researchers from different institutions have been involved: Pablo Barcel´o (University

In this framework, the main focus is on the problem of answer- ing conjunctive queries over an ontology expressed by a set of rules, and one of the most studied properties is

Besides the theoretical interest of a unified approach and new proofs for the semi-oblivious and core chase variants, we provide the first positive decidability results concerning

We study chase termination by exploiting in a novel way a graph structure, namely the derivation tree, which was originally introduced to solve the ontology-mediated (conjunctive)