• Aucun résultat trouvé

Worst-Case Optimal Querying of Very Expressive Description Logics with Path Expressions and Succinct Counting

N/A
N/A
Protected

Academic year: 2022

Partager "Worst-Case Optimal Querying of Very Expressive Description Logics with Path Expressions and Succinct Counting"

Copied!
12
0
0

Texte intégral

(1)

Worst-Case Optimal Querying of Very Expressive Description Logics with Path Expressions and Succinct Counting

Bartosz Bednarczyk1,2?and Sebastian Rudolph2??

1 Institute of Computer Science, University of Wrocław, Poland

2 Computational Logic Group, TU Dresden, Germany firstname.lastname@tu-dresden.de

Abstract. Among the most expressive knowledge representation formalisms are the description logics of theZfamily. For well-behaved fragments ofZOIQ, en- tailment of positive two-way regular path queries is well known to be 2EXPTIME- complete under the proviso of unary encoding of numbers in cardinality con- straints. We show that this assumption can be dropped without an increase in com- plexity and EXPTIME-completeness can be achieved when bounding the number of query atoms, using a novel reduction from query entailment to knowledge base satisfiability. These findings allow to strengthen other results regarding query en- tailment and query containment problems in very expressive description logics.

Our results also carry over toGC2, the two-variable guarded fragment of first- order logic with counting quantifiers, for which hitherto only conjunctive query entailment has been investigated.

1 Introduction

Recent years have seen a convergence of the fields of logic-based knowledge represen- tation (KR) and databases, with querying over knowledge bases (KBs, aka ontologies) as the archetypical inferencing task considered. Thereby, the query languages of interest have evolved from plain conjunctive queries to more expressive formalisms with path- navigational components such as thepositive two-way regular path queries(P2RPQs) considered here. One popular and practically used way of specifying the to-be-queried knowledge bases is viadescription logics(DLs), whose most expressive exemplars fea- ture advanced constructors for roles and path expressions. Certainly, the DLs from the Zfamily [6] are among the most powerful KR formalisms on the verge of decidability.

For its most expressive proponent,ZOIQ, featuring nominals (O), role inverses (I), and number restrictions (Q), querying is undecidable [8,13] and even decidability of KB satisfiability is open, owing to the intricate interplay of the three mentioned features, but restricting the interaction ofO,I, andQ(or excluding one of the features altogether) leads to beneficial model-theoretic properties, which give rise to upper bounds of EXP- TIME for KB satisfiability and 2EXPTIME for querying, established using elaborate automata techniques [6]. While the first bound holds under the assumption of binary

?Supported by “Diamentowy Grant” no. DI2017 006447

??Supported by ERC Consolidator Grant 771779 (DeciGUT)

(2)

encoding of number restrictions, the proof for the second one requires unary encoding.

While unary encoding of numbers is often silently accepted in the DL community, it is considered “cheating” in others.

In our paper, we overcome this restriction. Leveraging said model-theoretic property of the considered DLs, we provide a novel reduction from the query entailment problem K |=qto the problem of checking unsatisfiability of some other knowledge baseK¬q. Thereby, the size of the latter is exponential in the total size ofqandK(even when as- suming binary encoding), leading to a 2EXPTIME-complete complexity. Our technique also yields the new insight that the complexity drops to EXPTIME, when the number of atoms inqis bounded by a constant. The obtained results allow to correspondingly im- prove a number of previous results on query containment and can be transferred to DLs from theSRfamily. Reaching out to the community researching decidable fragments of first-order logic, we show that our results also extend to entailment ofP2RPQs by GC2databases, i.e. sets of unary and binary ground facts in the presence of a sentence of the guarded two-variable fragment with counting as defined by Pratt-Hartmann [10].

2 Preliminaries

Description Logics. We will very briefly recap the very expressive DLZOIQand its relevant sublogics following Calvaneseet al.[6]. We assume a fixed signature consist- ing of setsNIofindividual names,NCofconcept names, andNRofrole names. We letNCcontain the special concept names>and⊥, andNRthe special role names>

and. The following EBNF grammar definesatomic conceptsB,conceptsC,atomic rolesR,simple rolesSandrolesT, witho, a, b∈NI,A∈NC,P ∈NR\ {>}:

B::=A| {o}

C::=B| ¬C|CuC|CtC| ∀T.C| ∃T.C|>nS.C|6nS.C| ∃S.Self R::=P |P

S::=R|S∩S|S∪S|S\S

T ::=>|S|T∪T |T◦T |T|id(C)

Anassertionis of the formC(a),R(a, b),a≈b, ora6≈b, ageneral concept inclusion (GCI) has the formC1vC2. We useC1≡C2as abbreviation forC1vC2,C2vC1. AZOIQknowledge base(short: KB)K = (A,T)consists of a finite setA(called ABox) of assertions and a finite setT (calledTBox) of GCIs.

The semantics ofZOIQis defined via interpretationsI = (∆II)composed of a non-empty set ∆I called thedomain of I and a function·I mapping individual names to elements of∆I, concept names to subsets of∆I, and role names to subsets of∆I×∆I. This mapping is extended to concepts, simple roles, and roles (cf. Table 1).

Then, a GCIC1 vC2issatisfiedbyIifC1I ⊆C2I, satisfaction of ABox assertions is defined in the obvious way. We say that an interpretationIsatisfiesa KBK = (A,T) (or I is amodelof K, written:I |= K) if it satisfies all axioms of A andT.K is calledsatisfiableif it has a model andunsatisfiableotherwise. FromZOIQ, we obtain ZIQby disallowing nominal concepts{o},ZOQ, by disallowing role inverses(.) andZOI by disallowing number restrictions.

(3)

Table 1: Semantics of concepts (left) and (simple) roles (right) inZOIQ. Name Syntax Semantics

top > I

bottom

nominal {o} {oI}

negation ¬C I\CI intersection C1uC2 CI1C2I union C1tC2 CI1C2I

univ. restriction ∀T.C {x| ∀y.(x, y)TIyCI} exist. restriction ∃T.C {x| ∃y.(x, y)TIyCI} qualified number 6nS.C {x|#{y∈CI|(x, y)∈SI} ≤n}

restriction >nS.C {x|#{y∈CI|(x, y)∈SI} ≥n}

Selfconcept ∃S.Self {x|(x, x)SI}

Name Syntax Semantics universal role > I×I bottom role

inverse role P {(x, y)|(y, x)PI} intersection S1S2 SI1S2I

union S1S2 SI1S2I difference S1\S2 SI1\S2I concatenation T1T2 T1IT2I

Kleene star T S

i≥0(TI)i concept test id(C) {(x, x)|xCI}

Queries. In queries, we use variables from a countably infinite set V. A Boolean positive two-way regular path query(P2RPQ) is an expressionq=∃x.ϕ, whereϕis a positive formula (i.e., one using∧and∨) over expressions of the formT(s, t)orC(t), whereT is a role,C is concept, andsandtare elements ofx∪NI. AP2RPQis a conjunctive two-way regular path query(C2RPQ) if it does not use disjunction. It is a union of conjunctive two-way regular path queries(UC2RPQ) if it is a disjunction of C2RPQs.3Avariable assignmentZforIis a mappingV→∆I. Forx∈V, we set xI,Z := Z(x); forc ∈NI, we setcI,Z :=cI.T(s, t)evaluates to true underZand Iif(sI,Z, tI,Z)∈TI.C(t)evaluates to true underZandIiftI,Z∈CI. AP2RPQ q=∃x.ϕissatisfiedbyI(written:I |=q) if there is a variable assignmentZ(called match) such thatϕevaluates to true underIandZ. AP2RPQqisentailedby a KBK (written:K |=q) if every model ofKsatisfiesq. Ifquses only signature elements ofK, we call it a queryoverK.

Simplifications. AP2RPQissimplifiedif all its atoms are of the formT(x, y)where xandyare variables andT is built from role names using∪,◦, and. For any KBK andP2RPQq, we can construct in polynomial time a KBK0 and simplifiedP2RPQq0 such thatK |=qiffK0|=q0. Hence from here on, we will focus on simplified queries.

Automata. For each atomT(x, y)of a simplifiedP2RPQ,T is a regular expression overNR. Every such expression can be represented by a nondeterministic finite au- tomatonA= (Σ,Q,I,F,T)with an alphabetΣ⊆ {P, P|P∈NR} of possibly in- verted role names, a finite set Qof states, initial statesI ⊆ Q, final statesF ⊆ Q, and transitionsT ⊆ Q×Σ×Q.4 The size ofAis polynomially bounded byT. In the following, we assume queries in automaton form, where the atomsT(x, y)have been replaced by the correspondingA(x, y). GivenA = (Σ,Q,I,F,T), let A = (Σ,Q,F,I,T)withΣ={P|P ∈Σ} be the corresponding reverse automa- ton5with initial and final states swapped and the state transitions flipped:(q0, R,q)∈

3As∃x.W

iϕi≡W

i∃x.ϕi, the order of disjunction and existential quantification is irrelevant.

4The correspondence is such that(δ, δ0) ∈ TI iff there is a role pathδ−P−→ · · ·1P−→k δ0 inI such thatAacceptsP1. . . Pk.

5For convenience, we consider(P)identical toP.

(4)

Tfor every(q, R,q0)∈T. Moreover, we obtain the “state-parameterized” automaton Aq,q0 = (Σ,Q,{q},{q0},T)fromAby setting the initial state toqand the final toq0.

3 A Little Bit of Model Theory

The “well-behavedness” of certain sublogics of ZOIQcan be conveniently charac- terised in model-theoretic terms.

Definition 1 (quasi-forest model, [5]).LetKbe a KB. A modelIofKis aquasi-forest modelif:

– the domain∆IofI is a forest, i.e., a prefix-closed subset ofRoots×Nfor some finite setRoots,

– Roots={oI|ois an individual name inK}, and

– for everyδ, δ0∈∆Iwith(δ, δ0)∈PIfor some role nameP ∈NR\ {>}, either (i) {δ, δ0} ∩Roots6=∅, or (ii)δ=δ0,or (iii)δis a child ofδ0, or (iv)δ0is a child ofδ.

A KBKhas thequasi-forest model property(short: qfmp) ifKis either unsatisfiable or it has a quasi-forest model. A DLLhas the qfmp if everyLKBKhas the qfmp.

For such well-behaved KBs, tight and quite decent complexity bounds for satisfiability checking have been established.

Theorem 1 ([6]).Satisfiability checking ofZOIQKBs having the qfmp isEXPTIME- complete even with binary encoding of number restrictions.

When it comes to query answering, a useful notion for relating interpretations to each other is via homomorphisms. For two interpretationsI= (∆II)andJ = (∆JJ), a functionh:∆I →∆J is ahomomorphismfromItoJ if for allδ, δ0 ∈∆I holds δ∈ AI ⇒h(δ)∈AJ for all concept namesA, and(δ, δ0)∈PI ⇒(h(δ), h(δ0))∈ PJ for all role namesP. If a homomorphism fromItoJ exists, we writeIBJ. We note thatI |=qandIBJ implyJ |=qfor every simplifiedP2RPQq.

Definition 2 (qfhcp, [3]).A KBKhas thequasi-forest homomorphism-cover property (short: qfhcp) if for every modelIofK, there exists a quasi-forest modelJ ofKwith J BI. A description logicLhas the qfhcp if everyLKBKhas the qfhcp. We lettame ZOIQdenote the set of allZOIQKBs having the qfhcp.

Obviously, the qfhcp implies the qfmp. Note that unrestrictedZOIQdoes not even have the qfmp. The most expressive syntactically defined sublogics of tameZOIQ known today areZIQ,ZOQ, andZOI[6,2]. From the fact that the set of models for anyP2RPQis closed under homomorphisms now follows that for any tameZOIQKB withK 6|=q, there must exist a quasi-forest modelI, satisfyingI |=KandI 6|=q.

4 Annotating (Partial) Query Matches

In the following, we consider a tameZOIQKBK and a simplified P2RPQq over K in automaton form. We note thatq can be rewritten into an equivalent disjunction

(5)

W

1≤i≤nqiofn(also simplified)C2RPQs, wherenmay be exponential in the size ofq, but every singleqi’s size is polynomial. Let[q]denote{q1, . . . qn}. Then, testingK 6|=q amounts to determining ifKhas a modelIwhich is a simultaneous countermodel (i.e., I 6|=qi) for allqi∈[q].

We now give an intuitive description of our technique to detect (or refute) matches ofq. Given some modelIofK, we iteratively, deterministically annotate all its domain elementsδwith fresh concept namesQMthat indicate which “parts”M of some query qimatch intoIand howδparticipates in these partial matches. To this end, we employ descriptionsM of query parts which contain information about (i) the query variables matched (ii) (the existence of) paths realising certain state transitions in the query’s au- tomata and (iii) optionally, the “position” of the currentδin the query match by use of a marker•acting like an additional, distinguished query variable. In the annotation process, theQMs will be assigned to domain elementsδbased on their local environ- ment and known annotationsQM0 for “smaller” partial queries to the same element δor its direct (role) neighborhood. As an exception, non-localised query matchesM not containing•will be “broadcast”, i.e., uniformly attached to all domain elements.

This way, in the annotation process,QMs for larger and larger partial queries are as- signed, until finally (after reaching the unique fixpoint) also the full matches for anyqi are recognised by annotationsQqi. This process will accurately detect partial and full query matches, ifIis a quasi-forest model (which is sufficient for our purposes asK has the qfhcp). The annotation process is realised by virtue of a KBKq, the size of which is exponential inq, but only polynomial inK. In the following, we will stepwise introduce the KB, interleaved with some necessary definitions.

Definition 3 (partial query).Consider aC2RPQqi={A1(x1, y1),...,Am(xm, ym)}.6 Apartial queryforqiand a setX ⊆ {x1, y1, . . . , xm, ym}of variables fromqiis a set M consisting of

– all atomsA(x, y)fromqifor which{x, y} ⊆X,

– for everyA(x, y)fromqiwithx∈ X buty 6∈ X, one of the atomsAq,q0(x,•)or Aq,q0(x, o)whereqis initial inA,

– for everyA(x, y)fromqi withx 6∈ X buty ∈ X, one of the atomsAq,q0(y,•)or Aq,q0(y, o)whereqis final inA,

– in case X = ∅, exactly one atom of the form Aq,q0(•, o) or Aq,q0(•, o) (called nominal-anchored path) orAq,q0(•,•) orAq,q0(•,•) (called round trip) for some Afromqi.

A partial query is called monodic, if |X| = 1. A partial query is called localif it contains•andglobalotherwise.

Nominal-Anchored Paths. First, we want to detect role paths that start in the to-be- annotated individual, end in named individualoI, and realise state transitions in one of the query’s automata. Letobe any individual name inK, letA±be eitherAorAfor any automatonAoccurring inqwith statesq,q0,q00. Then letKqcontain the axiom

Q{A±

q,q(•,o)}(o) (1)

6In cases, we may equate aC2RPQqiwith the set of its atoms.

(6)

and wheneverA±has a transition(q, R,q0):

∃R.Q{A±

q0,q00(•,o)}vQ{A±

q,q00(•,o)} (2)

Round Trips. Next, we are concerned with paths which start and end in the to-be- annotated individual. Assuming an automatonAwith statesq,q0,q00,q000and transition setTas well as an individual nameo, we add the axioms:

> vQ{Aq,q(•,•)} (3) Q{A

q,q0(•,•)}uQ{A

q0,q00(•,•)}vQ{A

q,q00(•,•)} (4)

Q{A

q,q0(•,o)}uQ{A

q00,q0(•,o)}vQ{A

q,q00(•,•)} (5)

for(q, R,q0)∈T: ∃R.SelfvQ{A

q,q0(•,•)} (6)

for{(q, R,q0),(q00, R0,q000)} ⊆T:

∃(R∩R0−).Q{Aq0,q00(•,•)}vQ{Aq,q000(•,•)} (7) Initialising monodic partial queries. Next we determine domain elements to which separate query variables could possibly be mapped in a match.

Definition 4. We call a monodic partial queryM original, wheneverq = q0 for ev- eryAq,q0(x,•) ∈M and everyAq,q0(x,•)∈M. ForM an original monodic partial query forqand{x}, the setPrecM of precondition conceptsconsists of:

– Q{A

q,q0(•,o)}for each atomAq,q0(x, o)inM, – Q{A

q,q0(•,o)}for each atomAq,q0(x, o)inM, – F

qinitial q0final

Q{Aq,q0(•,•)}for each atomA(x, x)inM.

Now we can initialise all original monodic partial queriesM by adding the following axioms toKq:

lPrecM vQM (8) Updating partial queries. Partial queries can be updated by roundtrips: Let M be a partial query containing someA±q,q0(x,•). Then we realise the update by adding the following axioms toKq:

QMuQ{A±

q0,q00(•,•)}vQM\{A±

q,q0(x,•)}∪{A±q,q00(x,•)} (9) Furthermore, a partial query can “progress” when making a “step” in the model, moving from the consideredδto some role neighbour. In such a step, all the “unready” paths (corresponding to atoms with•) must be updated in a synchronous manner. Given a partial queryM forqi and nonemptyX as well as a set {R1, . . . , Rk} of (possibly inverted) role names, assumeM0 can be obtained fromM by replacing each atom of the formA±q,q0(x,•)by an atomA±q,q00(x,•)such thatA± has a transition(q0, R,q00) for for anyR∈ {R1, . . . , Rk}. Then add the following axiom to the KBKq:

∃(R1 ∩R2 ∩. . .∩Rm).QM vQM0 (10)

(7)

Joining partial queries. When two partial queries corresponding to the same query qi“meet” in a domain elementδ, they can – under certain circumstances – be “glued together” into a “bigger” partial query ofqi.

Definition 5 (joinable, join).LetM1be a partial query forqi andX1and letM2be a partial query forqiandX2. We callM1andM2joinableif

– X16=∅,X26=∅,X1∩X2=∅, and

– for eachA(x, y)∈qi\(M1∪M2)with{x, y} ⊆X1∪X2, there are statesq,q0,q00of Awithqinitial andq00final, such that either{Aq,q0(x,•),Aq00,q0(y,•)} ∈M1∪M2 or{Aq,q0(x, o),Aq00,q0(y, o)} ∈M1∪M2for someo∈NI.

For joinableM1andM2, theirjoin, denotedM1 ./ M2, is the partial query obtained fromM1∪M2by replacing any pair of atomsAq,q0(x,∗)andAq00,q0(y,∗)byA(x, y), where∗is either an individual nameoor•.

We implement the join operation for every pairM1,M2of joinable partial queries for aqi∈[q]by extendingKq with:

QM1uQM2 vQM1./M2 (11) Broadcasting of global partial queries. Whenever a partial queryM does not have any occurrences of•, which would tie it to a specific element, it will be “broadcast” to all domain elements:

∃>.QM vQM (12)

This concludes the definition ofKq. Revisiting our initial intuition ofKq’s purpose of deterministically annotating a modelI of K, the following lemma formalises this by singling out one unique annotated modelIfor everyI.

Lemma 1. LetKbe aZOIQKB andqa simplifiedP2RPQ. For every modelI ofK there exists a unique modelIofK ∪ Kq which extendsIand satisfiesQIM⊆QJMfor everyQM inKqand for every modelJ ofK ∪ Kqthat extendsI.

We will call modelsIfrom Lemma 1Q-minimal. Note that for every modelI ofK, any concept membershipδ∈QIMholding in the correspondingQ-minimal modelI can be derived from concept and role memberships inI through a finite sequence of

“applications” of axioms fromKq.7

5 Reduction to Satisfiability

Now we establish technical results relating the presence ofQM-annotations in models and the actual semantic satisfaction of the corresponding partial queryM. Note that any partial queryM can be seen as a (set representation of a)C2RPQin automaton form, assuming•is just an ordinary variable.

7To see this, it may help to realise that all axioms ofKq can be easily expressed in monadic Datalog.

(8)

Definition 6 ((tight) satisfaction of a partial query).Given an interpretation I, a domain elementδ ∈ ∆I, and a partial queryM, we sayM is satisfiedor holdsin δ, writtenI, δ|=M, if there is a matchZ forM inI withZ(•) =δ.M issatisfied or holds tightlyinδ, writtenI, δ|=|=M, ifZ is such that everyA±q1,q2(x,•) ∈ M is realized by a pathZ(x);δnot containingoIfor any individual nameo.

Note that|=and|=|=coincide wheneverM is global or does not contain variables. Note also that, as a consequence of this definition, ifM is global, thenQM holds (tightly) everywhere or nowhere throughout the domain. With this notion in place, the next two lemmas can be seen as soundness and completeness results regarding the deduction calculus for (partial) query matches embodied byKq.

Lemma 2 (soundness). LetIbe aQ-minimal model ofK ∪ Kqand letδ∈∆I. Let M be any partial query for anyC2RPQqi∈[q]. Thenδ∈QIM impliesI, δ|=M. Lemma 3 (completeness).LetI be a quasi-forest model ofK ∪ Kq and letδ∈ ∆I. LetM be any partial query for anyC2RPQqi ∈[q]. ThenI, δ|=|=M impliesδ∈QIM. Soundness is proven by induction over the length of the derivation forδ∈QIM. Com- pleteness is shown by induction over thespreadof the match forM, i.e., the number of variables (except•) plus the sum of the lengths of all paths realising the query atoms.

Thanks to these correspondences, we can essentially rule out models with matches of anyq1, . . . , qnby forcing the extensions ofQq1, . . . , Qqnto be empty. Therefore, let K¬q =K ∪ Kq∪ {Qqi v ⊥ |qi∈[q]}. (13) We establish some syntactic and semantic properties ofK¬q: Some easy calculations yield size bounds forK¬q, tameness is not affected by the reduction, and unsatisfiability ofK¬qindeed coincides with query entailment.

Fact 4 (size ofK¬q) The size ofK¬q is single exponential in the total size ofKandq.

It is polynomial, if the number of atoms inqis bounded by a constant.

Lemma 5. IfKis in tameZOIQthen so isK¬q.

Theorem 2. LetKbe a tameZOIQKB and letqbe a simplifiedP2RPQoverK. Then K |=qif and only ifK¬q is unsatisfiable.

Proof. We show the equivalent statement:K 6|=qif and only ifK¬qsatisfiable.

“only if”: AssumeK 6|=q, that is there exists anIwithI |=KandI 6|=q. Consider I, theQ-minimal model ofK ∪ Kq extendingI. If someQIqiwere nonempty, then there would be a query match ofqintoI by Lemma 2, contradicting our assumption.

Therefore,I|=Qqiv ⊥for allqi∈[q]and hence,I|=K¬q.

“if”: AssumeK¬qis satisfiable and hence has a modelI. AsK¬qhas the qfmp thanks to Lemma 5, we can assumeIto be a quasi-forest model.I |=Qqiv ⊥implies empti- ness ofQIq

i for every qi ∈ [q]. If I |= q held true, witnessed byI |= qi for some qi ∈ [q], Lemma 3 would requireδ ∈QIqi for allδ∈ ∆I, leading to a contradiction.

HenceI 6|=q. On the other hand,I |=K¬qimpliesI |=K, thereforeK 6|=q.

(9)

Based on this theorem, we can now prove our central result:

Theorem 3. Checking entailment of simplifiedP2RPQs from tameZOIQKBs with binary number encoding is2EXPTIME-complete. It isEXPTIME-complete if the num- ber of atoms in the query is bounded.

Proof. By Theorem 2, entailment of a simplifiedC2RPQqfrom a tameZOIQKBK can be reduced to checking unsatisfiability ofK¬q, which can be computed in output- polynomial time and the size of which is exponential (polynomial, if number of atoms in qis bounded). By Theorem 1, (un)satisfiability ofK¬q can be checked in EXPTIMEin the size ofK¬q, hence we obtain the desired 2EXPTIMEand EXPTIMEupper bounds.

The matching lower bounds are well-known even for much weaker logics [9].

6 Querying GC

2

Databases

We now consider the problem ofP2RPQentailment fromGC2 databases, i.e., sets of unary and binary ground facts in the presence of a sentence in the guarded two-variable fragment with counting quantifiers as defined by Pratt-Hartmann ([10]). We will show that there is a polytime query-entailment-preserving transformation fromGC2databases toZIQ KBs. This allows us to transfer the results from Theorem 3. Matching lower bounds follow from known results for much weaker logics and query formalisms (such asCQs overALCI[7]).

Definition 7 (GC2, [10]).For a signatureSof nullary (N0), unary (NC), and binary (NR∪ {≈}) predicates,8letGC2be the smallest set of formulae that contains all atoms over Susing only variables from{x, y}, that is closed under Boolean combinations, and that contains

– all formulae∃u.ϕand∀u.ϕwithu∈ {x, y}wheneverϕis aGC2formula with at most one free variable, and

– all formulae∀u.(γ → ϕ)as well as Q

u.(γ∧ϕ)and Q

u.(γ)whereγ is a binary atom (called “guard”) containing bothxandy,ϕis aGC2formula, and Q

is any of

∃,∃6n,∃=n, or∃>n.

AGC2database DBis a theory{Φ} ∪ Aconsisting of aGC2sentenceΦand a finite set Aof unary and binary ground facts over constants fromNI.9 We denote byGC2qef the set ofGC2formulae which are quantifier-free and do not use≈.

AP2RPQover aGC2database is aP2RPQwhere for every atom of the formC(t)holds C∈NCand for every atom of the formT(s, t), every subexpression of the formid(C) satisfiesC ∈NC.P2RPQentailment fromGC2databases is defined in the same way as for DLs.

For convenience, we make use of a special normal form forGC2introduced in the following.

8As unary and binary predicates correspond to concept and role names, respectively, we use the same set symbols to denote them.

9Note that ABoxes, as defined before, are syntactically and semantically identical to finite sets of unary and binary ground facts, so we consider the notions the same and use the same symbol A. Note also that by definition,Φdoes not contain constants.

(10)

Lemma 6 (normal form, [11]).For anyGC2sentenceΦ, one can compute in polyno- mial time aGC2sentenceNF(Φ)of the form

∀xα∧V

1≤h≤`∀x∀y(Eh(x, y)→βh∨x≈y))

∧V

1≤i≤m∀x∃=niy(Fi(x, y)∧x6≈y), (14) withα∈ GC2qef not usingy,{E1, ... , E`, F1, ... , Fm} ⊆NR,{β1, ... , β`} ⊆ GC2qef, and{n1, ... , nm} ⊆N, such that (i)NF(Φ)|=Φand (ii) every model ofΦwith at least maxhnhelements can be extended to a model ofNF(Φ).

Given aGC2database DB={Φ} ∪ A, we letNF(DB) ={NF(Φ)} ∪ A. Thanks to the previous lemma and the known property that models ofGC2sentences are closed under disjoint self-union, the normal form can be shown to be query-entailment-preserving.

Lemma 7. For everyP2RPQqover aGC2databaseDBholdsDB|=qif and only if NF(DB)|=q.

We next provide a query-entailment-preserving polynomial translation ofGC2databases intoZIQKBs.

Definition 8. LetΨ be aGC2 sentence in normal form as in Lemma 6. We define the ZIQTBoxTΨ as follows:

TΨ =T∪ Tps∪ Tα∪ [

1≤h≤l

TEhh∪ [

1≤i≤m

TFi,ni (15)

Thereby (introducing fresh concept namesP ropP and role namesRandREh,at):

– T={> v61R.>, > v ∃R.Self}

– Tps={P ropP ≡ ∃>.P ropP |P ∈N0} – Tα={> vtr(α)}, where

tr(P) =P ropP

tr(P(x)) =P tr(P(x, x)) =∃P.Self

tr(¬ϕ) =¬tr(ϕ) tr(ϕ∧ψ) =tr(ϕ)utr(ψ) tr(ϕ∨ψ) =tr(ϕ)ttr(ψ) – TEhh=S

at∈atoms(βh)(axEh(at)∪ {> v ∀(REh,at\Eh).⊥})

∪ {> v ∀(Eh\(R∪trEhh))).⊥}

whereaxEhmaps atoms inβhto sets of GCIs as follows:

P 7→ { ∃REh,P.> vP ropP, ∃(Eh\REh,P).> v ¬P ropP} P(x) 7→ { ∃REh,P(x).> vP, ∃(Eh\REh,P(x)).> v ¬P } P(x, x)7→ { ∃REh,P(x,x).> v ∃P.Self, ∃(Eh\REh,P(x,x)).> v ¬∃P.Self}

P(y) 7→ { ∃RE

h,P(y).> vP, ∃(Eh\RE

h,P(y)).> v ¬P } P(y, y)7→ { ∃RE

h,P(y,y).> v ∃P.Self, ∃(Eh\RE

h,P(y,y)).> v ¬∃P.Self}

P(x, y)7→ { ∃(REh,P(x,y)\P).> v ⊥, ∃(Eh∩P\REh,P(x,y)).> v ⊥ } P(y, x)7→ { ∃(RE

h,P(y,x)\P).> v ⊥, ∃(Eh∩P\RE

h,P(y,x)).> v ⊥ } andtrEh mapsGC2qef formulae to simple roles:

trEh(at) =REh,at trEh(¬ρ) =Eh\tr(ρ)

trEh(ρ∧ρ0) =trEh(ρ)∩trEh0) trEh(ρ∨ρ0) =trEh(ρ)∪trEh0)

(11)

– TFi,ni ={ > v 6ni(Fi\R).> u >ni(Fi\R).> }

Given aGC2databaseDB={Φ} ∪ A, we letKDBdenote theZIQKB(TNF(Φ),A).

Intuitively,Taxiomatises equality,Tpsintroduces synchronised concept namesP ropP

for every propositional symbolPfrom DB, sinceZIQdoes not allow for nullary pred- icates.Tα,TEhh, andTFi,niimplement the conjuncts fromNF(Φ).

We can now observe thatKDBcan be computed in polytime wrt. the size of DB and that it has the qfhcp (since it is aZIQKB). Moreover we can show that any model of KDBcan be extended to one ofNF(DB)and, conversely, any model ofNF(DB)can be extended to one ofKDB. With the help of these correspondences and Lemma 7, we can then argue that for anyP2RPQqover DB holds DB|=qiffKDB|=q. These findings allow us to apply Theorem 3 to establish the following result.

Theorem 4. CheckingP2RPQentailment fromGC2databases is2EXPTIME-complete.

It isEXPTIME-complete if the number of query atoms is bounded by a constant.

7 Conclusion

We have established tight complexity bounds for expressive querying in very expressive DLs under the assumption of succinct (i.e. binary) encoding of number restrictions. Ar- guing along the lines of Calvaneseet al.[6], we can leverage our findings to strengthen their results on query containment as well as theSRfamily as follows:10

Theorem 5 (query containment).Testing query entailmentK |=q0 ⊆qis in2EXP- TIME with respect to the total size ofq0,q, andK (with binary encoding of number restrictions) if (i)Kis inZOQorZOI andq0andqareP2RPQs overK, or (ii)Kis inZIQ,q0is a conjunctive query andqis aP2RPQoverK. The complexity drops to EXPTIMEif the number of atoms occurring inqis bounded by a constant.

Theorem 6 (SRfamily).DecidingK |= qandK |=q0 ⊆qis in3EXPTIMEin the total size ofq0,q,K(with binary encoding of numbers) – and in2EXPTIMEifK’s RBox is given by defining regular expressions for the non-simple roles – if (i)Kis inSROQ or SROI and q0,qare P2RPQs overK, or (ii)Kis inSRIQ,q0 is a conjunctive query andqis aP2RPQoverK. The complexities are in2EXPTIMEand inEXPTIME, respectively, if the number of atoms occurring inqis bounded by a constant.

There are plenty of open questions left for future work. First, the results established here hold for arbitrary models, however, so far, very little is known about finite (model) query answering in DLs from theZ family and the methods applied here do not seem to easily carry over to the finite model setting. Second, while the extension of our query formalism by nestings in the sense of Bienvenuet al.[1] seems straightforward without impacting complexities, our technique seems not readily extendable to capture more elaborate query languages [14,4,12].

10For space reasons, we assume the reader to be familiar with the notions used in these theorems.

(12)

References

1. Meghyn Bienvenu, Diego Calvanese, Magdalena Ortiz, and Mantas Simkus. Nested regular path queries in description logics. In Chitta Baral, Giuseppe De Giacomo, and Thomas Eiter, editors, Proc. 14th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR’14), pages 218–227. AAAI Press, 2014.

2. Piero A. Bonatti, Carsten Lutz, Aniello Murano, and Moshe Y. Vardi. The complexity of enriched mu-calculi.Logical Methods in Computer Science, 4(3:11):1–27, 2008.

3. Pierre Bourhis, Markus Krötzsch, and Sebastian Rudolph. How to best nest regular path queries. InProc. 27th Int. Workshop on Description Logics (DL’14), volume 1193 ofCEUR Workshop Proceedings, pages 404–415. CEUR-WS.org, 2014.

4. Pierre Bourhis, Markus Krötzsch, and Sebastian Rudolph. Reasonable highly expressive query languages. In Qiang Yang and Michael Wooldridge, editors, Proc. 24th Int. Joint Conf. on Artificial Intelligence (IJCAI’15), pages 2826–2832. AAAI Press, 2015.

5. Diego Calvanese, Thomas Eiter, and Magdalena Ortiz. Answering regular path queries in expressive description logics: An automata-theoretic approach. InProc. 22nd AAAI Conf. on Artificial Intelligence (AAAI’07), pages 391–396. AAAI Press, 2007.

6. Diego Calvanese, Thomas Eiter, and Magdalena Ortiz. Regular path queries in expressive description logics with nominals. In Craig Boutilier, editor,Proc. 21st Int. Joint Conf. on Artificial Intelligence (IJCAI’09), pages 714–720. IJCAI, 2009.

7. Carsten Lutz. The complexity of conjunctive query answering in expressive description logics. In Alessandro Armando, Peter Baumgartner, and Gilles Dowek, editors,Proc. 4th Int. Joint Conf. on Automated Reasoning (IJCAR’08), number 5195 in LNAI, pages 179–

193. Springer, 2008.

8. Magdalena Ortiz, Sebastian Rudolph, and Mantas Simkus. Query answering is undecidable in DLs with regular expressions, inverses, nominals, and counting. INFSYS Research Report 1843-10-03, TU Vienna, 2010.

9. Magdalena Ortiz and Mantas Simkus. Reasoning and query answering in description logics.

In Thomas Eiter and Thomas Krennwallner, editors,Proc. 8th Int. Reasoning Web Summer School (RW’12), volume 7487 ofLecture Notes in Computer Science, pages 1–53. Springer, 2012.

10. Ian Pratt-Hartmann. Complexity of the guarded two-variable fragment with counting quan- tifiers.J. Log. Comput., 17(1):133–155, 2007.

11. Ian Pratt-Hartmann. Data-complexity of the two-variable fragment with counting quantifiers.

Inf. Comput., 207(8):867–888, 2009.

12. Juan L. Reutter, Miguel Romero, and Moshe Y. Vardi. Regular queries on graph databases.

Theor. Comp. Sys., 61(1):31–83, July 2017.

13. Sebastian Rudolph. Undecidability Results for Database-Inspired Reasoning Problems in Very Expressive Description Logics. In Chitta Baral, James P. Delgrande, and Frank Wolter, editors,Proc. 15th Int. Conf. on Knowledge Representation and Reasoning (KR’16), pages 247–257. AAAI Press, 2016.

14. Sebastian Rudolph and Markus Krötzsch. Flag & check: Data access with monadically de- fined queries. In Richard Hull and Wenfei Fan, editors,Proc. 32nd Symposium on Principles of Database Systems (PODS’13), pages 151–162. ACM, 2013.

Références

Documents relatifs

In addition, Erickson and Siau [14] provided further support for the idea of practical complexity by conducting a Delphi study aimed at identifying a kernel of

: Weighted polynomial approximation in rearrangement in- variant Banach function spaces on the whole real line; To appear in Indian J.. PEETRE, J.: Nouvelles propri6ti6s

More generally, one says that a variety of dimension n, defined by a system of polynomial equations, is rational if its points (the solutions of the system) can be

Hence, it remains to check whether the query is analogously satisfied (i.e., same foldings must exist) by, roughly speaking, checking whether the variable mappings have been

But if statement Sh fails to make sense, it seems that statement Sf does not make sense either, according to Condition1: for the terms “brain” and “left arm” occupy the

Colorie le dessin en respectant l’orthographe des

L’accès aux archives de la revue « Rendiconti del Seminario Matematico della Università di Padova » ( http://rendiconti.math.unipd.it/ ) implique l’accord avec les

Does it make a difference whether the parcel fronts upon the body of water or if the body of water is contained within or passes through the