Tractable Extensions of the Description Logic EL with Numerical Datatypes

(1)

Tractable Extensions of the Description Logic E L with Numerical Datatypes

Despoina Magka, Yevgeny Kazakov, and Ian Horrocks Oxford University Computing Laboratory

Wolfson Building, Parks Road, OXFORD, OX1 3QD, UK {despoina.magka,yevgeny.kazakov,ian.horrocks}@comlab.ox.ac.uk

1 Introduction and Motivation

Description logics (DLs) [1] provide a logical foundation for modern ontology languages such as OWL¹ and OWL 2 [2]. EL⁺⁺ [3] is a lightweight DL for which reasoning is tractable (i.e., can be performed in time that is polynomial w.r.t. the size of the input), and that offers sufficient expressivity for a num- ber of life-sciences ontologies, such as SNOMED CT [4] or the Gene Ontology [5]. Among other constructors, EL⁺⁺ supports limited usage of datatypes. In DL, datatypes (also called concrete domains) can be used to define new concepts by referring to particular values, such as strings or integers. For example, the conceptHuman⊓ ∃hasAge.(<,18)⊓ ∃hasName.(=,“Alice”) describes hu- mans, named “Alice”, whose age is less than 18. Datatypes are described first by the domain their values can come from and also by the relations that can be used to constrain possible values. In our example, (<,18) refers to the domain of natural numbers and uses the relation “<” to constrain possible values to those less than 18, while (=,“Alice”) refers to the domain of strings and uses the relation “=” to constrain the value to “Alice”.

In order to ensure that reasoning remains polynomial,EL⁺⁺ allows only for datatypes which satisfy a condition calledp-admissibility [3]. In an nutshell, this condition ensures that the satisfiability of datatype constraints can be solved in polynomial time, and that concept disjunction cannot be expressed using datatype concepts. For example, if we were to allow both≤and≥for integers, then we could express A ⊑ B⊔C by formulating the axioms A⊑ ∃R.(≤,5),

∃R.(≤,2)⊑B and ∃R.(≥,2)⊑C. Similarly, we can show that p-admissibility does not allow for both<(or>) and =. For this reason, the EL Profile of OWL 2, which is based onEL⁺⁺, admits only equality (=) in datatype expressions.

In this paper, we demonstrate how these restrictions can be significantly relaxed without loosing tractability. As a motivating example, consider the following two axioms which might be used, e.g., in a pharmacy-related ontology:

Panadol⊑ ∃contains.(Paracetamol⊓ ∃mgPerTablet.(=,500)) (1) Patient⊓ ∃hasAge.(<,6)⊓

∃hasPrescription.∃contains.(Paracetamol⊓ ∃mgPerTablet.(>,250))⊑ ⊥ (2)

1 http://www.w3.org/2004/OWL

79

(2)

Axiom (1) states that the drug Panadol contains 500 mg of paracetamol per tablet, while axiom (2) states that a drug that contains more than 250 mg of paracetamol per tablet must not be prescribed to a patient younger than 6 years old. The ontology could be used, for example, to support clinical staff who want to check whether Panadol can be prescribed to a 3-year-old patient. This can easily be achieved by checking whether concept (3) is satisfiable w.r.t. the ontology:

Patient⊓ ∃hasAge.(=,3)⊓ ∃hasPrescription.Panadol (3) Unfortunately, this is not possible using EL⁺⁺, because axioms (1) and (2) involve both equality (=) and inequalities (<, >), and this violates the p- admissibility restriction. In this paper we demonstrate that it is, however, possible to express axioms (1) and (2) and concept (3) in a tractable extension of EL. A polynomial classification procedure can then be used to determine the satisfiability of (3) w.r.t. the ontology by checking if adding an axiom

X⊑Patient⊓ ∃hasAge.(=,3)⊓ ∃hasPrescription.Panadol for some new concept nameX would entailX ⊑ ⊥.

Our idea is based on the intuition that equality in (1) and (3) serves a different purpose than inequalities do in (2). Equality in (1) and (3) is used to state afact (the content of a drug and the age of a patient) whereas inequalities in (2) are used to trigger arule (what happens if a certain quantity of drug is prescribed to a patient of a certain age). In other words, equality is used positively and inequalities are used negatively. It seems reasonable to assume that positive usages of datatypes will typically involve equality since a fact can usually be precisely stated. On the other hand, it seems reasonable to assume that negative occurrences of datatypes will typically involve equality as well as inequalities since a rule usually applies to a range of situations. In this paper, we make a fine-grained study of datatypes inELby considering restrictions not only on the kinds of relations included in a datatype, but also on whether the relations can be used positively or negatively. The main contributions of this paper can be summarised as follows:

1. We introduce the notion of aNumerical Datatype with Restrictions (NDR) that specifies the domain of the datatype, the datatype relations that can be used positively and the datatype relations that can be used negatively.

2. We extend theELreasoning algorithm [3] to provide a polynomial reasoning procedure for an extension ofELwith NDRs, where the procedure is sound for any NDR.

3. We introduce the notion of asafe NDR, show that every extension ofELwith a safe NDR is tractable and prove that our reasoning procedure is complete for any safe NDR.

4. Finally, we provide a complete classification of safe NDRs for the cases of natural numbers, integers, rationals and reals. Notably, we demonstrate that the numerical datatype restrictions can be significantly relaxed by allowing arbitrary numerical relations to occur negatively—not only equality as cur- rently specified in the OWL 2 EL Profile. As argued earlier, this combination

(3)

Table 1.Concept descriptions inEL^⊥(D)

Name Syntax Semantics

Concept name C C^I

Top ⊤ ∆^I

Bottom ⊥ ∅

Conjunction C⊓D C^I∩D^I

Existential restriction ∃R.C {x∈∆Î| ∃y∈∆Î : (x, y)∈RÎ∧y∈CÎ} Datatype restriction ∃F.r {x∈∆Î| ∃v∈ D: (x, v)∈FÎ∧r(v)}

is of particular interest to ontology engineering, and is thus a strong candidate for the next extension of the EL Profile in OWL 2.

2 Preliminaries

In this section we introduce EL^⊥(D), an extension of EL^⊥ [3] with numerical datatypes. In the DL literature datatypes are better known as concrete domains [6]; we call them datatypes to be more consistent with OWL 2 [2]. The syntax ofEL^⊥(D) uses a set ofconcept names NC, a set ofrole names NRand a set of feature names NF.EL^⊥(D) is parametrised with anumerical domainD ⊆R(R is the set of real numbers).NC,NRandNF are countably infinite sets and, additionally, pairwise disjoint. We call (s, y), wheres∈ {<,≤, >,≥,=}andy∈ D, a D-datatype restriction (or simply adatatype restriction if the domainDis clear from the context). Given aD-datatype restrictionr= (s, y) and anx∈ D, we say that xsatisfies rand we writer(x) iff (x, y)∈s, wheres∈ {<,≤, >,≥,=}

andsis interpreted as the standard relation on real numbers. Table 1 recursively defines concepts in EL^⊥(D), where C and D are concepts, R ∈ NR, F ∈ NF

andr is aD-datatype restriction. An axiomα(in EL^⊥(D)) is an expression of the formC ⊑D, whereC and D are concepts. An (EL^⊥(D)−)ontology O is a set of axioms. A concept E is said to positively (negatively) occur in an axiom C⊑Diff it occurs inD(C). An interpretation ofEL^⊥(D) is a pairI = (∆Î,·Î), where∆Î is a non-empty set, thedomain of the interpretation, and·Î is thein- terpretation function. The interpretation function maps each A ∈NC to a set AÎ ⊆ ∆Î, each R ∈ NR to a relationRÎ ⊆ ∆Î×∆Î and eachF ∈NF to a relationFÎ⊆∆Î× D. Note that we do not require the interpretation of features to be functional. In this respect, they correspond to the data properties in OWL 2 [2]. The constructors of EL^⊥(D) are interpreted as indicated in Table 1. An interpretation I satisfies an axiomα=C⊑D iffCÎ⊆DÎ (writtenI |=α). If I |=αfor everyα∈ O, thenI is a model of O(writtenI |=O). If every model I ofO satisfies the axiomαthen we say thatO entails αand we writeO |=α.

We define thesignature of an ontologyOas the setsig(O) of concept, role and feature names that occur in O. We say that an axiom in EL^⊥(D) is in normal form if it has one of the forms:A⊑B^′ (NF1),A1⊓A2 ⊑B (NF2),A⊑ ∃R.B (NF3),∃R.B⊑A(NF4),A⊑ ∃F.r(NF5) or∃F.r⊑A(NF6), whereA,A1,A2, B ∈N_C^⊤, B^′ ∈N_C^⊤,⊥, R∈NR, F ∈NF and ris aD-datatype restriction. The

(4)

normalization procedure is the same as for the EL⁺⁺ [3]; more details can be found in the technical report [7]. (N_C^⊤=NC∪ {⊤},N_C^⊤,⊥=NC∪ {⊤,⊥}).

3 Numerical Datatypes with Restrictions

In this section we introduce the notion of a Numerical Datatype with Restric- tions (NDR) which specifies which datatype relations can be used positively and negatively. We then present a sound polynomial consequence-based classification procedure forEL^⊥ extended with NDRs. Finally we prove that the procedure is complete if the NDR satisfies special safety requirements.

Definition 1 (Numerical Datatype with Restrictions).Anumerical datatype with restrictions (NDR)is a triple(D, O+, O₋), where D ⊆Ris a numerical domain andO+, O₋⊆ {<,≤, >,≥,=}is the set of positiveand, respectively, negative relations. An axiom inEL^⊥(D)is an axiom inEL^⊥(D, O+, O₋)if for every positive (negative) occurrence of a concept ∃F.(s, y) in the axiom,s∈O+

(s∈O₋). An EL^⊥(D, O+, O₋)-ontology is a set of axioms inEL^⊥(D, O+, O₋).

We are going to describe a classification procedure for EL^⊥(D, O+, O₋), which is closely related to the procedure for EL⁺⁺ [3]. In order to formulate inference rules for datatypes we need to introduce notation for satisfiability of a datatype restriction and implication between datatype restrictions. For two D-datatype restrictionsr+ andr₋, we writer+ →_Dr₋ iffr+(x) impliesr₋(x),

∀x ∈ D. We write r+ →_D ⊥ iff there is no x ∈ D such that r+(x) holds. In the opposite cases, we writer+9D r₋ andr+9D ⊥. We assume that deciding whetherr+→_D r₋ andr+→_D ⊥can be done in polynomial time. It is easy to see that this is the case whenDis the set of natural numbers, integers, reals or rationals for the set of relations{<,≤, >,≥,=}.

The classification procedure for EL^⊥(D) takes as an input an ontology O whose axioms are inEL^⊥(D) and in normal form and applies the inference rules in Table 2 to derive new axioms of the formNF1, NF3andNF5. The rules are applied to already derived axioms and use existence of axioms in O, r+ →_D ⊥ orr+→_D r₋ as side-conditions. The procedure terminates when no new axiom can be derived. It is easily checked that the procedure runs in polynomial time (there are only polynomially many possible axioms of the form NF1, NF3 and NF5oversig(O)) and that the rules in Table 2 are sound (the conclusions of the rules are logical consequences of their premises).

The completeness proof is based on the canonical model construction similarly as forEL⁺⁺ [3]. In order to deal with datatypes in the canonical model we introduce a notion of a datatype constraint. Intuitively, a constraint specifies which datatype restrictions should hold in a given element of the model and which should not.

Definition 2 (Constraint). A constraint over (D, O+, O₋) is defined as a pair of sets (S+, S₋), such that S+ = {(s¹₊, y1), . . . ,(sⁿ₊, yn)} with sⁱ₊ ∈ O+, S₋ ={(s¹₋, z1), . . . ,(s^m₋, zm)} with s^j₋ ∈ O₋, yi, zj ∈ D, (sⁱ₊, yi) 9D (s^j₋, zj)

(5)

Table 2.Reasoning rules inEL^⊥(D) (A, B, C, E∈N_C^⊤,C^′∈N_C^⊤,⊥,R∈NR,F∈NF)

IR1 A⊑A ^IR2 A⊑ ⊤ ^CR1

A⊑B

A⊑C^′ B⊑C^′∈ O

CR2 A⊑B A⊑C

A⊑D B⊓C⊑D∈ O ^CR3 A⊑B

A⊑ ∃R.C B⊑ ∃R.C∈ O

CR4 A⊑ ∃R.B B⊑C

A⊑D ∃R.C⊑D∈ O ^CR5 A⊑ ∃R.B B⊑ ⊥

A⊑ ⊥

ID1 A⊑ ⊥ A⊑ ∃F.r+∈ O, r+→D⊥ ^CD1 A⊑B A⊑ ∃F.r+

B⊑ ∃F.r+∈ O

CD2 A⊑ ∃F.r+

A⊑B ∃F.r₋⊑B∈ O, r+→Dr₋

and (sⁱ₊, yi) 9D ⊥ for 1 ≤ i ≤ n, 1 ≤ j ≤ m and m, n ≥ 0. A constraint (S+, S₋) over (D, O+, O₋) is satisfiable iff there exists a solution of (S+, S₋) that is a set V ⊆ D such that every r+ ∈S+ is satisfied by at least onev ∈V but no r₋ ∈S₋ is satisfied by anyv∈V.

Our model construction procedure works only for the cases where we can ensure that every constraint over a numerical domain is satisfiable. This leads us to a notion of safety for an NDR.

Definition 3 (NDR Safety). Let (D, O+, O₋) be an NDR. (D, O+, O₋) is safeiff every constraint over(D, O+, O₋)is satisfiable.

Definition 4 (Strong and Weak Convexity). The NDR (D, O+, O₋) is strongly convex when for everyr₊ⁱ = (sⁱ₊, yi)andr^j₋= (s^j₋, zj), with sⁱ₊ ∈O+, s^j₋∈O₋ andyi,zj ∈ D(1≤i≤n,1≤j≤m), ifVn

i=1rⁱ₊→_D Wm

j=1 r^j₋, then there exists anr^j₋(1≤j≤m) such thatVn

i=1rⁱ₊→_D r^j₋.(D, O+, O₋)isweakly convexwhen the implication holds for n= 1.

For example the NDR (Z,{<, >},{=}) is weakly convex but not strongly convex. It is weakly convex since the implications ((<, y) →Z

Wm

j=1(=, zj)) and ((>, y) →Z Wm

j=1(=, zj)) never hold. However, it is not strongly convex:

it is (>,2)∧(<,5)→Z(=,3)∨(=,4), but also (>,2)∧(<,5) 9^Z (=,3) and (>,2)∧(<,5)9^Z(=,4).

Lemma 1. (D, O+, O₋)is safe iff it is weakly convex.

Proof. We assume that (D, O+, O₋) is not weakly convex and we prove that it is non-safe. Since it is not weakly convex we have that for somer+→_DWm

j=1 r^j₋

(6)

there exists no r^j₋ such that r+ →_D r^j₋. We define (S+, S₋), withS+ ={r+} and S₋ = {r^j₋}^m_j=1 and we prove that (S+, S₋) is not satisfiable. (S+, S₋) is indeed a constraint becauser+ 9D ⊥ (otherwise r+ →_D r^j₋ is true for every r^j₋) and for everyr₋^j, r+ 9D r^j₋ (otherwise r+ →_D r₋^j is true for at least one r^j₋). Additionally, it is not satisfiable, because fromr+ →_D Wm

j=1 r^j₋ there can be found noxsuch thatr+(x) andVm

j=1¬r^j₋(x).

We prove that if (D, O+, O₋) is not safe, then it is not weakly convex.

Since it is not safe then there exists a non-satisfiable constraint (S+, S₋), where S+={r₊ⁱ}ⁿ_i=1 and S₋ = {r^j₋}^m_j=1. We have S+, S₋ 6= ∅ because otherwise a solution for (S+, S₋) exists. Since (S+, S₋) is not satisfiable there exists no x for 1 ≤ i ≤ n such that rⁱ₊(x) and Vm

j=1 ¬r₋^j(x), or otherwise written, rⁱ₊ →_D Wm

j=1 r^j₋. From this and r₊ⁱ 9D r₋^j (from the constraint definition),

(D, O+, O₋) is not weakly convex. ⊓⊔

Theorem 1 (Completeness). Let (D, O+, O₋) be a safe NDR, let O be an EL^⊥(D, O+, O₋)-ontology containing axioms in normal form and let O^′ be the saturation of Ounder the rules of Table 2. For every A,B∈(N_C^⊤∩sig(O)), if O |=A⊑B, thenA⊑B∈ O^′ orA⊑ ⊥ ∈ O^′.

Proof. The proof is analogous to the completeness proof for theEL⁺⁺ language [3]; we build a canonical modelI forO usingO^′ and show that ifA6⊑B ∈ O^′ andA6⊑ ⊥ ∈ O^′ thenI2A⊑B.

For everyA∈NC,F ∈NF, defineS+(A, F) andS₋(A, F), as follows:

S+(A, F) ={r+|A⊑ ∃F.r+∈ O^′, A⊑ ⊥∈ O/ ^′} (3) S₋(A, F) ={r₋ | ∃F.r₋⊑B∈ O, A⊑B /∈ O^′} (4) We now show that (S+(A, F), S₋(A, F)) is a constraint over (D, O+, O₋). First we prove that r+ 9D ⊥, ∀r+∈S+(A, F), which is true because otherwise due to rule ^ID1 it would be A ⊑ ⊥ ∈ O^′, in contradiction to the definition of S+(A, F). Additionally, there is no r+∈S+(A, F) and r₋ ∈ S₋(A, F) such that r+ →_D r₋, otherwise from A ⊑ ∃F.r+ ∈ O^′, ∃F.r₋ ⊑ B ∈ O and ^CD2 it would be A⊑B ∈ O^′ which contradicts the definition of S₋(A, F). Since (S+(A, F), S₋(A, F)) is a constraint over (D, O+, O₋) and (D, O+, O₋) is safe, there exists a solutionV(A, F)⊆ Dof (S+(A, F), S₋(A, F)). We now construct the canonical modelI:

∆^I ={xA|A∈(N_C^⊤∩sig(O)), A⊑ ⊥∈ O/ ^′} (5)

B^I ={x_A|xA∈∆^I, A⊑B∈ O^′} (6)

R^I ={(xA, xB)|A⊑ ∃R.B∈ O^′, xA, xB ∈∆^I} (7)

F^I ={(x_A, v)|v∈V(A, F)} (8)

We prove thatI |=Oby showing thatI |=α, whenαtakes one of theNF1-NF6.

NF1A⊑B: We need to prove AÎ ⊆BÎ. Take anx∈AÎ. By (6), x=xC

such thatC ⊑A∈ O^′. From A⊑B ∈ O and since O^′ is closed under^CR1, we have C⊑B ∈ O^′. Hencex=xC∈B^I by (6).

(7)

If B = ⊥, then we need to show that AÎ = ∅. If there exists x ∈ AÎ, then by (6) x = xC such that C⊑A∈ O^′. Since O^′ is closed under ^CR1 and A⊑ ⊥ ∈ O^′, we haveC⊑ ⊥ ∈ O^′. Thus,x=xC∈/∆Îby (5), which contradicts our assumption thatx∈AÎ.

We examine separately the case whenA =⊤. We have that xA ∈∆Î and we need to show thatxA∈BÎ. From ruleÎR2, we have that A⊑ ⊤ ∈ O^′. From rule^CR1,A⊑B∈ O^′; sincexA∈∆Î andA⊑B∈ O^′ we getxA∈BÎ by (6).

NF2A1⊓A2⊑B: We prove (A1⊓A2)Î ⊆ BÎ. Take an x ∈ (A1⊓A2)Î; then, x∈AÎ₁,x∈ AÎ₂ and by (6) x=xA for some concept name A such that A ⊑ A1 ∈ O^′ and A ⊑ A2 ∈ O^′. Since A ⊑ A1 ∈ O^′, A ⊑ A2 ∈ O^′ and A1⊓A2⊑B∈ O, closure under rule ^CR2givesA⊑B∈ O^′ orx∈BÎ, by (6).

NF3A⊑ ∃R.B: We show AÎ ⊆(∃R.B)Î; take an x∈ AÎ. By (6), x=xC

where C ⊑A∈ O^′. Since A⊑ ∃R.B ∈ O andO^′ is closed under ^CR3, we have C⊑ ∃R.B∈ O^′. SincexC∈∆Î, we haveC⊑ ⊥∈ O/ ^′ and, hence,B ⊑ ⊥∈ O/ ^′ by^CR5. Thus, xB ∈∆Î and (xC, xB)∈RÎ by (7). SinceB ⊑B ∈ O^′ byÎR1, we have xB ∈BÎ by (6). Thus,x=xC∈(∃R.B)Î.

NF4∃R.B⊑A: We prove (∃R.B)Î⊆AÎ; take anx∈(∃R.B)Î. Then, there exists y ∈ ∆Î such that (x, y)∈RÎ and y ∈ BÎ. By (7) and (6) x = xC

and y =xD such that C ⊑ ∃R.D ∈ O^′ and D ⊑ B ∈ O^′ respectively. Since

∃R.B⊑A∈ O andO^′ is closed under^CR4,C⊑A∈ O^′. By (6),x=xC∈AÎ. NF5A⊑ ∃F.r+: We show that AÎ ⊆ (∃F.r+)Î; take an x ∈ AÎ. By (6), there exists a concept name C such that x = xC and C ⊑ A ∈ O^′. Since A⊑ ∃F.r+∈ O and O^′ is closed under ^CD1, we have C⊑ ∃F.r+∈ O^′. We use (3) and (4) to build (S+(C, F), S₋(C, F)); we have r+ ∈ S+(C, F). By (8) we have (xC, v) ∈ FÎ for every v ∈ V(C, F). Since r+ ∈ S+(C, F), there exists v∈V(C, F) such thatv satisfiesr+ and, hence,x=xC ∈(∃F.r+)Î.

NF6∃F.r₋ ⊑B: We prove that (∃F.r₋)Î ⊆BÎ; take an x∈(∃F.r₋)Î. By (5), there existsC∈(N_C^⊤∩sig(O)) such thatx=xC. By (3) and (4) we construct (S+(C, F), S₋(C, F)). Since xC ∈ (∃F.r₋)Î, by (8), there exists v∈V(C, F), such that r₋(v) and V(C, F) is a solution for (S+(C, F), S₋(C, F)). Hence, r₋∈/S₋(C, F), and so,C⊑B∈ O^′ by (4). ByC⊑B∈ O^′ and (6),xC ∈BÎ.

We now show that if A ⊑B /∈ O^′ and A ⊑ ⊥ ∈ O/ ^′, then O 2 A ⊑B by proving I 2 A ⊑ B (since I |= O). AÎ *BÎ holds, because xA ∈ ∆Î (from A⊑ ⊥∈ O/ ^′ and (5)),xA∈AÎ (fromA⊑A∈ O^′ using ruleÎR1and by (6)) and

xA∈/ B^I (fromA⊑B /∈ O^′ and (6)). ⊓⊔

4 Maximal Safe NDRs for N , Z , R and Q

In this section we present a full classification of safe NDRs for natural numbers (0∈N), integers, reals and rationals. Table 3 lists all maximal safe NDRs forN, Z, R and Q. Due to space constraints we present proofs only for the maximal NDRs of natural numbers, that isNDR1,NDR2,NDR9andNDR10. For these we show that: (i) they are safe (ii) extending any of them leads to non-safety and (iii) every safe NDR w.r.t.Nis contained in one of theNDR1,NDR2,NDR9orNDR10. Table 4 presents some basic transformations between (satisfiable) constraints.

(8)

Table 3.Maximal safe NDRs forN,Z,RandQwhereDis the domain andO+ ,O−

is the set of positive and, respectively, negative relations

NDR D O+ O₋

NDR¹ N,Z,R,Q {=} {<,≤, >,≥,=}

NDR² N,Z {>,≥,=} {<,≤,=}

NDR³ Z {<,≤,=} {>,≥,=}

NDR⁴ R,Q {<, >,≥,=} {<,≤,=}

NDR⁵ R,Q {<,≤, >,=} {>,≥,=}

NDR⁶ Z {<,≤, >,≥,=} {=}

NDR⁷ R,Q {<,≤, >,≥,=} {≤,=}

NDR⁸ R,Q {<,≤, >,≥,=} {≥,=}

NDR⁹ N,Z,R,Q {<,≤, >,≥,=} {<,≤}

NDR¹⁰ N,Z,R,Q {<,≤, >,≥,=} {>,≥}

Table 4.Transformations C1 ⇒C2 preserving constraints and their satisfiability for N, whereS₋,S+ andSare sets of datatype restrictions andy1≤y2,z1≤z2

C1= (S∪S+¹, S₋),C2= (S∪S+², S₋) C1= (S+, S∪S₋¹),C2= (S+, S∪S₋²)

S+¹ S+² S₋¹ S₋²

{(<, y)} {(≤, y−1)} {(<, z)} {(≤, z−1)}

{(>, y)} {(≥, y+ 1)} {(>, z)} {(≥, z+ 1)}

{(≤, y1),(≤, y2)} {(≤, y1)} {(≤, z1),(≤, z2)} {(≤, z2)}

{(≥, y1),(≥, y2)} {(≥, y2)} {(≥, z1),(≥, z2)} {(≥, z1)}

{(=, y¹),(≤, y2)} {(=, y¹)} {(=, z¹),(≤, z2)} {(≤, z2)}

{(≥, y1),(=, y2)} {(=, y2)} {(≥, z1),(=, z2)} {(≥, z1)}

{(<,0)} ∅

Lemma 2. Let C1 and C2 be as defined in Table 4 and (N, O+, O₋) be an NDR. Then (i) C1 is a constraint over (N, O+, O₋) iff C2 is a constraint over (N, O+, O₋) and (ii) if C1 and C2 are both constraints over (N, O+, O₋), then C1 is satisfiable iffC2 is satisfiable.

Corollary 1. For N, let NDRi with i = 1,2,9,10. For every C1= (S₊¹, S₋¹) over NDRi there exists a constraint C2 = (S₊², S₋²) over NDRi, y1, . . . , yn∈N andz1, . . . , zm∈Nwithm,n≥0 such that:

S₊² ⊆ {(≤, y1),(=, y2), . . . ,(=, yn−1),(≥, yn)}

S₋² ⊆ {(≤, z1),(=, z2), . . . ,(=, z_m−1),(≥, zm)}

where z1 < y1 < . . . < yn < zm, z1 < . . . < zm, yi 6= zj (2 ≤ i ≤ n−1, 2≤j≤m−1) and C1 overNDRi is satisfiable iffC2 overNDRi is satisfiable.

Lemma 3. NDR1,NDR2,NDR9 andNDR10 (all for N) are safe.

Proof. We prove safety by building a solution V for every (S+, S₋) over the NDRs; by Corollary 1 we can assume w.l.o.g. the following restrictions:

(9)

NDR1: For S+ we have that S+ ⊆ {(=, y1), . . . ,(=, yn)} and for S₋ that S₋ ⊆ {(≤, z1),(=, z2), . . . ,(=, zm−1),(≥, zm)} with z1 < y1 < . . . < yn < zm, z1< . . . < zmandyi6=zj (1≤i≤n, 2≤j≤m−1).V ={y1, . . . , yn}.

NDR2: ForS+we have thatS+⊆ {(=, y1), . . . ,(=, y_n−1),(≥, yn)}and forS₋ thatS₋⊆ {(≤, z1),(=, z2), . . . ,(=, zm)}withz1< y1< . . . < yn,z1< . . . < zm

and yi 6= zj (1 ≤ i ≤ n−1, 2 ≤ j ≤ m). V = {y1, . . . , y_n−1, y^′_n}, where y_n^′ = max(yn, zm) + 1.

NDR⁹: For S+ we have that S+⊆ {(≤, y1),(=, y2), . . . ,(=, y_n−1),(≥, yn)}

and forS₋ that S₋⊆ {(≤, z1)}withz1< y1< . . . < yn.V ={y1, . . . , yn}.

NDR10: For S+ we have that S+⊆ {(≤, y1),(=, y2), . . . ,(=, y_n−1),(≥, yn)}

and forS₋ that S₋⊆ {(≥, z1)}withy1< . . . < yn< z1.V ={y1, . . . , yn}. ⊓⊔ Lemma 4. Let NDR= (N, O+, O₋). If (a), (b) or (c), then NDRis non-safe.

(a) O+∩ {<,≤, >,≥} 6=∅,O₋∩ {<,≤} 6=∅ andO₋∩ {>,≥} 6=∅.

(b) O+∩ {>,≥} 6=∅,O₋∩ {>,≥} 6=∅ and{=} ⊆O₋. (c) O+∩ {<,≤} 6=∅ and{=} ⊆O₋.

Proof. For every of the cases (a)-(c) we provide a counterexample that violates the weak convexity condition and, thus by Lemma 1, safety:

(a): (<,3)→N(<,1)∨(≥,1) but (<,3) 9^N (<,1) and (<,3) 9^N (≥,1). The same counterexample applies when O+∩ {<,≤} 6=∅,{≤, >} ⊆O₋ and when O+∩ {<,≤} 6= ∅, {≤,≥} ⊆ O₋. For O+∩ {<,≤} 6= ∅, {<, >} ⊆ O₋ it is (<,3)→N(<,2)∨(>,1) but (<,3) 9^N (<,2) and (<,3) 9^N(>,1). A similar example can be given for the the cases whenO+∩ {>,≥} 6=∅.

(b): (>,1)→^N(=,2)∨(≥,3) but (>,1)9^N(=,2) and (>,1)9^N(≥,3) (>,1)→^N(=,2)∨(>,2) but (>,1)9^N(=,2) and (>,1)9^N(>,2) (≥,1)→^N(=,1)∨(≥,2) but (≥,1)9^N(=,1) and (≥,1)9^N(≥,2) (≥,1)→^N(=,1)∨(>,1) but (≥,1)9^N(=,1) and (≥,1)9^N(>,1) (c): (<,3)→^N(=,1)∨(=,2) but (<,3)9^N(=,1) and (<,3)9^N(= 2)

(≤,2)→^N(=,1)∨(=,2) but (≤,2)9^N(=,1) and (≤,2)9^N(= 2) ⊓⊔ Lemma 5. NDR1,NDR2,NDR9 andNDR10 (all for N) are maximal safe, that is if any relation is added toO+ orO₋ they become non-safe.

Proof. We examine all cases of adding a new relation:

NDR1: If any of the <,≤,>,≥is added to O+, thenNDR1becomes non-safe due to Lemma 4(a).

NDR2: If > or ≥is added to O₋, then non-safety is due to Lemma 4(b). For adding<or≤toO+, non-safety is due to Lemma 4(c).

NDR9: If>or≥is added toO₋, then non-safety is due to Lemma 4(a). When

= is added toO₋ thenNDR9becomes non-safe due to Lemma 4(c).

NDR10: If<or≤is added toO₋, then non-safety is due to Lemma 4(a). When

= is added toO₋ thenNDR10becomes non-safe due to Lemma 4(c). ⊓⊔ It remains to demonstrate that every safe NDR forNis contained in one of the NDR1,NDR2,NDR9orNDR10. In the following, we assume thatOⁱ₊andO₋ⁱ are defined such thatNDRi= (N, O+ⁱ, O₋ⁱ) withi= 1,2,9,10.

(10)

Lemma 6. If (N, O+, O₋) is a safe NDR, then O+ ⊆ Oⁱ₊ and O₋ ⊆ Oⁱ₋ for i= 1,2,9 or 10.

Proof. The proof is by case analysis of possible relations inO+ andO₋. Case 1: O+∩ {<,≤, >,≥}=∅. In this case,O+⊆O¹₊ andO₋⊆O₋¹.

Case 2:O+∩ {<,≤, >,≥} 6=∅. IfO₋∩ {<,≤} 6=∅ andO₋∩ {>,≥} 6=∅at the same time, then from Lemma 4(a), the NDR is non-safe. Therefore, we examine two cases: eitherO₋⊆ {>,≥,=} orO₋⊆ {<,≤,=}.

Case 2.1:O₋⊆ {>,≥,=}. We distinguish eitherO₋⊆ {>,≥}or{=} ⊆O₋. Case 2.1.1:O₋⊆ {>,≥}=O₋¹⁰andO+⊆O₊¹⁰.

Case 2.1.2: {=} ⊆ O₋. By Lemma 4(c) it should be O+ ⊆ {>,≥,=} = O²₊ otherwise the NDR is non-safe. IfO₋∩ {>,≥} 6=∅then the NDR is non-safe by Lemma 4(b); otherwise O₋={=} ⊆O²₋.

Case 2.2:O₋⊆ {<,≤,=}=O₋². IfO+⊆ {>,≥,=}, thenO+ ⊆O²₊. Otherwise, O+∩ {<,≤} 6=∅and we distinguish whetherO₋⊆ {<,≤}or{=} ∈O₋. Case 2.2.1:O₋⊆ {<,≤}=O₋⁹ andO+⊆O₊⁹.

Case 2.2.2:{=} ∈O₋. In this case, the NDR is non-safe by Lemma 4(c). ⊓⊔ For the cases of integers, reals and rationals the proofs are analogous to the case of natural numbers. The interested reader can find details in the technical report [7]. In the following, we provide a brief explanation for the results. We notice two new maximal safe NDRs w.r.t.Z, namelyNDR3andNDR6. The reason is that integers do not have a minimal element such as 0 in the case of naturals.

In particular positive occurrences of<(or≤) and negative occurrence of = are no longer dangerous (e.g. (≤,1) 9^Z (=,1)∨(=,0) does not hold anymore).

Reals and rationals are examples of dense domains: between every two different numbers there always exists a third one. This property is responsible for new safe NDRs. Specifically,O+ ofNDR2andNDR3can be extended with <and>

respectively because the weak convexity property which did not apply forZnow applies for R(e.g. (<,5) 9^R(=,4)∨(≤,3)). For the same reason, either≤ or

≥can be added toO₋ ofNDR⁶ (e.g. (≤,5)9^R(=,5)∨(≤,4)).

5 Related Work and Conclusions

Datatypes have been extensively studied in the context of DLs [3, 6, 8]. Exten- sions of expressive DLs with datatypes have been examined in depth [6] with the main focus on decidability. Baader, Brandt and Lutz [3] formulated tractable extensions ofELwith datatypes using ap-admissibility restriction for datatypes.

A datatypeDisp-admissible if (i) satisfiability and implication of conjunctions of datatype restrictions can be decided in polynomial time, and (ii)Dis convex:

if a conjunction of datatype restrictions implies a disjunction of datatype restrictions then it also implies one of its disjuncts [3]. In our case instead of condition (i) we require that implication and satisfiability of just datatype restrictions (not conjunctions) is decidable in polynomial time since we do not consider functional features. Condition (ii) is relaxed to the requirement of safety for NDRs since we take into account not only the domain of the datatypes and the types of

(11)

restrictions but also the polarity of their occurrences. The relaxed restrictions allow for more expressive usage of datatypes in tractable languages, as demon- strated by the example given in the introduction. Furthermore, Baader, Brandt and Lutz did not provide a classification of datatypes that arep-admissible; in our case we provide such a classification for natural numbers, integers, rationals and reals. The EL Profile of OWL 2 [2] is inspired by EL⁺⁺ and restricts all OWL 2 datatypes to satisfyp-admissibility in such a way that only equality can be used. Our result can allow for a significant extension of datatypes in the OWL 2 EL Profile, where in addition inequalities can be used negatively.

Our work is not the only one where the convexity property is relaxed without losing tractability. It has been shown [8] that the convexity requirement is not necessary provided that (i) the ontology contains only concept definitions of the form A ≡C, where A is a concept name, and (ii) every concept name occurs at most once in the left-hand side of the definition. In some applications this requirement can be too restrictive since it disallows the usage of general concept inclusion axioms (GCIs), such as the axiom (2) given in the introduction, which do not cause any problem in our case.

In this work we made a fine-grained analysis of extensions of EL with numerical datatypes, focusing not only on the types of relations but also on the polarities of their occurrences in axioms. We made a full classification of cases where these restrictions result in a tractable extension for natural numbers, integers, rationals and reals. One practically relevant case for these datatypes is when positive occurrences of datatype expressions can only use equality and negative occurrences can use any of the numerical relations considered. This case was motivated by an example of a pharmacy-related ontology and can be proposed as a candidate for a future extension of the OWL 2 EL Profile. For the cases where the extension is tractable, we provided a polynomial sound and complete consequence-based reasoning procedure, which can be seen as an extension of the completion-based procedure forEL. We think that the procedure can be straightforwardly extended to accommodate other constructors inEL⁺⁺

such as (complex) role inclusions, nominals, domain and range restrictions and assertions since these constructors do not interact with datatypes [9]. We hope to investigate these extensions in future works.

In future work we also plan to consider other OWL datatypes, such as strings, binary data or date and time, functional features, and to try to extend the consequence-based procedure for HornSHIQ [10] with our rules for datatypes.

For example, to extend the procedure with functional features, we probably need a notion of “functional safety” for an NDR that corresponds to the strong convexity property (see Definition 4). In order to achieve even higher expressivity for datatypes we shall study how to combine different restrictions on the datatypes occurring in an ontology so that tractability is preserved. For example, using two safe NDRs in a single ontology may result in intractability, as is the case for NDR1andNDR6for integers (see Table 3). One possible solution to this problem is to specify explicitly which features can be used with which NDRs in order to separate their usage in ontologies.

(12)

References

1. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.:

The Description Logic Handbook: Theory, Implementation, and Applications. 2nd edn. Cambridge University Press (2007)

2. Grau, B.C., Horrocks, I., Motik, B., Parsia, B., Patel-Schneider, P.F., Sattler, U.:

OWL 2: The next step for OWL. J. Web Sem.6(4) (2008) 309–322

3. Baader, F., Brandt, S., Lutz, C.: Pushing theELEnvelope. In: IJCAI, Professional Book Center (2005) 364–369

4. Cote, R., Rothwell, D., Palotay, J., Beckett, R., , Brochu, L.: The systematized nomenclature of human and veterinary medicine. Technical report, Northfield, IL:

College of American Pathologists (1993)

5. The Gene Ontology Consortium: Gene Ontology: Tool for the unification of biology.

Nature Genetics25(2000) 25–29

6. Lutz, C.: Description Logics with Concrete Domains-A Survey. In: Advances in Modal Logic, King’s College Publications (2002) 265–296

7. Magka, D., Kazakov, Y., Horrocks, I.: Tractable extensions of the description logic EL with numerical datatypes. Technical report, Oxford University Com- puting Laboratory (2010) posted on http://web.comlab.ox.ac.uk/isg/people/

despoina.magka/publications/reports/NDRTechnicalReport.pdf.

8. Haase, C., Lutz, C.: Complexity of Subsumption in theELFamily of Description Logics: Acyclic and Cyclic tboxes. In: ECAI. Volume 178., IOS Press (2008) 25–29 9. Baader, F., Brandt, S., Lutz, C.: Pushing theEL envelope further. In: In Pro- ceedings of the OWLED 2008 DC Workshop on OWL: Experiences and Directions.

(2008)

10. Kazakov, Y.: Consequence-driven Reasoning for Horn SHIQ Ontologies. In:

IJCAI. (2009) 2040–2045