• Aucun résultat trouvé

Role-depth Bounded Least Common Subsumer in Prob-EL with Nominals

N/A
N/A
Protected

Academic year: 2022

Partager "Role-depth Bounded Least Common Subsumer in Prob-EL with Nominals"

Copied!
19
0
0

Texte intégral

(1)

Role-depth bounded Least Common Subsumer in Prob-EL with Nominals

Andreas Ecke1?, Rafael Pe˜naloza1,2??, and Anni-Yasmin Turhan1? ? ?

1 Institute for Theoretical Computer Science, Technische Universit¨at Dresden

2 Center for Advancing Electronics Dresden {ecke,penaloza,turhan}@tcs.inf.tu-dresden.de

Abstract. Completion-based algorithms can be employed for comput- ing the least common subsumer of two concepts up to a given role-depth, in extensions of the lightweight DLEL. This approach has been applied also to the probabilistic DL Prob-EL, which is variant ofELwith sub- jective probabilities. In this paper we extend the completion-based lcs- computation algorithm to nominals, yielding a procedure for the DL Prob-ELO01c .

1 Introduction

The least common subsumer (lcs) is a reasoning service that generalizes a set of input concept descriptions into a single concept description that subsumes all the input concepts and is the least one w.r.t. subsumption. This inference has turned out to be quite useful for a number of applications, like the definition of similarity measures for concept descriptions [5, 7], the bottom-up construction of knowledge bases [4], information retrieval, and more (see [6, 16]).

In particular, for large biomedical ontologies the lcs can be used effectively to aid construction and maintenance. Many of these biomedical ontologies, notably SNOMED CT [12]and the Gene Ontology [1], are written in the EL-family of lightweight description logics.

An interesting extension of EL that still admits tractable reasoning is the use of nominals. Nominals basically allow to characterize a concept in terms of specific individuals. Nominals are also admitted in the OWL 2 EL profile of OWL [14] and thus are interesting to practical applications. For the EL-family of DLs there exist completion-based classification algorithms [3] that compute all the subsumption relations between all named concepts and nominals in an ontology in polynomial time. Kazakov et al. showed in [11] that the original completion algorithm is indeed incomplete forELO and introduced a complete

?Supported by DFG Graduiertenkolleg 1763 (QuantLA).

?? Partially supported by DFG within the Cluster of Excellence ‘cfAED’

? ? ?

Partially supported by the German Research Foundation (DFG) in the Collaborative Research Center 912 “Highly Adaptive Energy-Efficient Computing”.

(2)

consequence-based classification algorithm for this DL. This algorithm is a vari- ant of the completion-based algorithm, with the additional benefit of exhibiting a pay-as-you-go behavior, and thus allows for efficient implementation.

Another extension ofEL is Prob-EL [13], which allows the modeling of un- certain knowledge by introducing probabilistic constructors. Prob-ELuses sub- jective (or Type-2 [10]) probabilities, which correspond to degrees of belief and are interpreted using a multiple-world semantics. For example, in Prob-ELone can express that obese people are likely to have high pressure, without requiring every obese person to be hypertense, using the GCI

ObesevP≥0.9∃hasCondition.HighPressure.

A completion algorithm for classifying TBoxes in the sublanguage Prob-EL01c of Prob-ELwas described in [13].

For generalEL-TBoxes with cyclic concept definitions, the lcs may not exist, as it may require an infinite nesting of existential restrictions to be expressed.

Therefore, an approximation has been introduced in [16], that limits the maxi- mal nesting of quantifiers of the resulting concept descriptions. These approxi- mations, called role-depth bounded lcs (k-lcs), can be computed forELwith role inclusions by using completion sets produced by the completion-based classifi- cation algorithm [8].

In this paper, we give a classification algorithm for the classic DLELOintro- duced in [11] in terms of the completion algorithm in order to extend thek-lcs to this DL. Furthermore, we extend the completion algorithm for (a moderate form of) subjective probabilities to nominals, which results in a classification algorithm for the DL Prob-ELO01c . In this direction, we correct a small error from the algorithm in [13] that caused it to be incomplete. From this algorithm, we develop an algorithm that computes the k-lcs for Prob-ELO01c . We show that for both lcs approximations, the resulting concept description subsumes all input concepts, and is the least one w.r.t. subsumption. Thus, if the exact lcs exists for some role-depth boundn, then the k-lcs is exact forkn. Recently, necessary and sufficient conditions for the existence of the lcs w.r.t. generalEL- TBoxes have been devised [17]. By the use of these conditions the k for which the role-depth bound lcs and the lcs coincide can be determined, if the lcs exists.

The paper is organized as follows. First, we introduce some basic notions.

In Section 3 we give a completion algorithm for EL with nominals and extend the k-lcs algorithm to this DL. The completion-based classification algorithm for Prob-ELO01c is devised in Section 4, along with the algorithm for thek-lcs for this probabilistic DL. We end with conclusions and remarks on future work.

The main proofs of Section 4 have been deferred to the appendix.

2 Preliminaries

ELO-concept descriptions are built from mutually disjoint sets NC of concept names, NRofrole names andNI ofindividual names using the syntax rule:

C, D::=> |A| {a} |CuD| ∃r.C,

(3)

Table 1.Concept constructors and TBox axioms forELO.

Syntax Semantics

named concept A (A∈NC) AII

top concept > I

nominal {a} (a∈NI) {a}I={aI}

conjunction CuD (CuD)I =CIDI

existential restriction ∃r.C (r∈NR) (∃r.C)I ={d∈I| ∃e.(d, e)∈rIeCI}

GCI CvD CIDI

whereANC,rNRandaNI. As usual the semantics ofELO-concepts are defined by means of interpretations. The semantics of named concepts and roles are extended to concept descriptions as shown in the upper part of Table 1.

An ELO-TBox consists of a finite set of general concept inclusion axioms (GCIs) of the form C vD. We use CD as an abbreviation forC vD and D vC. WithSig(T) we denote the signature of a TBoxT, i.e., the set of all concept, role, and individual names that occur inT. An interpretation is amodel for a TBox if it satisfies all its GCIs, as shown in the lower part of Table 1.

The central inference discussed in this paper is theleast common subsumer (lcs) of two concept descriptions, i.e., to compute a concept that subsumes both and is the least one w.r.t. subsumption. Since the lcs does not need to exist for generalEL-TBoxes [2], we follow the idea from [16] and compute an approxima- tion of the lcs that limits the role depth (rd(C)), i.e., the maximal nesting of quantifiers of the resulting conceptC:

Definition 1 (Role-depth bounded lcs). Let L be a DL, T be an L-TBox and k∈IN. The role-depth bounded least common subsumerof twoL-concept descriptions C1, C2 w.r.t.T (written:k-lcsT(C1, C2)) is theL-concept descrip- tionD s.t.:

1. rd(D)k,

2. C1vT D andC2vT D, and

3. for eachL-concept description E holds: C1vT E andC2vT E and rd(E)k, impliesDvT E.

For the DLs considered in this paper, thek-lcs is unique up to equivalence for a givenk; thus we speak ofthe k-lcs.

3 The k-lcs in ELO

The algorithms to compute the role-depth bounded lcs are based on completion- based classification algorithms for the corresponding DL. ForELO, a consequence- based classification algorithm is given by Kazakov et al. in [11], building upon the incomplete completion algorithm developed in [3]. The completion algorithm presented next adapts the ideas of the complete algorithm.

(4)

3.1 Completion Algorithm for Classification of ELO-TBoxes

The completion algorithms works on normalized TBoxes. We define forELOthe set ofbasic concepts for a TBoxT as

BCT = (Sig(T)∩(NCNI))∪ {>}.

AnELO-TBoxT is innormal form if every GCI contained inT is of one of the forms:

AvB, A1uA2vB, Av ∃r.B, or∃r.AvB,

withA, A1, A2, B∈BCT. AllELO-TBoxes can be transformed into normal form in linear time by applying a set of normalization rules similar to those given in [3].Before describing the completion algorithm in detail, we introduce the reachability relation R, which plays a fundamental role in the correct treatment of nominals [3, 11].

Definition 2 ( R).LetT be anELO-TBox in normal form,GNCa concept name, and D ∈BCT.G RD iff there exist roles r1, . . . , rn and basic concepts A0, . . . , An, B0, . . . , Bn ∈ BCT, n ≥ 0 such that Ai vT Bi for all 0 ≤in, Bi−1v ∃ri.Ai∈ T for all1≤in,A0 is eitherGor a nominal, andBn =D.

Informally, the concept name D is reachable from G if there is a chain of ex- istential restrictions leading to D that starts either with G or with a nominal.

This implies that, forG RD, if the interpretation ofGis not empty, then the interpretation ofDcannot be empty either. This, in turn can cause equivalence of concepts, e.g.|AI|>0 andAv {b}impliesA≡ {b}.

The basic idea of the completion algorithm forEL (without nominals) is to store all basic concepts that subsume a conceptA∈(Sig(T)∩NC)∪ {>}in its subsumer setS(A) and and all basic conceptsB for which∃r.B subsumesAin the subsumer setS(A, r). These completion sets are then extended using a set of completion rules. However, with nominals the algorithm needs to keep track of completion sets of the form SG(A) and SG(A, r) for every G ∈ (Sig(T)∩ NC)∪ {>}, since the non-emptiness of an interpretation of a concept G may imply additional subsumption relationships for A. The completion set SG(A) for A ∈ BCT therefore stores all basic concepts that subsume A under the assumption that G is not empty. Similarly SG(A, r) stores all concepts B for which ∃r.B subsumes Aunder the same assumption. For every G∈(Sig(T)∩ NC)∪ {>}, every basic concept A and every role name r, the completion sets are initialized as SG(A) = {A,>} and SG(A, r) = ∅. The completion sets are then extended by applying the completion rules adapted from [11] and shown in Figure 1 exhaustively.

It can be shown that the algorithm terminates in polynomial time, and is sound and complete for classifying the TBoxT [11]. In particular, if no rules are applicable the completion sets have the following properties.

Proposition 1 ([11]). Let T be an ELO-TBox in normal form to which the completion rules have been applied exhaustively, C, D∈BCT,r∈Sig(T)∩NR, and G = C if CNC and G = > otherwise. Then, the following properties hold:

(5)

OR1 IfA1SG(A),A1vB∈ T andB6∈SG(A), thenSG(A) :=SG(A)∪ {B}

OR2 IfA1, A2SG(A),A1uA2vB∈ T andB6∈SG(A), thenSG(A) :=SG(A)∪ {B}

OR3 IfA1SG(A),A1v ∃r.B∈ T andB6∈SG(A, r), thenSG(A, r) :=SG(A, r)∪ {B}

OR4 IfBSG(A, r),B1SG(B),∃r.B1 vC∈ T andC6∈SG(A), thenSG(A) :=SG(A)∪ {C}

OR5 If{a} ∈SG(A1)∩SG(A2),G RA2, andA2/SG(A1), thenSG(A1) :=SG(A1)∪ {A2}

Fig. 1.Completion rules forELO

CvT D iff DSG(C), and

CvT ∃r.D iff there existsE∈BCT such thatESG(C, r)andDSG(E).

We now show how to use these completion sets for computing the role-depth bounded lcs forELO-concept description w.r.t. a generalELO-TBox.

3.2 Computing the Role-depth Bounded ELO-lcs

In order to compute the role-depth bounded lcs of twoELO-conceptsC andD, we follow an idea very similar to the one presented forELR-concepts in [8], where we compute the cross product of the tree unravelings of the completion graph for C andD up to the role-depthk. Clearly, in the presence of nominals, the right completion sets need to be chosen such that they preserve the non-emptiness of the interpretation of concepts derived by R.

An algorithm that computes the role-depth boundedELO-lcs using comple- tion sets is shown in Figure 2. In the first step, the algorithm introduces two new concept namesAandB as abbreviations for the conceptsCandD, and the augmented TBox is normalized. The completion sets are then initialized and the completion rules from Figure 1 are applied exhaustively, yielding the saturated completion sets. In the recursive procedurek-lcs-rfor conceptsAandB, we first obtain all the basic concepts that subsume bothAand B from the setsSA(A) andSB(B). For every role namer, the algorithm then recursively computes the (k−1)-lcs of the conceptsA0andB0in the subsumer setsSA(A, r) andSB(B, r), i.e. for whichAvT ∃r.A0 andBvT ∃r.B0. The resulting concepts are conjoined as existential restrictions to the resultingk-lcs.

The algorithm only introduces concept and role names that occur in the original TBox T. Therefore those names introduced by the normalization are not used in the concept description for the k-lcs and an extra denormalization step as in [16, 8] is not necessary.

Notice, that for every pair (A0, B0) ofr-successors ofA andB it holds that A RA0andB RB0. Intuitively, we are assuming that the interpretation of both AandBis not empty. This in turn causes the interpretation of∃r.A0 and∃r.B0 to be not empty, either. Thus, it suffices to consider the completion setsSA(. . .)

(6)

Procedurek-lcs(C, D,T, k)

Input:C, D:ELO-concept descriptions;T:ELO-TBox;k∈IN Output:role-depth boundedELO-lcs ofC, Dw.r.t.T andk

1: T0:=normalize(T ∪ {A≡C, BD}) 2: ST :=apply-completion-rules(T0)

3: return k-lcs-r(A, B,ST, k, A, B,Sig(T)) Procedurek-lcs-r(X, Y,ST, k, A, B,Sig(T))

Input:A, B: concept names,X, Y: basic concepts withA RX, B RY;k∈IN;

ST: set of saturated completion sets;Sig(T): signature ofT Output:role-depth boundedELO-lcs ofX, Y w.r.t.T andk

1: CN :=SA(X)∩SB(Y)∩BCT

2: if k= 0then 3: return l

P∈CN

P 4: else

5: return l

P∈CN

Pu l

r∈Sig(T)∩NR

l

E∈SA(X,r), F∈SB(Y,r)

∃r.k-lcs-r E, F,ST, k−1, A, B,Sig(T)

Fig. 2.Computation algorithm for role-depth boundedELO-lcs.

andSB(. . .), without the need to additionally computeSA0(. . .) andSB0(. . .), or the completion setsSE(. . .) for any other basic conceptEencountered during the recursive computation of thek-lcs. This allows for a goal-oriented optimization in cases where there is no need to classify the full TBox.

4 The k-lcs in Prob-ELO

01c

So far, we have discussed only classic DLs that can be used to represent and reason with certain knowledge. However, it is not uncommon to encounter sit- uations where a degree of uncertainty is unavoidable. This is often the case in the medical and biological domains, where knowledge is obtained through clini- cal testing, and there might exist hidden, or not completely understood, factors affecting the outcome. For instance, we would like to be able to express that obese people arelikelyto have high pressure, without asserting that every obese personmust have high pressure.

The probabilistic logic Prob-ELwas introduced by [13] as an extension ofEL that allows to express uncertain knowledge through probabilistic concepts and roles. Here, we extend these ideas to cover also nominals. Formally, Prob-ELO concepts extend classicalELOconcepts with the constructors

P.qC and ∃P.qr.C,

where rNR, .∈ {>, <,≥,≤,=}, andq∈[0,1]. Intuitively, a concept of the formP.qCdenotes the class of all objects that belong toCwith a probability.q.

(7)

For the above example, we can use the conceptP≥0.9∃hasCondition.HighPressure to represent the class of all individuals that are likely to have high pressure.

The semantics of this logic generalizes the semantics of classicalELOby con- sidering a set of possible worlds, corresponding to a formalization of subjective (or Type 2) probabilities [10]. Formally, the semantics of Prob-ELOare based on probabilistic interpretations of the form I = (∆I, W,(Iw)w∈W, µ), where∆I is a (non-empty) domain,W is a non-empty set of (possible)worlds,µis a discrete probability distribution over W, and for every wW, Iw is a classical ELO interpretation with domain I. Additionally, for every aNI and every two worldsw, w0W, it holds thataIw=aIw0.

From a probabilistic interpretation, we can compute the probability that a given element of the domain dI belongs to the interpretation of a named conceptA, and respectively, the probability that a pair of individuals are related via a roleras follows:

pId(A) :=µ({wW |dAIw}), pId,e(r) :=µ({wW |(d, e)∈rIw}).

The functionsIwandpId are extended to general concepts through the following mutual recursion.

>Iw = I, (∃r.C)Iw= {d∈I| ∃e∈CIw.(d, e)rIw}, (CuD)Iw = CIwDIw, (P.qC)Iw= {d∈I|pId(C). q},

({o})Iw = {oIw}, (∃P.qr.C)Iw= {d∈I| ∃e∈CIw.pId,e(r). q}, pId(C) =µ({wW |dCIw}).

We say that the probabilistic interpretation I = (∆I, W,(Iw)w∈W, µ) satisfies the GCICvD if for every worldwW it holds thatCIwDIw.I is amodel of the TBoxT ifI satisfies all GCIs inT. A conceptCis subsumed by concept D w.r.t. the TBoxT (CvT D), if every modelI ofT satisfiesCvD.

Unfortunately, the probabilistic constructors increase the complexity of rea- soning, and deciding subsumption becomes intractable. In fact, as shown in [9], the problem isExpTime-complete, even if only one constructor of the formP.q

withq∈(0,1) is allowed. Moreover, the problem becomesPSpace-hard if prob- abilistic existential restrictions of the form∃P>0ror ∃P=1r are used. To regain tractability Prob-ELwas restricted in [13] to probabilistic concepts of the form P>0CorP=1Cand no probabilistic existential restrictions or roles, yielding the DL Prob-EL01c . The extension of this DL by nominals is Prob-ELO01c , which is the DL we consider for the remainder of the paper.

4.1 Completion Algorithm for Prob-ELO01c

The basic idea of the completion algorithm for Prob-ELO01c is the same as in the crisp case: to construct a canonical model of the given TBox. In order to intro-

(8)

duce it, we need to extend the notion of basic concepts to the new constructors BCT = (Sig(T)∩(NCNI))∪ {>} ∪ {P>0A, P=1A|A∈Sig(T)∩NC}.

Since probabilistic interpretations contain a set of worlds, the completion al- gorithm has to work on sets of completion sets: one for each world and, since Prob-ELO01c also contains nominals, for each basic concept. LetP0T denote the set of all probabilistic concepts of the form P>0A appearing in T. The com- pletion algorithm uses a set of worlds V := {0,1, ε} ∪ P0T; this way, for each GCI B vP>0A ∈ T the world v =P>0A serves as witness for this subsump- tion. Therefore, the Prob-ELO01c completion sets are now of the formSG(X, v) or SG(X, r, v) where X is a concept name, >, or a nominal, vV, G is a concept name or >, ∗ ∈ {0, ε} and ris a role name. The completion sets con- tainbasic concepts fromBCT. The normal form for Prob-ELO01c -TBoxes is the same as the normal form as for ELO-TBoxes, i.e. each axiom is of the form C v D, C1uC2 vD, C v ∃r.A, or ∃r.AvD, with C, C1, C2, D ∈BCT and A∈(Sig(T)∩NC)∪ {>}.

The reachability relation for Prob-ELO01c -concepts extends the one for ELO in distinguishing between concept names X and probabilistic concepts P>0X. For example, non-emptiness of Gdoes not imply non-emptiness ofP>0X, even ifG RX, e.g. in worlds with probability 0. Similarly, non-emptiness ofGdoes not imply non-emptiness ofX forG RP>0X. Therefore we introduce two kinds of reachability relation,G 0RX forG RX andG εRX forG RP>0X: Definition 3 (Reachability of Prob-ELO01c -concept descriptions). LetT be a Prob-ELO01c -TBox in normal form,GNC∪ {>}, andDNCNI.

G0RD iff there exist r1, . . . , rnNR and basic concepts A0, . . . , An where AiS0G(Ai−1, ri,0)for all1≤in, such thatA0is eitherGor a nominal andAn=D.

G εRD iff G 0RX, P>0YS0G(X,0) and there are r1, . . . , rnNR and A0, . . . , AnNC with AiSεG(Ai−1, ri, ε) for all 1 ≤ in such that A0=Y andAn=D.

When it is clear from the context which relation 0R or εR we are using, we will sometimes denote it simply as R.

Additionally, nominals interact with the set of possible worlds in a different way than normal concepts. In particular, the conceptsP>0{a} andP=1{a} are indeed equivalent to{a}, since{a}is interpreted as the singleton domain element aIin each world. This implies that alsoX vP>0{a},X vP=1{a}andXv {a}

are equivalent. In other words, whenever {a} ∈ SG(X, v), then{a} must be in SG(X, w) for allwV.

The completions setsSG(X, v) andSG(X, r, v) are initialized as follows:

S0G(X,0) ={>, X} andS0G(X, v) ={>}for allvV \ {0}, SεG(X, ε) ={>, X}andSεG(X, v) ={>}for allvV \ {ε}, S0G(X, r, v) =SεG(X, r, v) =∅for allvV.

(9)

PR1 IfC0SG(X, v) andC0vD∈ T, thenSG(X, v) :=SG(X, v)∪ {D}

PR2 IfC1, C2SG(X, v) andC1uC2vD∈ T, thenSG(X, v) :=SG(X, v)∪ {D}

PR3 IfC0SG(X, v) andC0v ∃r.D∈ T, thenSG(X, r, v) :=SG(X, r, v)∪ {D}

PR4 IfDSG(X, r, v),D0Sγ(v)G (D, γ(v)) and∃r.D0vE∈ T, thenSG(X, v) :=SG(X, v)∪ {E}

PR5 If{a} ∈SG1(X,∗1)∩SG2(D,∗2) andG2RD, thenSG1(X,∗1) :=SG1(X,∗1)∪ {P2D}

PR6 IfP>0ASG(X, v), thenSG(X, P>0A) :=SG(X, P>0A)∪ {A}

PR7 IfP=1ASG(X,0), thenSG(X,1) :=SG(X,1)∪ {A}

PR8 IfP=1ASG(X, v),v6= 0, thenSG(X, v) :=SG(X, v)∪ {A}

PR9 IfASG(X, v) andv6= 0,P>0A∈ P0T, thenSG(X, v0) :=SG(X, v0)∪ {P>0A}

PR10IfASG(X,1) andP=1Aoccurs inT, thenSG(X, v) :=SG(X, v)∪ {P=1A}

PR11If{a} ∈SG(X, v) thenSG(X, v0) :=SG(X, v0)∪ {{a}}

Fig. 3.Completion rules for Prob-ELO01c

These completion sets are then extended by applying the completion rules from Figure 3 exhaustively. The function γ : V → {0, ε} used in rule PR4 is defined byγ(0) = 0, andγ(v) =εfor allvV \ {0}. The completion rules can be divided into two groups. The rules PR1 to PR5 form the first group and are basically the same as the rules OR1 to OR5 forELO—they are used to compute all subsumption relationships between concepts inside each world. The next five rulesPR6 to PR11 handle probabilistic concepts and therefore propagate derived facts between the different worlds. For example, whenever we haveP>0Ain the subsumer set ofB, then rulePR6 will pushAinto the subsumer set of Bfor the world v = P>0A, i.e., the world v is a witness of the subsumption. Similarly, wheneverP=1Ais in the subsumer set ofB for some worldv, then rulePR7and

PR10will pushP=1Ainto the subsumer sets ofB for all other worldswand rule

PR8 will finally put A into the subsumer set of B for each world with non-zero probability (i.e. all worlds except world 0). Rule PR11 distributes nominals in subsumer sets between the worlds inV as explained earlier.

This set of completion rules extends the rules given in [13] for Prob-EL01c in two ways. First, by the rules for nominals (written as completion rules). Second, by rulePR7, which is actually necessary to achieve completeness of the completion algorithm for Prob-EL01c (without nominals). To see this, consider the following TBox:

Tex ={AvP=1B, BvC, P=1CvD}

Clearly, we haveAvTex D, however, without rulePR7, the completion algorithm is stuck withP=1BS0(A,0) and will never deriveBS0(A,1),CS0(A,1), P=1CS0(A,0) and finallyDS0(A,0).

Theorem 1. The completion algorithm for Prob-ELO01c is sound and complete.

(10)

A detailed proof is given in the appendix (see Lemmas 1 and 2). Also note that the completion algorithm for Prob-ELO01c still runs in polynomial time, since

|BCT|,|Sig(T)∩NC|, |Sig(T)∩NI|, |V|, are all linear in the size of|T |=n.

4.2 Computing the Role-depth Bounded Prob-ELO01c -lcs

The approach for computing the role-depth bounded Prob-ELO01c -lcs is similar to the classical case, where we first introduce new concept names for the input concepts, normalize the TBox, apply the completion rules, then we intersect the direct subsumers stored in the completion sets and add the cross-product of the existential restrictions of both concepts. However, in the presence of prob- abilistic concepts, we need to compute also the probabilistic direct subsumers and probabilistic existential restrictions. Therefore, this algorithm computes the cross-product of the existential restrictions three times: for the unconditional concepts, for those concepts with probability 1, and for concepts with non-zero probability. In contrast, we need to compute the intersection of the basic con- cepts only once, since wheneverXS0G(A, v) with v 6= 0, then by completion rule PR9 we have alsoP>0XS0G(A,0) and similarly, whenever XS0G(A,1) then we also have P=1XS0G(A,0) by completion rulePR10. The algorithm to compute the role-depth bounded lcs in Prob-ELO01c is described in Figure 4.

Theorem 2. LetT be a Prob-ELO01c -TBox,C andD be Prob-ELO01c -concepts andkbe a natural number. Thenk-lcs(C, D,T, k)is the role-depth bounded least common subsumer ofC andD w.r.t. T and the role-depthk.

Correctness of this algorithm for computing the role-depth bounded Prob-ELO01c - lcs follows from soundness and completeness of the completion rules which are used to generate the underlying completion sets. The full proof can be found in the appendix. As in the crisp case, the resultingk-lcs can have a size exponential inkif computed forninput concepts, but it is still polynomial in the size of the input TBoxT.

5 Conclusions

In this paper we have studied extensions of the light-weight description logic EL, that include nominals and are capable of handling uncertainty. For the DL Prob-ELO01c , we have introduced a completion algorithm that generalizes the previously known algorithm for Prob-EL01c , with correct rules for handling nom- inals, while still retaining the polynomial time complexity of classification.

Second, we described how the completion sets saturated by the completion algorithm can be combined to compute (approximations of) the lcs of two con- cepts in DLs with nominals—both for the crisp case inELOand the probabilistic case in Prob-ELO01c . In cases where the exact lcs exists, the algorithms compute the exact lcs for a sufficiently largek.

The extension ofEL+ by nominals, covers (most of) the OWL 2 EL profile, thus combining the algorithms for computing the role-depth bounded lcs inEL+

(11)

Procedurek-lcs(C, D,T, k)

Input:C, D: Prob-ELO01c -concept descriptions;T: Prob-ELO01c -TBox;k∈IN Output:k-lcs(C, D): role-depth bounded Prob-ELO01c -lcs ofC, Dw.r.t.T andk

1: T0:=normalize(T ∪ {A≡C, BD}) 2: ST :=apply-completion-rules(T0) 3: L:=k-lcs-r(A, B,ST, k, A, B) 4: return L

Procedurek-lcs-r(X, Y,S, k, A, B)

Input:A, B: concept names,X, Y: basic concepts withA RX, B RY; S: set of saturated completion sets;k: natural number

Output:k-lcs(A, B): role-depth bounded Prob-ELO01c -lcs ofX, Y w.r.t.T andk 1: CN :=d

E∈SA 0(X,0)∩SB

0(Y,0)∩BCT E 2: if k= 0then

3: return CN 4: else

5: return CNu l

r∈Sig(T)∩NR

l

(E,F)∈SA

0(X,r,0)×SB 0(Y,r,0)

∃r.k-lcs-r(E, F,S, k−1, A, B)u

l

(E,F)∈SA

0(X,r,1)×SB 0(Y,r,1)

P=1

∃r.k-lcs-r(E, F,S, k−1, A, B) u

l

(E,F)∈PRA(X,r)×PRB(Y,r)

P>0

∃r.k-lcs-r(E, F,S, k−1, A, B) ,

wherePRG(X, r) =S

v∈V\{0}S0G(X, r, v)

Fig. 4.Computation algorithm for role-depth bounded Prob-ELO01c -lcs.

[8] and inELOpresented here allows to compute generalizations in this profile.

Similarly, as in case of the lcs, an approximation of the most specific concept (msc) can be computed based on a completion algorithm, see [15]. For DLs with nominals, the completion algorithm given in this paper can be directly used for this, since an ABox can always be absorbed into the TBox in a preprocessing step using these nominals. However, since the msc in the presence of nominals is trivial (msc(a) =a), another target DL should be considered in order to yield an informative version of the msc.

Besides the msc, there exist several other non-standard inferences that have been studied for classical DLs and would be of interest in the context of subjec- tive probabilities. One of them isaxiom pinpointing, i.e. the task of discovering the precise axioms from a knowledge base that are responsible for a consequence to follow. The use of probabilities introduces a new challenge as seemingly in- nocuous axioms may interact to produce unexpected (and possibly unwanted) consequences. A further study of this problem will be a matter of future work.

(12)

References

1. M. Ashburner. Gene ontology: Tool for the unification of biology.Nature Genetics, 25:25–29, 2000.

2. F. Baader. Terminological cycles in a description logic with existential restrictions.

In G. Gottlob and T. Walsh, editors,Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI-03), pages 319–324. Morgan Kaufmann, 2003.

3. F. Baader, S. Brandt, and C. Lutz. Pushing the EL envelope. In Proc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI-05), Edinburgh, UK, 2005.

Morgan-Kaufmann Publishers.

4. F. Baader, R. K¨usters, and R. Molitor. Computing least common subsumers in description logics with existential restrictions. In T. Dean, editor,Proc. of the 16th Int. Joint Conf. on Artificial Intelligence (IJCAI-99), pages 96–101, Stockholm, Sweden, 1999. Morgan Kaufmann, Los Altos.

5. A. Borgida, T. Walsh, and H. Hirsh. Towards measuring similarity in description logics. InProc. of the 2005 Description Logic Workshop (DL 2005), volume 147 of CEUR Workshop Proceedings, 2005.

6. S. Brandt and A.-Y. Turhan. Using non-standard inferences in description logics

— what does it buy me? In G. G¨orz, V. Haarslev, C. Lutz, and R. M¨oller, editors, Proc. of the 2001 Applications of Description Logic Workshop (ADL 2001), num- ber 44 in CEUR Workshop, Vienna, Austria, September 2001. RWTH Aachen. See http://CEUR-WS.org/Vol-44/.

7. C. d’Amato, N. Fanizzi, and F. Esposito. A semantic similarity measure for expres- sive description logics. InProc. of Convegno Italiano di Logica Computazionale, CILC05, 2005.

8. A. Ecke and A.-Y. Turhan. Role-depth bounded least common subsumers forEL+ andELI. In Y. Kazakov, D. Lembo, and F. Wolter, editors,Proc. of the 2012 De- scription Logic Workshop (DL 2012), volume 846 ofCEUR Workshop Proceedings.

CEUR-WS.org, 2012.

9. V. Guti´errez-Basulto, J. C. Jung, C. Lutz, and L. Schr¨oder. A closer look at the probabilistic description logic prob-EL. InProceedings of Twenty-Fifth Conference on Artificial Intelligence (AAAI-11), 2011.

10. J. Y. Halpern. An analysis of first-order logics of probability.Artificial Intelligence, 46:311–350, 1990.

11. Y. Kazakov, M. Kr¨otzsch, and F. Simanˇc´ık. Practical reasoning with nominals in theELfamily of description logics. In G. Brewka, T. Eiter, and S. A. McIlraith, editors,Proc. of the 12th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR-12), pages 264–274. AAAI Press, 2012.

12. K. M. Kudla and M. C. Rallins. Snomed: A controlled vocabulary for computer- based patient records. Journal of the American Health Information Management Association, 69:40–44, 1998.

13. C. Lutz and L. Schr¨oder. Probabilistic description logics for subjective uncertainty.

In F. Lin, U. Sattler, and M. Truszczynski, editors,Proc. of the 12th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR-10). AAAI Press, 2010.

14. B. Motik, B. Cuenca Grau, I. Horrocks, Z. Wu, A. Fokoue, and C. Lutz. OWL 2 web ontology language profiles. W3C Recommendation, 27 October 2009. http:

//www.w3.org/TR/2009/REC-owl2-profiles-20091027/.

15. R. Pe˜naloza and A.-Y. Turhan. Towards approximative most specific concepts by completion forEL01with subjective probabilities. In T. Lukasiewicz, R. Pe˜naloza,

(13)

and A.-Y. Turhan, editors, Proceedings of the First International Workshop on Uncertainty in Description Logics (UniDL’10), 2010.

16. R. Pe˜naloza and A.-Y. Turhan. A practical approach for computing generalization inferences inEL. In M. Grobelnik and E. Simperl, editors,Proc. of the 8th European Semantic Web Conf. (ESWC’11), Lecture Notes in Computer Science. Springer, 2011.

17. B. Zarrieß and A.-Y. Turhan. Most specific generalizations w.r.t. general EL- TBoxes. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI’13), Beijing, China, 2013. AAAI Press. To appear.

(14)

A Omitted Proofs

A.1 Correctness of the Classification Algorithm for Prob-ELO01c Theorem 1. The completion algorithm for Prob-ELO01c is sound and complete.

We prove each of the claims in the following. Please note that the consequence- driven algorithms forELOfrom [11] and also the completion algorithm presented in Section 3 compute subsumption relationshipsunder the assumption that cer- tain concepts have a non-empty interpretation. To indicate the assumption that, sayGis not empty, when considering the subsumption relationship between C andD, we writeG:CvT D.

Lemma 1. The completion algorithm is sound, i.e.

CSG(X, v)impliesG:PXvT PvC (1) CSG(X, r, v)impliesG:PXvT Pv∃r.C (2) Proof. We show this by induction on the number of rule applications. It is easy to see that the initial subsumer sets satisfy (1) and (2). Also, after each rule application (1) and (2) will still be satisfied. We will only show (1) for the new completion rules for Prob-ELO01c , which are not in the original completion algorithm in [13]. For property (2), note that none of these new rules changes the subsumer setsSG(X, r, v).

PR5 If {a} ∈ SG1(X,∗1)∩SG2(D,∗2), then by induction hypothesis we have G:P1XvT {a} andG:P2DvT {a}. Additionally, by definition of R, G2RD implies that if G is not empty, then P2D must be not empty as well. Thus we haveG:P2DT {a} and thereforeG:P1XvP2D. This means, that the addition ofP2D toSG1(X,∗1) still satisfies (1).

PR7 If P=1ASG(X,0), then by induction hypothesis G : PX vT P=1A.

Thus the implication ASG(X,1) ⇒ G : PX vT P=1A is obviously correct, and we can addAto SG(x,1).

PR11 If{a} ∈ SG(X, v), then by induction hypothesisG:PX vT Pv{a}, i.e.

for all modelsI= (∆I, W,(Iw)w∈W, µ) ofT and all worldswW we have:

ifGis not empty, then (PX)I,w ⊆(Pv{a})I,w. Together with (P>0{a})I,w={d| ∃v∈W :µ(v)>0∧d∈ {a}I,v}

={d| ∃v∈W :µ(v)>0∧d=aI}={aI}={a}I,w and similarly (P=1{a})I,w = {a}I,w, this yields: if G is not empty, then (PX)I,w ⊆ {a}I,w ={aI} = {a}I,v0 for all v0W. Hence it holds that G:PX vT Pv0{a}. This means, that the addition of{a}to SG(X, v0) still satisfies (1).

Lemma 2. The completion algorithm is complete, i.e. for a normalized TBox T,GNC,B∈BCT, andrNR we have

GvT B implies BS0G(G,0)

GvT ∃r.Bimplies ∃A withASG0(G, r,0)andBS0G(A,0)

(15)

Proof. We assume thatB 6∈SG0(G,0) (resp. there is noAwithASG0(G, r,0) and BS0G(A,0)) and construct a modelIG of T which shows that G6vT B (resp. G6vT ∃r.B). To construct this model, we need the classes of equivalent nominals: [a] = {b ∈ Sig(T)∩NI | {a} ∈ S0G({b},0)}. The domain of the interpretation will contain all nominals (modulo equivalence) and for each world wV all concepts that are not subsumed by a nominal and can be reached fromGor a nominal using the relation R.

LetIG = (∆IG, W,(IG,w)w∈W, µ) be the following interpretation:

IG :={[a]|a∈Sig(T)∩NI} ∪

{(A, v)∈Sig(T)∩NC×V |Gγ(v)RA,{a} 6∈Sγ(v)G (A, γ(v))}

W :=V µ(0) := 0 µ(w) := 1

|W\ {0}| for allwW\ {0}

aIG= [a] for alla∈Sig(T)∩NI

To interpret concept and role names, we also need a bijectionπv(w) :WW for eachvW\ {0} withπv(v) =εandπv(0) = 0. Moreover,π0is the identity mapping onW. Then:

AIG,w={[a]|AS0G({a}, w)}

∪ {(B, v)∈IG|ASγ(v)G (B, πv(w))}

rIG,w={([a],[b])∈IG×∆IG| ∃A:AS0G({a}, r, w)∧ {b} ∈SGγ(w)(A, γ(w))}

∪ {([a],(A, w))∈IG×∆IG |AS0G({a}, r, w)}

∪ {((B, v),[b])∈IG×∆IG | ∃A:ASγ(v)G (B, r, πv(w))

∧ {b} ∈Sγ(w)G (A, γ(w))}

∪ {((B, v),(A, w))∈IG×∆IG|ASγ(v)G (B, r, πv(w))}

Before proving thatIG is indeed a model of T, we generalize the definition of AIG,w to probabilistic concepts:

XSγ(v)G (B, πv(w)) iff (B, v)∈XIG,w forX ∈BCT,(B, v)∈IG (3) XS0G({a},0) iff [a]∈XIG,w forX ∈BCT,[a]∈IG (4) To show this, we make a case distinction according to the kind of concept X is. First notice that in (3)X cannot be a nominal, since otherwiseB would be subsumed by one and hence not be in the domainIG as we assumed.

IfX =>, then (3) and (4) are true by definition of >IG,w =IG and the fact that>is in each subsumer set.

IfX =ANC, then (3) and (4) are true by definition ofAIG,w.

Références

Documents relatifs

Although the existence of a modeling FO-limit for FO-convergent sequences of graphs with bounded tree-depth follows easily from our study of FO-convergent sequence of rooted

There we show that the characterization can be used to verify whether a given common subsumer is indeed the least one and show that the size of the lcs, if it exists, is

Roughly speaking, these approaches rewrite the ontology and/or the query into another formalism, typically a union of conjunctive queries or a datalog program; the relevant answers

Although Algorithm 4 is formulated for the binary k-lcs, the implementation can handle input tuples of concept descriptions by applying the binary k-lcs to the first two concepts

The highly redundant ELH-concept descriptions obtained from the k-lcs algo- rithm, need to be simplified, in order to make the resulting concept description readable.. The general

The returned lcs-concept description should only contain concept names that appear in the initial TBox, thus we need to “de- normalize” the concept descriptions obtained from

After defining the model in which I am working, I will explore ‘hidden’ meaning construction processes occurring in one’s interiority and linked to the use of symbolic resources as

In this paper, we consider ontologies formulated as general TBoxes in the description logic EL allowing for additional role inclusion axioms of the form r v s (role hierarchy ), r ◦ r