• Aucun résultat trouvé

Learning Semantic Web rules within the framework of SHIQ+log

N/A
N/A
Protected

Academic year: 2022

Partager "Learning Semantic Web rules within the framework of SHIQ+log"

Copied!
2
0
0

Texte intégral

(1)

Learning Semantic Web rules within the framework of

SHIQ

+log

Francesca A. Lisi

Dipartimento di Informatica Università degli Studi di Bari Via E. Orabona 4, 70125 Bari, Italy

[email protected]

Floriana Esposito

Dipartimento di Informatica Università degli Studi di Bari Via E. Orabona 4, 70125 Bari, Italy

[email protected]

ABSTRACT

In this paper we face the problem of learning Semantic Web rules within a decidable instantiation of theDL+log frame- work which integrates the DLSHIQand positiveDatalog. To solve the problem, we resort to the methodological ap- paratus of Inductive Logic Programming.

Categories and Subject Descriptors

I.2 [Artificial Intelligence]: Learning

Keywords

Semantic Web rules, Inductive Logic Programming

1. INTRODUCTION

Among the many recent KR proposals for Semantic Web rules,DL+log [9] is a powerful framework for the tight in- tegration of Description Logics (DLs) [1] and disjunctive Datalog with negation (Datalog¬∨) [3]. More precisely, it extends a DL KB with weakly-safe Datalog¬∨ rules.

Note that the condition of weak safeness allows to overcome the main representational limits of the approaches based on the DL-safeness condition, e.g. the possibility of expressing conjunctive queries (CQ) and unions of conjunctive queries (UCQ), by keeping the integration scheme still decidable.

In particular, the decidability of reasoning inDL+log de- pends on the decidability of Boolean CQ/UCQ containment in theDLchosen to instantiate the framework. As far as we know, the most powerful decidable instantiation ofDL+log is currently obtained by choosingSHIQ[4] asDL.

Acquiring and maintaining Semantic Web rules is very de- manding and can be automated though partially by applying Machine Learning (ML) algorithms based on induction. In this paper we face the problem of learning Semantic Web rules by adopting SHIQ+log restricted to positive Dat- alog [2] as KR framework and Inductive Logic Program- ming (ILP) [8] as ML approach. Since ILP has been histor- ically concerned with rule induction from examples within the KR framework of Horn Clausal Logic (HCL), we refor- mulate core ILP ingredients to tackle with the hybrid DL-CL representation and reasoning ofSHIQ+log.

2. INDUCING

SHIQ

+LOG RULES

We assume that thedataare represented as aSHIQ+log KBB where the intensional partK(i.e., the TBoxT plus the set ΠRof rules) plays the role ofbackground theoryand the extensional part (i.e., the ABox A plus the set ΠF of

facts) contribute to the definition of observations. As an example, suppose we have aSHIQ+log KB (adapted from [9]) consisting of the following intensional knowledgeK:

[A1]RICHuUNMARRIEDv ∃WANTS-TO-MARRY.>

[R1]RICH(X)←famous(X),scientist(X,us)

and the following extensional knowledgeF:

UNMARRIED(Mary) UNMARRIED(Joe) famous(Mary) famous(Paul) famous(Joe) scientist(Mary,us) scientist(Paul,us) scientist(Joe,it)

that can be split intoFJoe={UNMARRIED(Joe),famous(Joe), scientist(Joe,it)},FMary={UNMARRIED(Mary),famous(Mary), scientist(Mary,us)}, andFPaul ={famous(Paul),

scientist(Paul,us)}.

ThelanguageLof hypothesesallows for the generation ofSHIQ+log rules of the form

p(X)~ ←r1(Y~1), . . . , rm(Y~m), s1(Z~1), . . . , sk(Z~k)

where m≥0,k≥0,p(X) is an atom built out of either a~ SHIQ-predicate or aDatalog-predicate, eachrj(Y~j) is an atom with aDatalog-predicate, and eachsl(Y~l) is an atom with aSHIQ-predicate. Note thatprepresents the target predicate, denoted ascifpis aDatalog-predicate and asC ifpis aSHIQ-predicate. The former case aims at inducing c(X)~ ←rules that will enrich theDatalogpart of the KB.

E.g., suppose that theDatalog-predicatehappyis the tar- get concept and the building blocks for the languageLhappy are in the set{famous/1,RICH/1,WANTS-TO-MARRY/2,LIKES/2}.

The following rules

H1happy happy(X)←RICH(X) H2happy happy(X)←famous(X)

H3happy happy(X)←famous(X),WANTS-TO-MARRY(Y,X)

(2)

belonging toLhappycan be considered hypotheses forhappy.

Note thatH3happyis weakly-safe. The latter case aims at in- ducing C(X~) ← rules that will extend the DL part (i.e., the input ontology). E.g., suppose that the target con- cept is the DL-predicate LONER. If LLONER is defined over {famous/1,scientist/2,UNMARRIED/1}, then the rules

H1LONER LONER(X)←scientist(X,Y)

H2LONER LONER(X)←scientist(X,Y),UNMARRIED(X) H3LONER LONER(X)←scientist(X,Y),famous(X)

belong toLLONERand represent hypotheses forLONER.

Anobservationoi∈Ois represented as a couple (p(a~i),Fi) whereFiis a set containing ground facts concerning the in- dividuala~i. We assumeK ∩O =∅. We say that H ∈ L covers oi ∈ O w.r.t. K iff K ∪ Fi ∪H |= p(a~i). Note that the coverage test can be reduced to query answering inSHIQ+log KBs which in its turn can be reformulated as a satisfiability problem of the KB. E.g., the hypothesis H3happy covers the observation oMary = (happy(Mary),FMary) because K ∪ FMary∪H3happy |=happy(Mary). Conversely it does not cover the observations oJoe = (happy(Joe),FJoe) and oPaul = (happy(Paul),FPaul). It can be proved that H1happycoversoMaryandoPaul, whileH2happyall the three obser- vations. With reference toLLONER, the hypothesesH1LONERand H3LONER cover the observations oMary= (LONER(Mary),FMary), oPaul= (LONER(Paul),FPaul) andoJoe= (LONER(Joe),FJoe).

Conversely,H2LONERcovers onlyoMary andoJoe.

Thegenerality orderforLis based on a relation of sub- sumption, named K-subsumption and denoted as K, be- tween SHIQ+log rules which is defined as follows. Let H1, H2∈ Lbe two hypotheses standardized apart,Ka back- ground theory, and σ a Skolem substitution1 for H2 with respect to{H1} ∪ K. We say thatH1KH2 iff there exists a ground substitution θ forH1 such that (i) head(H1)θ = head(H2)σand (ii)K ∪body(H2)σ|=body(H1)θ. Note that condition (ii) is a variant of the Boolean CQ/UCQ contain- ment problem becausebody(H2)σ and body(H1)θ are both Boolean CQs. The difference between (ii) and the origi- nal formulation of the problem is thatKencompasses not only a TBox but also a set of rules. Nonetheless this vari- ant can be reduced to the satisfiability problem for finite SHIQ+log KBs. Indeed the skolemization ofbody(H2) al- lows to reduce the Boolean CQ/UCQ containment problem to a CQ answering problem. Due to the aforementioned link between CQ answering and satisfiability, checking (ii) can be reformulated as proving that the KB (T,ΠR∪body(H2)σ∪ {← body(H1)θ}) is unsatisfiable. Once reformulated this way, (ii) can be solved by applying the algorithm NMSAT- DL+log. Thus, a procedure for testing K can be built on top of the reasoning mechanisms forSHIQ+log. E.g., it can be checked thatH1happy6KH2happyandH2happy6KH1happy, i.e. the two hypotheses are incomparable with respect toK- subsumption, and thatH1LONERKH2LONERbut not viceversa.

1Let B be a clausal theory and H be a clause. Let X1, . . . , Xn be all the variables appearing in H, and a1, . . . , an be distinct constants (individuals) not appearing inB or H. Then the substitution{X1/a1, . . . , Xn/an} is called aSkolem substitution forH w.r.t. B.

It can be proved thatK is a decidable quasi-order (i.e. it is a reflexive and transitive relation) forSHIQ+log rules.

3. CONCLUSIONS AND FUTURE WORK

Our proposal for learning Semantic Web rules adopts a de- cidable KR framework, SHIQ+log, that is the most pow- erful among the ones currently available for the tight inte- gration of DLs and CLs. We would like to emphasize that the results of this paper are valid for any other decidable in- stantiation ofDL+log with positiveDatalog. More details of our proposal can be found in [6]. Compared with related works on learning hybrid DL-CL rules [10, 5], it relies on a more expressive DL (i.e.,SHIQ), thus getting closer to the actual DLs underlying OWL. Also it can learn rules with a DL-predicate in the head, thus affecting the input ontology.

For the future we plan to define ILP algorithms starting from the ingredients identified in this paper. Also we would like to extend the proposal toSHIQ+log withDatalog¬∨. Finally, from the application side, we will consider some of the use cases for Semantic Web rules. A preliminary study of an application to ontology evolution can be found in [7].

4. REFERENCES

[1] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider, editors.The Description Logic Handbook: Theory, Implementation and Applications.

Cambridge University Press, 2003.

[2] S. Ceri, G. Gottlob, and L. Tanca.Logic Programming and Databases. Springer, 1990.

[3] T. Eiter, G. Gottlob, and H. Mannila. Disjunctive Datalog.ACM Transactions on Database Systems, 22(3):364–418, 1997.

[4] I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for very expressive description logics.Logic Journal of the IGPL, 8(3):239–263, 2000.

[5] F. Lisi. Building Rules on Top of Ontologies for the Semantic Web with Inductive Logic Programming.

Theory and Practice of Logic Programming, 8(03):271–300, 2008.

[6] F. Lisi and F. Esposito. Foundations of Onto-Relational Learning. In F. ˇZelezn´y and N. Lavraˇc, editors,Inductive Logic Programming, volume 5194 ofLecture Notes in Artificial Intelligence, pages 158–175. Springer, 2008.

[7] F. Lisi and F. Esposito. Supporting the Evolution of SHIQOntologies with Inductive Logic Programming - A Preliminary Study. InProc. Int. Workshop on Ontology Dynamics (IWOD-08), 2008.

[8] S. Nienhuys-Cheng and R. de Wolf.Foundations of Inductive Logic Programming, volume 1228 ofLecture Notes in Artificial Intelligence. Springer, 1997.

[9] R. Rosati.DL+log: Tight integration of description logics and disjunctive datalog. In P. Doherty, J. Mylopoulos, and C. Welty, editors,Proc. of 10th Int. Conf. on Principles of Knowledge Representation and Reasoning, pages 68–78. AAAI Press, 2006.

[10] C. Rouveirol and V. Ventos. Towards Learning in CARIN-ALN. In J. Cussens and A. Frisch, editors, Inductive Logic Programming, volume 1866 ofLecture Notes in Artificial Intelligence, pages 191–208.

Springer, 2000.

Références

Documents relatifs

This work-in-progress paper details how the patient and equipment transports can be optimized by learning semantic rules to avoid future delays in transport time.. Since these

We asked software engineers with little or no experience in Semantic Web software engineering (but yet sufficient programming skills) to solve a problem using RDF data sources

Following these guidelines, new ILP frameworks can be designed to deal with more or differently expressive hybrid DL-CL languages according to the DL chosen (e.g., learning Carin

SeaLife’s main objective is to develop the notion of context-based information integration by extending three existing semantic web browsers (SWB) to link the existing Web

Before we perform reasoning on the collected class hierar- chies from the Semantic Web, we must solve several funda- mental problems, e.g., which URIs identify classes and which

In particular, IRS-III following the WSMO frame- work [6], provides at the semantic level a distinction between goals (i.e. abstract definition of tasks to be accomplished) and

In particular, IRS-III following the WSMO frame- work [6], provides at the semantic level a distinction between goals (i.e. abstract definition of tasks to be accomplished) and

Machine Learning for Information Extraction from XML marked-up text on the Semantic Web..