• Aucun résultat trouvé

Towards Large scale reasoning on the Semantic Web

N/A
N/A
Protected

Academic year: 2022

Partager "Towards Large scale reasoning on the Semantic Web"

Copied!
14
0
0

Texte intégral

(1)

Web

BalázsKádár,GergelyLukásyandPéterSzeredi

BudapestUniversityofTehnologyandEonomis

DepartmentofComputerSieneandInformation Theory

1117Budapest,Magyartudósokkörútja2.,Hungary

balazskadar.biz,{lukasy,sze redi} s.b me.h u

Abstrat. Traditionalalgorithmsfordesriptionlogi(DL)instanere-

trievalareineientforlargeamountsofunderlyingdata.Asdesription

logiisbeomingpopular inareassuhastheSemantiWeb,itis very

importanttohavesystemsthatanreasoneientlyoverlargedatasets.

InthispaperwepresenttheDLogdesriptionlogi reasonerspeially

designedforsuhsenarios.

TheDLogapproahtransformsdesriptionlogiaxiomsusingthe

S HIQ

DLlanguageintoaPrologprogram.Thistransformationisdonewithout

anyknowledgeofthepartiular individuals:theyare aesseddynami-

allyduringthenormalPrologexeutionofthegeneratedprogram.This

allowsustostoretheindividualsinadatabaseinsteadofmemory,whih

resultsinbetter salability and helpsusingdesription logi ontologies

diretlyontopofexisting informationsoures.

InthispaperwefousonthedesriptionoftheDLogappliationitself.

Wepresent the arhitetureofDLog and desribe itsinterfaes. These

make it possible to use ABoxes stored indatabases and to ommuni-

ate withtheProtégé ontology editor,as aserverappliation. Wealso

evaluatetheperformaneoftheDLogdatabaseextension.

Keywords:largedatasets,desriptionlogi,reasoning, logiprogram-

ming,databases

1 Introdution

DesriptionLogis(DLs)allowustorepresentknowledgebases onsistingof

terminologialaxioms(theTBox)and assertionalknowledge(theABox).

DesriptionLogisarebeomingwidespreadasmoreandmoresystemsstart

using semantisforvariousreasons.Asan example,intheSemantiWebidea,

DLsareintendedtoprovidethemathematialbakgroundneededformoreintel-

ligentqueryanswering.Heretheknowledgeisapturedintheformofexpressive

ontologies,desribedintheWebOntologyLanguage(OWL)[1℄.This language

is mostly based on the

SHIQ

desription logi, and it is intended to be the

standardknowledgerepresentationformatoftheWeb.

However, we have tremendous amounts of information on the Web whih

alls for reasoners that are ableto eiently handle suh abundane of data.

(2)

for querying desription logi onepts in an environment where the ABox is

storedin adatabase.

Wefoundthatmostexisting desriptionlogireasonersarenotsuitablefor

this task, as these are not apableof handling ABoxesstored externally. This

is not a simple tehnial problem: most existing algorithms for querying DL

onepts need to examine the whole ABox to answera query. This resultsin

salability problems and undermines the point of using databases.Beause of

this, we startedto investigate tehniques whih allowthe separationofthein-

ferenealgorithm fromthedatastorage.

Wehavedevelopedasolution,wheretheinferenealgorithm isdividedinto

twophases. Firstwereate a query-plan in Prologfrom the atualDL knowl-

edgebase,withoutaessingtheunderlyingdataset. Subsequently,thisquery-

plan anberun onreal data, to obtainthe requiredresults.The implementa-

tion of these ideasis inorporatedin the DLog reasoningsystem, available at

http://dlog-reasoner.soureforge.net.

Inthis paper wefous on the arhiteture of the DLog system, as well as

onitsexternalinterfaes.Wedisusstheinterfaeusedforaessingdatabases,

whihallowsdesriptionlogi reasoningontopofexisting informationsoures.

WealsodesribetheProtégé[2℄interfaethatmakesitpossibletouseDLogas

thebak-endreasonerofthispopularontologyeditor.Detailsonthetheoretial

sideofDLoganbefoundin [3℄ andin[4℄.

This paper is strutured asfollows.Setion 2summarises related work.In

Setion 3wegiveageneralintrodutiontotheDLogapproahandpresentthe

arhitetureandimplementationdetailsofthesystem.ThedatabaseandProtégé

interfaesaredesribedinSetions4and5,respetively.Setion6evaluatesthe

performaneof the databaseextensionof DLogw.r.t. theversionwhih stores

theABoxasPrologfats.Finally,inSetion7,weonludewiththefuturework

andthesummaryofourresults.

2 Related work

Several tehniques have emerged for dealing with ABox-reasoning. Tradi-

tional ABox-reasoningis basedon thetableau inferene algorithm,whihtries

to build a model showing that a given onept is satisable. To infer that an

individual

i

isaninstaneofaonept

C

,anindiretassumption

¬C(i)

isadded

totheABox,andthetableau-algorithmisapplied.Ifthisreportsinonsisteny,

i

is provedto beaninstaneof

C

.Themain drawbakofthisapproahis that

it annotbe diretlyused forhigh volumeinstane retrieval, beause itwould

requirehekingallinstanesin theABox,onebyone.

To make tableau-based reasoning moreeient on large data sets, several

tehniqueshavebeendevelopedinreentyears[5℄.Theseareusedbythestate-

of-the-artDLreasoners,suh asRaerPro[6℄orPellet[7℄.

Extreme asesinvolve serious restritionson the knowledge base to ensure

eientexeutionwithlargeamountsofinstanes. Forexample,[8℄ suggestsa

(3)

aessed in a very eient way. The drawbak is that the ABox may ontain

onlyaxiomsofform

C(a)

,i.e.weannotmakeroleassertions.

Paper[9℄ disusses how a rst order theorem proversuh as Vampire an

bemodiedandoptimisedforreasoningoverdesriptionlogiknowledgebases.

Thiswork,however,mostlyfouses onTBoxreasoning.

In[10℄, aresolution-based inferenealgorithm is desribed, whih is not as

sensitiveto theinreaseof theABox size asthetableau-based methods. How-

ever, this approah still requires the input of the whole ontent of the ABox

before attemptingto answeranyqueries.TheKAON2 system[11℄ implements

thismethod andprovidesreasoningserviesoverthedesriptionlogilanguage

SHIQ

bytransformingtheknowledgebaseintoadisjuntivedatalogprogram.

Although the motivation and goals of KAON2 are similar to ours, unlike

KAON2 (1)weuse apure two-phase reasoningapproah(i.e. theABox is a-

essedonlyduring queryanswering)and(2)wetranslateintoPrologwhihhas

well-established,eientandrobustimplementations.

Artile [12℄ introdues theterm DesriptionLogi Programming.Thisidea

usesadirettransformationof

ALC

desriptionlogioneptsintodeniteHorn-

lauses, and posessomerestritions onthe form ofthe knowledge base,whih

disallowaxiomsrequiringdisjuntivereasoning.Asanextension,[13℄introdues

afragmentof the

SHIQ

languagethat an be transformedinto Horn-lauses.

Thiswork,however,still posesrestritionsontheuseofdisjuntions.

3 The DLog system

Themain ideaoftheDLogapproahis thatwetransforma

SHIQ

knowl-

edgebaseKBintorst-orderlauses

Ω(

KB

)

andfromthesewegenerateProlog

ode[3℄. Inontrastwith[11℄,alllausesontainingfuntionsymbolsareelim-

inatedduring the transformation:the resultinglauses anbe resolved further

only with ABox lauses. This forms the basis of a pure two phase reasoning

framework,whereeverypossibleABox-independentreasoningstepisperformed

beforeaessingtheABoxitself,allowingustostoretheontentoftheABoxin

anexternaldatabase.

Atually, in the general transformation, we use only ertain properties of

Ω(

KB

)

. Thesepropertiesaresatisedbyasubsetof rstorder lausesthat is,

in fat,largerthantheset oflausesthatanbegeneratedfroma

SHIQ

KB.

We allthese lauses DLlauses. Asa onsequeneof this, ourresultsan be

usedforDLknowledgebasesthataremoreexpressivethan

SHIQ

.Thisinludes

theuse ofertain roleonstrutors, suh asunion.Furthermore,someparts of

theknowledgebaseanbesuppliedbytheuserdiretlyintheformofrstorder

lauses.Moredetailsanbefoundin[3℄.

Asthelausesofa

SHIQ

knowledgebase KBarenormalrst-orderlauses

weanapplythePrologTehnologyTheoremProving(PTTP) tehnology[14℄

diretlyonthese. In[3℄wehavesimpliedthePTTPtehniques forthespeial

(4)

ompleteforDLlauses.

ThesimpliedPTTPtehniquesusedinDLoginludedeterministianestor

resolution and loop elimination. Both are appliableonly to unary prediates,

i.e.prediatesorrespondingtoDLonepts.

InthedesignoftheDLogsystemwefousonmodularity.Thisenablesusto

easilyimplementnewfeaturesandnewinterfaes.Thetoplevelarhitetureof

the systemis shown in Figure 1.In thisgure, asin subsequentguresof the

paper,retangleswithrounded ornersrepresentmodules oftheDLogsystem,

whiledataareshownasplainretangles.InFigure1theDLogreasonerisshown

within adashedretangle.

PSfragreplaements

Client

DLogreasoner

KnowledgeBaseManager

Firstphase:

translation

Seondphase:

exeution

Supportmodules

Fig.1.ThetoplevelarhitetureoftheDLogsystem.

Theuser(eitherloalorremote)aessesDLogthroughoneoftheexternal

interfaes. These interfaes range from a loal onsole to serverinterfaeslike

DIG used by theProtégé ontologyeditor. Theknowledge base manageris the

entral piee of thesystem.It oordinatesthetasks of theothermodules, and

performstheadministrationofmultipleonurrentknowledgebases.Itforwards

therequestarrivingfromtheinterfaesto thereasonermodules.

Thesupport modules onsistof severaltoolsthatare usedbymostparts of

the system. They inlude a onguration manager module, alogger, an XML

reader,arun-timesystemfortheseondphase,andseveralportabilitytoolsthat

allowDLogto rununder dierentPrologimplementations(urrentlySWI and

SICStus).

The rst phase, translation, shown in Figure 2, takes a set of desription

logi axiomsas input. These axioms are divided into two parts: the TBox or

terminology box stores onept and role inlusion axioms, while the ABox or

assertion box ontains the fatual data. The ABox may be stored (partly or

(5)

ABoxode(whihisaPrologmodule),andtheABoxsignature,whihisrequired

for translatingthe TBox. Thegeneration of ABoxodeinludes optimisations

suhasindexingonseondargumentforrolesstoredinmemory.

Next,the TBox is proessed in two steps.First the DL translator module

transformsthedesriptionlogiformulaetoaset ofDLlauses[15℄, whih are

passed onto the TBoxtranslator module that generates the exeutableTBox

ode.Thisgeneratedodeisequivalent,withrespettoinstaneretrieval,tothe

input DLknowledge base.TheTBoxtranslator module usesvarious optimisa-

tions[3℄toobtainmoreeientPrologprograms.TheABoxandTBoxodean

begenerateddiretlyintomemoryormaybesavedtodiskforlater(standalone)

use.

PSfragreplaements

TBox ABox

DL

translator

TBox

translator ABoxsignature ABox

translator

ABoxode

TBoxode

Fig.2.Therstphase:translation.

Theseond phase, exeution,shown in Figure 3,uses theABox and TBox

programs generated in the rst phase, to answer queries.There are two ways

to exeute queries: the generated TBox an be alled diretly from Prolog as

a low-level interfae, orthe Query module provides ahigh-level interfae that

providesbasisupport for omposite queriesand anaggregatethe results.In

normal operation the querymodule is alled bythe knowledge base manager,

whihforwardstheresultsto theuserinterfae.Asthequerymoduledoesnot

dependontherest ofthesystem,itmaybeusedin standaloneoperation.The

run-timesystem(shownasRTSinthegure)inludesahashtableimplemented

in Cusedto speedupthereasoning,andoptionalolletionofstatistis.

PSfragreplaements

ABoxode

TBoxode

Query

module RTS

Queries

Results

Fig.3.Theseondphase:exeution.

(6)

As the rst phase of reasoning (i.e. the generation of a query plan) only

dependsonthesignatureofthedataset,andbeauseofthetop-downinferene

ofProlog,DLoganeientlyusedatabasestostoretheABox.

There may be several advantages in using databases to store the ABox.

Firstly,thisallowsreasoningondatasetsthatannottintomemory.Seondly,

itmakesintegratingDLogwithexistingsystemseasier,asthereasoneranuse

the existing databases of other appliations. Thirdly, querying some onepts

(namely those orresponding to so-alled query prediates) may be performed

usingomplexdatabasequeries,ratherthanDLreasoning,whihisexpetedto

deliveramarkedinreasein performane.

Aprediateisaquery prediate [3℄,if itisnon-reursive,itdoesnotinvoke

itsnegation,andisnotinvokedfromwithinitsnegation.Here,aprediate

P 0

is

saidto invokeaprediate

P n

,

n ≥ 1

,ifthere are

n − 1

intermediateprediates

P 1 . . . P n − 1

, suh that

P i

is diretly invoked by

P i − 1

, i.e. it ours in a lause

bodytheheadofwhihis

P i − 1

,for

i = 1, . . . , n

.

Query prediates require neither loop elimination, nor anestor resolution

duringexeution.Thenamequeryprediate reetsthatfatthatsuhpredi-

atesanbetransformedtoomplexdatabasequeries(providedthatallonepts

and roles required are storedin asingle database). This an inreasethe per-

formane as the database enginean optimise the query using statistial and

struturalknowledgeofthedatabaseinquestion.

Wedesignedthedatabaseinterfaetobeassimpleaspossible.Thedatabases

are aessed viathe ODBC driver ofSWI-Prolog; asa onsequeneDLogan

interfae with most modern database systems. We wanted a way to speify

database aess using existing tools and interfaes suh as Protégé and the

DIGinterfaeitutilisesevenifthosedonot,atthemoment,provideawayto

speifydatabaseusage. Toaess adatabase,several piees ofinformation are

needed:thenameofthedatabase,ausername,apassword,adesriptionofwhih

tabletouseforgivenoneptsandroles,et.Beauseoftheaforementionedre-

quirements wedeided to use ABoxassertions to arrythis meta-information.

ABoxassertionsaredesriptionlogionstrutsthatarereadilyavailableinDL

systemsandinterfaes,suhasOWLandDIG.

Inorder tospeify thedatabaseaess foroneptsand roles weintrodue

new roles (objet properties), attributes (datatype properties) and individuals

dened inthenamespaehttp://www.s.bme.hu/dlogDB.

The ODBC interfae presribes that databaseonnetions are to be iden-

tied by a Data Soure Name (DSN). In DLog we introduean individual to

representagivendatabaseonnetion.Roles andoneptsarealsorepresented

byindividuals.An arbitrarynameanbeusedforsuhanindividual.

The meta data provided is used to onnet to the database, and, for eah

oneptand role,anadditionallauseisgenerated, whih, byexeutinganap-

propriatedatabasequery,listsappropriateindividuals(orpairsofindividuals).

Thisallowsoneptsandrolesto bestoredpartiallyin databasesand partially

in memory.Thismaybeveryusefulwhendevelopingontologies.

(7)

Databaseonnetionsarerepresentedbyindividualsthathavethestringat-

tributehasDSNdened.Thevalueofthisattributeisthenameofthedatasoure

(DSN).Asallothernamesinthissetion,thisnameisdenedinthenamespae

http://www.s.bme.hu/dlogDB.Additionalstringattributes,namelyhasUserName

and hasPassword,maybeusedto speifytheusernameand thepasswordfor

thegivenonnetion,ifrequired.

TheobjetpropertyhasConnetionlinks anindividual representingarole

or a onept with the database onnetion to be used for aessing it. This

makesit possible to use one data soure for one onept, and a dierent one

for another. The instane on the left hand side is the individual representing

therole oronept,while theinstane onthe righthand side isthe individual

representingtheonnetion.

Twomethodsareprovidedtospeifyhowtogetthedatafromthedatabase.

Oneisto speifyaquerythat isto bediretly exeuted onthedatabase.This

method,namedthesimpleinterfae,isprovidedbeauseofitssimpliity:itan

beappliedtodatabaseswithoutanymodiation.Howeverithastwodrawbaks:

it makes transforming query prediates to database queries very diult;

and

itperformsbadlyforinstanehekqueries.

Thelatterisalargesetbakasmostofthequeriesareinstaneheks,assuming

thetheprojetion optimisationof[3℄isused.

Thereforetheseond,preferred,wayistoprovidethenameofatableorof

aview andthe name of theolumn(s) ofthis table. This approah, alled the

omplex interfae may requirethe reation of new views in the database,but

providesmuhgreaterexibilityandbetterperformane.

TheSQL queryin thesimple interfaeisdened using the stringattribute

hasQuery.Theindividualrepresentstheroleoroneptandtheattributevalue

isthequerystring.Forindividualsrepresentingrolesthequerymustreturntwo

olumns,andforthoseusedforoneptsitmustreturnoneolumnthatontains

theindividualname.

If the omplex interfae is used, the name of the table or view to use is

speiedbythestringattribute hasTable.Thenameoftheolumn listingthe

individualsofaoneptisgivenusingthestringattributehasColumn.Forroles,

theattributeshasLHSandhasRHSareusedfortheleft andtherighthand side,

respetively.

Beause,inProtégé,individualsannotbespeiedasinstanesofanegated

onept,weprovidesomeadditionalattributes:hasNegQuery,hasNegTableand

hasNegColumn. These are used to speify the databaseaess of negated on-

epts,inawaysimilartotheirrespetivepositivepairs.Byprovidinganattribute

hasNegQueryforanamerepresentingtheonept

C

wespeifyaquerylisting

theindividualsof

¬C

.Obviously,bothhasQueryand hasNegQueryanappear asattributesof thesameindividual.

(8)

Tospeifythattheindividualoneptrepresentstheonept

C

,onesimply

hastomakeoneptaninstaneof

C

.TheDLogsystemwillhekeahonept

ourring in the ABox if it ontains an instane whih is in the namespae

http://www.s.bme.hu/dlogDB.If suh an instane is found,it is interpreted

as a handle to a databasewhih is to produe (additional) instanes for the

givenonept.

Similarly,tospeifythatanindividualrolerepresentstherole

R

,werequire

thattheuserinludesthetriple{role,

R

, indiv}intheABox.Hereindivis

anarbitraryindividual.AgainDLogwill lookforaninstaneinthenamespae

http://www.s.bme.hu/dlogDBwithin the domain (i.e. theleft hand side) of

eahrole,anduseittoonstrutadatabaseaessforthegivenrole.

Thedatabaseinterfaeisurrentlyin thealphatestphase.Webelievethat

our approah for this task, disussed above, is an intermediate solution. Ulti-

matelythestandardinterfaes,suhasDIG,shouldbeextendedtoallowstoring

(partsof)theABoxindatabases.However,wehopethatourworkontributes

toimplementingthisultimategoal.

4.2 Examples of Usingthe Database Interfae

We now present two examples for interfaing with databases, one for the

simple,andonefortheomplexinterfae.

TheexamplesontainABoxassertions,whihare displayed asRDF triples

in {subjet, prediate, objet} format. String values are shown between

quotes.Thenamespaehttp://www.s.bme.hu/dlogDB#is representedbythe

dlog:prex.

Figure4showstheuseofthesimpliedinterfaefortheABoxoftheIoaste

example. This lassialexample involvesthe onept desribinga person hav-

ingapatriidehild,who,inturn,hasanon-patriidehild.TheABoxaxioms,

whiharenowtobestoredinadatabase,desribethehasChildrelationbetween

pairsof individuals (traditionallyontaining(Ioaste, Oedipus),(Ioaste,

Polyneikes),(Oedipus, Polyneikes)and(Polyneikes, Thersandros)).The

ABoxalsospeies whihindividuals arepatriide andwhih arenon-patriide

(traditionallyOedipusisknowntobelongtotheformer,whileThersandrosto

thelatter).

Wehavehosenthenamespaerepresentedbytheio:prexforthenames

inthisontology.Thedatabaseonnetionisnamediodb,andtheorresponding

DSN is speied as "ioaste" (line 1). This onnetion is aessed without

speifyingausernameorapassword.Aordingly,iodbhasnoattributesother

thandlog:hasDSN.

Both the role hasChild and the onept Patriide are taken from this

database.TherolehasChildisrepresentedbytheinstanedlog:riohasChild.

We hose this name asa mnemoni for a role from the namespae io, alled

hasChild,butanyothernameouldhavebeenused.Line2tellsthesystemthat

this individual represents the role io:hasChild. Here, the right hand side of

theroleisofnointerest,sowehoseto havethesameindividual asontheleft

handside.Line6tellsthattheindividualdlog:ioPatriideisaninstaneof

(9)

2 {dlog:riohasChild, io:hasChild, dlog:riohasChild}

3 {dlog:riohasChild, dlog:hasConnetion, dlog:iodb}

4 {dlog:riohasChild, dlog:hasQuery,

5 "SELECT parent, hild FROM hasChild"}

6 {dlog:ioPatriide, rdf:type, io:Patriide}

7 {dlog:ioPatriide, dlog:hasConnetion, dlog:iodb}

8 {dlog:ioPatriide, dlog:hasQuery,

9 "SELECT name FROM people WHERE patriide"}

10 {dlog:ioPatriide, dlog:hasNegQuery,

11 "SELECT name FROM people WHERE NOT patriide"}

Fig.4.Anexampleofthesimplieddatabaseinterfae.

theoneptio:Patriide 1

.Thisindividual,whihthusrepresentstheonept

io:Patriide,hastwoqueriesassoiatedwithit:oneforio:Patriide(line8)

andoneforitsnegation(line10).

Thesimpliedinterfaeallowsomplexqueries,suhastheoneforPatriide

whihhasaWHERElause.Thiswaytheexistingtablepeopleanbeusedwithout

modiation. However,this approah makes it very diult to transform any

possiblequeryprediatesin the TBoxto diret databasequeries,and instane

hekqueriesrunwithapoorperformane.

Wenowpresenta seond example. The TBox of this example, takenfrom

[4℄,isshownbelow.

1

hasFriend

.

Aloholi

⊑ ¬

Aloholi

2

hasParent

. ¬

Aloholi

⊑ ¬

Aloholi

Line1desribesthatthosewhohaveafriendwhoisaloholiarenon-aloholi

(as they see a bad example), while line 2 states that those who have a non-

aloholiparentare non-aloholi(as they see agood example).Inthe lassi

form the ABox ontainsrole assertionsfor the hasParent and hasFriendre-

lations only, and no onept assertions about anyone being aloholi or non-

aloholi.Inspiteof this, in thepreseneofertain roleinstane patterns,one

aninfersomepeopleto benon-aloholi,using aseanalysis.

Forexample,onsiderthefollowingpattern:JakisJoe'sparentandalsohis

friend.Now,ifweassumethatJakisaloholi,thentheaxiominline1implies

that Joeisnotaloholi.On theotherhand,ifJakisnotaloholi,it follows

fromline2thatJoeisnotaloholi,either.Thusthesetworoleassertionsimply

that Joehasto benon-aloholi.Other patterns, whereJoeanbeinferred to

benon-aloholi,arethe following:Joeisafriendof himself; Joeis afriendof

ananestor;andJoe'stwoanestorsare inthehasFriendrelationship.

1

Notethattheprex rdf,usedintheprediatepositionofthetripleinline6,refers

totheRDFnamespae:http://www.w3.org/1999/02/22-rdf -syn tax-n s#.

(10)

usingtheomplexinterfae.Here,thedatabasealoholiisaessedwiththe

user name "drunkard" and the password "palinka" (lines 13). We assume

that anewview,alled"hasParentView",wasdenedin thedatabaseto hide

the omplex query for the role hasParent, f. lines 46. The olumns of this

view, hildand parent(lines 78), ontainthe data for the role hasParent.

FromthisinformationDLoganreateaqueryforinstaneretrieval("SELECT

hild, parent FROM hasParentView"),andthreeotherquerypatternsforthe

aseswhen at leastoneof theindividuals isknown (e.g. "SELECT hild FROM

hasParentView WHERE parent = ?").This approahallowsforthegeneration

ofomplexdatabasequeriesforthequeryprediates.

1 {dlog:aldb, dlog:hasDSN, "aloholi"}

2 {dlog:aldb, dlog:hasUserName, "drunkard"}

3 {dlog:aldb, dlog:hasPassword, "palinka"}

4 {dlog:ralhasParent, al:hasParent, dlog:ralhasParent}

5 {dlog:ralhasParent, dlog:hasConnetion, dlog:aldb}

6 {dlog:ralhasParent, dlog:hasTable, "hasParentView"}

7 {dlog:ralhasParent, dlog:hasLHS, "hild"}

8 {dlog:ralhasParent, dlog:hasRHS, "parent"}

9 {dlog:ralhasFriend, al:hasFriend, dlog:ralhasFriend}

10 {dlog:ralhasFriend, dlog:hasConnetion, dlog:aldb}

11 {dlog:ralhasFriend, dlog:hasTable, "friends"}

12 {dlog:ralhasFriend, dlog:hasLHS, "friend1"}

13 {dlog:ralhasFriend, dlog:hasRHS, "friend2"}

14 {dlog:alAloholi, rdf:type, al:Aloholi}

15 {dlog:alAloholi, dlog:hasConnetion, dlog:aldb}

16 {dlog:alAloholi, dlog:hasTable, "aloholiView"}

17 {dlog:alAloholi, dlog:hasColumn, "name"}

18 {dlog:alAloholi, dlog:hasNegTable, "nonaloholiView"}

19 {dlog:alAloholi, dlog:hasNegColumn, "name"}

Fig.5.Anexampleoftheomplexdatabaseinterfae.

InFigure5,lines1013speify thedatabaseaess fortherolehasFriend,

whilelines1419allowforaessingindividualsbelongingtotheoneptaloholi

anditsnegationthroughappropriatedatabaseviews.

5 Integrating DLog with Protégé

Protégé[2℄isanopensoureontologyeditorthatsupportstheWebOntology

Language (OWL) [1℄, and anonnet to reasoners viathe HTTP-basedDIG

interfae[16℄.TheDLogserverimplementstheDIGinterfaeandanbeusedto

exeuteinstaneretrievalqueriesissuedfromthegraphialinterfaeofProtégé.

(11)

format.Forthe implementation weusedthe HTTPserverprovidedwith SWI-

Prolog.Inimplementingtheinterfaewefaeddiulties aused bysomeam-

biguitiesoftheDIGspeiations,despitethere beingan(exat)XMLshema

denition.Anotherdiultywasthat Protégédoesnot stritlyfollowthedef-

inition of the interfae. For example it uses a learKB ommand that is not

even dened in version1.1 of DIG.In DIG 1.0, whih supportedonly asingle

database,thisommandwasdened,butProtégéusesthenewversionthatsup-

portsmultipleonurrentknowledgebases.Westroveforanimplementationas

generi and omplying to the interfaedenition as possible while, also being

ompatiblewithProtégé.

For parsingXML we use the SGML module of SWI-Prolog, whih an be

operated in an XML ompatibility mode, allowing namespaes. As this is not

a diret XML parser, it has some diulties when used in XML mode. For

example evenwith the stritestsettings and treatingall warningsaserrors, it

aeptsinputles that arenotevenwell-formedXML. Beause ofthis, andin

hopeofbetterperformane,weareplanningtoswithto ApaheXeres-C++.

With Xeres weplan to use SAX parsing, instead of DOM, with the hope of

lowermemoryusageandfasterparsing.

ThedataareextratedfromtheXMLDOMusingDeniteClauseGrammars

(DCG).

Figure 6shows the resultsof a queryissued from Protégé, as answeredby

theDLogserver.

Fig.6.SreenshotofqueryresultsinProtégéansweredbyDLog.

(12)

diultyisthatiftheresultsofaqueryontainindividualsthatarenotdened

in Protégé (i.e. individuals present only in databases) Protégé silently drops

theseindividualsfrom thelistofqueryresults.

6 Evaluation

Thissetion ontainsapreliminary performane test ofthe databaseinter-

fae.

We tried the database interfae on a large version of the Ioaste problem

whihontains5058pairsinthehasChildrelation,855instanesthatareknown

tobepatriide,and314thatareknownto benon-patriide.

Theexeutionresultsaresummarisedin Table1.The loadtimemeansthe

time it takes to load the le whih ontains the axioms, inluding the XML

parsing. The translation time is the time it takes to generate the TBox and

ABoxodefrom theaxioms,whileexeutiontimeistherun-timeofthequery.

Table 1.Comparingthein-memoryanddatabaseversionofalargeIoastetest.

(seonds) loadtranslateexeutetotal

in-memory0.88 0.53 0.02 1.43

database 0.05 0.02 0.36 0.43

WhentheABoxisstoredinmemory,thetranslationtakes1.41seonds,and

theexeutiontakesonly0.02seonds.Notethatthesegureswereobtainedwith

theindexing optimisationturnedo. Whenthis optimisationisturned on,the

number of generated ABox lauses is doubled, and translation time inreases

aordingly.

The database variant of the example enumerates all the instanes of the

queriedoneptin 0.36seonds.This, ompared totheoriginal 0.02seonds is

muh slower.However,the time wespent at ompile-time wasaltogether0.07

seonds,resultinginatotalexeutiontimeof0.43seonds.Tosumup,interms

of total query exeution time, more than a three-fold derease wasahieved,

usingthedatabaseinterfae.

FromtheabovedataitmayseemthatusingadatabaseforstoringtheABox,

whihts into memory, isbeneialonly beauseof thereduedompile-time.

However, we believe that in the ase of large data sets and omplex queries

(espeially if these ontain onepts givingrise to query prediates)exeution

timeanalsobebetterthanthat ofthein-memoryvariant.

DetailedevaluationoftheDLogSystemanbefoundin[3℄.

(13)

InthispaperwehaveshownthearhitetureoftheDLogsystem,disusseda

databaseinterfaeforrepresentinglargeABoxes,andreportedontheintegration

ofDLogwiththeProtégéontologyeditor.

Thedatabaseinterfaeisespeiallyusefulifthedatasetannottinmemory

orifitissharedwithothersystems.Usingdatabasesangreatlyredueompile

timeand,withadvanedoptimisations,itmayprovideeienysimilartothat

ofthein-memoryversion.

Futureimprovementsinludetheoptimisationofqueryprediates,bytrans-

forming them to database queries, and better integration of Protégé and the

databaseinterfae.Ourplansalso inludetheimplementationof aquerymod-

uletohandleompositequeries,andthesupportforadditionalinterfaeformats,

suhasOWL,ortheKRSSnotationusedbye.g.theRaerProengine.

Aknowledgements

Theauthorsaregrateful totheanonymousreviewersfortheirommentson

the earlier version of the paper, and espeially for reommending the Billion

TriplesChallenge forevaluation.

Referenes

1. Behhofer, S.: OWL web ontology language referene. W3C reommendation

(February2004)

2. Noy, N., Fergerson, R., Musen, M.: The knowledge model

of Protege-2000: Combining interoperability and exibility.

http://iteseer.nj.ne.om/noy01knowledge.html(2000)

3. Lukásy,G.,Szeredi,P.: EientdesriptionlogireasoninginProlog:theDLog

system.Tehnialreport,BudapestUniversityofTehnologyandEonomis(Jan-

uary2008)ConditionallyaeptedforpubliationinTheoryandPratieofLogi

Programming.

4. Lukásy, G., Szeredi, P., Kádár, B.: Prolog based desription logi reasoning.

(Deember2008)ToappearinICLP2008.

5. Haarslev,V.,Möller,R.:Optimizationtehniquesforretrievingresouresdesribed

inOWL/RDFdouments:Firstresults. In:NinthInternationalConfereneonthe

Priniples ofKnowledgeRepresentationand Reasoning,KR 2004,Whistler,BC,

Canada,June2-5.(2004)163173

6. Haarslev, V.,Möller, R., vander Straeten,R.,Wessel,M.: ExtendedQueryFa-

ilities for Raerand anAppliation toSoftware-EngineeringProblems. In:Pro-

eedings of the 2004 International Workshop on Desription Logis (DL-2004),

Whistler,BC,Canada,June6-8.(2004)148157

7. Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., Katz, Y.: Pellet: A pratial

OWL-DLreasoner. WebSemant.5(2)(2007)5153

8. Horroks, I., Li, L., Turi, D., Behhofer, S.: The Instane Store: DL reasoning

withlargenumbersofindividuals. In:ProeedingsofDL2004, BritishColumbia,

Canada.(2004)

(14)

using a theoremprover. In:FoIKS. Volume 3861 ofLetureNotes inComputer

Siene.,Springer(2006)201218

10. Hustadt,U.,Motik,B.,Sattler,U.:ReasoningforDesriptionLogisaroundSHIQ

inaresolutionframework. Tehnialreport,FZI,Karlsruhe(2004)

11. Motik, B.: Reasoning in Desription Logis using Resolution and Dedutive

Databases. PhDthesis,UnivesitätKarlsruhe(TH),Karlsruhe,Germany(January

2006)

12. Grosof,B.N.,Horroks,I.,Volz,R.,Deker,S.: Desriptionlogiprograms:Com-

bininglogiprogramswithdesriptionlogi.In:Pro.oftheTwelfthInternational

WorldWideWebConferene(WWW2003),ACM(2003)4857

13. Hustadt,U.,Motik,B.,Sattler,U.:Dataomplexityofreasoninginveryexpressive

desriptionlogis.In:ProeedingsoftheNineteenthInternationalJointConferene

onArtiialIntelligene(IJCAI2005),InternationalJointConferenesonArtiial

Intelligene(2005)466471

14. Stikel,M.E.: A Prologtehnology theoremprover:anewexpositionandimple-

mentationinProlog. TheoretialComputerSiene104(1)(1992)109128

15. Zombori,Zs.: Eient two-phasedata reasoningfor desription logis. In: Pro-

eedingsoftheInternationalFederationforInformationProessingTehnialCom-

mitteeonArtiialIntelligene(TC12), Milan,Italy(September2008) Aepted

onferenepaper.

16. Behhofer,S.:TheDIGdesriptionlogiinterfae.http://dig.s.manhester.a.uk/

(2006)

Références

Documents relatifs

If there is another event in the knowledgebase, for example a broken water main on the same :ex_street where the traffic jam is happening, we can reason about whether these two

We discuss the application of in- complete reasoning for the resource discovery tasks and demonstrate a service- oriented realization for the query expansion and

The performance limitation can be considerably reduced by utilizing such large-scale e-Infrastructures as LarKC - the Large Knowledge Collider - an experimental platform

Since ELP does not only define the semantics of the language, but also a tractable reasoning algorithm that translates ELP rule bases into Datalog, a corresponding extension to

of a decentralised, self-organising system is presented, which allows autonomous, light-weight beasts to traverse RDF graphs and thereby instantiate pattern- based inference rules,

With the increase in the amount of data published in the Web an important subject of interest that has recently gained momentum in the Semantic Web community is how to provide

Access Matrix with explicit and implied permissions caused by user role hier- archy and object class hierarchy (conflicting solution from [2] given in parentheses).. and objects

between 2 and 100 compared to query answering with respect to the original crisp ontologies. The pattern which emerges is quite clear. For the reduced ontologies with 3 degrees,