Web
BalázsKádár,GergelyLukásyandPéterSzeredi
BudapestUniversityofTehnologyandEonomis
DepartmentofComputerSieneandInformation Theory
1117Budapest,Magyartudósokkörútja2.,Hungary
balazskadar.biz,{lukasy,sze redi} s.b me.h u
Abstrat. Traditionalalgorithmsfordesriptionlogi(DL)instanere-
trievalareineientforlargeamountsofunderlyingdata.Asdesription
logiisbeomingpopular inareassuhastheSemantiWeb,itis very
importanttohavesystemsthatanreasoneientlyoverlargedatasets.
InthispaperwepresenttheDLogdesriptionlogi reasonerspeially
designedforsuhsenarios.
TheDLogapproahtransformsdesriptionlogiaxiomsusingthe
S HIQ
DLlanguageintoaPrologprogram.Thistransformationisdonewithout
anyknowledgeofthepartiular individuals:theyare aesseddynami-
allyduringthenormalPrologexeutionofthegeneratedprogram.This
allowsustostoretheindividualsinadatabaseinsteadofmemory,whih
resultsinbetter salability and helpsusingdesription logi ontologies
diretlyontopofexisting informationsoures.
InthispaperwefousonthedesriptionoftheDLogappliationitself.
Wepresent the arhitetureofDLog and desribe itsinterfaes. These
make it possible to use ABoxes stored indatabases and to ommuni-
ate withtheProtégé ontology editor,as aserverappliation. Wealso
evaluatetheperformaneoftheDLogdatabaseextension.
Keywords:largedatasets,desriptionlogi,reasoning, logiprogram-
ming,databases
1 Introdution
DesriptionLogis(DLs)allowustorepresentknowledgebases onsistingof
terminologialaxioms(theTBox)and assertionalknowledge(theABox).
DesriptionLogisarebeomingwidespreadasmoreandmoresystemsstart
using semantisforvariousreasons.Asan example,intheSemantiWebidea,
DLsareintendedtoprovidethemathematialbakgroundneededformoreintel-
ligentqueryanswering.Heretheknowledgeisapturedintheformofexpressive
ontologies,desribedintheWebOntologyLanguage(OWL)[1℄.This language
is mostly based on the
SHIQ
desription logi, and it is intended to be thestandardknowledgerepresentationformatoftheWeb.
However, we have tremendous amounts of information on the Web whih
alls for reasoners that are ableto eiently handle suh abundane of data.
for querying desription logi onepts in an environment where the ABox is
storedin adatabase.
Wefoundthatmostexisting desriptionlogireasonersarenotsuitablefor
this task, as these are not apableof handling ABoxesstored externally. This
is not a simple tehnial problem: most existing algorithms for querying DL
onepts need to examine the whole ABox to answera query. This resultsin
salability problems and undermines the point of using databases.Beause of
this, we startedto investigate tehniques whih allowthe separationofthein-
ferenealgorithm fromthedatastorage.
Wehavedevelopedasolution,wheretheinferenealgorithm isdividedinto
twophases. Firstwereate a query-plan in Prologfrom the atualDL knowl-
edgebase,withoutaessingtheunderlyingdataset. Subsequently,thisquery-
plan anberun onreal data, to obtainthe requiredresults.The implementa-
tion of these ideasis inorporatedin the DLog reasoningsystem, available at
http://dlog-reasoner.soureforge.net.
Inthis paper wefous on the arhiteture of the DLog system, as well as
onitsexternalinterfaes.Wedisusstheinterfaeusedforaessingdatabases,
whihallowsdesriptionlogi reasoningontopofexisting informationsoures.
WealsodesribetheProtégé[2℄interfaethatmakesitpossibletouseDLogas
thebak-endreasonerofthispopularontologyeditor.Detailsonthetheoretial
sideofDLoganbefoundin [3℄ andin[4℄.
This paper is strutured asfollows.Setion 2summarises related work.In
Setion 3wegiveageneralintrodutiontotheDLogapproahandpresentthe
arhitetureandimplementationdetailsofthesystem.ThedatabaseandProtégé
interfaesaredesribedinSetions4and5,respetively.Setion6evaluatesthe
performaneof the databaseextensionof DLogw.r.t. theversionwhih stores
theABoxasPrologfats.Finally,inSetion7,weonludewiththefuturework
andthesummaryofourresults.
2 Related work
Several tehniques have emerged for dealing with ABox-reasoning. Tradi-
tional ABox-reasoningis basedon thetableau inferene algorithm,whihtries
to build a model showing that a given onept is satisable. To infer that an
individual
i
isaninstaneofaoneptC
,anindiretassumption¬C(i)
isaddedtotheABox,andthetableau-algorithmisapplied.Ifthisreportsinonsisteny,
i
is provedto beaninstaneofC
.Themain drawbakofthisapproahis thatit annotbe diretlyused forhigh volumeinstane retrieval, beause itwould
requirehekingallinstanesin theABox,onebyone.
To make tableau-based reasoning moreeient on large data sets, several
tehniqueshavebeendevelopedinreentyears[5℄.Theseareusedbythestate-
of-the-artDLreasoners,suh asRaerPro[6℄orPellet[7℄.
Extreme asesinvolve serious restritionson the knowledge base to ensure
eientexeutionwithlargeamountsofinstanes. Forexample,[8℄ suggestsa
aessed in a very eient way. The drawbak is that the ABox may ontain
onlyaxiomsofform
C(a)
,i.e.weannotmakeroleassertions.Paper[9℄ disusses how a rst order theorem proversuh as Vampire an
bemodiedandoptimisedforreasoningoverdesriptionlogiknowledgebases.
Thiswork,however,mostlyfouses onTBoxreasoning.
In[10℄, aresolution-based inferenealgorithm is desribed, whih is not as
sensitiveto theinreaseof theABox size asthetableau-based methods. How-
ever, this approah still requires the input of the whole ontent of the ABox
before attemptingto answeranyqueries.TheKAON2 system[11℄ implements
thismethod andprovidesreasoningserviesoverthedesriptionlogilanguage
SHIQ
bytransformingtheknowledgebaseintoadisjuntivedatalogprogram.Although the motivation and goals of KAON2 are similar to ours, unlike
KAON2 (1)weuse apure two-phase reasoningapproah(i.e. theABox is a-
essedonlyduring queryanswering)and(2)wetranslateintoPrologwhihhas
well-established,eientandrobustimplementations.
Artile [12℄ introdues theterm DesriptionLogi Programming.Thisidea
usesadirettransformationof
ALC
desriptionlogioneptsintodeniteHorn-lauses, and posessomerestritions onthe form ofthe knowledge base,whih
disallowaxiomsrequiringdisjuntivereasoning.Asanextension,[13℄introdues
afragmentof the
SHIQ
languagethat an be transformedinto Horn-lauses.Thiswork,however,still posesrestritionsontheuseofdisjuntions.
3 The DLog system
Themain ideaoftheDLogapproahis thatwetransforma
SHIQ
knowl-edgebaseKBintorst-orderlauses
Ω(
KB)
andfromthesewegeneratePrologode[3℄. Inontrastwith[11℄,alllausesontainingfuntionsymbolsareelim-
inatedduring the transformation:the resultinglauses anbe resolved further
only with ABox lauses. This forms the basis of a pure two phase reasoning
framework,whereeverypossibleABox-independentreasoningstepisperformed
beforeaessingtheABoxitself,allowingustostoretheontentoftheABoxin
anexternaldatabase.
Atually, in the general transformation, we use only ertain properties of
Ω(
KB)
. Thesepropertiesaresatisedbyasubsetof rstorder lausesthat is,in fat,largerthantheset oflausesthatanbegeneratedfroma
SHIQ
KB.We allthese lauses DLlauses. Asa onsequeneof this, ourresultsan be
usedforDLknowledgebasesthataremoreexpressivethan
SHIQ
.Thisinludestheuse ofertain roleonstrutors, suh asunion.Furthermore,someparts of
theknowledgebaseanbesuppliedbytheuserdiretlyintheformofrstorder
lauses.Moredetailsanbefoundin[3℄.
Asthelausesofa
SHIQ
knowledgebase KBarenormalrst-orderlausesweanapplythePrologTehnologyTheoremProving(PTTP) tehnology[14℄
diretlyonthese. In[3℄wehavesimpliedthePTTPtehniques forthespeial
ompleteforDLlauses.
ThesimpliedPTTPtehniquesusedinDLoginludedeterministianestor
resolution and loop elimination. Both are appliableonly to unary prediates,
i.e.prediatesorrespondingtoDLonepts.
InthedesignoftheDLogsystemwefousonmodularity.Thisenablesusto
easilyimplementnewfeaturesandnewinterfaes.Thetoplevelarhitetureof
the systemis shown in Figure 1.In thisgure, asin subsequentguresof the
paper,retangleswithrounded ornersrepresentmodules oftheDLogsystem,
whiledataareshownasplainretangles.InFigure1theDLogreasonerisshown
within adashedretangle.
PSfragreplaements
Client
DLogreasoner
KnowledgeBaseManager
Firstphase:
translation
Seondphase:
exeution
Supportmodules
Fig.1.ThetoplevelarhitetureoftheDLogsystem.
Theuser(eitherloalorremote)aessesDLogthroughoneoftheexternal
interfaes. These interfaes range from a loal onsole to serverinterfaeslike
DIG used by theProtégé ontologyeditor. Theknowledge base manageris the
entral piee of thesystem.It oordinatesthetasks of theothermodules, and
performstheadministrationofmultipleonurrentknowledgebases.Itforwards
therequestarrivingfromtheinterfaesto thereasonermodules.
Thesupport modules onsistof severaltoolsthatare usedbymostparts of
the system. They inlude a onguration manager module, alogger, an XML
reader,arun-timesystemfortheseondphase,andseveralportabilitytoolsthat
allowDLogto rununder dierentPrologimplementations(urrentlySWI and
SICStus).
The rst phase, translation, shown in Figure 2, takes a set of desription
logi axiomsas input. These axioms are divided into two parts: the TBox or
terminology box stores onept and role inlusion axioms, while the ABox or
assertion box ontains the fatual data. The ABox may be stored (partly or
ABoxode(whihisaPrologmodule),andtheABoxsignature,whihisrequired
for translatingthe TBox. Thegeneration of ABoxodeinludes optimisations
suhasindexingonseondargumentforrolesstoredinmemory.
Next,the TBox is proessed in two steps.First the DL translator module
transformsthedesriptionlogiformulaetoaset ofDLlauses[15℄, whih are
passed onto the TBoxtranslator module that generates the exeutableTBox
ode.Thisgeneratedodeisequivalent,withrespettoinstaneretrieval,tothe
input DLknowledge base.TheTBoxtranslator module usesvarious optimisa-
tions[3℄toobtainmoreeientPrologprograms.TheABoxandTBoxodean
begenerateddiretlyintomemoryormaybesavedtodiskforlater(standalone)
use.
PSfragreplaements
TBox ABox
DL
translator
TBox
translator ABoxsignature ABox
translator
ABoxode
TBoxode
Fig.2.Therstphase:translation.
Theseond phase, exeution,shown in Figure 3,uses theABox and TBox
programs generated in the rst phase, to answer queries.There are two ways
to exeute queries: the generated TBox an be alled diretly from Prolog as
a low-level interfae, orthe Query module provides ahigh-level interfae that
providesbasisupport for omposite queriesand anaggregatethe results.In
normal operation the querymodule is alled bythe knowledge base manager,
whihforwardstheresultsto theuserinterfae.Asthequerymoduledoesnot
dependontherest ofthesystem,itmaybeusedin standaloneoperation.The
run-timesystem(shownasRTSinthegure)inludesahashtableimplemented
in Cusedto speedupthereasoning,andoptionalolletionofstatistis.
PSfragreplaements
ABoxode
TBoxode
Query
module RTS
Queries
Results
Fig.3.Theseondphase:exeution.
As the rst phase of reasoning (i.e. the generation of a query plan) only
dependsonthesignatureofthedataset,andbeauseofthetop-downinferene
ofProlog,DLoganeientlyusedatabasestostoretheABox.
There may be several advantages in using databases to store the ABox.
Firstly,thisallowsreasoningondatasetsthatannottintomemory.Seondly,
itmakesintegratingDLogwithexistingsystemseasier,asthereasoneranuse
the existing databases of other appliations. Thirdly, querying some onepts
(namely those orresponding to so-alled query prediates) may be performed
usingomplexdatabasequeries,ratherthanDLreasoning,whihisexpetedto
deliveramarkedinreasein performane.
Aprediateisaquery prediate [3℄,if itisnon-reursive,itdoesnotinvoke
itsnegation,andisnotinvokedfromwithinitsnegation.Here,aprediate
P 0is
saidto invokeaprediate
P n,n ≥ 1
,ifthere aren − 1
intermediateprediates
P 1 . . . P n − 1, suh that P i is diretly invoked byP i − 1, i.e. it ours in a lause
P i − 1, i.e. it ours in a lause
bodytheheadofwhihis
P i − 1,fori = 1, . . . , n
.
Query prediates require neither loop elimination, nor anestor resolution
duringexeution.Thenamequeryprediate reetsthatfatthatsuhpredi-
atesanbetransformedtoomplexdatabasequeries(providedthatallonepts
and roles required are storedin asingle database). This an inreasethe per-
formane as the database enginean optimise the query using statistial and
struturalknowledgeofthedatabaseinquestion.
Wedesignedthedatabaseinterfaetobeassimpleaspossible.Thedatabases
are aessed viathe ODBC driver ofSWI-Prolog; asa onsequeneDLogan
interfae with most modern database systems. We wanted a way to speify
database aess using existing tools and interfaes suh as Protégé and the
DIGinterfaeitutilisesevenifthosedonot,atthemoment,provideawayto
speifydatabaseusage. Toaess adatabase,several piees ofinformation are
needed:thenameofthedatabase,ausername,apassword,adesriptionofwhih
tabletouseforgivenoneptsandroles,et.Beauseoftheaforementionedre-
quirements wedeided to use ABoxassertions to arrythis meta-information.
ABoxassertionsaredesriptionlogionstrutsthatarereadilyavailableinDL
systemsandinterfaes,suhasOWLandDIG.
Inorder tospeify thedatabaseaess foroneptsand roles weintrodue
new roles (objet properties), attributes (datatype properties) and individuals
dened inthenamespaehttp://www.s.bme.hu/dlogDB.
The ODBC interfae presribes that databaseonnetions are to be iden-
tied by a Data Soure Name (DSN). In DLog we introduean individual to
representagivendatabaseonnetion.Roles andoneptsarealsorepresented
byindividuals.An arbitrarynameanbeusedforsuhanindividual.
The meta data provided is used to onnet to the database, and, for eah
oneptand role,anadditionallauseisgenerated, whih, byexeutinganap-
propriatedatabasequery,listsappropriateindividuals(orpairsofindividuals).
Thisallowsoneptsandrolesto bestoredpartiallyin databasesand partially
in memory.Thismaybeveryusefulwhendevelopingontologies.
Databaseonnetionsarerepresentedbyindividualsthathavethestringat-
tributehasDSNdened.Thevalueofthisattributeisthenameofthedatasoure
(DSN).Asallothernamesinthissetion,thisnameisdenedinthenamespae
http://www.s.bme.hu/dlogDB.Additionalstringattributes,namelyhasUserName
and hasPassword,maybeusedto speifytheusernameand thepasswordfor
thegivenonnetion,ifrequired.
TheobjetpropertyhasConnetionlinks anindividual representingarole
or a onept with the database onnetion to be used for aessing it. This
makesit possible to use one data soure for one onept, and a dierent one
for another. The instane on the left hand side is the individual representing
therole oronept,while theinstane onthe righthand side isthe individual
representingtheonnetion.
Twomethodsareprovidedtospeifyhowtogetthedatafromthedatabase.
Oneisto speifyaquerythat isto bediretly exeuted onthedatabase.This
method,namedthesimpleinterfae,isprovidedbeauseofitssimpliity:itan
beappliedtodatabaseswithoutanymodiation.Howeverithastwodrawbaks:
it makes transforming query prediates to database queries very diult;
and
itperformsbadlyforinstanehekqueries.
Thelatterisalargesetbakasmostofthequeriesareinstaneheks,assuming
thetheprojetion optimisationof[3℄isused.
Thereforetheseond,preferred,wayistoprovidethenameofatableorof
aview andthe name of theolumn(s) ofthis table. This approah, alled the
omplex interfae may requirethe reation of new views in the database,but
providesmuhgreaterexibilityandbetterperformane.
TheSQL queryin thesimple interfaeisdened using the stringattribute
hasQuery.Theindividualrepresentstheroleoroneptandtheattributevalue
isthequerystring.Forindividualsrepresentingrolesthequerymustreturntwo
olumns,andforthoseusedforoneptsitmustreturnoneolumnthatontains
theindividualname.
If the omplex interfae is used, the name of the table or view to use is
speiedbythestringattribute hasTable.Thenameoftheolumn listingthe
individualsofaoneptisgivenusingthestringattributehasColumn.Forroles,
theattributeshasLHSandhasRHSareusedfortheleft andtherighthand side,
respetively.
Beause,inProtégé,individualsannotbespeiedasinstanesofanegated
onept,weprovidesomeadditionalattributes:hasNegQuery,hasNegTableand
hasNegColumn. These are used to speify the databaseaess of negated on-
epts,inawaysimilartotheirrespetivepositivepairs.Byprovidinganattribute
hasNegQueryforanamerepresentingtheonept
C
wespeifyaquerylistingtheindividualsof
¬C
.Obviously,bothhasQueryand hasNegQueryanappear asattributesof thesameindividual.Tospeifythattheindividualoneptrepresentstheonept
C
,onesimplyhastomakeoneptaninstaneof
C
.TheDLogsystemwillhekeahoneptourring in the ABox if it ontains an instane whih is in the namespae
http://www.s.bme.hu/dlogDB.If suh an instane is found,it is interpreted
as a handle to a databasewhih is to produe (additional) instanes for the
givenonept.
Similarly,tospeifythatanindividualrolerepresentstherole
R
,werequirethattheuserinludesthetriple{role,
R
, indiv}intheABox.Hereindivisanarbitraryindividual.AgainDLogwill lookforaninstaneinthenamespae
http://www.s.bme.hu/dlogDBwithin the domain (i.e. theleft hand side) of
eahrole,anduseittoonstrutadatabaseaessforthegivenrole.
Thedatabaseinterfaeisurrentlyin thealphatestphase.Webelievethat
our approah for this task, disussed above, is an intermediate solution. Ulti-
matelythestandardinterfaes,suhasDIG,shouldbeextendedtoallowstoring
(partsof)theABoxindatabases.However,wehopethatourworkontributes
toimplementingthisultimategoal.
4.2 Examples of Usingthe Database Interfae
We now present two examples for interfaing with databases, one for the
simple,andonefortheomplexinterfae.
TheexamplesontainABoxassertions,whihare displayed asRDF triples
in {subjet, prediate, objet} format. String values are shown between
quotes.Thenamespaehttp://www.s.bme.hu/dlogDB#is representedbythe
dlog:prex.
Figure4showstheuseofthesimpliedinterfaefortheABoxoftheIoaste
example. This lassialexample involvesthe onept desribinga person hav-
ingapatriidehild,who,inturn,hasanon-patriidehild.TheABoxaxioms,
whiharenowtobestoredinadatabase,desribethehasChildrelationbetween
pairsof individuals (traditionallyontaining(Ioaste, Oedipus),(Ioaste,
Polyneikes),(Oedipus, Polyneikes)and(Polyneikes, Thersandros)).The
ABoxalsospeies whihindividuals arepatriide andwhih arenon-patriide
(traditionallyOedipusisknowntobelongtotheformer,whileThersandrosto
thelatter).
Wehavehosenthenamespaerepresentedbytheio:prexforthenames
inthisontology.Thedatabaseonnetionisnamediodb,andtheorresponding
DSN is speied as "ioaste" (line 1). This onnetion is aessed without
speifyingausernameorapassword.Aordingly,iodbhasnoattributesother
thandlog:hasDSN.
Both the role hasChild and the onept Patriide are taken from this
database.TherolehasChildisrepresentedbytheinstanedlog:riohasChild.
We hose this name asa mnemoni for a role from the namespae io, alled
hasChild,butanyothernameouldhavebeenused.Line2tellsthesystemthat
this individual represents the role io:hasChild. Here, the right hand side of
theroleisofnointerest,sowehoseto havethesameindividual asontheleft
handside.Line6tellsthattheindividualdlog:ioPatriideisaninstaneof
2 {dlog:riohasChild, io:hasChild, dlog:riohasChild}
3 {dlog:riohasChild, dlog:hasConnetion, dlog:iodb}
4 {dlog:riohasChild, dlog:hasQuery,
5 "SELECT parent, hild FROM hasChild"}
6 {dlog:ioPatriide, rdf:type, io:Patriide}
7 {dlog:ioPatriide, dlog:hasConnetion, dlog:iodb}
8 {dlog:ioPatriide, dlog:hasQuery,
9 "SELECT name FROM people WHERE patriide"}
10 {dlog:ioPatriide, dlog:hasNegQuery,
11 "SELECT name FROM people WHERE NOT patriide"}
Fig.4.Anexampleofthesimplieddatabaseinterfae.
theoneptio:Patriide 1
.Thisindividual,whihthusrepresentstheonept
io:Patriide,hastwoqueriesassoiatedwithit:oneforio:Patriide(line8)
andoneforitsnegation(line10).
Thesimpliedinterfaeallowsomplexqueries,suhastheoneforPatriide
whihhasaWHERElause.Thiswaytheexistingtablepeopleanbeusedwithout
modiation. However,this approah makes it very diult to transform any
possiblequeryprediatesin the TBoxto diret databasequeries,and instane
hekqueriesrunwithapoorperformane.
Wenowpresenta seond example. The TBox of this example, takenfrom
[4℄,isshownbelow.
1
∃
hasFriend.
Aloholi⊑ ¬
Aloholi2
∃
hasParent. ¬
Aloholi⊑ ¬
AloholiLine1desribesthatthosewhohaveafriendwhoisaloholiarenon-aloholi
(as they see a bad example), while line 2 states that those who have a non-
aloholiparentare non-aloholi(as they see agood example).Inthe lassi
form the ABox ontainsrole assertionsfor the hasParent and hasFriendre-
lations only, and no onept assertions about anyone being aloholi or non-
aloholi.Inspiteof this, in thepreseneofertain roleinstane patterns,one
aninfersomepeopleto benon-aloholi,using aseanalysis.
Forexample,onsiderthefollowingpattern:JakisJoe'sparentandalsohis
friend.Now,ifweassumethatJakisaloholi,thentheaxiominline1implies
that Joeisnotaloholi.On theotherhand,ifJakisnotaloholi,it follows
fromline2thatJoeisnotaloholi,either.Thusthesetworoleassertionsimply
that Joehasto benon-aloholi.Other patterns, whereJoeanbeinferred to
benon-aloholi,arethe following:Joeisafriendof himself; Joeis afriendof
ananestor;andJoe'stwoanestorsare inthehasFriendrelationship.
1
Notethattheprex rdf,usedintheprediatepositionofthetripleinline6,refers
totheRDFnamespae:http://www.w3.org/1999/02/22-rdf -syn tax-n s#.
usingtheomplexinterfae.Here,thedatabasealoholiisaessedwiththe
user name "drunkard" and the password "palinka" (lines 13). We assume
that anewview,alled"hasParentView",wasdenedin thedatabaseto hide
the omplex query for the role hasParent, f. lines 46. The olumns of this
view, hildand parent(lines 78), ontainthe data for the role hasParent.
FromthisinformationDLoganreateaqueryforinstaneretrieval("SELECT
hild, parent FROM hasParentView"),andthreeotherquerypatternsforthe
aseswhen at leastoneof theindividuals isknown (e.g. "SELECT hild FROM
hasParentView WHERE parent = ?").This approahallowsforthegeneration
ofomplexdatabasequeriesforthequeryprediates.
1 {dlog:aldb, dlog:hasDSN, "aloholi"}
2 {dlog:aldb, dlog:hasUserName, "drunkard"}
3 {dlog:aldb, dlog:hasPassword, "palinka"}
4 {dlog:ralhasParent, al:hasParent, dlog:ralhasParent}
5 {dlog:ralhasParent, dlog:hasConnetion, dlog:aldb}
6 {dlog:ralhasParent, dlog:hasTable, "hasParentView"}
7 {dlog:ralhasParent, dlog:hasLHS, "hild"}
8 {dlog:ralhasParent, dlog:hasRHS, "parent"}
9 {dlog:ralhasFriend, al:hasFriend, dlog:ralhasFriend}
10 {dlog:ralhasFriend, dlog:hasConnetion, dlog:aldb}
11 {dlog:ralhasFriend, dlog:hasTable, "friends"}
12 {dlog:ralhasFriend, dlog:hasLHS, "friend1"}
13 {dlog:ralhasFriend, dlog:hasRHS, "friend2"}
14 {dlog:alAloholi, rdf:type, al:Aloholi}
15 {dlog:alAloholi, dlog:hasConnetion, dlog:aldb}
16 {dlog:alAloholi, dlog:hasTable, "aloholiView"}
17 {dlog:alAloholi, dlog:hasColumn, "name"}
18 {dlog:alAloholi, dlog:hasNegTable, "nonaloholiView"}
19 {dlog:alAloholi, dlog:hasNegColumn, "name"}
Fig.5.Anexampleoftheomplexdatabaseinterfae.
InFigure5,lines1013speify thedatabaseaess fortherolehasFriend,
whilelines1419allowforaessingindividualsbelongingtotheoneptaloholi
anditsnegationthroughappropriatedatabaseviews.
5 Integrating DLog with Protégé
Protégé[2℄isanopensoureontologyeditorthatsupportstheWebOntology
Language (OWL) [1℄, and anonnet to reasoners viathe HTTP-basedDIG
interfae[16℄.TheDLogserverimplementstheDIGinterfaeandanbeusedto
exeuteinstaneretrievalqueriesissuedfromthegraphialinterfaeofProtégé.
format.Forthe implementation weusedthe HTTPserverprovidedwith SWI-
Prolog.Inimplementingtheinterfaewefaeddiulties aused bysomeam-
biguitiesoftheDIGspeiations,despitethere beingan(exat)XMLshema
denition.Anotherdiultywasthat Protégédoesnot stritlyfollowthedef-
inition of the interfae. For example it uses a learKB ommand that is not
even dened in version1.1 of DIG.In DIG 1.0, whih supportedonly asingle
database,thisommandwasdened,butProtégéusesthenewversionthatsup-
portsmultipleonurrentknowledgebases.Westroveforanimplementationas
generi and omplying to the interfaedenition as possible while, also being
ompatiblewithProtégé.
For parsingXML we use the SGML module of SWI-Prolog, whih an be
operated in an XML ompatibility mode, allowing namespaes. As this is not
a diret XML parser, it has some diulties when used in XML mode. For
example evenwith the stritestsettings and treatingall warningsaserrors, it
aeptsinputles that arenotevenwell-formedXML. Beause ofthis, andin
hopeofbetterperformane,weareplanningtoswithto ApaheXeres-C++.
With Xeres weplan to use SAX parsing, instead of DOM, with the hope of
lowermemoryusageandfasterparsing.
ThedataareextratedfromtheXMLDOMusingDeniteClauseGrammars
(DCG).
Figure 6shows the resultsof a queryissued from Protégé, as answeredby
theDLogserver.
Fig.6.SreenshotofqueryresultsinProtégéansweredbyDLog.
diultyisthatiftheresultsofaqueryontainindividualsthatarenotdened
in Protégé (i.e. individuals present only in databases) Protégé silently drops
theseindividualsfrom thelistofqueryresults.
6 Evaluation
Thissetion ontainsapreliminary performane test ofthe databaseinter-
fae.
We tried the database interfae on a large version of the Ioaste problem
whihontains5058pairsinthehasChildrelation,855instanesthatareknown
tobepatriide,and314thatareknownto benon-patriide.
Theexeutionresultsaresummarisedin Table1.The loadtimemeansthe
time it takes to load the le whih ontains the axioms, inluding the XML
parsing. The translation time is the time it takes to generate the TBox and
ABoxodefrom theaxioms,whileexeutiontimeistherun-timeofthequery.
Table 1.Comparingthein-memoryanddatabaseversionofalargeIoastetest.
(seonds) loadtranslateexeutetotal
in-memory0.88 0.53 0.02 1.43
database 0.05 0.02 0.36 0.43
WhentheABoxisstoredinmemory,thetranslationtakes1.41seonds,and
theexeutiontakesonly0.02seonds.Notethatthesegureswereobtainedwith
theindexing optimisationturnedo. Whenthis optimisationisturned on,the
number of generated ABox lauses is doubled, and translation time inreases
aordingly.
The database variant of the example enumerates all the instanes of the
queriedoneptin 0.36seonds.This, ompared totheoriginal 0.02seonds is
muh slower.However,the time wespent at ompile-time wasaltogether0.07
seonds,resultinginatotalexeutiontimeof0.43seonds.Tosumup,interms
of total query exeution time, more than a three-fold derease wasahieved,
usingthedatabaseinterfae.
FromtheabovedataitmayseemthatusingadatabaseforstoringtheABox,
whihts into memory, isbeneialonly beauseof thereduedompile-time.
However, we believe that in the ase of large data sets and omplex queries
(espeially if these ontain onepts givingrise to query prediates)exeution
timeanalsobebetterthanthat ofthein-memoryvariant.
DetailedevaluationoftheDLogSystemanbefoundin[3℄.
InthispaperwehaveshownthearhitetureoftheDLogsystem,disusseda
databaseinterfaeforrepresentinglargeABoxes,andreportedontheintegration
ofDLogwiththeProtégéontologyeditor.
Thedatabaseinterfaeisespeiallyusefulifthedatasetannottinmemory
orifitissharedwithothersystems.Usingdatabasesangreatlyredueompile
timeand,withadvanedoptimisations,itmayprovideeienysimilartothat
ofthein-memoryversion.
Futureimprovementsinludetheoptimisationofqueryprediates,bytrans-
forming them to database queries, and better integration of Protégé and the
databaseinterfae.Ourplansalso inludetheimplementationof aquerymod-
uletohandleompositequeries,andthesupportforadditionalinterfaeformats,
suhasOWL,ortheKRSSnotationusedbye.g.theRaerProengine.
Aknowledgements
Theauthorsaregrateful totheanonymousreviewersfortheirommentson
the earlier version of the paper, and espeially for reommending the Billion
TriplesChallenge forevaluation.
Referenes
1. Behhofer, S.: OWL web ontology language referene. W3C reommendation
(February2004)
2. Noy, N., Fergerson, R., Musen, M.: The knowledge model
of Protege-2000: Combining interoperability and exibility.
http://iteseer.nj.ne.om/noy01knowledge.html(2000)
3. Lukásy,G.,Szeredi,P.: EientdesriptionlogireasoninginProlog:theDLog
system.Tehnialreport,BudapestUniversityofTehnologyandEonomis(Jan-
uary2008)ConditionallyaeptedforpubliationinTheoryandPratieofLogi
Programming.
4. Lukásy, G., Szeredi, P., Kádár, B.: Prolog based desription logi reasoning.
(Deember2008)ToappearinICLP2008.
5. Haarslev,V.,Möller,R.:Optimizationtehniquesforretrievingresouresdesribed
inOWL/RDFdouments:Firstresults. In:NinthInternationalConfereneonthe
Priniples ofKnowledgeRepresentationand Reasoning,KR 2004,Whistler,BC,
Canada,June2-5.(2004)163173
6. Haarslev, V.,Möller, R., vander Straeten,R.,Wessel,M.: ExtendedQueryFa-
ilities for Raerand anAppliation toSoftware-EngineeringProblems. In:Pro-
eedings of the 2004 International Workshop on Desription Logis (DL-2004),
Whistler,BC,Canada,June6-8.(2004)148157
7. Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., Katz, Y.: Pellet: A pratial
OWL-DLreasoner. WebSemant.5(2)(2007)5153
8. Horroks, I., Li, L., Turi, D., Behhofer, S.: The Instane Store: DL reasoning
withlargenumbersofindividuals. In:ProeedingsofDL2004, BritishColumbia,
Canada.(2004)
using a theoremprover. In:FoIKS. Volume 3861 ofLetureNotes inComputer
Siene.,Springer(2006)201218
10. Hustadt,U.,Motik,B.,Sattler,U.:ReasoningforDesriptionLogisaroundSHIQ
inaresolutionframework. Tehnialreport,FZI,Karlsruhe(2004)
11. Motik, B.: Reasoning in Desription Logis using Resolution and Dedutive
Databases. PhDthesis,UnivesitätKarlsruhe(TH),Karlsruhe,Germany(January
2006)
12. Grosof,B.N.,Horroks,I.,Volz,R.,Deker,S.: Desriptionlogiprograms:Com-
bininglogiprogramswithdesriptionlogi.In:Pro.oftheTwelfthInternational
WorldWideWebConferene(WWW2003),ACM(2003)4857
13. Hustadt,U.,Motik,B.,Sattler,U.:Dataomplexityofreasoninginveryexpressive
desriptionlogis.In:ProeedingsoftheNineteenthInternationalJointConferene
onArtiialIntelligene(IJCAI2005),InternationalJointConferenesonArtiial
Intelligene(2005)466471
14. Stikel,M.E.: A Prologtehnology theoremprover:anewexpositionandimple-
mentationinProlog. TheoretialComputerSiene104(1)(1992)109128
15. Zombori,Zs.: Eient two-phasedata reasoningfor desription logis. In: Pro-
eedingsoftheInternationalFederationforInformationProessingTehnialCom-
mitteeonArtiialIntelligene(TC12), Milan,Italy(September2008) Aepted
onferenepaper.
16. Behhofer,S.:TheDIGdesriptionlogiinterfae.http://dig.s.manhester.a.uk/
(2006)