• Aucun résultat trouvé

Ontology Based Information Integration Using Logic Programming

N/A
N/A
Protected

Academic year: 2022

Partager "Ontology Based Information Integration Using Logic Programming"

Copied!
17
0
0

Texte intégral

(1)

Logi Programming

GergelyLukásyandPéterSzeredi

BudapestUniversityofTehnologyandEonomis

DepartmentofComputerSieneandInformation Theory

1117Budapest,Magyartudósokkörútja2.,Hungary

Phone:+36 1463-2585Fax:+361463-3157

{lukasy,szeredi}s.bme.hu

Keywords:desriptonlogi,CWA,informationintegration,logiprogramming

Abstrat. We present an information integration system alled SIN-

TAGMAwhihsupports the semanti integration of heterogeneous in-

formationsouresusingametadatadrivenapproah.Themainideaof

SINTAGMAistobuildasoalledModelWarehouse,ontainingseveral

layers of integrated models onneted by mappings. At the bottom of

thishierarhy therearethemodelsrepresentingtheatualinformation

soures.Higher levelmodels represent virtual databases,whihanbe

queried,asthemappingsprovideapreisedesriptionofhowtopopulate

thesevirtualsouresusingtheonreteones.

This paper fouses on a reent development in SINTAGMA allowing

theinformationexperttouseDesriptionLogi basedontologiesinthe

developmentofhighabstrationleveloneptualmodels.Queryingthese

modelsisperformedusingtheClosedWorldAssumptionaswearguethat

traditionalOpenWorldDLreasoningislessappropriateintheontext

ofdatabaseorientedinformationintegrationenvironments.

TheimplementationofSINTAGMAusesonstraintsandlogiprogram-

ming,forexample,theomplexqueriesaretranslatedintoProloggoals.

Thisallowsustoprovidefuntionalitiesnotsupportedbyotherintegra-

tionframeworks.

1 Introdution

ThispaperpresentstheDesriptionLogimodellingapabilitiesoftheSIN-

TAGMA Enterprise Information Integration system. SINTAGMA is based on

theSILKtool-set,developedwithintheEUFP5projetSILK(SystemIntegra-

tionviaLogi &Knowledge) [2℄. SILKis adataentered, monolithi informa-

tionintegrationsystemsupportingsemi-automatiintegrationonrelationaland

semi-struturedsoures.

The SINTAGMA system extends the original framework in several dire-

tions.As opposed to themonolithiSILKstruture, SINTAGMAisbuiltfrom

looselyoupleddistributedomponents.Thefuntionalityhasbeomeriheras,

among others,thesystemnowdealswithWebServiesasinformationsoures.

(2)

The present paper is about areent extension of thesystem whih allowsthe

integrationexpert touseDesriptionLogimodelsintheintegrationproess.

Thepaperisstruturedasfollows.In Setion2wegiveageneralintrodu-

tion to the SINTAGMA system, desribing the main omponents, the SILan

modelling language, and the query exeution mehanism. In the next setion

wedisussthe desriptionlogiextension ofSILan: weintroduethe syntati

onstruts andthe modelling methodology.In Setion 4we presentthe exeu-

tionmehanismusedwhen queryingDesriptionLogimodels.InSetion5we

examinerelatedwork.Finally,weonludewithasummaryofourresults.

Theexamplesweusein theupomingdisussionsarepartofanintegration

senario.Thissenariorepresentsaworldwhereweattempttointegratevarious

informationsouresaboutwriters,paintersandtheirwork(i.e.books,paintings,

et.)andpresentthisinformationin theformofabstratviews.

Meta

Server

Data

Server

Comparator

ModelVerier

Unier

Corr.generator

DataVerier

SpeAdvisor

ModelManager Model

Warehouse

Model

Im(Ex)port Agent

Congu-

rator

Mediator

Client

Programs

Browser

Shell

... User

Wrapper Wrapper

Wrapper

ModelingTool

(Protege,Rose)

Wrappers:

-Relational

-XML

-RDF

-HTML

-WebServie

Fig.1.ThearhitetureoftheSINTAGMAsystem

2 System Arhiteture

TheoverallarhitetureofthesystemanbeseeninFigure1.Themainidea

ofourapproahistoolletandmanagemeta-information onthesourestobe

integrated.ThesepieesofinformationarestoredintheModelWarehouseofthe

system,intheformofUML-likemodels[8℄,onstraintsandmappings.Thisway

weanrepresentstruturalaswell asnon-strutural information,suhaslass

invariants,impliations,et.TheModelWarehouseresidesinandishandledby

theModel Manager omponent.

(3)

models.Mediationdeomposesomplexintegratedqueriestosimplequeriesan-

swerablebyindividualinformationsouresand,havingobtaineddatafromthese,

omposes theresultsinto anintegratedform.Mediation isthetask oftheMe-

diator omponent.

Aesstoheterogeneousinformationsouresissupportedbywrappers.Wrap-

pershidethesyntatidierenesbetweenthesouresofdierentkinds,bypre-

sentingthem to upperlayersuniformly,asUML models. These models (alled

interfae models) are entered into the Model Warehouseautomatially. In the

followingwegiveabriefintrodutionto themain omponents.

2.1 The Model Manager

TheModelManagerisresponsibleformanagingtheModelWarehouse(MW)

and providing integration support, suh asmodel omparison and veriation

(not overedinthis paper).HerewefousontheroleoftheModelWarehouse.

TheontentoftheMWisgivenin thelanguagealledSILanwhihisbased

onUML[8℄andDesriptionLogis[12℄.ThesyntaxofSILANresemblesIDL,the

InterfaeDesriptionLanguageofCORBA[16℄.Wedemonstratetheknowledge

representationfailitiesofSINTAGMAbyasimpleSILanexampleshowingthe

relevantfeaturesofthemeta-datarepository(Figure2).

1 model Art {

2 lass Artist:BuiltIns::DLAny {

3 attribute String name;

4 attribute Integer birthDate;

5 onstraint self.reation.date > 1900;

6 };

7

8 lass Work:BuiltIns::DLAny {

9 attribute String title;

10 attribute String author;

11 attribute Integer date;

12 attribute String type;

13 primary key title;

14 };

15

16 assoiation hasWork {

17 onnetion Artist as reator;

18 onnetion Work as reation;

19 }; };

Fig.2.SILanrepresentationofthemodelArt

(4)

Work.It also ontainsanassoiationbetweenan artistand herworks. We will

explainthedetailsofthisexamplewithinthedisussionbelow.

SemantisofSILanmodels TheentralelementsofSILanmodelsarelasses

and assoiations, sinethese are the arriersof information. A lass denotesa

set of entities alled the instanes of the lass. Similarly, an

n

-ary assoiation denotesasetof

n

-arytuplesoflassinstanesalled links.

Classesanhaveattributes whiharedenedasfuntionsmappingthelass

to asubset of values allowed by thetype of the attribute. Classes an inherit

fromotherlasses.Theinstanesofthedesendantlassareallinstanesofthe

anestor lass.In ourexamplebothArtistand Workinherit fromthe built-in

lassDLAny(f.lines 2and8). SeeSetion3.3formoredetails.

Assoiationshaveonnetions,an

n

-aryassoiationhas

n

onnetions.Inan assoiationsomeoftheonnetionsanbenamed,providingintuitivenavigation.

Forexample, the onnetions of assoiation hasWork orresponding to lasses

ArtistandWorkarealledreatorandreationrespetively(lines1718).

Classes are allowed to have a primary key, omposed of one or more at-

tributes.Thisspeiesthatthegivensubsetoftheattributesuniquelyidenties

aninstaneofthelass.Inourexample,asagrosssimpliation,attributetitle

servesasakeyinlassWork,i.e.thereannotbetwoworks(books,forexample)

withthesametitle.

Finally,invariantsanbespeiedforlassesandassoiationsusingtheob-

jetonstraintextension ofUML, theOCLlanguage[7℄.Invariantsgivestate-

ments about instanes of lasses(and links of assoiations) that hold for eah

of them. The onstraint in the delaration of Artist(line 5) is an invariant

statingthatthepubliationdateofeahworkofanartistisgreaterthan

1900

1.

The identier selfrefers to an arbitrary instane of the ontext, in this ase

thelassArtist.Thentwonavigation stepsfollow.Intherststepwenavigate

through theassoiationhasWork to an arbitrary work of the artist and in the

seond stepfrom thework toits publiationdate andstatethat this isalways

greaterthan

1900

.

Inadditionto theobjetorientedmodelling paradigm,theSILan language

also supports onstrutsfrom the DesriptionLogi (DL) world[12℄. This, re-

entlyaddednewfeatureofSINTAGMAisdisussedinSetion 3.

Abstrations Formediation,weneedmappingsbetweenthedierentsoures

andtheintegratedmodel.Thesemappings arealled abstrations beausethey

often provide a more abstrat view of the notions present in the lower level

models. Anexampleabstrationalledw0anbeseeninFigure3.

1

Thismay be sobeause the given informationsoureis knownto be dealingwith

worksofartof20thenturyorlater.

(5)

lassesProdutandDesription,bothfromthemodelInterfae 2

.Thismeans

that theabstration speies how to reate a virtual instane of lass Work,

given that the other two lasses are already populated (e.g. they orrespond

to real information soures).In lines 13the identiers m0, m1and m2 are de-

lared,andthesewillbeusedthroughouttheabstrationspeiationtodenote

instanesoftheappropriatelasses.

1 abstration w0 (m0: Interfae::Produt,

2 m1: Interfae::Desription

3 -> m2: Art::Work) {

4

5 onstraint

6 m1.id = m0.id and

7 m1.ode = 84

8 implies

9 m2.title = m0.title and

10 m2.author = m0.author and

11 m2.date = m0.publiation_date and

12 m2.type = m1.desription and

13 m2.DL_ID = m0.title;

14 };

Fig.3.SILanrepresentationoftheabstrationpopulatinglassWork

TheabstrationdesribesthatgivenaninstaneoflassProdutalledm0and

aninstaneoflassDesriptionalledm1,forwhihtheonditionsinlines67

hold,thereexistsaninstanem2oflassWorkwithattributevaluesspeiedby

lines913 3

.Notethatline6speiesthattheidattributesofthetwoinstanes

have to be the same, and thus orresponds to a relational join operation. In

ourintegrationsenarioProdutandDesriptionatuallyorrespondtoreal-

worldOraletablesontainingbooksand paintings.Thetaskofabstrationw0

istoonvertthereordsofthese tablesintoinstanesoflass Work.

Wenote that other abstrations an also populate lass Work.In this ase

theset ofinstanes of Workwillbetheunionoftheinstanes produedbythe

appropriate abstrations. Note that if a new information soure is added, we

only haveto speifya newabstration orresponding to this soure,while the

existingabstrationsanbeleftunhanged.

Notie that the abstration in Figure 3 takes the form of an impliation

desribinghowthegivensouresanontributetopopulatingthehighlevellass

Art::Work.ThisisharateristioftheLoalasViewintegrationapproah[4℄.

2

InSILandoubleolons(::)separatethemodelnamefromthenameofitsonstituent

(lass,assoiation, et.).

3

AttributeDL_IDhasaspeialroleasexplainedinSetion3.3.

(6)

Wrappersprovideaommoninterfaeforaessingvariousinformationsoure

types,suhasrelationalandobjet-orienteddatabases,semi-struturedsoures

(e.g.XMLor RDF),aswellasWeb-servies.

PSfragreplaements

olumn

attribute

database

model

table

lass

Produt

... titleString

idInteger

authorString

publiation_dateString

Fig.4.ModellingrelationalsouresinSILan

Awrapperhastwomaintasks.First,itextratsmeta-datafromtheinforma-

tionsoureanddeliversthesetotheModelManagerintheformofSILanmodels.

Forexample, inase ofrelationalsoures,databasesorrespondto models, ta-

blestolasses,olumnstoattributes,asshownin Figure4(f.lassProdutin

abstrationw0presentedinFigure 3).

The other prinipal task of a wrapper is to transform queries, formulated

in terms of this interfae model, into the format required by the underlying

informationsoure,andthusallowforrunningqueriesonthesoures.

2.3 The Mediator

TheMediator[1℄supportsqueriesonhighlevelmodelelementsbydeompos-

ingthemintointerfaemodelspeiquestions.Thisisperformedbyreatinga

queryplansatisfyingthedataowrequirementsofthesoures.Duringtheexe-

utionofthisqueryplanthedatatransformationsdesribedintheabstrations

arearriedout.

Whenever wequery amodel element in SINTAGMA, the Model Manager

basiallyprovidesthefollowingtwokindsofinformationtotheMediator:

1. thequerygoalitself, i.e.aPrologtermrepresentingwhatto query;

2. setofmediatorrules,usingwhihthemediatorandeomposetheomplex

queryintoprimitiveones (i.e.queriesthatreferonlytointerfaemodels).

Forexample,letusonsider thequeryshownbelowinvolvinglassWork.

query ReentWork

selet *

from w: Art::Work

where w.date > 2000;

(7)

Art::Work that were reatedafter 2000 4

. In this ase, the query goal is sim-

ilartothefollowingsimplePrologexpression:

:- 'Work:lass:220'(DT, [A, B, C, D, E℄, DA), C > 2000. (1)

Here, the rst Prolog goal orresponds to an instane of Art::Work. The

variables in this term will be instantiated during query exeution. The pred-

iate name 'Work:lass:220' is omposed from three parts: the kind of the

model element (lass) and its unique internal identier (220), preeded by

theunqualiedand thus non-uniqueSILanname(Work),providedfor read-

ability. Model elements areoften referredto byhandles of form Kind(Id),e.g.

lass(220).Theaboveprediatenamein fatrepresentsthestatitype ofthe

instanesqueriedfor.

Therstargumentof thegoalis thedynami typeof theinstane, i.e. the

handleofthelasswhih,inaseofinheritane,anbedierentfromthestati

type. Theseond argumentontainsthevaluesof thestati attributes, in this

asewehavevesuhvariables(f.delarationoflassWorkinFigure2),e.g.C

denotesthevalueoftheattributedate.Thethirdandlastargumentofthequery

termarriesthevaluesofthedynamiattributes.Theserepresenttheadditional

attributes(not known atquerytime)oftheinstaneifithappensto belong to

a desendant lass of Art::Work. Note that in the urrent implementation of

SINTAGMA, asasimpliation, thethird queryargumentontainsthelist of

bothstatianddynamiattributes.

Theseond partof thequerygoalorrespondstoasimplearithmeti OCL

onstraint,whihusesvariableCrepresentingthedateattributeoftheworkin

question.

Themediator rulesrepresenting theabstration w0shownin Figure3 take

thefollowingform:

'Produt:lass:190'(_,[A,B,C,D℄,_),

'Desription:lass:191'(_,[84,E,B℄,_) --->

'Work:lass:220'(lass(220),[D,A,C,D,E℄,[D,A,C,D,E℄)

The spei rule above desribes how to reate an instane of the lass Work

wheneverwehavetwoappropriateinstanesoflassesProdutandDesription

available.Ifthereweremoreabstrations,theMediatorwouldgetmorerulesas

therewouldbemorethanonepossiblewaytopopulatethegivenlass.

Notethat the mediator rulesare also used to desribeinheritane between

model elements.Insuh aasethe dynami type ofthe model element onthe

righthandsideoftheruleisavariable(asopposedtotheonstantlass(220)

above).Thisvariableisthesameasthedynamitypeofthemodel elementon

thelefthandside.Thedynamiattributesarepropagatedsimilarly.

4

WeouldhavereatedalassnamedReentWorkandpopulateditbyanappropri-

ateabstration. Then,instead offormulating aSILanquery,we ouldhavesimply

diretlyaskedfortheinstanesofthislass.Thequestionwhethertouseaqueryor

anabstrationisamodellingdeision.

(8)

relation,eahargumentofwhihisaternarystrutureorrespondingto alass

instane, similartotheoneappearingin (1).Forexample,aquerygoalforthe

assoiationhasWork(f.Figure2) hasthefollowingform:

:- 'hasWork:assoiation:227'(

'Artist:lass:218'(DT1,[A,B,C℄,DA1), (2)

'Work:lass:220'(DT2,[D,E,F,G,H℄,DA2)

).

3 DL modelling in SINTAGMA

LetusnowintroduethenewDLmodellingapabilitiesoftheSINTAGMA

system.First wedisusswhyweneedDesriptionLogimodelsduringtheinte-

grationproess and weprovidean introdutory example.Then wepresentthe

DL onstruts supported by our system and disuss the restritions we plae

ontheirusage. Finally, wesummarise thetasksofthe integrationexpert when

usingDLelementsduringintegration.

3.1 Introdution

IntheModelWarehousewehandlemodelsofdierentkinds.Wedistinguish

between appliation and oneptual models. The appliation models represent

existingorvirtualinformationsouresandbeauseofthistheyarefairlyelabo-

rateand preise.Coneptualmodels,however,representmentalmodelsofuser

groups,thereforetheyarevaguerthantheappliationmodels.

Wearguethat toonstrutsuhmodelsitis moreappropriateto usesome

kind of ontologial formalism instead of the relatively rigid objet oriented

paradigm.Aordingly,weextendedourmodellinglanguagetoinorporatesev-

eral desription logi onstruts, in addition to the UML-like ones desribed

earlier.Intheenvisionedsenario,thehigh-levelmodelsoftheusersareformu-

lated in desriptionlogiandviaappropriatedenitions theyareonneted to

lower-levelmodels.Mediationforaoneptualmodelworksinthesamewayas

foranyothermodel: thequeryisdeomposed,followingtheabstrations,until

we reah the interfae models (in general, through some further intermediate

models)whihanbequerieddiretly.

Beforegoingintothedetails,weshowanexampletoillustrate thewayhow

DL desriptionsare representedin SILan (note that Writerand Painter are

bothdesendantsoflassArtist,but otherwisetheyarenormalUMLlasses.

model Coneptual {

lass WriterAndPainter {};

onstraint equivalent { (3)

WriterAndPainter,

Unified::Writer and Unified::Painter};

};

(9)

ThisonstraintanbeplaedanywhereintheModelWarehouse:intheexample

above we simply put it in the same model that delares the lass WriterAnd

Painter itself. Theonstraintatually orresponds to aDL onept denition

axiom:WriterAndPainter

Writer

Painter.Namely,itstatesthattheinstanesof lassWriterAndPainterarethose(andonlythose)whobelongtotheunnamed

lass ontainingthe individuals who are both writers and painters. Thus, DL

oneptsaredenedusingtheGlobalasViewapproah[4℄,asopposedtowhen

populatinghigh-levellassesusingabstrations(f.Setion 2.1).

NotethatthelassWriterAndPainterouldbereatedwithoutDLsupport.

However,in thatasetheintegrationexpert would haveto gothroughamuh

moreelaborate proess of (1) reatingthe high levellass WriterAndPainter,

speifyingallitsattributesand(2)populatingitwithanappropriateabstration

involvingajoin.Now,withDLsupport,shesimplyformulatesaveryshortand

intuitiveDLaxiom.Wearguethatthisiseasierfortheexperttodo,anditalso

makestheontentoftheModelWarehousemorereadabletoothers.

3.2 DL elementsin SILan

FromtheDLpointofview,SINTAGMAsupportsayliDesriptionLogi

TBoxesontainingoneptdenitionaxiomsformulatedintheextensionofthe

ALCN (D)

language(seemorebelow abouttheextension).Only singleatomi onepts,soallednamedsymbols anappearonthelefthandsideoftheaxioms,

suhasWriterAndPainterinexample(3).Theremainingatomionepts,not

appearingonthelefthandsidearealledbasesymbols.SuhaTBoxisdenito-

rial,i.e.themeaningofthebasesymbolsunambiguouslydenesthemeaningof

thenamed symbols.Thebase symbols,in ourase,orrespondto normalSIN-

TAGMAlassesandassoiations,e.g.WriterandPainterintheexample(3).

TheABoxisasetofoneptandroleassertions,asdeterminedbytheinstanes

ofthelasseswhihorrespondto thebase symbolspartiipatingin theTBox.

The DL onept onstrutors supported by SINTAGMA and their SILan

equivalentsaresummarisedinTable1.Notethatthistableatuallydesribesthe

possibleoneptformatsontherighthandsideofadenition axiom,assuming

that wehaveexpanded theTBox 5

.

Theonlynon-lassialDLelementinTable1istheonretedomainrestri-

tion(thelastlineinthetable).Suharestritionspeiesasubsetofinstanes

ofthebaseonept

A

forwhihthegivenOCLonstraintholds.Thisisagener- alisationoftheideaofonretedomainsintheDesriptionLogisworld.Below

weshowanexampleofaonreteSILanrestritiondesribingthoseworkswhose

type(i.e.thevalueoftheattributetype)ispainting.

{lass onstraint Art::Work satisfies self.type="painting"}

ThereasonweallowonlyoneptdenitionaxiomsisthatweaimtouseDL

oneptsto desribeexeutablehigh-levelviews ofinformationsoures. Inthis

5

TheexpandedversionofanayliTBoxisanotherTBoxwhereeverynamedsymbol

ontheright handsideoftheaxiomsissubstitutedbyitsdenition.

(10)

Baseonept

A

UMLlass

Atomirole

R

UMLassoiation

Top

DLAny

Bottom

DLEmpty

Negation

¬C

not C

Intersetion

C ⊓ D

C and D

Union

C ⊔ D

C or D

Valuerestritions

∀R.C

slot onstraint R all values C

Existentialrestritions

∃R.C

slot onstraint R some value C

Numberrestritions

⋊ ⋉ nR

slot onstraint R ardinality i..j

Conreterestrition lass onstraint A satisfies OCL

Table 1.(extended)DLoneptonstrutorssupportedinSILan

sense aDLoneptisatuallyasyntativariantofaSILanqueryoraSILan

lasspopulatedbyanabstration.

NotethatthisalsoimpliesthatweusetheClosedWorldAssumption(CWA)

inDLqueryexeution.Wearguethatthisisappropriatebeauseofthefollow-

ingthreereasons.First,CWAautomatiallyensuresthat ourDLonstrutsare

semantiallyompatiblewith otheronstrutsintheSINTAGMAsystem.Se-

ond, we arguethat the OpenWorldAssumption(OWA)is appliablewhen we

have only partial knowledge and would liketo determine the onsequenes of

thisknowledge,trueineveryuniverseinwhihtheaxiomsofthispartialknowl-

edge hold. Inontrastwith this, in theontextof informationintegration,our

userswould liketoonsider asingleuniverse, inwhihabase oneptorarole

denotes exatly those individuals (orpairs ofindividuals) whih are presentin

theorrespondingdatabase.Toillustratethisissue,letusonsiderthefollowing

example:the oneptofnoviepainter is dened toontain painters having at

most5paintings(forexample,beinganoviepaintermaybeapreonditionfor

agovernmentgrant).Tomodelthissituation,theintegrationexpertreatesthe

DLaxiomshownbelow.

NoviePainter

Painter

⊓ (6 5

hasPainting

)

However,queryingthis onept,using OWA, will provide noresultsin general

asanopenworldreasonerwouldreturnanindividualonlyifitisprovable that

it has no more than 5paintings. Pratially, this is notwhat the information

expert wants.

(11)

fatthatwehavehugeamountsofdataintheunderlyingdatabases.Traditional,

tableaubasedDLreasonersdonotopewellwithlargeABoxes[10℄.Resolution

basedDL provingtehniques[13℄ domuh better, but theyare either stillnot

fastornotexpressiveenough[15℄.ByusingCWAweanimplementDLqueries

usingthewellresearhed,eientdatabasetehnology.

3.3 Modeling methodology and tasksof the integration expert

Theintegrationexpert isresponsibleforreatingtheDLaxioms.Although

thesearerepresentedinSILanwithintheSINTAGMAsystem,theexpertanuse

anyavailableOWLeditortoreateOWLdesriptions.Thesedesriptionsthen

anbeloaded bytheOWLimporterof theSINTAGMA systemthat basially

realisesanOWL-SILantranslation(f.theModelIm(Ex)portboxinFigure1).

Onething theexpertshould takeareofisto maththenamesof thebase

symbols and the orresponding SINTAGMA lasses and assoiations. This is

often done in two steps: rst the integration expert reates onept denition

axiomsusingthewidelyaeptedterminologyofthedomain,notpayingatten-

tion to the names of the model elements in the Model Warehouse. Next, the

expert providesadditionaldenition axiomsforeahbasesymbolonnetingit

withthepropermodelelement.Forexample,weouldusenamesAandBinstead

of WriterandPainterin (3),providedthat wehavethefollowingaxioms:

A

Writer B

Painter

Anotherimportantissueistodeidehowtoidentifytheinstanesofthebase

onepts,e.gtheinstanesofthelassWriterandlassPainter.Withoutthis,

itisnotpossibletodeterminetheinstanesoflassWriterAndPainter.

InatraditionalDLABox,aninstanehasanamethatunambiguouslyiden-

ties it. Unambiguity is guaranteed beause DL reasoning systemsuse the so

alled unique name assumption, i.e. they assume that two dierent instane

namesdenotedierentelementsinthedomain.

InSINTAGMA,similarlytodatabases,aninstaneisidentiedbythesubset

ofitsattributevalues,e.g.twowritersouldbeonsideredtobethesameiftheir

namesmath.Inotherwords,thismeansthatnameisakeyinlassWriter.

Theproblem isthatsuhkeysarefairlyuselesswhenweompareinstanes

ofdierentsoures.Thisisbeause,in general,weannotdrawanydiret on-

lusionfromtherelationofthekeysbelongingtoinstanesfromdierentlasses.

Forexample,databasesontainingemployeesoftenusenumeriIDsaskeys.Hav-

ing two employees from dierent ompanieswith the sameID doesnot mean

thatwearetalkingaboutthesameperson.Similarly,iftheIDsoftheemployees

donotmath,theyarenotneessarilydierentpersons.

Whatweneedissomekindofsharedkeythatuniquelyidentiestheinstanes

ofthelassespartiipatinginDLoneptdenitions.Lukily,theobjet-oriented

paradigmweusein SINTAGMAprovidesaniewaytohavesuh identiers.

We have mentioned earlier that in SINTAGMA the notion of DL onept

is a syntati variantof SINTAGMAlass. This also means that theresult of

(12)

example,whenwearelookingfortheinstanesthatarein bothlassesWriter

andPainterweareatuallyinterestedinanartist instane belongingto these

lassessimultaneously.This istrueingeneral:whateverDLoneptonstruts

weuse todesribeaDLonepttheresultmustbelongto somelassthat isa

ommonanestorofthelassesinvolved.

Instead of asking the integration expert to dene suh ommon anestor

lassesin anadhoway,weintroduethebuilt-in lassDLAny.Thislass or-

respondsto theDLonepttop(

)and ithasonlyoneattributealled DL_ID whihisakey.WerequirethatallthelassespartiipatinginDLoneptdeni-

tionsarethedesendantsofDLAny 6

(f.lines2and8ofFigure2).Beauseofthe

propertiesofgeneralisationattributeDL_IDwillbeakeyinallofthedesendant

lasses,i.e.itwillexatlyserveastheglobalidentierwewerelooking for.

Now, the task of the integration expert is to assign appropriate values to

the DL_IDattributes: she needsto extend theexisting abstrations populating

thebasesymbols(lasses) toalsoonsidertheattributeDL_ID.Byappropriate

valueswemeanthattheDL_IDsoftwoinstanesshouldmathiftheseinstanes

are the same,and should dier otherwise. An examplefor this an be seenin

Figure5populatingthelassWriter.

1 abstration ap (m0: Interfae::Member ->

2 m1: Unified::Writer) {

3

4 onstraint let n = m0.fname.onat(" ").onat(m0.lname) in

5 m1.name = n and

6 m1.birthDate = m0.date and

7 m1.id = m0.iwa and

8 m1.style = m0.style and

9 m1.DL_ID = n;

10 };

Fig.5.PopulatingtheDL_IDattributeofabaseonept

This abstration populates the lass Writer from an interfae lass alled

Member (lines 12). Let us assume that the members of this assoiation have

somekindofauniqueidentier,suhasthemembershipnumberofanimaginary

InternationalWriterAssoiation (IWA),presentintheunderlyingdatabase.It

maybeworthbringingthiskeytothelassWriter(line7)asitmakespossible

to nd writers eiently if they happen to be IWA members. However, the

uniqueidentierfrom theDL pointofviewhastobedierent:in fat itisthe

onatenationoftherstandlast nameofthewriter(line4and9).

This isbeause thelass Writeranalso bepopulatedfrom other soures

wheretheIWAnumbermakesnosense.Furthermore,wemaywantlassWriter

to be a desendantof lass Artist,together with some other lasses, suh as

6

Notethatthisisaneessaryondition.Asforanyonept

C

,

C ⊑ ⊤

holds,anyDL

instanehastobelongtothelassorrespondingto

,i.e.toDLAny.

(13)

soures,suh asthenameoftheartist 7

.

Tosummarise,theintegrationexperthastoperformthefollowingtaskswhen

DLmodellingisusedduring theintegrationproess:

1. delareDLlassesandforeahprovideorrespondingdenition axioms;

2. ensurethat eahbase oneptappearinginthedenitionaxiomsis:

(a) inheritedfromlassDLAny,

(b) populatedproperly,i.e.itsDL_IDattributeislled appropriately.

4 Querying DL models in SINTAGMA

Now we turn our attention to querying DL onepts in SINTAGMA. As

desribedin Setion2.3ourtaskisto reateaquerygoal andasetofmediator

rules. WhenwequeryaDLlass, weonlygeneratemediatorrulesfor thebase

symbols.As theseare ordinarylassesand assoiations,this proess isexatly

thesameastheoneweuseforaseswithoutanyDLonstrutinvolved.

Reall that aSINTAGMA instane is haraterised by three properties, as

exemplied by (1)on page7:its dynamitype DT,itsstati attributes SAand

its dynami attributes DA. A DL lass has only a single stati attribute, the

DL_ID key. However, in ontrast with an objet oriented query, a DL query

mayreturnananswerthathasmultiple dynami types.Forexample,whenwe

enumerate the lass WriterAndPainterwe get instanes that belong to both

lasses Writerand Painter.Aordingly, an answerto aDL query takesthe

form of apair (ID, DTAs), where ID is the valueof the DL_ID, while DTAs =

[DT

1 -DA

1 ,DT

2 -DA

2

,...℄ is the list of the dynami types of the answer, eah

pairedwiththeorrespondingdynamiattributelist.

ThealgorithmspeifyingwhatgoalstoreatefromaDLoneptdesription

is summarised in Figure 6. Here we desribe a funtion

Φ C

whih, given an arbitraryonept

C

,returnstheorrespondingquerygoalwithtwoarguments, IDandDTAs.Wedenethisfuntion asebyase.

If we have a base lass, we simply reate a query term representing the

instanesof thelass,similarto theonein goal(1). Ifwehavetheintersetion

oftwoonepts

C

and

D

, wereursivelytransformonepts

C

and

D

andput theminaPrologonjuntion.Theunionoflassesissimilar:wereateaProlog

disjuntion. Negation

¬C

isimplementedbyenumeratingthe DLAnylass, and removingthoseinstaneswhihbelongto

C

.TheexpensiveDLAnyenumeration an be avoidedwhen thenegationappearsin aonjuntion(whih isnormally

thease).

Themoreinterestingasesinvolveassoiations.Here

R

denotestheassoia- tionitself,while

R D

and

R R

denotethelassesthatarethedomainandtherange ofassoiation

R

,respetively.Reallthatabinaryassoiationisrepresentedby abinaryrelationwithternarystruturesasargumentsasin(2).

7

Thisisalsoasimpliation.Morerealistially,thekey ouldbethenametogether

withthebirthdate.

(14)

Φ A

(ID, DTAs)

= A

(DT, [ID|_℄, DA), DTAs = [DT-DA℄

Φ C⊓D

(ID, DTAs)

= Φ C

(ID, DTAs1)

, Φ D

(ID, DTAs2)

,

DTAs = DTAs1

DTAs2

,

where

denotesthe(ompiletime)onatenationoflists

Φ C⊔D

(ID, DTAs)

=

(

Φ C

(ID, DTAs) ;

Φ D

(ID, DTAs))

Φ ¬C

(ID, DTAs)

=

DLAny(ID, DTAs),

\

+

Φ C

(ID, _)

Φ ∃R.C

(ID, DTAs)

= R

(

R D

(DT, [ID|_℄, DA),

R R

(_, [ID2|_℄, _)),

Φ C

(ID2, _), DTAs = [DT-DA℄

Φ ∀R.C

(ID, DTAs)

= R D

(DT, [ID|_℄, DA), DTAs = [DT-DA℄

,

\

+ (

R

(

R D

(DT, [ID|_℄, DA),

R R

(_, [ID2|_℄, _)),

\

+

Φ C

(ID2, _))

Φ ⋊ ⋉nR

(ID, DTAs)

=

aggregate([DT, ID, DA℄, [S=nt(0)℄,

R

(

R D

(DT, [ID|_℄, DA),

R R

(_, _, _))),

ondition

⋊ ⋉

(n, S), DTAs = [DT-DA℄

Fig.6.TransformingDLonstrutsintoquerygoals

The existential restrition

∃R.C

is simply transformed to a query of the assoiation

R

andtheonept

C

.Thegoalorrespondingtoavaluerestrition

∀R.C

rstenumeratesthedomainof

R

andthenusesdoublenegationtoensure

that the given instane has no

R

-valueswhih do not belong to

C

. Finally, a number restrition

( ⋊ ⋉ nR )

is transformed into a goalwhih uses abagof-like Prolog prediateaggregate/3to enumerate the instanes in the domain of

R

togetherwiththenumberof

R

-valuesonnetedtothem,andthensimplyapplies theappropriatearithmetiomparison.

Aonreterestritioninvolvingabaseonept

A

andanOCLonstraint

O

istransformedinastraightforwardwayintothequerygoalasshownbelow 8

:

Φ A

(ID, DTAs), DTAs = [DT-DA℄,

Ψ O

(ID, DA)

In all formulas so far, ID denotes the value of the attribute DL_IDontaining

theuniquenameoftheDLinstanes(seeSetion3.3).Herewemakeuseofthe

fatthattheseattributesarealwaysplaedrstinthestatiattributelistofan

instane. Toillustrate the generalalgorithm, twoexampletransformations are

shown in Figure 7. The seond example involves an assoiation hasPainting

andalassModern(representing,say,ontemporarypieesofart).

8

Here,

Ψ O

(ID, DA)isthePrologtranslationoftheOCLonstraint

O

.Thisisanold

feature,implementedbeforetheintrodutionofDLextensionintoSINTAGMA.

(15)

DLdenition: Writer

Painter

Querygoal: 'Writer:lass:234'(DT1,[ID|_℄, DA1),

'Painter:lass:236'(DT2,[ID|_℄,D A2),

DTAs = [DT1-DA1,DT2-DA2℄

Class toquery: ModernPainterWriter

DLdenition: Writer

⊓ ∃

hasPainting.Modern Querygoal: 'Writer:lass:234'(DT1,[ID|_℄, DA1),

'hasPainting:assoiation:142'(

'Artist:lass:218'(DT2,[ID|_ ℄,DA2 ),

'Work:lass:220(_,[ID2|_℄,_) ),

'Modern:lass:237'(_,[ID2|_℄, _),

DTAs = [DT1-DA1,DT2-DA2℄

Fig.7.Transformationexamples

5 Related work

Thetwomain approahesin information integrationare theLoal asView

(LAV) and the Global as View (GAV) [4℄. In the former, soures are dened

in termsof theglobalshema,whilein thelatter, theglobalshemaisdened

in terms of the soures (similarly to the lassialviews in databasesystems).

Information Manifold [14℄ is agood example for aLAV system. Examples for

theGAVapproahinludetheStanford-IBMintegrationsystemTSIMMIS[6℄.

InSINTAGMAweapplyahybridapproah,i.e.weusebothLAVandGAV.

Whenusingabstrationstopopulatehigh-levellassesweemploytheLAV,while

in aseofDLlassdenitionsweusetheGAVapproah.

Thereareseveralompletedandongoingresearhprojetsintheareaofusing

desriptionlogi-based approahesfor bothEnterprise Appliation Integration

(EAI) andEnterpriseInformationIntegration(EII)aswell.

The generi EAI researh stresses the importane of the Servie Oriented

Arhiteture,andtheprovisionofnewapabilitieswithintheframeworkofSe-

mantiWebServies. Examplesforsuhresearhprojetsinlude DIP[11℄and

INFRAWEBS [9℄. These projetsaim at the semanti integrationof Web Ser-

vies,inmostasesusingDesriptionLogibasedontologiesandSemantiWeb

tehnologies.Here,however,DLisusedmostlyforserviedisoveryanddesign-

timeworkowvalidation,but notduring queryexeution.

Ontheotherhand,severallogi-basedEII toolsuseDesriptionLogisand

take a similar approah aswe did in SINTAGMA. That is, they reate a DL

modelasaviewovertheinformationsourestobeintegrated.Thebasiframe-

workofthissolutionisdesribede.g.in[5,3℄.Thefundamentaldiereneom-

pared to our approah is that these appliations deal with the lassial Open

WorldAssumption,thereforetheirtask anbeviewedasanABoxinstanere-

trievaltask.However,asalreadydisussedinSetion3.2,oneproblemwiththis

(16)

whihthereforeanbeextremely big.Wearguethatexisting DLreasonersare

notusablewhenthis amountofdataandomplexDLqueriesareinvolved.

6 Conlusions

InthispaperwepresentedtheDLextension oftheinformation integration

system SINTAGMA. This extension allows the information expert to use De-

sription Logi based ontologies in the development of high abstration level

oneptualmodels.QueryingthesemodelsisperformedusingtheClosedWorld

Assumptionovertheunderlyinginformationsoures.

We have presented the main omponents of the SINTAGMA system: the

Model Manager whih is responsible for the Model Warehouse, the Wrapper,

whih provides auniform view overtheheterogenous information soures and

theMediator,whih deomposesomplexhigh-levelqueriesintoprimitiveones

answerablebytheindividualinformationsoures.

Next,we havedesribedthe DL modelling elements theintegrationexpert

anusewhenbuilding oneptualmodelsand wehavealsodisussedthemod-

ellingmethodologyshehastofollow.WehavepresentedthewayhowDLqueries

areexeutedwithintheSINTAGMAsystem.Finally,wehaveillustratedourap-

proahbyprovidingauseaseaboutartists.

WearguethatoursolutionforombiningDLandUMLmodellinginaunied

integration framework provides a viable alternative to existing systems. The

usageofDLonstrutsin buildinghigh-leveloneptualmodelshassubstantial

benets,bothintermsofmodellingeieny andmaintenane.

Aknowledgements

Theauthors aknowledgethe support of theHungarianNKFP programme

fortheSINTAGMAprojetunder grantno.2/052/2004.Wewouldalsoliketo

thankallthepeoplepartiipatingin thisprojet,andespeiallyTamásBenk®.

Referenes

1. Liviu BadeaandDoinaTilivea. QueryPlanningfor IntelligentInformationInte-

grationusingConstraintHandlingRules,2001.IJCAI-2001WorkshoponModeling

andSolvingProblemswithConstraints.

2. T.Benk®,P.Krauth,andP.Szeredi.Alogibasedsystemforappliationintegra-

tion. InProeedings ofthe18thInternationalConferene onLogiProgramming,

ICLP 2002.Springer,LNCS,2002.

3. A. Borgida, M. Lenzerini, and R. Rosati. Desription logis for databases. In

Desription LogiHandbook,pages462484,2003.

4. D.Calvanese,D.Lembo,andM.Lenerini. Surveyonmethodsforqueryrewriting

andqueryansweringusingviews. Teh.report,UniversityofRome,April2001.

5. Diego Calvanese, Giuseppe DeGiaomo,Maurizio Lenzerini, DanieleNardi, and

RiardoRosati. Desriptionlogiframeworkforinformationintegration.InPrin-

iplesof KnowledgeRepresentationandReasoning,pages213,1998.

(17)

J. Ullman, andJ. Widom. The TSIMMISprojet:Integration ofheterogeneous

information soures. In 16th Meeting of the Information Proessing Soiety of

Japan,pages718,Tokyo,Japan,1994.

7. T.Clark and J.Warmer,editors. ObjetModeling with theOCL: The Rationale

behindtheObjetConstraintLanguage,volume2263ofLNCS. Springer,2002.

8. Martin Fowler and KendallSott. UML Distilled: Applying the Standard Objet

ModelingLanguage. Addison-Wesley,1997.

9. Vladislava Grigorova. Semanti desription of web servies and possibilities of

BPEL4WS. InformationTheoriesandAppliations,13(2):183187,2006.

10. V. Haarslev andR. Möller. Optimizationtehniquesfor retrievingresoures de-

sribedinowl/rdfdouments:Firstresults. InNinthInternationalConfereneon

the Priniples of Knowledge Representation and Reasoning, KR 2004, Whistler,

BC,Canada,June2-5,pages163173,2004.

11. M.Hepp,F.Leymann,J.Domingue,A.Wahler,andD.Fensel. Semantibusiness

proess management: A vision towards using semanti webservies for business

proessmanagement,2005.

12. IanHorroks. Reasoningwithexpressivedesriptionlogis:Theoryandpratie.

In Pro. of the 18th Int. Conf. onAutomated Dedution (CADE 2002), number

2392 inLetureNotesinArtiialIntelligene,pages115.Springer,2002.

13. U. Hustadt, B. Motik, and U. Sattler. Reasoning for desription logis around

SHIQ

inaresolutionframework. TehnialReport3-8-04/04,June2004.

14. T. Kirk, A. Y. Levy, Y. Sagiv, and D. Srivastava. The Information Manifold.

In C. Knoblok and A. Levy,editors, AAAI Spring Symposium ob Information

GatheringfromHeterogeneous, DistributedEnvironments,1995.

15. Zsolt Nagy, Gergely Lukásy, and Péter Szeredi. Translating desription logi

queriestoProlog. InPro.ofPADL,SpringerLNCS3819,pages168182,2006.

16. InterfaeDenitionLanguage. ISOInternationalStandard,number14750.

Références

Documents relatifs

It is necessary to form an integrating data model based on the ontological representations that obtained after mapping the RDB structure of each of the integrated information

add new chunks to the existing ontology; combine disciplines according to their on- tologies; separate ontologies according to their semantic graphs on the subgraphs, for

The objective of this work is to collect the RTI queries and associated response (whether the institute has replied to the query, rejected the query or referred to third party)

We first present ChainTracker, a visualization and trace-analysis environment that uses traceability information about model-transformation compositions, to support

Both the Biological Collections Ontology (BCO) and the Ontologized Minimum Information About BIobank data Sharing (OMIABIS) are ontologies rooted in specimens and

(1) The configuration of source databases was completely independent, except for the Federation Ontology, which lists the URL of each endpoint and the concepts each imple- ments;

In our future work, we aim to convert the SPARQL query language for RDF graphs into queries using the operator of the temporal logic, in order to have a

Mapping SPARQL Query to temporal logic query based on NµSMV Model Checker to Query Semantic Graphs.. Mahdi Gueffaz, Sylvain Rampacek,