• Aucun résultat trouvé

Supporting Ontology Alignment Tasks By Organizing Ontology Mappings Using Edge Bundling

N/A
N/A
Protected

Academic year: 2022

Partager "Supporting Ontology Alignment Tasks By Organizing Ontology Mappings Using Edge Bundling"

Copied!
196
0
0

Texte intégral

(1)

Support ing Onto logy A l ignment Tasks By Organ iz ing Onto logy Mapp ings Us ingEdge

Bund l ing

by

cMuhammadNas ir

Athes issubm ittedtothe Schoo lo fGraduateStud ies

inpart ia lfu lfi lmento fthe requ irementsforthedegreeo f

Mastero fSc ience

Departmento fComputerSc ience Memor ia lUn ivers ityo fNew found land

May2014

St .John ’s New found land

(2)

Abstract

Onto log iesarecommon lyusedforknow ledgerepresentat ionandtoexchangein-

format ionbetweenmu lt ip leapp l icat ions . Whens im i larin format ionisrepresentedby

d ifferentonto log ies ,in format ionshar ingrequ iresa mapp ingbetweencorrespond ing

pa irso fent it ies . Wh i leonto logya l ignmenta lgor ithmshavebeendeve lopedtosup-

portsuchtasks ,theygenera l lydonotofferent ire lycomp leteandprec ise mapp ings .

Consequent ly ,animportantinteract iveaspecto fthea l ignmentprocessistheva l i-

dat iono fautomat ica l lygenerated mapp ingsandtheadd it iono fnew mapp ings ,by

aknow ledge manager . Wh i lev isua linter facesex isttosupportthesetasks ,show ing

alargenumbero f mapp ingscanresu ltinas ign ificantamounto fv isua lc lutter . To

addressth isissue ,anedgebund l ingapproachhasbeenadaptedtotheconstra intso f

anex ist inga l ignmentinter face . Auserstudywasdes ignedandconductedtoeva lu-

atetheva lueo fedgebund l inginth iscontext ,w ithpos it iveresu ltsforbothmapp ing

va l idat ionandadd it iontasks .

(3)

Acknow ledgements

F irsto fa l l ,Iwou ldl iketos incere lythank mysuperv isors ,Dr .JoergEvermann andDr . Or landHoeber ,forthe ircont inuedgu idance ,andsupport . Theyprov ided the irins ight fu ladv ices ,showedhowtheacadem icresearchisconducted ,he lpedin improv ingmycr it ica lth ink ingsk i l ls ,andattent iontodeta i l . Mostimportant ly ,they constant lycha l lengemetodobetterinconduct ingh ighqua l ityresearchandlearn ing thesk i l lo fexp la in ingtheresearchbetter .Iamverygrate fu ltothemandw i l la lways apprec iatethe irinva luab legu idanceinimprov ing myovera l lsk i l lset .

Iwou ldl iketothankthe Departmento fComputerSc ienceand Memor ia lUn i- vers ityo fNew found landforthe irgenerousresourcesw ithoutwh ichth isthes iswou ld nothavebeenposs ib le . Myresearchisfinanc ia l lysupportedbymysuperv isor ’sgrant (NSERCD iscoveryGrants)andscho larsh ipfromtheUn ivers ity ’sSchoo lo fGraduate Stud iesandDepartmento fComputerSc ience .

Iwou lda lsol iketothankthegraduatestudentso ftheDepartmento fComputer Sc ienceforprov id ingthe irins ight fu lcommentsandsuggest ions . A l lo fmySt .John ’s fr iendsthat made mytrans it ionheresmooth , made myl i fefunanda lwaysprov ided mora lsupportthroughoutthecourseo f myresearch .

Lastbutdefin ite lynottheleast ,Iowemydeepestgrat itudetomylov ingparents , MuhammadYakoobandFar idaYakoob ,whoare myidea ls ,bestfr iends ,strongest supporters ,anda lwayspush metostr iveforexce l lenceandinbecom ingabetter personeachday .

Refereedpub l icat ionsfromth isthes is : Basedupontheresearchworkcon-

ductedinth isthes is ,apeer-rev iewedpub l icat ionhasbeen made[58 ] .

(4)

Contents

Abstract i i

Acknow ledgements i i i

L istof Tab les v i i i

L istofF igures x

1 Introduct ion 1

1 .1 Mot ivat ion... 1

1 .2 Approach ... 6

1 .3 ResearchQuest ionsand Methods... 7

1 .4 Organ izat iono fThes is... 8

2 Re lated Work 10 2 .1 Onto log ies... 10

2 .2 Onto logyA l ignment ... 14

2 .2 .1 Defin it ion... 15

2 .2 .2 A lgor ithmTypes... 15

(5)

2 .2 .3 HumanInvo lvement W ith intheOnto logyA l ignmentProcess. 19

2 .3 Onto logyA l ignmentInter faces... 22

2 .3 .1 In format ionV isua l izat ion ... 22

2 .3 .2 Representat iono fOnto log ies... 28

2 .3 .3 Representat iono f Mapp ings... 30

2 .4 Onto logy Mapp ingsRepresentat ionIssues ... 34

2 .5 Base l ineSystem... 37

2 .6 EdgeBund l ing ... 41

2 .6 .1 Theoret ica lFoundat ion... 42

2 .6 .1 .1 Gesta ltLaws... 43

2 .6 .1 .2 Categor izat ionTheory... 44

2 .6 .1 .3 Cogn it iveLoadTheory... 45

2 .6 .2 EdgeBund l ingA lgor ithm... 46

2 .6 .2 .1 Compat ib i l ity Measures... 48

2 .6 .3 BenefitsandDrawbackso fEdgeBund l ing... 50

3 Edge Bund l ing w ith inan Onto logy A l ignmentInterface 52 3 .1 Mot ivat ion... 52

3 .2 Approach ... 54

3 .2 .1 F rameworkandConstra intso f Mapp ingRepresentat ions... 54

3 .2 .2 EdgeBund l ingProcess... 56

3 .2 .3 Bund leAmb igu ityandD isamb iguat ion... 59

3 .2 .4 SupportforOnto logyA l ignmentTasks... 61

3 .2 .4 .1 Mapp ingVa l idat ion ... 61

(6)

3 .2 .4 .2 Supportfor Mapp ingAdd it ion ... 66

3 .3 Imp lementat ionDeta i ls... 69

3 .3 .1 P lat form... 69

3 .3 .2 SystemArch itecture ... 70

3 .3 .3 WorkF low... 71

3 .4 Examp le... 73

3 .5 D iscuss ion... 79

4 Eva luat ion 86 4 .1 Purpose... 86

4 .2 Hypotheses ... 88

4 .3 Methodo logy ... 97

4 .3 .1 Tasks... 97

4 .3 .2 Se lect iono fDatasets... 99

4 .3 .3 Procedures... 101

4 .3 .4 Ana lys is... 104

4 .4 Eva luat ionResu lts... 106

4 .4 .1 Part ic ipantDemograph ics... 106

4 .4 .2 T imetoTaskComp let ion ... 107

4 .4 .3 Accuracy ... 111

4 .4 .4 Perce ivedD ifficu ltyo fG ivenTasks... 112

4 .4 .5 Perce ivedConfidenceinResu lts... 116

4 .4 .6 Perce ivedUse fu lnessandEaseo fuse... 118

4 .4 .7 Pre ference... 122

(7)

4 .4 .8 Open-endedFeedback ... 122 4 .4 .9 Th ink-a loudProtoco lF ind ings ... 124 4 .5 D iscuss ion... 125

5 Conc lus ionsandFuture Work 133

5 .1 ResearchContr ibut ions... 134 5 .1 .1 Bund l ingA lgor ithm Mod ificat ionsandInteract ionSupport.. 134 5 .1 .2 UserStudyF ind ings... 137 5 .2 Future Work... 139 5 .2 .1 FurtherImprovemento ftheProposedSystem... 139 5 .2 .2 Studyo fD ifferentFactorsAffect ingtheBund l ingQua l ity... 141 5 .2 .3 FurtherEva luat ions... 142

B ib l iography 144

A UserStudy 156

B Test Onto logy Datasets 177

(8)

L istof Tab les

2 .1 L isto fpopu lara l ignmentsystemsa longw iththetypeso fa lgor ithms emp loyedw ith inthemtoca lcu late mapp ings . ... 20 3 .1 Summaryo fco lourencod ingsforthe mapp ingdepend inguponthe

stateandlocat ion . ... 68 4 .1 Summaryo fcharacter ist icso fthetestonto logydatasets ... 101 4 .2 Round-Rob inrotat iono ftasksandinter faces . LettheCogZsystem

beCandCogZ-easC e andcons idertheusedonto logydatasetsasO A

andO B ... 102 4 .3 Featureso fthepart ic ipantdemograph ics ... 106 4 .4 Stat ist ica lana lys iso fthed ifferencesinthet imetotaskcomp let iono f

theva l idat iontasks . ... 108 4 .5 Stat ist ica lana lys iso fthed ifferencesinthet imetotaskcomp let iono f

theadd it iontasks . ... 110 4 .6 Stat ist ica lana lys iso fthed ifferencesintheaccuracyo ftheadd it ion

tasks ... 112

(9)

4 .7 Stat ist ica lana lys iso ftheresponsesfortheperce ivedd ifficu ltyformap- p ingva l idat iontasks ... 114 4 .8 Stat ist ica lana lys iso ftheresponsesfortheperce ivedd ifficu ltyformap-

p ingadd it iontasks ... 115 4 .9 Stat ist ica lana lys iso ftheresponsesfortheperce ivedconfidenceon

correctnessfor mapp ingva l idat iontasks ... 117 4 .10Stat ist ica lana lys iso ftheresponsesfortheperce ivedconfidenceon

correctnessfor mapp ingadd it iontasks ... 118 4 .11Stat ist ica lana lys iso fthed ifferencesintheperce iveeaseo fuseand

use fu lnesso fthetwointer facesforthemapp ingva l idat ionandadd it ion tasks ... 120 4 .12Summaryo ftheva l idat ionresu ltso fthehypothesesforbothtypeo f

onto logya l ignmenttasks ... 126

(10)

L istofF igures

2 .1 Anexamp leonto logyo faun ivers itydoma in[29 ]generatedus ingOWLV iz [46 ]p lug ininProt´ eg´ e[60 ] ... 12 2 .2 Typ ica lflowo fanonto logya l ignmentprocess ... 16 2 .3 Screenshoto ftheinter faceo ftheAgreementMakersystem[15 ,16 ].. 33 2 .4 Screenshoto ftheinter faceo ftheCOMA++system[28 ] ... 33 2 .5 Asnapshoth igh l ight ingthec lutter ingprob lemin B izTa lk Mapper

system[67 ] ... 36 2 .6 V iewo fthe ma ininter faceo ftheCogZsystem[28 ] ... 38 2 .7 Thefish-eyezoomfeatureo ftheCogZsystem[28 ] . ... 40 2 .8 Resu ltso fpresent ingalargenumbero f mapp ingson CogZ ’sinter-

face[28 ] . Thel ightred-co lordottededgesrepresentthecand idate mapp ings ,edgesturndarkerwhentheknow ledgemanagerhoversover

themandthetoo lt ipisa lso madev is ib le ... 41 2 .9 Thesnapshoto fasystemfrom[45 ]present ingthe m igrat iondatabe-

tweend ifferentpartso fthewor ldus ing(a)w ithoutedgebund l ing(b)

w ithedgebund l ing ... 42

2 .10(a)Pr iortothebund l ingprocess(b)A fterthebund l ingprocess... 47

(11)

3 .1 The ma inv iewo ftheprototypew ithoutbund l ing ... 56 3 .2 V iewo fthebase l inesystemw ithnobund l ing . Thereddashedcurves

representthecand idate mapp ingsgeneratedbythesystem ... 60 3 .3 App l icat iono fedgebund l ingtothe mapp ingedges . Notethat map-

p ingswhosesourceanddest inat ionarenotv is ib lehavebeenfi ltered . 60 3 .4 Thetoo lt ipshow ingthein format ionaboutthe mapp ingsthatare

presentinthebund le ... 63 3 .5 Thepop-upw indowprov id ingthe mapp ingva l idat ionoperat ions ... 64 3 .6 Aspertheknow ledgemanager ’sse lect ion ,the“m in i-corne l l :Astronomy”

to“corne l l :Astronomy” mapp ing ’sco lourhasbeenchangedtob lack .. 65 3 .7 Thetoo lt ipshowstheconfirmed mapp ingsinredco lourw ithbo ldfont . 65 3 .8 Theconfirmedmapp ingsl istedinthepop-upw indowarea lsopresented

us ingredco lourw ithbo ldfontfo l low ingtheschemeadoptedinthe too lt ip ,andwhenaconfirmedmapp ingisse lectedtheco lourischanged togreenfromb lue ... 67 3 .9 Theco lourtrans it iono fthemapp ingedgedepend inguponthecurrent

state ... 67 3 .10System Arch itectureo ftheprototypeCogZ-e . Thein format ionflow

fromd ifferentmodu leshasbeenmarkedbyb lackarrows .Theexchange

o fin format ionisrepresentedus ingredarrows . Thenewmodu lesthat

havebeendeve lopedare markedin maroon ... 70

3 .11 Extendedtoo lt ipl ist inga l lthe mapp ingsconta inedinthebund le ... 74

3 .12 Pop-upw indowl ist ingthe mapp ingsthatwerev is ib leinthetoo lt ip .. 75

(12)

3 .13 Theconfirmed mapp ing“Organ isat ionto Organ isat ion”isshownin b lueco lourw iththetextinthetoo lt ipinredco lourw ithbo ldfont .. 76 3 .14 Theconfirmed mapp ingappear ingatthebottominredco lourw ith

bo ldtextinthepop-upw indowa longw iththerema in ingcand idate mapp ings ... 77 3 .15 Thesnapshoto ftheprototypea ftertheknow ledge managerhascon-

firmeda l lthe mapp ingso fthein it ia lbund le ... 78 3 .16 Thenew lycreatedmapp ingbetweenthesource“Presentat ion”concept

andthetarget“Ind iv idua lPresentat ion”conceptisshownbythetop b luel ineaswe l lasinthetoo lt ip ... 79 4 .1 D istr ibut iono ftheaggregatedresponsesaboutfam i l iar ity w iththe

doma ino fthetestonto logydatasets ... 107 4 .2 Averaget imetakentocomp letetheva l idat iontasks . Errorbarsare

reflect ingthestandarderroraboutthe mean . ... 108 4 .3 Averaget imetakentocomp letetheadd it iontasks . Standarderror

aboutthe meanisshownus ingtheerrorbars ... 110 4 .4 Averageaccuracyforadd it iontasksinpercentage .Standarderrorso f

meano fthedatasamp leisa lsoshownus ingtheerrorbars ... 112 4 .5 D istr ibut iono fresponsesforperce ivedd ifficu ltyforva l idat iontasks

(a)un ivers itydoma in(b)con ferencedoma in ... 113 4 .6 D istr ibut iono fresponsesforperce ivedd ifficu ltyforadd it iontasks(a)

un ivers itydoma in(b)con ferencedoma in ... 114

(13)

4 .7 D istr ibut iono fresponsesforperce ivedconfidenceforva l idat iontasks (a)un ivers itydoma in(b)con ferencedoma in ... 117 4 .8 D istr ibut iono fresponsesforperce ivedconfidenceforadd it iontasks

(a)un ivers itydoma in(b)con ferencedoma in ... 118 4 .9 Aggregatedresponsesforperce ivedeaseo fuseanduse fu lnessmeasures

forthe mapp ingva l idat iontasks . ... 120 4 .10 Aggregatedresponsesforperce ivedeaseo fuseanduse fu lnessmeasures

forthe mapp ingadd it iontasks ... 121

(14)

Chapter1

Introduct ion

1 .1 Mot ivat ion

Effect ivecommun icat ionandco l laborat ionamongpeop le ,organ izat ions ,andso ft- waresystemsisthekeytosuccessintoday ’scompet it ive wor ld . However ,such commun icat ionamongso ftwaresystemscanbecha l leng ingduetothe irindependent deve lopment .So ftwaresystemsarebu i ltbypeop le ;duetod isparatebackgroundsand cu ltures ,educat ion ,andlanguages ,so ftwaredeve lopersmayused ifferentterm ino logy toexp la inthesameconceptua lent ity[80 ] .Inadd it ion ,they mayproduced ifferent h ierarch ica ldecompos it ionso fd ifficu ltorcomp lexdoma insw ith intheso ftwarethey create . Th isleadstod ifferentcomputersystemsstor ingin format ionind ifferentfor- mats , mak ingitd ifficu ltforthemtoshareandinter-commun icate . There fore ,in orderforsuchsystemstosharein format ion ,aneffect ive mechan ismisrequ iredthat canprov idetrans lat ionsbetweend ifferentdataformats .

Onto log iesprov ideaconven ient mechan ismtorepresentin format ionaboutado-

(15)

ma inbyorgan iz ingthe ir majorconceptsandform ingh ierarch ica lre lat ionsh ipsbe- tweenent it ies . Theyhavebecomeanimportanttoo lforshar ingandre-us ingin for- mat ionamongacadem iaandindustry[29 ] ,andthe irusagetosupporttheshar ingo f in format ionbetweenmu lt ip leso ftwaresystemshasbecomeincreas ing lyimportantin recentyears[18 ,41 ,43 ] .

Wh i lecommun icat ionandshar ingisstra ight- forwardwhentwosystemsusethe sameonto logy ,cha l lengesex istwhenshar ingisrequ iredbetweensystemsthatuse d ifferentonto log iestorepresentthesamein format ion .Th ismayoccurwhensystems arebu i ltinpara l le l ,whenonesystemischangedorupdated ,orwhennewshar ing funct iona l ityisrequ ired .Inthesecases ,in format ionshar ingrequ iresonto logya l ign- ment( i .e . ,a mapp ingbetweentheent it iesinthepa iro fonto log ies)[42 ] . Sucha mapp ingcanthenbeusedtoautomat ica l lytrans latein format ionfromoneonto logy totheother . Forexamp le ,inthe Webserv icearea ,onto logya l ignmentsareused totrans lateone messageformato fane-commerceapp l icat ionintoanother ,com- p lete lyautomat ica l lyorsem i-automat ica l ly[43 ] . Th istrans lat ionisneededtocon- ductbus inesstransact ionsbetweend isparatedatabases ,inventorycontro lsystems , ande-commercesystems . Thus ,thecha l lengeistoobta inaccuratetrans lat ionsby reso lv ingheterogene ityissuesandd iscover ings im i lar it iesamongd ifferentonto log ies represent ingcomparab ledoma ins . Theso lut ionsresearchedanddeve lopedto meet th ischa l lengeareusua l lyre ferredtoas“onto logya l ignmentapproaches”inthel it- erature[30 ] .

Onto logya l ignmenthasbeenanact iveresearchareaforovertwodecades[66 ] .Itis

st i l loneo fthemostfundamenta l ,t ime -consum ing ,labor- intens ive ,andcruc ia ltasks

(16)

afewconceptstosevera lhundreds ,manua l lya l ign ingtwoonto log iesisnotpract ica l whenonto log iesarelarge . There fore ,automat iconto logya l ignmenta lgor ithmsare requ iredthatcanautomat ica l lymatchent it iesbetweenthetwoonto log ies . Numerous onto logya l ignmenta lgor ithmsex isttoper formth istask[66 ] ,w iththeoutputbe ing aseto fcand idate mapp ingsbetweenent it iesacrossthepa iro fonto log ies .

Fundamenta l ly ,onto logya l ignmentisad ifficu ltprob lemduetothecomp lex it ies o fhumanlanguage . Humanbe ingsperce ivethesameth ingsd ifferent lybynature[70 ] ; thuswhend ifferentdeve lopersdes ignonto log iesforthesamedoma in ,eachmaylabe l andstructures im i larconceptsandpropert iesd ifferent ly .Forexamp le ,oneonto logy deve lopermaylabe lthepr iceo fthet icketinanonto logyrepresent ingthedoma ino f atrave lagencymanagementsystemas“pr ice” ,butanotherdeve lopermaydefinethe labe las“cost”inad ifferentonto logyrepresent ingthesamedoma in . Furthermore , intheh ierarch ica ldecompos it iono ftheprob lemdoma in ,oneonto logy mayinc lude thet icketpr icew ith inalargert icketent ity ,whereasanotherm ightinc ludeitw ith in atrave lit inerary . Usageo fsuchd ifferentlabe l l ingsandh ierarch ica ldecompos it ions o fimportante lementso ftheconceptso fanonto logy mayresu ltinheterogene ity confl icts ,andconceptua ld iscrepanc ies[66 ] . Thea l ignmentprocessneedstoreso lve thesetofindthe matchesbetweentheg ivenonto log ies .

Eventhoughs ign ificantadvanceshavebeenmadeasaresu lto fnatura llanguage

process ing[36 ]andgraphmatch ingapproaches[62 ] ,theendresu ltso fa l ignmenta lgo-

r ithmsarese ldomcerta inandareo ftenincomp lete[29 ] . Asaresu lt ,fu l lyautomat ic

a l ignmentsystemsarenotfeas ib le .S incethehuman m indiswe l l-su itedto mak ing

senseo famb iguousandincomp leteknow ledgeaboutthe wor ld ,it maybebenefi-

c ia ltotakeadvantageo fsuchhumancapab i l it iesbyinc lud ingthemintheonto logy

(17)

a l ignmentprocess[2 ] .

Recent ly ,researchershavestartedtoana lyzeandg iveimportancetovar iousas- pectso fuserinvo lvementw ith inonto logya l ignmentprocesses[30 ] .In it ia l ly ,us ing a match inga lgor ithm ,suchsystemsproducecand idate mapp ingsbetweentheg iven onto log iesandpresentthemtoaknow ledgemanager . Acommonapproachtoaddress theuncerta intyaboutthesecand idate mapp ingsistoa l lowtheknow ledge manager tointeract ive lyva l idatethe mapp ings ,aswe l lasaddnew mapp ingsthatwerenot detectedbytheautomat ica l ignmenta lgor ithm . Wh i leana¨ ıveapproachm ightbeto s imp lyprov ideal isto fmapp ingsfromwh ichaknow ledgemanagercande leteoradd newmapp ings ,do ingsomakesitd ifficu lttocons idertheonto log ica lstructuredur ing th isprocess . Th isinturnm ightleadtoincorrect lyva l idatedorm issedmapp ings . A moreeffect iveapproachistopresenttheonto log iesandthe ir mapp ingsinav isua l mannerthatisspec ifica l lydes ignedtosupporttheva l idat ionandadd it iontasks ,and toprov ideinteract ivetoo lsthata idinthecomp let iono fthesetasks[28 ,30 ] .

Thev isua lcomponento fonto logya l ignmentsystemsnorma l lyrepresentsthetwo onto log iesastreestructuresone ithers ideo ftheinter face ,andgraph ica l lydep ictsthe mapp ingsasedgesbetweenthecorrespond ingent it ies .Thetreestructuresprov idean effic ientwayo frepresent ingtheh ierarch ica lnatureo ftheonto log ies ,anda l lowforthe nav igat ionamonglargeorcomp lexonto log ies . Therepresentat iono fthe mapp ings us ingedgesa l lowstheknow ledge managertov isua l lytracetheconnect ionbetween ent it iesintheonto log ies .

Onto log iesthatarebu i lttorepresentcomp lexdoma inknow ledgecanbevery

large . Whenasem i-automat ica l ignmentprocessiscarr iedoutonapa iro fsuch

(18)

cand idate mapp ingsthat mustbeva l idatedbytheknow ledge manager . Theva l i- dat ionprocessinc ludestrac ingind iv idua l mapp ings ,confirm ingthere lat ionsh ip ,as we l lasident i fy ingandadd ingnew mapp ings .Inordertocomp letetheva l idat ion task ,theknow ledge managerhastoprocessand man ipu lateeachandeverycand i- date mapp ing . Va l idat ingthese mapp ingsisthe mostfundamenta lbutprotracted taskinthea l ignmentprocess .

S incethev isua lspaceava i lab letoshowthe mapp ingedgesisl im ited ,prob lems mayoccur whenrepresent ingalargenumbero f mapp ings . Whenrepresent inga re lat ive lysma l lseto fmapp ings ,thespacecanbeusedeffect ive ly ,mak ingiteasyfor know ledgemanagerstoperce iveandunderstandthemapp ingsandcomparethepa ir o fonto log ies . However ,whenrea ll i feonto log iesarea l igned ,thenumbero fmapp ings producedisgreatandpresent ingthesemapp ingsw ith inthespaceava i lab lecreatesa v isua l mess .

These mapp ingsareconvent iona l lydrawnasstra ightl inesbetweenthesource andtargetonto log iestrees ,andina l lbutthes imp lestcasesw i l linc lude manyedge cross ings . Evenw ithasfewas20 mapp ings ,theusab i l ityo fsuchinter facesqu ick ly deter ioratesduetov isua lc luttercreatedbytheedgecross ingsw ith inthe mapp ing reg iono ftheinter face . Asaresu lt ,theknow ledgemanagermayexper ienced ifficu lty inobserv inggenera lpatternsortrendsw ith inthemapp ingset ,aswe l lasinchoos ing and man ipu lat ingapart icu lar mapp ing . Suchc lutter maya lsocauseprob lemsfor find ingaspec ific mapp ingorseto f mapp ings .

Wh i lesomestud ieshaveexp loredtheuseo fcurvedl inestorepresentthemapp ings

[31 ]orinteract iveh igh l ight ingtoa l lowtheknow ledgemanagertofocusonaspec ific

mapp ing[31 ,67 ] ,thefundamenta lprob lemo ftry ingto makesenseo fav isua l ly

(19)

c lutteredrepresentat ionrema ins . Thepotent ia lso lut ionforreduc ingtheamounto f v isua lc lutterandimprov ingthev isua lorgan izat iono f mapp ingsthatisproposedin th isresearchistouseedgebund l ing[45 ]ontherepresentat iono fthe mapp ings .

1 .2 Approach

Edgebund l ingistheprocesso fd istort ingtheshapeso ftheedgesinagraphto prov idepathsthatareeas ierforthehumaneyetofo l low[45 ] .Ithasbeenproposed forprov id inganorgan izedv iewo flargegraphs ,group ingthegraphedgesbybund l ing the ircommonpathwaystogether[45 ] .

Inthecontexto fonto log ies ,thebund l ingprocesscanbecons ideredasapotent ia l so lut ionforthev isua lc lutterprob lemdueto mapp ingedgescross ings .Bybund l ing theedgesw ith inanonto logy mapp inginter face , mucho fthev isua lc lutterw ith in themapp ingrepresentat ionsw i l lbee l im inatedduetothec luster ingo ftheedgesand thecross ingo fafewbund lesratherthan manyind iv idua ledges . Thegroup ingo f mapp ingedgeshe lpsinprov id inganoverv iewo fthe mapp ingsresu lts ,wh ich may he lpaknow ledge managertoper formthe mapp ingva l idat iontasksw itheaseand increasedper formance .Insteado fwork ingonind iv idua l mapp ings ,w ithbund l ing , theknow ledge managercanworkon mapp ingsets( i .e . ,thosethataregroupedinto bund les) .

Anotheradvantageprov idedbyedgebund l ingistheeasyident ificat iono fpoten-

t ia l lyinterest ingareasw ith inthesourceandtargetonto log iesfromwh ichalarge

numbero f mapp ingsareor ig inat ingorterm inat ing .Ident i fy ingtheseareascanhe lp

theknow ledgemanagerinfocus ingonimportantconceptsw ith inthesourceandtar-

(20)

getonto log ies . Duetothestructureo fonto log ies ,theseareas mayconta inpotent ia l cand idateconceptsforfind ingandadd ingnewre lat ionsh ipsthatwerem issedbythe automat ica l ignmenta lgor ithm .

Wh i leedgebund l ingaddsstructuretothev isua l izat iono fthe mapp ings ,itdoes soatthecosto fintroduc ingamb igu ity . Thedrawbacko fus ingbund l ingisthat the mapp ingedgesarenowgroupedtogether mak ingitd ifficu lttotraceas ing le mapp ingfromsourcetodest inat ion .S incetheknow ledgemanagerisrequ iredtov iew eachandevery mapp ingbe foreva l idat ingit ,thesameab i l ityisa lsorequ iredwh i le dea l ingw ithbund l ing . There fore ,inordertod isamb iguatethe mapp ingsconta ined w ith inabund le ,someinteract ion mechan ismisneeded . Assuch ,a long w iththe imp lementat iono fedgebund l ing ,aninteract ivefocus ingfeatureisa lsorequ ired , wh ichcana l lowtheknow ledge managertoeas i lyv iewandper formthere levant va l idat iontasksforeach mapp ingind iv idua l ly .

1 .3 Research Quest ionsand Methods

Thepr imarygoa lo fth isthes isistostudytheapp l icat iono fanedgebund l ingap- proachtoonto logymapp ingrepresentat ion . Thefo l low ingresearchquest ionsw i l lbe addressed :

1)Howcantheedgebund l ingtechn iquebeadoptedtothedoma ino fonto logy mapp ingrepresentat ion?

2) Whatarethebenefitsanddrawbackso fus ingedgebund l inginthecontexto f onto logy mapp ingva l idat ionandadd it iontasks?

Toanswerthesequest ions ,thefirststepwastoident i fyastate-o f-the-artonto logy

(21)

a l ignmentandv isua l izat ionsystem ,wh ichcou ldbeusedasabase l inesystem . The m in imumrequ irementso fth issystemweretohand letheinputo fsourceandtarget onto log ies ,per formtheautomat ica l ignmento ftheonto log ies ,andfina l lypresent thecand idate mapp ingsw ith inav isua linter face . Toaddressthefirstresearchques- t ion ,a ftertheident ificat iono fthebase l inesystem ,thenextstepwastostudythe constra intsandrequ irementso ftheonto logymapp ingrepresentat ionandmod i fythe genera lbund l ingprocesstocomp lyw iththeserestr ict ions .Inadd it ion ,tosupportthe a l ignmentre latedtasks( i .e . ,mapp ingver ificat ionandadd it ion) ,interact ionmethods weredeve lopedforbund led isamb iguat ion . Thefina lstepwastoconductanemp ir- ica leva luat iontostudythebenefitso fus ingedgebund l ingtosupportaknow ledge manager ’sonto logya l ignmenttasks .Theresu ltso fth iseva luat ionaddressthesecond researchquest ion .

1 .4 Organ izat ionof Thes is

Therema indero fthethes isisorgan izedasfo l lows . Thefo l low ingchapterout l ines

re latedworkperta in ingtoautomat iconto logya l ignment ,onto logya l ignmentinter-

faces ,andedgebund l ing ,a longw ithatheoret ica ld iscuss ionontheproposedbenefits

o fedgebund l ing .Chapter3exp la instheprocesso fper form ingedgebund l ingw ith in

thebase l ineonto logya l ignmentframework ,prov idesthedeta i lsregard ingtheimp le-

mentat ion ,andd iscussestheva lueo ftheapproachinthecontexto fthepr imarytasks

o f mapp ingva l idat ionand mapp ingadd it ion . Chapter4prov idesthe methodo logy ,

resu lts ,andd iscuss iono ftheemp ir ica leva luat ion , wh ichwasconductedtostudy ,

eva luate ,andcomparethebenefitsanddrawbackso ftheproposed methodincom-

(22)

par isontothestate -o f-the-artbase l inesystem .InChapter5 ,th isthes isconc ludes

w ithasummaryo fthepr imarycontr ibut ionso fth isworka longw ithanout l ineo f

futurework .

(23)

Chapter2

Re lated Work

Th ischapterbeg insbydescr ib ingtheconcepto fonto log iesindeta i l ,a fterwh ichsev- era lcommontypeso fa lgor ithmsthatareusedina l ignmentsystemsw i l lbed iscussed . Fo l low ingthat ,theimportanceo fknow ledge managersinthea l ignmentprocessw i l l bedescr ibed ,wh ichw i l lleadtothed iscuss ionabouttheinter faceso fthea l ignment systemstosupportknow ledge managers . Th issect ionw i l la lsoinc ludead iscuss ion abouttheva lueo fv isua l izat ionindes ign ingsuchinter faces ,aswe l lasthetheor iesand pr inc ip lesfromthefie ldo fin format ionv isua l izat ionthatapp lytotherepresentat ion o fonto logy mapp ings . Ad iscuss iononex ist ing mapp ingrepresentat ionapproaches andissuesw i l lconst itutethenextsect iono fthechapter . Deta i lsw i l la lsobeprov ided onthechosenbase l inesystemforth isresearch .

2 .1 Onto log ies

“Onto logy”isatermthator ig inatesfromthefie ldo fph i losophywhereitisused

(24)

andthenatureo fre lat ionsh ipsthatex istsbetweentheseth ings . Inthefie ldo f computersc ience ,th istermhasevo lvedtorepresentthefina lproductobta inedwhen anabstractands imp l ifieddescr ipt iono fadoma iniscreated[39 ] .Thekeyconceptso f thedoma inarerepresentedus ingc lasses ,andlog ica lre lat ionsh ipsarecreatedamong thembyknow ledgeexpertstoreflectacommonunderstand ingo ftheovera l ldoma in . Pr imar i ly ,onto log iesprov ideastandardvocabu larythatcanenab letheshar ing andre-useo fdoma inknow ledge[43 ] ,aswe l lasa l lowautomatedcomputerreason ing andana lys is[43 ] . Doma inonto log iesprov idestandarddefin it ionso fthedoma in conceptsandthelog ica lre lat ionsh ipsthatex istbetweenthem ,wh ichcanbeused bycomputerstoextractcommon mean ingautomat ica l lyfromtheg ivenin format ion [81 ] . Us ingthecapab i l it ieso fthedoma inonto logy ,largeamountso fdatacana lso becons istent lyc lass ifiedandaggregated[81 ] .

Atrad it iona ll ist-basedgraph ica lrepresentat iono fasma l lonto logyo faun ivers ity doma inisshowninF igure2 .1 . The ma inconceptsthatdefineth isdoma inare dep ictedasnodesandthere lat ionsh ipsthatex istbetweentheseconceptsareshown intheformo fl inesconnect ingthere levantconcepts .Th isisoneexamp leo fav isua l representat iono fanonto logy ;a morecomp leted iscuss ionononto logyv isua l izat ion ispresentedinSect ion2 .3 .2 .

Onto log ieshavebeenusedextens ive lyind ifferentapp l icat iondoma ins .Incom-

putersc ience ,oneo ftheimportantusageisintheareao ftheSemant ic Web[43 ] ,

inwh ichtheyfac i l itatethein format ionexchangebetweend ifferentsystems ,support

query-answerserv ices ,prov ideknow ledgebasesthatarere-usab le ,andenab lethe

interoperab i l ityamongvar iousheterogeneousin format ionsystems[39 ] . Theypr i-

mar i lyprov idesemant icsforthee lementso fawebdocument ,aswe l lasenab lethose

(25)

F igure2 .1 :Anexamp leonto logyo faun ivers itydoma in[29 ]generatedus ingOWLV iz [46 ]p lug ininProt´ eg´ e[60 ] .

semant icstobeusedbyinte l l igentin format ionretr ieva lsystemsandapp l icat ions [43 ] .

Thereare manytechno log iesandlanguagesthathavebeenintroducedtospec i fy onto log ies . XML(eXtens ib le MarkupLanguage)[6 ] ,isadata-descr ib inglanguage thatusestagstodescr ibeanddefinedataonthe Webinastructuredfash ion[18 ] . Itisthebas icunder ly ingtechno logythatcanbeusedtowr iteonto log ies ,wh icha lso prov idesthefoundat ionfor morespec ia l izedlanguagestoauthoronto log iessuchas RDF(ResourceDeve lopmentF ramework)[44 ]andOWL(Onto logy WebLanguage) [79 ] .

Theideabeh indtheselanguagesistoa l lowautomat icprocess ingo fthecontent

(26)

ratherthanshow ingpresentab lein format iontohumans[43 ] .RDF[44 ]wasdeve loped toincorporatemean ingto Webdatae lementsbyadd ingmach ine-readab lemetadata . Itwasthefirstlanguagethatwasdes ignedtosupporttheSemant ic Web[18 ] . RDF canbecons ideredasalayerontopo ftheex ist ingtechno log iesthatuseXML .RDF Schema(RDF-S)[7 ]extendsRDFandprov idesthebas icstructuresrequ iredtocreate thefundamenta lso fanonto logy ,suchasc lasses ,propert ies ,instances ,andre lat ion- sh ips(e .g . ,is-a ,ch i ld-o f)[18 ] .

Current ly ,the mostpopu laronto logydeve lopmentlanguageisOWL ,wh ichwas deve lopedbythe Web Onto logywork inggroup[14 ]underthe WWWConsort ium [79 ] . The OWLlanguagesupports moreexpress ib i l itythan RDF-S ,prov id ingthe ab i l itytodefineconstra intsandrestr ict ionsonthec lasses ,attr ibutes ,andinstances . There foreitisanimprovedvers iono fRDF-S ,prov id ingadd it iona lvocabu laryw ith forma lsemant ics[43 ] .

Duetothew idespreadusageo fonto log ies , manycurrent lyex istthat mode land

represents im i lardoma ins . Furthermore ,d ifferentnot ionsandtermscanbeusedto

descr ibethesameth ingsintheonto log iesusedw ith inso ftwaresystems . So ftware

deve lopershaveaspec ificseto frequ irementsandfactorstocons iderbe forebu i ld-

inganonto logyforaso ftwaresystem .Inadd it ion ,thereisaposs ib i l ityo fhuman

b iasbecauseo fd ifferentpercept ionsandunderstand ingso fthemean ingo faconcept .

Typ ica l ly ,awordoraseto fwordsareemp loyedtodescr ibeaconcept ,anditispos-

s ib lethatd ifferentcomb inat ionso fwords ,synonyms ,homonyms ,andotherre levant

featureso falanguageareusedtodefinethesameconcept ,wh ichcanbecon fus ing .

Th isindependenceinusageo flanguagebr ingsversat i l ityandbroadknow ledgeto

anonto logy ,but maya lsoleadtoanamb iguousorincomp letedefin it ionabouta

(27)

concept .

Toenab lecommun icat ionandshar ingo fin format ionbetween mu lt ip lesystems , d iscrepanc iesbetweenthe ironto log iesmustbereso lvedtoprov idetrans lat ion .Thus , in format ionshar ingrequ iresonto logya l ignment ,produc inga mapp ingbetweenthe ent it iesinthepa iro fonto log ies[42 ] .Inadd it ion ,a l ignmento fonto log iesrepresent- ingas im i lardoma inisrequ iredtoprov idea moreintegratedv iewo fthedoma in . Theemerg ingcha l lengeistoreso lveheterogene ityissuesandd iscovers im i lar it ies amongd ifferentonto log iesrepresent ingcomparab ledoma ins . Theapproachesand systemsdeve lopedtoper formth ischa l leng ingtaskarere ferredtoas“onto logya l ign- ment”[30 ] .

2 .2 Onto logy A l ignment

Onto logya l ignmentisarequ irementfor manyonto logy-dependentso ftwaresystems

inorderforthemtocommun icate w itheachother . Forexamp le ,Semant ic Web

app l icat ionsrequ ireeffect iveandstandardwayso fcommun icat ingbetweend ifferent

serv iceprov iders[43 ] . Theseserv iceprov idersexposecharacter ist icso fthe irserv ices

us ingonto log ies .Inordertoexchangein format ion ,theseonto log iesneedtobea l igned

[23 ] . Anumbero fd ifferenttypeso fa l ignmenta lgor ithmshavebeenproposedinthe

l iteraturetofindmapp ingsbetweentheconceptso ftheg ivenonto log ies . Genera l ly ,a

s ing leor mu lt ip letypeso fa l ignmenta lgor ithmsareusedinoverahundredd ifferent

proposeda l ignmentsystems[37 ] . Duetoth isw idespreadscope ,anout l ineo fthe

common lyusedtypeso fa lgor ithmsw i l lbeprov idedinth issect ion . Furthermore ,

abr ie fd iscuss ionaboutsystemsthatincorporatethosetypeso fa lgor ithmsw i l lbe

(28)

presenteda fterprov id ingaforma ldefin it iono ftheonto logya l ignment .

2 .2 .1 Defin it ion

Cons ider ingthesourceonto logyas Os andthetargetonto logyasOt,thesetwo onto log iesareprov idedasinputtotheonto logya l ignmentprocess ,wh ichproducesthe one-to-oneorone-to-many mapp ingsind icat ingthere lat ionsh ipthatex istsbetween there levantconcepts . W iththehe lpo fthesemapp ingre lat ionsh ips ,itcanbededuced thatthecorrespond ingonto logyconceptsrepresentthesameconceptua lent ity .

Themapp ingsaregenera l lypresentedintheformo f(Cs ,Ct ,i) ,whereCsandC t representtheconceptsfromsourceandtargetonto log ieswh i leishowstheconfidence leve lo fs im i lar itycomputedbythea l ignmenta lgor ithm .Therangeo ftheconfidence leve lva luesisnorma l lybetween0and1 .Inordertobecons idereda mapp ing ,the confidenceleve lo fthecomputeds im i lar ity must meetapre-definedthresho ldva lue . Ap ictor ia lrepresentat iono fth isprocessisshowninF igure2 .2 .

2 .2 .2 A lgor ithm Types

Anumbero fd ifferentapproachesex istforper form ingautomat iconto logya l ignment .

The mostcommonapproachusedbytheonto logya l ignmentsystemsisstr ing-based

match ing , wh ichemp loysaname-match ingapproachtothelabe lso ftheonto logy

concepts . Inthesestr ing-based methods ,thecomputat iono fthes im i lar ityva lue

betweentheconceptsisbasedontheamounto fre latednessamongthe irnamesor

labe ls .Certa instr ingnorma l izat iontechn iquesareapp l iedtotheinputstr ingsbe fore

theana lys iso fs im i lar ityiscarr iedout(e .g . ,casefo ld ing ,encod ing ,remova lo fb lanks

(29)

F igure2 .2 : Typ ica lflowo fanonto logya l ignmentprocess .

[37 ]) . Theoutcomeo ftheinputtextstothestr ing-basedtechn iquescanbereported asexact ,wh ichimp l iesthattheyareequa l ,orapprox imateuptoaleve lo fconfidence , wh ichisobta inedbyus ingd ifferents im i lar itymetr ics .Examp leso fstr ingcompar ison techn iquesinc ludeed itd istance ,soundexindex ,andprefix/suffixcompar ison[37 ] . A lthoughstr ing-basedtechn iquesprov idesupportfordeterm in ingconcepts im i lar ity , theycannotovercomethecomp lex ityintheusageo falanguagetodescr ibethose concepts . A l ignmentsystemsthatusestr ing-basedtechn iquesinc ludeCOMA[20 ] andCOMA++[1 ] ,OLA[27 ] ,Anchor-Prompt[61 ] ,andS-Match[36 ] .

Inordertoovercomethed ifficu ltyfacedbythestr ing-basedapproaches ,language-

basedapproachesareaddedtocomp lementthestr ing-basedapproaches . Theypro-

v ide moreadvancedsupportforconcept- labe l match ingbyincorporat ing methods

tocounterthecomp lex ityintroducedbylanguagesduetotheuseo fsynonyms ,

(30)

homonyms ,andphrases .Some methodsinc ludetoken izat ion ,remova lo fstopwords , andreduct iono faphraseorsentencetoitsstemwords[37 ] . Onto logya l ignmentsys- temsthatusethestr ing-basedtechn iquesaswe l laslanguage -basedana lys iso ftext inc ludeCOMA[20 ] ,COMA++[1 ] ,OLA[27 ] ,Anchor-Prompt[61 ] ,andS-Match[36 ] . Thestr ing-basedandlanguage-basedana lys istechn iquesofferasyntact icleve lo f match ing . However ,thequa l ityo fa l ignmentcanbegreat lyimprovedbyper form- ingsemant ic match ingana lys isontheconceptlabe ls .Inc lus iono fdoma in-spec ific thesaur iandlex ica ldatabases(e .g . , WordNet[33 ])forthematch ingstrategyenab les semant ic-basedcompar ison ,wh ichiso ftenre ferredtoasl ingu ist ic-based match ing [37 ] .S inceonto logyconceptsareorgan izedintheformo fh ierarch ica lre lat ionsh ips , theusageo fsynonyms ,homonyms ,etc . ,(prov idedbyal ingu ist icresource)inthe match ingstrategycanbeadvantageous . Do ingsoa lsohe lpsindeterm in ingthe natureo fthere lat ionsh ipsthatm ightex istamongtheconceptlabe ls ,suchasequ iva- lenceorgenera l izat ion[37 ] .Systemsthatper formmatch ingus ingl ingu ist icresources inc ludeOLA[27 ] ,Cup id[55 ] ,andCOMA[20 ] .

Thea l ignmentstrategycanbefurtherimprovedbyinc lud ingtype-basedmethods , wh ichusein format ionthatdefinesthetypeo fconcept(e .g . ,integer ,float ,str ing , date ,etc .)tofindappropr iate matches .Inadd it ion ,thel im itso ftheva luesonsuch conceptsisusedtoca lcu latethes im i lar ityamongtwoconceptlabe ls .Forexamp le ,i f aconceptfromanonto logyiso ftypeintegerandhasarangeo f0to2500 ,itcanbe markedasre levanttoas im i larconceptw ithad ifferentlabe linad ifferentonto logy w iththesametypeandrange[37 ] . Examp leso fsystemsthata lsouseth is method areOLA[27 ]andCOMA[20 ] .

A longw ithaboveapproaches ,someonto logya l ignmentsystemsa lsoaddtech-

(31)

n iquessuchasdec is ion-theoret icapproaches[72 ]thataskfortheknow ledgemanager ’s feedback( i .e . ,acceptorre jecta l ignments)tose lectfromtheava i lab lea l ignment strateg ies( i .e . ,semant ic ,lex ica l ,structura l)fortheg ivenpa iro fonto log ies[21 ] .For examp le ,i ftheknow ledge manageracceptsthe mapp ingsuggest ionthatwasfound us inglex ica la l ignmentstrategy ,itsimportancew i l lincreaseforthenexta l ignment taskamongothera l ignmentapproaches .Inotherwords ,thenextt imetheknow l- edge managerw i l lbeprov idedw iththesuggest ionproducedbythelex ica lstrategy ontopo fothers . Us ingtheknow ledgemanager ’sfeedbackinthea l ignmentprocessis h igh lydes irab le ,butitisd ifficu lttoach ieve[9 ] .Systemsthatusedec is ion-theoret ic approachesareCOMA[1 ]andQOM[23 ] .

Thecurrentstate-o f-the-art methodfor match ingisstructura l-based match ing , wh ichusestheh ierarch ica lstructureo ftheconceptinadd it iontothein format ion prov idedbythesames ing leconceptandfo l low ingothertr iv ia la l ignmentstrateg ies inordertocompute mapp ings[37 ] . The ma inideabeh indth istypeo f match ing istocons idertheonto log iesasgraphs ,andcomparetheh ierarch ica lre lat ionsh ips o fd ifferentconceptsassub-graphsus inggraph match inga lgor ithms . Forexamp le , i fthech i ldconcepto ftheparentconceptisthesameintermso flabe landother attr ibutestoach i ldo fad ifferentparentconcept(eveni ftheparentconceptslabe ls arenots im i lartoeachother) ,theyshou ldbe matchedtoeachother .

Inth isd imens ion ,s im i lar ityflood ing[56 ]isasuccess fu landcommontechn ique

usedinthecontexto fstructure match ing .Inth is method ,theideaisthati fthe

nodesares im i lartoeachother ,the irne ighborsshou lda lsobes im i lar .Th isapproach

propagatesth iss im i lar itydefin it iona longthedeeperleve lso finher itanceo fthecon-

(32)

find ingthere latedsemant icsintheonto log iesus ingthestructura ls im i lar ityana lys is . In it ia l ly ,theg ivenonto log iesaretrans formedintoalabe ledgraphstructure .The know ledge managerthense lectssomeanchors( i .e . , ma inent it ies)w ith inthegraph , orthesystemcana lsoprov idethoseautomat ica l lybyper form inglex ica l match ing . Thea lgor ithmthenana lyzesthesubgraphl im itedbytheseanchorsanddeterm ines wh ichc lassesfrequent lyappearins im i larpos it ionsons im i larpaths . Thesec lasses arel ike lytorepresentsemant ica l lys im i larconcepts . Theresu ltsproducedbyth is approachareprom is ing ;there foreitisnowusedin manya l ignmentsystemsasthe under ly inga l ignmenta lgor ithm[3 ,31 ]inc lud ingPROMPT[59 ] , COMA[1 ] , OLA [27 ] ,QOM[23 ] ,andR iMOM[54 ] .

S inceas ing letypeo fa l ignmenta lgor ithmcannotprov ideaccurateandcomp lete a l ignmentresu lts , mu lt ip lea l ignmenta lgor ithmtypesareincorporatedtocompute thea l ignments .Start ingfrombas ica l ignmentapproachessuchasstr ing-basedand language-basedtoo ls ,a l ignmentssystemsinc ludecomb inat ionso fd ifferenttypeso f a l ignmenta lgor ithmstocompute moreaccurateresu lts . Tab le2 .1presentsthel ist o fpopu lara l ignmentsystemsa long w iththetypeo fthea lgor ithmthathasbeen emp loyed .

2 .2 .3 HumanInvo lvement W ith inthe Onto logy A l ignment Process

Integrat ingd ifferentonto log iesandcomput ingassoc iat ionsbetweenthe irconcepts

iscons ideredad ifficu lttask . Desp itebe inganact iveresearchareafor manyyears

[30 ] ,thefu l lautomat iono ftheonto logya l ignmentprocessthatcanproducequa l ity

(33)

Tab le2 .1 : L isto fpopu lara l ignmentsystemsa long w iththetypeso fa lgor ithms emp loyedw ith inthemtoca lcu late mapp ings .

A l ignmentSystems A lgor ithmTypes

Str ing&Language Semant ic Type Dec is ion-theoret ic Structura l

COMA X X X X X

COMA++ X

OLA X X X X

PROMPT X X

S-Match X

Cup id X

QOM X X

R iMOM X

Anchor-Prompt X

mapp ingsonitsownisfarfrombe ingposs ib le[37 ] . Duetothecomp lex ityo fhuman

languageandvar iat ionsinhowthesamein format ion maybeencodedind ifferent

onto log ies ,theoutputo fautomat iconto logya l ignmenta lgor ithms mayinc ludeer-

rorsorom iss ions[66 ] . Hence ,thereisaneedtoincorporatehumandec is ion-mak ing

w ith intheonto logya l ignmentprocess . Thehuman m indisbettersu itedtounder-

stand ingthenuanceso flanguagethanautomat ica lgor ithms ,andisveryeffect iveat

mak ingdec is ionsbasedonincomp leteorconfl ict ingin format ion[74 ] . Thema inidea

istoproducebetterqua l ity mapp ingsbytak ingadvantageo fahuman ’sversat i le

know ledgeaboutthedoma inandab i l itytointerpretlanguage ,comb inedw iththe

effic ientab i l ityo f mach inesforstor inglargeamountso fin format ionandper form ing

(34)

repet it ivetasks[74 ] .

Humansaregoodatdeve lop ingnove lso lut ions ,adapt ingtothecurrents ituat ion , chang ingthestrategyi ftheor ig ina lso lut ionfa i ls ,andmak ingdec is ions[74 ] . Onthe otherhand , mach inescanoperatew ithvastamountso fdata ,reca l lin format ionac- curate ly ,per formrepet it ivetasksqu ick ly ,andcanma inta inper formanceforalonger t imeeveni fcarry ingoutmu lt ip letaskss imu ltaneous ly[74 ] .There fore ,itisanact ive areao fresearchtocomb inethesehumancapab i l it iesw ithimmenseab i l it ieso f ma- ch ines ,andfindagoodba lancebetweenautomat ionandmanua lworkinper form ing tasksathand[73 ] .

Systemscomb in inghumanand mach inecapab i l it iesinthecontexto fonto logy a l ignmentarere ferredtoassem i-automat icapproachesinthel iterature[28 ]andare ga in ingimportanceintheonto logya l ignmentresearchdoma in . Theyaregenera l ly compr isedo ftwoma inparts :firstistheunder ly inga l ignmenta lgor ithmthatisused bythesystemtofindthecand idate mapp ingsset ,andsecondisthedes igno fthe inter facethatisprov idedtotheknow ledge managerinordertosupportonto logy a l ignmenttasks .

Thesem i-automat icapproachesbu i ldupontheworko fautomat ic methods ,pro-

v id ingin format ionaboutthecand idatemapp ingstoa l lowtheknow ledgemanagerto

e itherconfirmorre jecteach mapp ing . Th is mapp ingva l idat ionprocessis moreeffi-

c ientthanmanua la l ignmentduetothefocuseddec is ionmak ingthatissupportedby

thesuggest iono fcand idate mapp ingsproducedbytheautomat ica lgor ithms .S ince

it maybeposs ib lefortheautomat ica lgor ithmsto m issimportant mapp ings ,these

sem i-automat icapproachesmusta lsoa l lowtheknow ledgemanagertomanua l lyadd

mapp ings .

(35)

Sem i-automat icapproachesforonto logya l ignmenthavebecomecruc ia ls inceau- tomat icprocessescannotsat is fythequa l ityrequ irementso f mapp ingre latedtasks [37 ,29 ] . Invo lv inghumansinth isprocesscan makethea l ignmentprocess more effic ient ,feas ib le ,anduse fu l[37 ] . W iththe irversat i legenera lknow ledge(wh ichis d ifficu lttoreproduceautomat ica l lyin mach ines)a longw iththesoph ist icatedand amp leprocess ingpowero fthe irv isua lsystem ,humansbr inginva luab leadvantages inso lv ingprob lems[29 ] . There fore ,inadd it iontothedeve lopmento fthestate-o f- the-arta l ignmenta lgor ithms ,care fu lcons iderat ionisa lsog iventothedes igno fthe inter faceo fthesesystemsthatfac i l itatestheinteract iono fknow ledgemanagersw ith thesesystems[37 ] .

2 .3 Onto logy A l ignmentInterfaces

A w iderangeo finter faceshavebeenproposedtosupportthehumane lemento f mapp ingva l idat ionandmapp ingadd it ionw ith insem i-automat iconto logya l ignment systems[28 ,37 ] . Acommonthemeamongtheseinter facesisthegraph ica lrepresen- tat iono fboththeonto log iesthemse lvesaswe l lasthemapp ings . Ratherthans imp ly prov id ingatextua ll isto fcand idate mapp ings ,theseinter facestakeadvantageo f in format ionv isua l izat iontechn iquestoconveyboththestructureo ftheonto log ies andthe mapp ingin format iontotheknow ledge managersinagraph ica lformat .

2 .3 .1 Informat ion V isua l izat ion

In format ionv isua l izat ion[82 ]isaninterd isc ip l inaryfie ldthatcomb ineshuman-computer

interact ion ,userinter facedes ign ,cogn it ivepsycho logy ,computergraph ics ,andin-

(36)

format ionsystemstobr idgethegapbetweenhumansand mach ines . Thecoreidea istobu i ldinteract ivev isua l methodstorepresentabstractdatainordertoamp l i fy cogn it ionandhe lpinuncover ingd ifferentaspectsaboutthedata .In format ionv i- sua l izat ionenhanceshumancogn it ionbyus ingcomputersupported ,interact ive ,and v isua lrepresentat ionso fabstract ,non-spat ia l ,sem i-structured ,orh ierarch ica ldata [49 ,53 ,82 ] . Thev isua l methodsdeve lopedforrepresentat ionso fabstractdatatake advantageo fthesw i ftprocess ingcapab i l ityo fthehumanv isua lapparatus[82 ] .

Thev isua lpercept ionsystemo fhumansisverypower fu landqu ickinautomat- ica l lyinterpret ingcerta inv isua lfeatures .In format ionv isua l izat iontakesadvantage o fth isqu ickprocess ingab i l itytoenhancetheunderstand ingandd iscoveryo fnew ins ightsaboutinvest igat iona ldata[53 ] . Thein format ionv isua l izat ionfie ldhasbeen ut i l izedtoaddressthed ifficu ltprob lemo fonto logya l ignment ,wh ichhasresu ltedin anumbero fd ifferentonto logya l ignmentinter faces[1 ,20 ,27 ,31 ,37 ,59 ,66 ] .

Cons ider ingtheamountandcomp lex ityo fdatathatisava i lab leford ifferentpur- poses ,effect ivev isua l izat ionrepresentat ionsforth isdataarerequ ired . Thesecan bedeve lopedus ingre levantin format ionv isua l izat iontheor iesandpr inc ip lestoe f- fect ive lyv iew ,eva luate , man ipu late ,andexp loresuchdata . V isua landinteract ive representat ionscanbefurtherenhancedbytak ingintocons iderat iontheuser ’sex- pectat ionsfromtheava i lab lein format ion . Theimp lementat iono ftheideasandthe chosendes ignmethodsforthepresentat iono fin format ion ,a longw iththeeva luat ion o ftheproductarea lsoimportantaspectso fin format ionv isua l izat ion .

Inthein format ionv isua l izat iondoma in ,theor iesandpr inc ip lesthathe lpinun-

derstand ingthework ingo fthehuman m indwhenperce iv ingobjects ,interpret ing

data ,andin ferenc inglog ica lconc lus ionsouto fv isua lrepresentat iono fthepresented

(37)

in format ionarea lsostud iedextens ive ly . Thereare manywe l l-definedtheor iesand pr inc ip lessuchaspre-attent iveprocess ing , Gesta ltLaws ,andco lourtheory ,wh ich canbeeffect ive lyusedtobu i lduse fu lv isua l izat ions .

Onephenomenonthatoccursconstant lyinthehumanv isua lsystemisre ferredto aspre-attent iveprocess ing .Thev isua lsystemo fthehumanbra inoperatesnatura l ly ataveryfastrate ,wh icha l lowsfastandseem ing lypara l le lprocess ingo fin forma- t ionassoonasitentersthroughtheret ina . Bystudy ingthev isua le lementsand cond it ionsthattr iggerpre-attent iveprocess ing ,dataattr ibutescanbe mappedto v isua le lementsthath igh l ightimportantaspectso fthepresentedin format iontoen- ab lequ ickprocess ing . The ma innot ionisthatthein format ionthatisrequ iredto beprocessedpre-attent ive ly ,needstorepresentedcare fu l lysothatitstandsoutfrom otherin format ion . Anotherimportantfactisthaton lyasma l lnumbero fitemsthat arev isua l lyd ifferentfromtherestcanbepre-attent ive lyprocessed .Th isnumbercan befoundbyper form ingrepeatedexper imentsfortheg ivenin format ionandbymea- sur ingtheresponset imeo ffind ingthein format ionfromtherest . Fa i l ingthat ,the pre-attent iveprocess ingcannotbeach ievedasthehumanv isua lsystemw i l lnotbe ab letodetectanyd ifferencequ ick lyandfocusontheimportantin format ion ,wh ich w i l lleadtothetyp ica lopt iono fper form ingaser ia lsearchtolookfortherequ ired in format ion .

Thereareanumbero fd ifferentv isua lvar iab lesthatcanbepre-attent ive lypro-

cessed(e .g . ,curvature ,l ineor ientat ion ,l inew idthshape ,s ize ,number ,etc) . Acurved

l inerepresent ingava luab lee lemento fin format ioncanbepre-attent ive lyprocessed

andqu ick lyident ifiedwhena l ltheotherdatanearth iscurvedl ineisshownasstra ight

(38)

canbeincreasedtoenab leitsqu ickprocess ingwh i lekeep ingw idthso ftheotherl ines constant .S im i lar ly ,al im itednumbero fova lshapescanbeprocessedsw i ft lyamong alargenumbero fc irc leshapesonag ivend isp lay . Furthermore ,ahugesquarecan beeas i lyd ifferent iatedfromtheseto fsma l lersquares .S im i lar ly ,amongthesetso f acerta innumbero fc irc les(e .g ,fourc irc les)c lose lyp lacedtoeachotherd istr ibuted overaspace ,asetconta in ingasma l lernumbero fc irc lesfromtherest(e .g . ,on lyone) canbesw i ft lyd ist ingu ished . Otherv isua lvar iab lesact ivat ingpre-attent iveprocess- ingarea lsoava i lab lethancanbeusedforpresent ingd ifferentk indso fin format ion w ith inin format ionv isua l izat ionsystems[82 ] .

Inordertodes igneffect ivein format ionv isua l izat ionsystemsforusers ,itisim- portanttostudyandunderstandhowhumansperce ivepatternsandinterpretin for- mat ion .Inth isd irect ion ,animportantstepwastakenbyagrouppsycho log istsin 1912fromtheGesta ltschoo lo fpsycho logy ,whodes ignedaser ieso fexper imentsand per formedanumbero feva luat ionstoexam inepatternpercept ion[82 ] . Theresu lts o ftheseeva luat ionsarecommon lyknownastheGesta l tLawso fPatternPercept ion inthefie ldo fin format ionv isua l izat ion .

TheGesta ltLaws[50 ]fundamenta l lydea lw ithtwotypeso fconcepts :percept ion o fre lat ionsh ips ,andpercept iono fforegroundfrombackground .Ingenera l ,a l lthe Gesta ltLawsperta intounderstandhowthehuman m indperce ivespatterns . How- ever ,threeGesta ltlawsthatfocusonhowhumansautomat ica l lyin fertheex istenceo f are lat ionsh ipbetweenth ingswhentheyarenearoneanother ,v isua l lys im i lartoeach other ,andconnectedtooneanother ,arere levantandimportantforth isresearch .

Thelawo fprox im i tysuggeststhatobjectsthatarepos it ionedneareachotherare

automat ica l lyperce ivedtobere lated . Duetotheuse fu lnesso fth isspat ia lprox im ity

(39)

interpretat ion ,itiscommon lyusedasades ignpr inc ip leinin format ionv isua l izat ion systems[82 ] . V isua lobjectsthatrepresentdatabe long ingtothesamegroupcanbe p lacednearbyeachothertofac i l itatetheautomat icpercept ionaboutthe irre lated- ness .Itiscons ideredtobethes imp lestwaytoaccentuatethere lat ionsh ipsbetween d ifferentdatae lementsandiscommon lyusedtov isua l lyrepresentc lusters[82 ] .

Asemphas izedbythelawo fs im i lar i ty,v isua lobjectsthataredrawns im i lar ly areperce ivedtobere lated[82 ] . Th islawiscommon lyusedasades ignpr inc ip le whenbu i ld inguserinter faces . Forexamp le ,same-s izeb lackco lourfi l ledc irc lesw i l l beperceptua l lygrouped ,whenp lacedinthesameinter faceassame-s izec irc lesthat arefi l ledw ithredco lor . There fore ,s im i lartypeo fv isua ltechn iquesaregenera l ly usedtoshowe lementsthatrepresentre lateddata .

Thelawo fconnec tedness[82 ]hasresu ltedinacommon lyuseddes ignpr inc ip leto showre lat ionsh ipbetweentwoobjects . Th islawassertsthatthes imp lestand most effect ivewayo fexpress ingthatanyk indo fre lat ionsh ipex istsbetweentwograph ica l objectsisbyconnect ingthemthroughal ine[82 ] . Th isdes ignpr inc ip lehasbeen usedandextendedinto manygraph ica lformsandstructuresthatareusedtoshow re lat ionsh ipsbetweenconta inedobjects(e .g . ,graphs ,trees ,node- l inkd iagrams) .

TheOpponentProcessTheoryo fCo lour[57 ]prov idesdeta i lsabouttheinterpre-

tat iono fco lorbythehumanv is ionapparatus .Itdescr ibestheprocessthroughwh ich

thehumaneyecanperce ivethed ifferencebetweenco lorsandthewaysinwh ichth is

rawin format ionisprocessedw ith inthehumanbra in . Thefactthatwecanperce ive

d ifferentco lours ,isafunct iono fourconesens it iv itytod ifferentwave lengthso fl ight .

Theab i l ityforthe m indtoeas i lydeterm inethed ifferencebetweenredandgreen ,

(40)

o fCo lour .

Accord ingtoth istheory ,thehumanbra incomb inesthest imu lusfromtheava i l- ab lethreecones( long , med ium ,short)be foreprocess ingthem ,wh ichresu ltsins ix e lementaryco loursthatareperce ivedasopponentpa irsa longthreechanne ls :red- green ,ye l low-b lue ,andb lack-wh ite . Wecanseethed ifferencesbetweenco lourson thesechanne lsmuchmoreeas i lythanw itharb itraryco lours .Forexamp le ,aredob- jectcanbeseenw ith inamajor ityo fgreenobjectsmuchbetterthani fitissurrounded byye l lowobjects . Us ingthegu ide l ineso fth istheory ,thes ixd ifferentco loursform ing thesechanne lsareeffect ivecho icestoencodethed ist inctnesso fd ifferenttypesdata attr ibutes[82 ] . Objectsthatareco loredus ingtheses ixpr imaryco lorsareautomat- ica l lyperce ivedasd ifferent ,andcanbecare fu l lyusedtorepresentd ifferentk indso f in format iononthesamescreen . Th istheorya lsoprov idesgu ide l inestouseco lorsto presentin format ionconta in ingorderandamountorintens ity .Forexamp le , mov ing a longtheb lack-wh itechanne l ,adarkerco lorcanbeusedtofi l lashapeorarearep- resent ingab iggeramountorh igherrank ,andal ightershadeo fthesameco lorcan beusedtoshowalessamountorlowerrank[76 ] .S im i lar ly ,rangeo fnegat iveva lues topos it iveva luescanbeshownus ingthered-greenco lourchanne l .

Anotherimportantaspecto fin format ionv isua l izat ionisthesupporto finterac-

t ion .Itisassumedthat manyin format ionv isua l izat ionsystemstodayhavebas ic

interact ionfeaturessuchaszoom ing ,fi lter ing ,andfocus ing .Interact ionistyp ica l ly

integratedintothev isua lrepresentat ions ,wh icha l lowsforachangeinthepresented

in format ion[85 ] . W iththeinteract ionsupport ,userscanadm in isterand man ipu-

latethepresentedin format ionaccord ingtothe irunderstand ingandrequ irements .

There fore ,interact iontechn iquescanimprovetheuser ’scogn it ion ,a l low ingthemto

(41)

eas i lyprocessandinvest igatethedataatadeeperleve l ,andthen makethere levant dec is ions .In format ionv isua l izat ionsystemsw ithoutinteract ionst i l lex ist ;theyare bas ica l lystat icimagesandaveryl im itednumbero ftaskscanbeper formedus ing suchsystems[19 ] .Intheabsenceo finteract ionsupport ,thesystemscaneas i lybe c lass ifiedasnotuser- fr iend lyandthe irva luecanbeadverse lyaffected ,spec ifica l ly thosesystemsthatdea lw ithlargedatasets[85 ] .

2 .3 .2 Representat ionof Onto log ies

Thefie ldo fin format ionv isua l izat ionprov idestheor iesandpr inc ip lestodea lw ith comp lexandlargeamountso fdata . S inceonto log iesconta inlargeco l lect ionso f datastructuresthatrepresentconstructso fdoma ins ,v isua l izat ionhasanenormous potent ia lindea l ingw ithonto log ies[53 ] . Consequent ly , manyonto logyv isua l izat ion systemsincorporated ifferentv isua l izat ion mechan ismsforpresent ingonto log ies .

Thepurposeo fprov id ingonto log iesinav isua lformatistoa l lowknow ledge managerstotakeadvantageo fthe irv isua lprocess ingcapab i l it ies . W ithaneffect ive v isua lrepresentat ion ,know ledge managersareab letoread i lyperce ive ,interpret , and makesenseo fthefeatures ,re lat ionsh ips ,andstructureamongthedata[10 ] . There fore ,show ingonto log iesingraph ica lformatenhancestheknow ledgemanagers ’ cogn it iveab i l it iestov isua l lyprocessthepresentedonto log ies ,wh ichcanhe lpthem ineffect ive lyper form ingtheassoc iatedtasks .

Becausetheonto log iesthemse lvesarestructuredasah ierarchyo fconcepts ,the

log ica lmethodforrepresent ingthemisinatreestructure[48 ] .Th isapproachisused

inv irtua l lya l lo fthepr iorresearchononto logya l ignmentinter faces[37 ,53 ] . The

(42)

structureo fonto log iesa longw iththe irre lat ionsh ipsareshownasah ierarchywhen tree-basedrepresentat ionsareemp loyed . Underth iscategory ,thesystemsgenera l ly presentonto log iess im i lartothew indows-exp lorerv iew[48 ] . Manysystemsthatare spec ifica l lydes ignedforonto logydeve lopmentfa l linth isgroup ,inc lud ingProt´ eg´ e [60 ] , OntoEd it[77 ] ,and OntoRama[24 ] . O ftenknow ledge managersarea lready fam i l iarw iththesemethods ,there forethesesystemsgenera l lyhaveh ighacceptab i l ity rateanduse fu lness[48 ] . S incetheonto log iesthemse lves maybe muchlargerthan whatcanfitonaregu larcomputerscreenevenus ingthetreestructure ,interact ive featuressuchasnodeco l laps ing/expans ion ,vert ica lscro l l ing ,andzoom ingareo ften imp lemented .

T ree-basedrepresentat ionsareh igh lystructuredandregu larinthe irin format ion layouts ,andares imp letoimp lementandd isp lay . Th isrepresentat ionoffersac lear v iewo fthec lassnamesandh ierarchy ,andthelabe lsrepresent ingthe mean ingo f thenodes .S inceeverynodeisind iv idua l lypos it ionedanda l locatedadefinedspace , tree-basedrepresentat ionsgenera l lyhavenoprob lemsintermso fover lapp ingand occ lus iono fnodelabe ls[48 ] .

However ,researchersarguethatthetree-basedrepresentat ionstyp ica l ly make

ineffic ientuseo ftheava i lab lescreenspace[32 ] . Theroots ideo fthetreeisle ft

comp lete lyempty ,lead ingtoovercrowd inginotherparts .Eventhetreeso fahundred

nodeso ftenneedmu lt ip lescreenstobecomp lete lyd isp layed ,orrequ irescro l l ings ince

on lyparto fthed iagramisv is ib leatag ivent ime .Inadd it ion ,certa ininteract ion

issues ,suchashav ingtodragthescro l lbarstonav igate ,leadtonegat iveop in ion

aboutthe iruse fu lness[48 ]asitisd ifficu lttokeeptracko ftheconcernednodes

w iththescro l lbarwhenthetreeislarge . OntoV iz[75 ]addressth isspace- l im itat ion

(43)

prob lembyg iv inganopt iono finteract ive lyco l laps ingnodes .Theknow ledgemanager canse lectthenodeswh ichtheywou ldl iketoseeandthese lectedconceptisthen shown w ithitssub -h ierarch iesorre latednodes . Inth isd irect ion ,SpaceT ree[64 ] usesexpans ionandcontract ionopt ionsforitssub-h ierarch iestoavo idtheprob lem o fspacel im itat ion . Theeva luat ionssuggestthatSpaceT ree[64 ]per formsbetterin compar isonw ithothersinspec ificonto logy-re latedtaskssuchasfind ingapart icu lar node .

Someresearchershaveexp loredtheuseo funstructuredgraphrepresentat ionsfor theonto log ies[48 ,53 ] . Thelacko fstructureo ftheserepresentat ionsimp l iesthat theycanbeverycompactandprov idesadvantagesintermso fava i lab lespaceon thed isp lay . However ,theyarelesseffect iveforthepurposeso fonto logya l ignment duetothefactthattheh ierarch ica lstructureo ftheonto logyisd ifficu lttoperce ive andunderstand . Furthermore ,itcanberatherd ifficu lttocomparesubsetso fthe onto log ica lstructure . Becauseo ftheseprob lemsandthepreva lenceo ftree-based representat ions ,inth isworkthetree-basedrepresentat ionshavebeenadopted .

2 .3 .3 Representat ionof Mapp ings

Ina l ignmentsystemsthatusetree-basedrepresentat ions ,sourceandtargetonto logy

treesarenorma l lyshownonoppos ites ideso fthescreen wh i lethe m idd lereg ion

isusedtorepresent mapp ingsbetweentheent it ies . Inth isd irect ion ,thesource

andtargetonto logytreep lacedareontheoppos ites ideso fthescreeninPROMPT

[59 ,62 ] ,however ,themapp ingrepresentat ionisrep lacedbyal istcomponentinstead

o fstra ightorcurvedl ines ,wh ichd isp laysa l ltheproduced mapp ingsinal istandis

(44)

showninthe m idd le .

The mapp ingbetweentwoconceptsrepresentstheassoc iat ionbetweentwodata e lements .S incehumansinterprettwoth ingsasbe ingre latedi ftheyareconnected byl inesassuggestedbythe Gesta ltLawo fconnectedness[50 ] ,theeas iest method to makeusersin ferandtracetheconnectednessbetweenthetwo matchedonto logy conceptsisbyl ink ingthemv iaal ine . There latednessencodedus ingl inesisauto- mat ica l lyperce ivedandqu ick lyprocessedthroughthepre-attent ivecapab i l it ieso f thehumanbra in . Th isencod ingcanbeusedbysystemstorepresentassoc iat ion amongobjects( i .e . , mapp ingsand matchedconcepts) .

Wh i leear lyapproachesusedstra ightl inestoconnectassoc iatedent it iesbetween

theonto log ies ,therearev isua ld ifficu lt ieswhenfo l low ingsuchconnect ionsdueto

thesharpcornersthatarecreatedbyedgecross ings .Forexamp le ,AgreementMaker

[16 ]d isp laystheh ierarch ica lre lat ionsh ipbetweentheonto logyc lasses( i .e . ,theinput

onto log ies)us ingthetree-basedrepresentat ion(seeF igure2 .3) .Thesourceandtarget

onto logytreesarep lacedontheoppos ites ideso fthescreen ,wh i lestra ightl inesare

usedtorepresentthemapp ingsbetweenthemappedc lassnodes[37 ] . Onefeatureis

thattheamounto fs im i lar ityinpercentageamongthemappedconceptsisd isp layed

a longthe mapp ingl inesasanumber ,wh ichcanbenot icedeas i lybytheknow ledge

manager . However ,v isua l lyencod ingth isin format iona longthemapp ingl inecauses

av isua l mess . Anotherfeatureo fthesystemisthat mapp ingl inescanbefi ltered

accord ingtothes im i lar itythresho ldsetbytheknow ledgemanager . Morein format ion

aboutanind iv idua lconceptinthetwog ivenonto log iesis madeava i lab leseparate ly

inadeta i lw indowwhentheknow ledgemanagerc l icksonthenoderepresent ingthat

concept . Theknow ledge managercana lsoaddthe irown mapp ingsbyse lect ingany

(45)

twonodesandthentr iggerthere levantoperat ioninthe menu .

S incethehumaneyecan moreread i lyfo l lowcurvedl ines[82 ] ,someonto logy a l ignmentinter facesnowusecurvestorepresentthe mapp ings[1 ,31 ] . COMA++

[1 ,42 ]hasas im i largraph ica luserinter faceto AgreementMaker[16 ] ,show ingthe onto log iesontheoppos ites ideso fthescreen(seeF igure2 .4) . However , mapp ings l inesfoundbytheunder ly ingmapp inga lgor ithmaredrawnus ingcurvedl inesrather thanstra ight . Thefeaturesintermso fre levant mapp ingoperat ions( i .e . ,va l idat ion andre ject ion)area lsos im i lar ,suchasa l low ingknow ledge managerstodefinethe ir ownmapp ingsformatchesthatthemapp inga lgor ithmm issed . Aconfidencemeasure isassoc iatedw itheach mapp ing ,wh ichshowstheleve lo fconfidencethe mapp ing a lgor ithmhasforthecorrectnesso fthepart icu lar mapp ing .

Thestatuso fthe mapp ingcana lsobeconveyedbyv isua lparameterso fthe curve .Insomesystems ,co lourisusedtorepresentthed ifferencebetweencand idate mapp ingsandconfirmed mapp ings[1 ,31 ] . L inesty le(e .g . ,so l id ,dashed) maya lso beusedtoconveyth isin format ion[31 ] ,a l low ingtheknow ledge managertoread i ly ident i fyfeatureso find iv idua l mapp ings .

Anopenissueinth isareaiswhethertov isua l lyrepresent mapp ingswhene ither

endo fthe mapp ingisnotv is ib leintheonto logyrepresentat ion . Th iss ituat ion may

occurwhenapart icu larport iono foneonto logyisco l lapsed ,orwhenanent ityis

notv is ib leduetothescro l l ingo falargeonto logy .Somesystemscont inuetoshow

these mapp ings ,w iththeendpo int ingtoaco l lapsednodeorbe ingd irectedoffthe

bottomortopo fthed isp lay[31 ,67 ] . Another methodistodynam ica l lyfi lterthese

mapp ings ,add ingthembackinastheco l lapsednodeisexpandedorastheknow ledge

(46)

F igure2 .3 :Screenshoto ftheinter faceo ftheAgreementMakersystem[15 ,16 ]

F igure2 .4 :Screenshoto ftheinter faceo ftheCOMA++system[28 ]

(47)

2 .4 Onto logy Mapp ings Representat ionIssues

Improvementsfortherepresentat iono fonto logy mapp ingshasbeeng ivenl itt leat- tent ioninthel iteraturecomparedtothedeve lopmento fa l ignmenta lgor ithms(Sec- t ion2 .2 .3)andonto logyrepresentat ions(Sect ion2 .3 .2) . Know ledge managerso f onto logya l ignmentsystemsfindtwo majorshortcom ings :alacko fava i lab i l ityo f thebas icv isua l izat ionandnav igat iono f mapp ingsthatre latetwoe lementso fthe schemas ,andtheinter facebecom ingeas i lyd isorgan ized whentheonto log iesand numbero f mapp ingsaretoolarge[67 ] .

Oneo fthefundamenta lprob lemsw ithv isua l lyrepresent ingmapp ingsw ithl inesor curvesisthatedgecross ingsmayoccurwhentheordero ftheent it iesisnotcons istent betweentheonto log ies . A lthoughtheuseo fcurvestorepresentthemapp ingsmakes theseedgecross ingseas iertofo l low ,theprob lemisnotso lvedw iththeuseo fcurves . Thecand idate mapp ingsproducedbythea l ignmentsa lgor ithmsaregenera l lyvery largeinnumber . Assuch ,whenrepresent ingarea l ist icseto f mapp ings ,theedge cross ings maycauseas ign ificantdegreeo fv isua lc lutter .

C lutterisanimportantconceptinda i lyl ivesandiscons ideredw ithgreats ign i f-

icanceinthein format ionv isua l izat iondoma in[69 ] .Itisdefinedasastateinwh ich

surp lusitemsorthe irrepresentat ionorpresentat ionsty le , mayleadtoadegrada-

t iono fper formanceatsometask[69 ] .Forexamp le ,c luttercaninter ferew ithnorma l

taskssuchassearch ingforanimportantitem(e .g . ,aresearchpaperonad isorgan ized

deskthathaslotso fotherirre levantpaperslay ingaround) .Inthelanguageo fuser

inter facedes ign ,itre ferstoas ituat ionthathasthepotent ia ltocreateobstac lesin

per form ingas imp letasksuchasv isua lsearch ing . Whendes ign inguserinter faces ,it

(48)

isoneo fthebas icprob lemsfacedbysystemsthatpresentextens iveamounto fin for- mat ionandhavel im itedd isp layspace .Severa lattemptshavebeen madetoaddress th isphenomenonandprov ided ifferent mechan ismsto measurev isua lc lutter[69 ] . Manyin format ionv isua l izat iontechn iquesa lsoex istforaddress ingth isprob lem[25 ] .

Inthecontexto fonto logya l ignmentinter faces ,th isc lutterdueto mapp ingedge cross ingsreducestheusab i l ityo ftheinter face , mak ingitd ifficu ltfortheknow ledge managertov isua l lytraceamapp ingfromsourcetodest inat ion ,ortomakecompar- isonsbetweenmapp ings . Wh i leoneapproachm ightbetofindanopt ima lorder ingo f ent it iesintheonto logyrepresentat ions ,thereisnoguaranteethatsuchanorder ing ex istsnorthatitisanorder ingthatw i l l makesensetotheknow ledge manager .

Mosto ftheonto logya l ignmentsystems ,inc lud ingtheonesd iscussedinSec- t ions2 .2 .3and2 .3 .3 ,prov ideverybas icv isua l izat iontoo lsandl im itedinteract ion opt ionsfor man ipu lat ing mapp ingsandtoconducta l ignmenttasks . Astheknow l- edge managerhastoper formtheexhaust ingtasko fva l idat ingcand idate mapp ings producedbythesystem ,thereisaneedforabetterorgan izat iono fonto logy map- p ingscomb inedw ithinteract ionsupportforonto logya l ignmenttasks( i .e . , mapp ing va l idat ion ,andadd it ion) .

Inth isd irect ion ,B izTa lk Mapper[67 ]attemptstoemp loyadvancedv isua l izat ion

approachestolessenthec lutter ingprob lem ,as we l las manyv isua ltechn iquesto

supporttheknow ledge managerinper form ing mapp ingtasks .Itsupportsd ifferent

k indso fauto-scro l l ing .Forexamp le ,g iventhesourceandtargetonto logytreespre-

sentedontheoppos ites ideso fthescreen ,whentheknow ledge managerse lectsan

item ,there levantobjectsonthescreenbecomeprom inentthroughh igh l ightpropa-

gat ion .Inadd it ion ,theinter faceisauto-scro l ledtobr ingtheh igh l ighted mapp ings

(49)

andthese lectediteminthe m idd leo fthescreen . Th iswaya l lthere levantl inks tothese lecteditemareshownas muchasposs ib le . Anotherinteract ivefeatureis mu lt i-se lect ,wh icha l lowsthemanagertoseemu lt ip lee lementsandthere lat ionsthat ex istbetweenthem . The mapp ingl inksarea lsobendab lesothatthe managerscan makechangesintheformat iono fthe mapp ingl inesaccord ingtothe irneeds .

Evenw iththesefeatures ,whentherearealargenumbero f mapp ings ,itrema ins d ifficu lttoavo idc lutterintherepresentat ion .Thefundamenta lprob lemisthateach ind iv idua lmapp ingcont inuestoberepresentedbyastra ightorcurvedl inew ith inthe inter face .Th iscausess ign ificantedgecross ingsevenw ithasma l lnumbero fmapp ings edgesthatresu ltsintotheintroduct iono fv isua lc lutterontheinter face . Anexamp le o fth isprob lemcanbeseeninF igure2 .5 . Theproposedapproachisaddress ingth is fundamenta lissuethroughtheimprovemento fthev isua lrepresentat iono ftheedges .

F igure2 .5 : Asnapshoth igh l ight ingthec lutter ingprob leminB izTa lk Mappersys-

tem[67 ] .

Références

Documents relatifs

Based on the given reliability degrees of those mappings, SPARQLoid allows users to query data in the target SPARQL endpoints by using their own (or a specified certain) ontology under

Section 3 first describes a recent experiment to use different biomedical ontologies as reference ontologies without using semantic similarity to improve alignment results

When compared with all measures in both tables 2 and 3, WP had the lowest SAQ value (rank of 15). If the ranking was not identical across all OA systems, a ranking was only

As argumented by Enterprise Engineering, a systematic and scientific ap- proach for constructing enterprise architectures should contribute to a solution for these three

A first general remark about the results is that three of the participating systems, i.e., AFlood, ASMOV, and DSSim, provide better results in terms of precision rather than in terms

To solve the matching problem without rich literal information, a similarity propagation matcher with strong propagation condition (SSP matcher) is presented, and the

However, most of this work is centered on the use of formal ontologies using standards such as WSMO (Web Services Modeling Ontology), WSML (Web Services Modeling

After comparing the ontologies using concept names, we examine the instance data of the compared concepts and perform content matching using value types based on