• Aucun résultat trouvé

Towards Automatic Generation of NoSQL Document-Oriented Models

N/A
N/A
Protected

Academic year: 2021

Partager "Towards Automatic Generation of NoSQL Document-Oriented Models"

Copied!
9
0
0

Texte intégral

(1)

HAL Id: hal-02295340

https://hal.archives-ouvertes.fr/hal-02295340

Submitted on 24 Sep 2019

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Towards Automatic Generation of NoSQL

Document-Oriented Models

Fatma Abdelhedi, Amal Ait Brahim, Faten Atigui, Gilles Zurfluh

To cite this version:

Fatma Abdelhedi, Amal Ait Brahim, Faten Atigui, Gilles Zurfluh. Towards Automatic Generation

of NoSQL Document-Oriented Models. 24yh International Conference on Parallel and Distributed

Processing Techniques and Applications (PDPTA 2018), Jul 2018, Las Vegas, Nevada, United States.

pp.47-53. �hal-02295340�

(2)

Any correspondence concerning this service should be sent

to the repository administrator:

tech-oatao@listes-diff.inp-toulouse.fr

This is an author’s version published in:

http://oatao.univ-toulouse.fr/22433

Official URL

https://csce.ucmss.com/cr/books/2018/LFS/CSREA2018/PDP3263.pdf

Open Archive Toulouse Archive Ouverte

OATAO is an open access repository that collects the work of Toulouse

researchers and makes it freely available over the web where possible

To cite this version:

Abdelhedi, Fatma and Ait Brahim, Amal

and Atigui, Faten and Zurfluh, Gilles Towards Automatic

Generation of NoSQL Document-Oriented Models. (2018) In: 24yh

International Conference on Parallel and Distributed Processing

Techniques and Applications (PDPTA 2018), 30 July 2018 - 2

August 2018 (Las Vegas, Nevada, United States).

(3)

Towards Automatic Generation of NoSQL Document



Oriented Models

)$%'(/+(',



 $$,7%5$+,0



 )$7,*8,



DQG*=85)/8+

 

,5,77RXORXVH&DSLWROH8QLYHUVLW\7RXORXVH)UDQFH



&%,– 75,0$1(3DULV)UDQFH



&('5,& &1$0 3DULV)UDQFH

$EVWUDFW - Volume, Variety and Velocity are the three

dimensions that have definitely impacted the tools required to store Big Data. Adapted data management tools have arisen, i.e. NOSQL systems. Compared to existing DBMS, NoSQL systems are commonly accepted to support larger volume of data, provide faster data access, better scalability and higher flexibility. While NoSQL solutions have proven their efficiency to handle Big Data, it’s still an unsolved problem how the automatic storage of Big Data in these systems could be ensured. The aim of this paper is to propose a precise and automatic approach that guides and facilitates the Big Database implementation task within document-oriented systems considered as the new generation of DBMS technology. Our approach will assist the developers to map Big Database UML conceptual model into document-oriented physical models; it relies on a unified model designed for document-oriented systems. The instances of this model can be generated to target specific NoSQL platform.

.H\ZRUGV %LJ 'DWD VWRUDJH 1R64/ 80/ FRQFHSWXDO

PRGHOGRFXPHQWRULHQWHGPRGHO0'$

 ,QWURGXFWLRQ

 ,QWKH%LJ 'DWDHUDWKHUH PXVWEH '%06DEOHWRVWRUH ODUJHGDWDVHWVHIIHFWLYHO\ZLWKKLJKSHUIRUPDQFH1RWRQO\WKH DPRXQWRIGDWDLVRQDFRPSOHWHO\GLIIHUHQWOHYHOWKDQEHIRUH EXWDOVRZHKDYHYDULRXVW\SHVRIGDWDLQFOXGLQJIDFWRUVVXFK DV IRUPDW VWUXFWXUH DQG VRXUFHV )XUWKHUPRUH WKH VSHHG DW ZKLFKWKHVHGDWDPXVWEHFROOHFWHGDQGDQDO\]HGLVLQFUHDVLQJ ,Q VXFK VFHQDULR %LJ 'DWD DSSOLFDWLRQV QHHG D GDWDEDVH VROXWLRQWKDWLQWHJUDWHVHDVLO\DOOSRVVLEOHGDWDVWUXFWXUHVZKLOH RIIHULQJ ORZHU ODWHQF\ EHWWHU VFDODELOLW\ DV ZHOO DV KLJKHU IOH[LELOLW\

5HODWLRQDO V\VWHPV DUH PDWXUH GDWD PDQDJHPHQW WHFKQRORJ\ +RZHYHU ZLWK WKH ULVH RI %LJ 'DWD WKHVH V\VWHPV EHFDPH XQILW IRU ODUJH DQG GLVWULEXWHG GDWD PDQDJHPHQW 7KH PDMRU SUREOHPV RI UHODWLRQDO WHFKQRORJLHV DUH   WKH KRUL]RQWDO VFDOH UHODWLRQDO GDWDEDVHV DUH PDLQO\ GHVLJQHG IRU VLQJOH VHUYHU FRQILJXUDWLRQV WR VFDOH LW LW KDV WR EH GLVWULEXWHG RQ PXOWLSOH SRZHUIXO VHUYHUV ZKLFK VHHPV UHDOO\ H[SHQVLYH )XUWKHUPRUH KDQGOLQJ WDEOHV DFURVV GLIIHUHQW VHUYHUV LV D FRPSOH[ WDVN   D VWULFW GDWD PRGHO WR GHVLJQ EHIRUH GDWD SURFHVVLQJLQWKHFRQWH[WRI%LJ'DWDLWVKRXOGEHHDV\WRDGG QHZGDWDEXWWKHSUREOHPLVWKDWUHODWLRQDOPRGHOVDUHKDUGWR

FKDQJH LQFUHPHQWDOO\ ZLWKRXW LPSDFWLQJ SHUIRUPDQFH RU PDNLQJWKHGDWDEDVHRIIOLQH

&RPSDUHGWRH[LVWLQJ'%061R64/V\VWHPVDUHFRPPRQO\ DFFHSWHGWRVXSSRUWODUJHUYROXPHRIGDWDDQGSURYLGHIOH[LEOH GDWD PRGHOV ORZ ODWHQF\ DW VFDOH DQG IDVWHU GDWD DFFHVV >@ 1R64/FRYHUVDZLGHYDULHW\RIGLIIHUHQWV\VWHPVWKDWFDQEH FODVVLILHG LQWR IRXU EDVLF W\SHV NH\YDOXH FROXPQRULHQWHG GRFXPHQWRULHQWHGDQGJUDSKRULHQWHG,QWKLVSDSHUZHIRFXV RQWKHGRFXPHQWRULHQWHG7KLVLVMXVWLILHGE\WKHIDFWWKDWRXU case study, detailed in section “Motivation”, requires SURFHVVLQJ RSHUDWLRQV WKDW DFFHVV WR KLHUDUFKLFDOO\ VWUXFWXUHG GDWD DQG WKDW GRFXPHQWRULHQWHG V\VWHPV KDYH SURYHQ WR EH WKHPRVWDGDSWHGVROXWLRQIRUWKLVNLQGRIRSHUDWLRQV>@ 7RGD\WKHGHYHORSHUVKDYHWRGHDOZLWKWKHSUREOHPRIVWRULQJ %LJ'DWDLQ1R64/V\VWHPV1HZDSSURDFKHV JXLGHOLQHVDQG WHFKQLTXHVDUH UHTXLUHG LQ RUGHU WRRYHUFRPH WKH FRPSOH[LW\ RI WKLV WDVN ,Q WKLV FRQWH[W D QHZ DXWRPDWLF DSSURDFK WKDW JXLGHV DQG IDFLOLWDWHV WKH %LJ 'DWDEDVH LPSOHPHQWDWLRQ WDVN ZLWKLQ GRFXPHQWRULHQWHG V\VWHPV ZLOO EH SUHVHQWHG LQ WKLV SDSHU

7KHUHPDLQGHURIWKHSDSHULVVWUXFWXUHGDVIROORZV6HFWLRQ PRWLYDWHVRXUZRUNXVLQJDFDVHVWXG\LQWKHKHDOWKFDUHILHOG 6HFWLRQLQWURGXFHVRXUDSSURDFK6HFWLRQVDQGGHWDLORXU FRQWULEXWLRQV 6HFWLRQ  SUHVHQWV RXU H[SHULPHQWV 6HFWLRQ  UHYLHZVSUHYLRXVZRUN)LQDOO\6HFWLRQFRQFOXGHVWKHSDSHU DQGDQQRXQFHVIXWXUHZRUN

 0RWLYDWLRQ

 &DVHVWXG\

 7R PRWLYDWH DQG LOOXVWUDWH RXU ZRUN ZH SUHVHQW D FDVH VWXG\LQWKHKHDOWKFDUHILOHG7KLVFDVHVWXG\FRQFHUQVQDWLRQDO RU LQWHUQDWLRQDO VFLHQWLILF SURJUDPV IRU PRQLWRULQJ SDWLHQWV KDYLQJVHULRXVGLVHDVHV7KHPDLQJRDORIWKLVSURJUDPLV   WR FROOHFW GDWD DERXW GLVHDVH GHYHORSPHQW RYHU WLPH   WR VWXG\ LQWHUDFWLRQV EHWZHHQ GLIIHUHQW GLVHDVHV   WR HYDOXDWH WKH VKRUW DQG PHGLXPWHUP HIIHFWV RI WKHLU WUHDWPHQWV 7KH PHGLFDO SURJUDP FDQ ODVW XS WR  \HDUV 'DWD FROOHFWHG IURP HVWDEOLVKPHQWV LQYROYHG LQ VXFK D SURJUDP KDYH WKH FKDUDFWHULVWLFVRI%LJ'DWD WKH9 

9ROXPH7KHDPRXQWRIGDWDFROOHFWHGIURPGLIIHUHQWKHDOWK

(4)

9DULHW\ 7KH DPRXQW RI GDWD FUHDWHG ZKLOH PRQLWRULQJ

SDWLHQWV FRPH LQ GLIIHUHQW W\SHV DQG IRUPDWV 7KHUHIRUH WKH ':XVHGIRUWKLVDSSOLFDWLRQZLOOFRQWDLQ  VWUXFWXUHGGDWD UHVSLUDWRU\ UDWH EORRG SUHVVXUH WHPSHUDWXUH SDWLHQW QDPH GLDJQRVLVFRGHVHWF   XQVWUXFWXUHGGDWD SDWLHQWKLVWRULHV visit summaries, paper prescriptions, radiology reports…) and  VHPLVWUXFWXUHGGRFXPHQW VXFKDVWKHSDFNDJHOHDIOHWVRI PHGLFLQDO SURGXFWV WKDW SURYLGH D VHW RI FRPSUHKHQVLEOH LQIRUPDWLRQ HQDEOLQJ WKH XVH RI WKH PHGLFLQDO SURGXFW VDIHO\ DQGDSSURSULDWHO\ 

9HORFLW\ 6RPH GDWD DUH SURGXFHG LQ FRQWLQXRXV IORZ E\

VHQVRUVLWPXVWEHSURFHVVHGLQQHDUUHDOWLPHEHFDXVHLWFDQ EHLQWHJUDWHGLQWRWLPHVHQVLWLYHSURFHVVHV)RUH[DPSOHVRPH PHDVXUHPHQWV OLNH WHPSHUDWXUH UHTXLUH DQ HPHUJHQF\ PHGLFDOWUHDWPHQWLIWKH\FURVVDJLYHQWKUHVKROG 

 1HFHVVLW\RIFRQFHSWXDOPRGHOIRU%LJ'DWD

DSSOLFDWLRQV

 2QHRIWKH1R64/NH\IHDWXUHVLVWKDWGDWDEDVHVFDQEH VFKHPDOHVV 7KLV PHDQV LQ D WDEOH PHDQZKLOH WKH URZ LV LQVHUWHGWKHDWWULEXWHVQDPHVDQGYDOXHVDUHVSHFLILHG8QOLNH UHODWLRQDO V\VWHPV  ZKHUH ILUVW WKH XVHU GHILQHV WKH VFKHPD DQGFUHDWHVWKHWDEOHVVHFRQGKHLQVHUWVGDWDWKHVFKHPDOHVV SURSHUW\ RIIHUV XQGHQLDEOH IOH[LELOLW\ WKDW IDFLOLWDWHV WKH SK\VLFDO PRGHO HYROXWLRQ (QGXVHUV DUH DEOH WR DGG LQIRUPDWLRQ ZLWKRXW WKH QHHG RI GDWDEDVH DGPLQLVWUDWRU )RU LQVWDQFH LQ WKH PHGLFDO SURJUDP WKDW IROORZVXS SDWLHQWV VXIIHULQJIURPDFKURQLFSDWKRORJ\– FDVHVWXG\SUHVHQWHGLQ WKH SUHYLRXV VHFWLRQ – RQH RI WKH EHQHILWV RI XVLQJ 1R64/ GDWDEDVHV LV WKDW WKH HYROXWLRQ RI WKH GDWD DQG VFKHPD  LV IOXHQW ,Q RUGHU WR IROORZ WKH HYROXWLRQ RI WKH SDWKRORJ\ LQIRUPDWLRQ LV HQWHUHG UHJXODUO\ IRU D FRKRUWRI SDWLHQWV %XW WKHVLWXDWLRQRIDSDWLHQWFDQ HYROYHUDSLGO\ ZKLFKQHHGV WKH UHFRUGLQJ RI QHZ LQIRUPDWLRQ 7KXV IHZ PRQWKV ODWHU HDFK patient will have his own information, and that’s how data will HYROYH RYHU WLPH 7KHUHIRUH WKH GDWD PRGHO L  GLIIHUV IURP RQH SDWLHQW WR DQRWKHU DQG LL  HYROYHV LQ XQSUHGLFWDEOH ZD\ RYHU WLPH :H VKRXOG KLJKOLJKW WKDW WKLV IOH[LELOLW\ FRQFHUQV WKHSK\VLFDOOHYHOLHWKHVWRUHGGDWDEDVHH[FOXVLYHO\>@  ,Q LQIRUPDWLRQ V\VWHPV WKH LPSRUWDQFH DQG WKH QHFHVVLW\ RI FRQFHSWXDO PRGHOV DUH ZLGHO\ UHFRJQL]HG 7KH FRQFHSWXDO PRGHO SURYLGHV D KLJK OHYHO RI DEVWUDFWLRQ DQG D VHPDQWLF NQRZOHGJH HOHPHQW FORVH WR KXPDQ FRPSUHKHQVLRQ ZKLFK JXDUDQWHHV HIILFLHQW GDWD PDQDJHPHQW >@ )XUWKHUPRUH WKLV PRGHO LV D GRFXPHQW RI LQWHUFKDQJH EHWZHHQ HQGXVHUV DQG GHVLJQHUV DQG EHWZHHQ GHVLJQHUV DQG GHYHORSHUV $OVR WKH FRQFHSWXDO PRGHO LV XVHG IRU V\VWHP PDLQWHQDQFH DQG HYROXWLRQ WKDW FDQ DIIHFW EXVLQHVV QHHGV DQGRU GHSOR\PHQW SODWIRUP7KH8QLILHG0RGHOLQJ/DQJXDJH 80/ LVZLGHO\ DFFHSWHGDVWKHVWDQGDUGRILQIRUPDWLRQV\VWHPPRGHOLQJ>@

 2EMHFWLYH XWLOLW\

 2Q WKH RQH KDQG 1R64/ V\VWHPV KDYH SURYHQ WKHLU HIILFLHQF\WRKDQGOH%LJ'DWD2QWKHRWKHUKDQGWKHQHHGVRI DFRQFHSWXDOPRGHOLQJDQGGHVLJQDSSURDFKUHPDLQXSWRGDWH 7KHUHIRUH Ze are convinced that it’s important to provide D SUHFLVHDSSURDFKWKDWJXLGHVDQGIDFLOLWDWHVWKH%LJ'DWDEDVH

LPSOHPHQWDWLRQ WDVN ZLWKLQ 1R64/ V\VWHPV 7KLV DSSURDFK ZLOO DVVLVW WKH GHYHORSHUV WR PDS %LJ 'DWDEDVH 80/ FRQFHSWXDOPRGHOLQWR1R64/SK\VLFDOPRGHOV

)RU WKLV ZH SURSRVH WKH 2EMHFW1R64/ 0'$EDVHG DSSURDFKSUHVHQWHGLQWKHIROORZLQJVHFWLRQ

 2EMHFW1R64/DSSURDFK

 7KUHH/HYHODUFKLWHFWXUHRI2EMHFW1R64/

SURFHVV

 ,Q WKLV SDSHU ZH SURSRVH WKH 2EMHWF1R64/ DSSURDFK WKDW VWDUWV IURP D 80/ FODVV GLDJUDP DQG JHQHUDWH 1R64/ SK\VLFDO PRGHOV :H LQWURGXFH D ORJLFDO OHYHO EHWZHHQ FRQFHSWXDO EXVLQHVV GHVFULSWLRQ  DQG SK\VLFDO WHFKQLFDO GHVFULSWLRQ  OHYHOV LQ ZKLFK D GRFXPHQWRULHQWHG ORJLFDO PRGHOLVGHYHORSHG

7KH QHHG IRU WKLV LQWHUPHGLDWH ORJLFDO  OD\HU LV MXVWLILHG E\ UHIHUULQJ WR WKH $16,63$5& DUFKLWHFWXUH 7KLV DUFKLWHFWXUH VKRZV FRQFHSWXDO EXVLQHVV GHVFULSWLRQ  DQG LQWHUQDO WHFKQLFDOGHVFULSWLRQ OHYHOV7KH ODWWHU PD\EHGHFRPSRVHG LQWR WZR OHYHOV ORJLFDO DQG SK\VLFDO >@ 7KH ORJLFDO OHYHO DLPV WR SURYLGH DQ LQWHUPHGLDWH PRGHO WKDW GHVFULEHV WKH VWUXFWXUHRIWKHGDWDZLWKRXWLWHPL]LQJWKHVSHFLILFIHDWXUHVRI HDFK V\VWHP 7KLV HQVXUHV GDWD LQGHSHQGHQFH PHDQLQJ WKDW XSSHUOHYHOLVLVRODWHGIURPFKDQJHVWRORZHUOHYHO

,Q RXU VFHQDULR WKH ORJLFDO PRGHO FRUUHVSRQGV WR D 1R64/ PRGHO WKDW FDQ EH LPSOHPHQWHG RQ GLIIHUHQW GRFXPHQW RULHQWHGSODWIRUPV7KHDGYDQWDJHRIXVLQJWKLVXQLILHGPRGHO LVWROLPLWWKHLPSDFWVUHODWHGWRWHFKQLFDODVSHFWVRI1R64/ V\VWHPV 7KXV WHFKQRORJLFDO FKDQJHV RI WKH 1R64/ V\VWHP RUHYHQLWVUHSODFHPHQWE\DQRWKHUV\VWHP ZLOODSSHDULQWKH SK\VLFDOPRGHOEXWZRXOGQRWDIIHFWWKHORJLFDOPRGHO2QO\ WKH WUDQVIRUPDWLRQ SURFHVV “Logical Model ! 3K\VLFDO 0RGHO”ZLOOEHDGDSWHGDQGUHVWDUWHGWKHUHZLOOEHQRLPSDFW RQ “&RQFHSWXDO 0RGHO ! /RJLFDO 0RGHO” WUDQVIRUPDWLRQ 7KLV VLPSOLILHV WKH WUDQVIRUPDWLRQV DQG VDYHV GHYHORSHUV HIIRUWVDQGWLPH

 )RUPDOL]DWLRQWRRO

 7RIRUPDOL]HDQGDXWRPDWH WKH2EMHFW1R64/SURFHVV ZH XVH WKH 0RGHO 'ULYHQ $UFKLWHFWXUH 0'$  >@ 0'$ LV ZHOONQRZQ DV D IUDPHZRUN IRU PRGHOV DXWRPDWLF WUDQVIRUPDWLRQV 2QH RI LWV PDLQ DLPV LV WR VHSDUDWH WKH IXQFWLRQDO VSHFLILFDWLRQ RI D V\VWHP IURP WKH GHWDLOV RI LWV LPSOHPHQWDWLRQ LQ D VSHFLILF SODWIRUP 7KLV DUFKLWHFWXUH GHILQHV D KLHUDUFK\ RI PRGHOV IURP WKUHH SRLQWV RI YLHZ &RPSXWDWLRQ ,QGHSHQGHQW 0RGHO &,0  3ODWIRUP ,QGHSHQGHQW 0RGHO 3,0  DQG 3ODWIRUP 6SHFLILF 0RGHO 360 >@$PRQJWKHVHPRGHOVZHXVHWKH3,0WRGHVFULEH GDWDKLGLQJDOODVSHFWVUHODWHGWRWKHLPSOHPHQWDWLRQSODWIRUPV DQG WKH 360 WR UHSUHVHQW GDWD XVLQJ D VSHFLILF WHFKQLFDO SODWIRUP

 &RPSRQHQWRI2EMHFW1R64/SURFHVV

,Q RXU VFHQDULR WKH 80/ FODVV GLDJUDP DQG WKH GRFXPHQW RULHQWHG ORJLFDO PRGHO EHORQJ WR WKH 3,0 OHYHO $W WKH 360 OHYHOZHFRQVLGHU1R64/SK\VLFDOPRGHOVWKDWFRUUHVSRQGWR

(5)

GRFXPHQWRULHQWHG SODWIRUPV 6ZLWFKLQJ EHWZHHQ PRGHOV LV HQVXUHG E\ 00 0RGHO7R0RGHO  WUDQVIRUPDWLRQV VXFK WUDQVIRUPDWLRQVGHVFULEHDPDSSLQJEHWZHHQVRXUFHDQGWDUJHW PRGHOVHOHPHQWV)LJXUHVKRZVWKHGLIIHUHQWFRPSRQHQWRI 2EMHFW1R64/SURFHVV

2EMHFW'2/0  LVWKHILUVWVWHSLQWKHSURFHVV,WWUDQVIRUPV WKH LQSXW 80/ &ODVV 'LDJUDP LQWR WKH 'RFXPHQW2ULHQWHG /RJLFDO 0RGHO '2/0  SUHVHQWHG LQ VHFWLRQ ,9 '2/0'230   LV WKH VHFRQG VWHS WKDW JHQHUDWHV 'RFXPHQWV2ULHQWHG 3K\VLFDO 0RGHOV '230V    VWDUWLQJ IURPWKH'2/0DVZHOODVDVHWRIJXLGHOLQHVQHFHVVDU\IRU WKH GHYHORSHU WR HODERUDWH WKH DSSOLFDWLRQ LQWHUIDFH 7KLV LQWHUIDFHZLOOEHXVHGE\GRFWRUVWRSHUIRUPVRPHWDVNVVXFK DVGDWDHQWU\

,QGHHG GHSHQGLQJ RQ KRZ ZH GHILQH WKHLU SK\VLFDO GDWD PRGHO1R64/V\VWHPVFDQEHFODVVLILHGLQWRWKUHHFDWHJRULHV  6\VWHPVZKHUHWKHGDWDPRGHOLVSUHYLRXVO\IL[HG,QRWKHU ZRUGVOLNHUHODWLRQDOV\VWHPVDWWULEXWHVQDPHVDQGW\SHVPXVW EH GHILQHG ZKHQ FUHDWLQJ HDFK WDEOH 'DWD HQWU\ LV RQO\ SRVVLEOH DIWHU WKH PRGHO KDV EHHQ IXOO\ GHILQHG ([DPSOH &DVVDQGUD FROXPQRULHQWHGV\VWHP DQG 5LDN76 NH\YDOXH V\VWHP   6\VWHPVZKHUHRQO\SDUWRIWKHPRGHOLVGHILQHGEHIRUHWKH XVHULQVHUWVGDWDLWXVXDOO\FRQFHUQVVSHFLI\LQJWDEOHVQDPHV 7KLVLVWKHFDVHLQ0RQJR'% GRFXPHQWRULHQWHGV\VWHP DQG +%DVHV\VWHPV FROXPQRULHQWHGV\VWHP   6\VWHPVZKHUHWKHGDWDPRGHOLVVSHFLILHGDVDQGZKHQWKH URZLVLQVHUWHG7KHXVHULQVHUWVHDFKURZVSHFLI\LQJWKHWDEOH QDPHDVZHOODVWKHDWWULEXWHVQDPHV([DPSOH1HRM JUDSK RULHQWHGV\VWHP DQG5HGLV NH\YDOXHV\VWHP 

,Q WKH ODVW WZR FDWHJRULHV V\VWHPV GR QRW UHTXLUH WKH IXOO GHILQLWLRQRIWKHSK\VLFDO PRGHOEHIRUHGDWDHQWU\+RZHYHU WKH GHYHORSHU PXVW KDYH NQRZOHGJH RQ WKH DWWULEXWHV WR XVH DQG JXLGHOLQHV RQ KRZ WR LPSOHPHQW UHODWLRQVKLSV 7KXV LQ DGGLWLRQ WR JHQHUDWLQJ WKH HOHPHQWV QHHGHG IRU FUHDWLQJ WKH 1R64/ SK\VLFDO PRGHO RXU DSSURDFK SURYLGHV GHWDLOV DQG JXLGHOLQHV WKDW DOORZ GHYHORSHU WR HODERUDWH WKH DSSOLFDWLRQ LQWHUIDFH

 2EMHFW'2/0WUDQVIRUPDWLRQ

 ,Q WKLV VHFWLRQ ZH SUHVHQW WKH 2EMHFW'2/0 WUDQVIRUPDWLRQ ZKLFK LV WKH ILUVW VWHS LQ RXU DSSURDFK DV VKRZQ LQ ILJXUH  :H ILUVW GHILQH WKH VRXUFH 80/ &ODVV 'LDJUDP DQGWKHWDUJHW 'RFXPHQW2ULHQWHG/RJLFDO0RGHO  $IWHUWKDWZHIRFXVRQWKHWUDQVIRUPDWLRQLWVHOI

 6RXUFH80/FODVVGLDJUDP

$&ODVV'LDJUDP &' LVGHILQHGDVDWXSOH 1&/ ZKHUH 1LVWKHFODVVGLDJUDPQDPH &LVDVHWRIFODVVHV&ODVVHVDUHFRPSRVHGIURPVWUXFWXUDODQG EHKDYLRUDO IHDWXUHV ,Q WKLV SDSHU ZH FRQVLGHU WKH VWUXFWXUDO IHDWXUHVRQO\6LQFHWKHRSHUDWLRQVGHVFULEHWKHEHKDYLRUZH GR QRWFRQVLGHU WKHP )RU HDFK FODVV F א&WKH VFKHPDLVD WXSOH 1$ †‡–ୡ ZKHUH

x F1LVWKHFODVVQDPH

x F$ ^ƒଵୡǡ ǥ ǡ ƒ୯ୡ`LVDVHWRITDWWULEXWHV)RUHDFKDWWULEXWH

ƒୡ א A, the schema is a pair (N,C) where “ƒ.N” is the

attribute name and “ƒୡ.C” the attribute type; C can be a SUHGHILQHGFODVVLHDVWDQGDUGGDWDW\SH 6WULQJ,QWHJHU 'DWH RUDEXVLQHVVFODVV FODVVGHILQHGE\XVHU  x F †‡–ୡ LV D VSHFLDO DWWULEXWH RI F LW KDV D QDPH

†‡–ୡ.N and a type called “Oid”. In this paper, an

DWWULEXWH ZKLFK type is “Oid” represents a unique object LGHQWLILHU LH DQ DWWULEXWH ZKLFK YDOXH GLVWLQJXLVKHV DQ REMHFWIURPDOORWKHUREMHFWVRIWKHVDPHFODVV>@

/LVDVHWRIOLQNV(DFKOLQNOEHWZHHQQFODVVHVZLWKQ! LV GHILQHGDVDWXSOH 17\”୪ ZKHUH

x O1LVWKHOLQNQDPH

x O7\ LV WKH OLQN W\SH ,Q WKLV SDSHU ZH FRQVLGHU WKH WKUHH PDLQ W\SHV RI OLQNV EHWZHHQ FODVVHV $VVRFLDWLRQ &RPSRVLWLRQDQG*HQHUDOL]DWLRQ x O”୪ ^’” ଵ୪ǡ ǥ ǡ ’”୬୪`LVDVHWRIQSDLUV׊L א^Q`’”୧୪ F…”ୡ  ZKHUH ’” ୧ ୪F LV D OLQNHG FODVV DQG ’” ୧ ୪…” LV WKH PXOWLSOLFLW\SODFHGQH[WWRF1RWHWKDW’”୧୪…”ୡFDQFRQWDLQ DQXOOYDOXHLIQRPXOWLSOLFLW\LVLQGLFDWHGQH[WWRF OLNHLQ JHQHUDOL]DWLRQOLQN 

 7DUJHW'RFXPHQW2ULHQWHGORJLFDOPRGHO

7KH 'RFXPHQW2ULHQWHG /RJLFDO 0RGHO '2/0  PDLQO\ VKRZVWDEOHVDQGWKHLULQWHUUHODWLRQVKLSV ELQDU\UHODWLRQVKLSV  ,QWKLVVHFWLRQZHSUHVHQWWKLVPRGHO

,Q'2/0'DWD%DVH '% LVWKHWRSOHYHOFRQWDLQHUWKDWRZQV all the elements. It’s defined as a tuple (N, T, R), where: 1LVWKHGDWDEDVHQDPH 7LVDVHWRIWDEOHV7KHVFKHPDRIHDFKWDEOHWא7LVDWXSOH 1$ †‡–୲ ZKHUH x W1LVWKHWDEOHQDPH x W$ ^ƒଵ୲ǡ ǥ ǡ ƒ୯୲`LVDVHWRITDWWULEXWHVWKDWZLOOEHXVHGWR GHILQHURZVRIWHDFKURZFDQKDYHDYDULDEOHQXPEHURI DWWULEXWHV 7KH VFKHPD RI HDFK DWWULEXWH ƒ୲ א $ LV D SDLU

(6)

(N,Ty) where “ƒ୲.N” is the attribute name and “ƒ୲.Ty” the DWWULEXWHW\SH

x W †‡–୲LVDVSHFLDODWWULEXWHRIWLWKDVDQDPH †‡–1

and a type called “Rid”. In this paper, an attribute which type is “Rid” represents a unique row identifier, i.e. an DWWULEXWH ZKLFK YDOXH GLVWLQJXLVKHV D URZ IURP DOO RWKHU URZVRIWKHVDPHWDEOH

5 LV D VHW RI ELQDU\ UHODWLRQVKLSV (DFK UHODWLRQVKLS U א5 EHWZHHQ–ଵDQG–ଶLVGHILQHGDVDWXSOH 1”୰ ZKHUH

x U1LVWKHUHODWLRQVKLSQDPH x U”୰ ^’” ଵ ୰ǡ ’” ଶ ୰`LVDVHWRIWZRSDLUV׊L א^`’” ୧ ୰ W…”୲  ZKHUH ’” ୧ ୰W LV D UHODWHG WDEOH DQG ’” ୧୰…”୲ LV WKH PXOWLSOLFLW\SODFHGQH[WWRW

 7UDQVIRUPDWLRQ5XOHV

5HDFKFODVVGLDJUDP&'LVWUDQVIRUPHGLQWRDGDWDEDVH'% ZKHUH'%1 &'1 5HDFKFODVVFא&LVWUDQVIRUPHGLQWRDWDEOHWא'%ZKHUH W1 F1 †‡–୲1  †‡–1

5 HDFK FODVV DWWULEXWH ƒא F$ LV WUDQVIRUPHG LQWR D WDEOH

DWWULEXWHƒ୲ZKHUHƒ1 ƒ1ƒ7\ ƒ&DQGDGGHGWRWKH

DWWULEXWHOLVWRILWVWUDQVIRUPHGFRQWDLQHUWVXFKDVƒ୲אW$

5HDFKELQDU\OLQNOא/ UHJDUGOHVVRILWVW\SH$VVRFLDWLRQ

&RPSRVLWLRQRU*HQHUDOL]DWLRQ EHWZHHQWZRFODVVHV…ଵDQG…ଶ

LV WUDQVIRUPHG LQWR D UHODWLRQVKLS U א 5 EHWZHHQ WKH WDEOHV –ଵDQG –ଶ UHSUHVHQWLQJ …ଵDQG…ଶ ZKHUH U1  O1 U”୰

^ –ଵ…”ୡభ  –ଶ…”ୡమ `…”ୡభDQG…”ୡమDUHWKHPXOWLSOLFLW\SODFHG

UHVSHFWLYHO\QH[WWR…ଵDQG…ଶ

5 HDFK OLQN O א / EHWZHHQ Q FODVVHV ^…ଵǡ ǥ ǡ …୬` Q!   LV

WUDQVIRUPHGLQWR  DQHZWDEOH–୪ZKHUH–1 O1DQG–$ 

׎DQG  QUHODWLRQVKLSV^”ଵǡ ǥ ǡ ”୬`׊ L א^Q`”୧OLQNV–୪

WRDQRWKHUWDEOH–୧UHSUHVHQWLQJDUHODWHGFODVV…୧ZKHUH”୧1 

–୪1 B –

୧1 DQG”୧”୰ ^ –୪QXOO  –୧QXOO `

5HDFKDVVRFLDWLRQFODVV…ୟୱୱ୭EHWZHHQQFODVVHV^…ଵǡ ǥ ǡ …୬`

Q!  LVWUDQVIRUPHGOLNHDOLQNEHWZHHQPXOWLSOHFODVVHV 5  XVLQJ   D QHZ WDEOH –ୟୡ ZKHUH –ୟୡ1  O1 DQG   Q

UHODWLRQVKLSV^”ଵǡ ǥ ǡ ”୬`׊Lא^Q`”୧OLQNV–ୟୡWRDQRWKHU

WDEOH –୧ UHSUHVHQWLQJ D UHODWHG FODVV …୧ ZKHUH ”୧1 

–ୟୡ1 B – ୧1 DQG”୧”୰ ^ –ୟୡQXOO  –୧QXOO `/LNHDQ\ RWKHUWDEOH–ୟୡFRQWDLQDOVRDVHWRIDWWULEXWHV$ZKHUH–ୟୡ$ …ୟୱୱ୭$ :HKDYHIRUPDOL]HGWKHVHWUDQVIRUPDWLRQUXOHVXVLQJWKH497 4XHU\9LHZ7UDQVIRUPDWLRQ ZKLFKLVWKH20*VWDQGDUG IRU PRGHOV WUDQVIRUPDWLRQ $Q H[FHUSW IURP 497 UXOHV LV VKRZQLQILJXUH

 '2/0'230WUDQVIRUPDWLRQ

 ,Q WKLV VHFWLRQ ZH SUHVHQW WKH '2/0'230 WUDQVIRUPDWLRQ WKDW JHQHUDWHV   WKH SK\VLFDO PRGHO RI WKH DGRSWHGGRFXPHQWRULHQWHGV\VWHPDQG  WKHJXLGHOLQHVWKDW DVVLVW GHYHORSHUV WR LPSOHPHQW WKH ORJLFDO UHODWLRQVKLSV DQG LQGLFDWHWKHDWWULEXWHVWKDWFDQEHXVHG

 6RXUFH'RFXPHQW2ULHQWHG/RJLFDO0RGHO

 7KH VRXUFH RI '2/0'230 WUDQVIRUPDWLRQ LV WKH WDUJHW RI WKH SUHYLRXV WUDQVIRUPDWLRQ 2EMHFW'2/0 It’s a GRFXPHQWRULHQWHG1R64/ORJLFDOPRGHO

 7DUJHW'RFXPHQW2ULHQWHGSK\VLFDOPRGHO

 7R LOOXVWUDWH RXU DSSURDFK ZH KDYH FKRVHQ 0RQJR'% DQG &RXFK'% V\VWHPV )RU 0RQJR'% WKH FRUUHVSRQGLQJ PDSSLQJLVDYDLODEOHLQ>@

 FRXFK'%

 &RXFK'% GDWDEDVH େୌ  GRHVQW KDYH WDEOHV >@ ,W

FRQWDLQV D VHW RI GRFXPHQWV WKDW DUH WKH FRQWDLQHUV RI GDWD 7KXVେୌLVGHILQHGDVDWXSOH 1' ZKHUH

1LVWKHGDWDEDVHQDPH

'LVDVHWRIGRFXPHQWV(DFKGRFXPHQWKDV  DQLGHQWLILHU ZKLFKDOORZVWRXQLTXHO\UHIHUHQFHLWLQWKHGDWDEDVHDQG  D VHW RI NH\YDOXH SDLUV FDOOHG SURSHUWLHV (DFK SURSHUW\ LV FRPSRVHGRIDNH\WKDWUHSUHVHQWVLWVQDPHDQGDYDOXHWKDW FDQ EH DWRPLF RU FRPSOH[ FRPSRVHG RI RWKHU SURSHUWLHV  )RUPDOO\WKHVFKHPDRIHDFKGRFXPHQWGא'LVDWXSOH †ୢ 35 ZKHUH x G †ୢLVDXQLTXHLGHQWLILHURIG,WKDVDQDPH †1DQGD W\SH †ୢ7\ x G35  ୅ ׫ େଡ଼LV D VHW RI DWRPLF DQG FRPSOH[ SURSHUWLHVWKDWZLOOEHXVHGWRILOOG7KHVFKHPDRIDQDWRPLF SURSHUW\’”ୟא is a pair (Key,Ty) where “’”.Key” is the

property name and “’”ୟ.Ty” is the property type. The schema RIDFRPSOH[SURSHUW\’”ୡ୶ א େଡ଼ is also a tuple (Key, PR’)

ZKHUH’”ୡ୶.H\LVWKHSURSHUW\QDPHDQG’”ୡ୶.PR’ is a set of

SURSerties where PR’35

 7DUJHW'RFXPHQW2ULHQWHGSK\VLFDOPRGHO

 FRXFK'%SK\VLFDPRGHO 5WKHORJLFDOGDWDEDVH'% LVWUDQVIRUPHGLQWRD&RXFK'% GDWDEDVHେୌZKHUHେୌ1 '%1  *XLGHOLQHV

 &RXFK'% GDWDEDVH େୌ  GRHVQW KDYH WDEOHV >@ ,W

FRQWDLQV D VHW RI GRFXPHQWV WKDW DUH WKH FRQWDLQHUV RI GDWD 7KXVେୌLVGHILQHGDVDWXSOH 1' ZKHUH

1LVWKHGDWDEDVHQDPH

5$VPHQWLRQHGEHIRUH 6HFWLRQ9% D&RXFK'%GDWDEDVH

FRQWDLQVDVHWRI GRFXPHQWVWKHFRQFHSWRIFROOHFWLRQVWKDW allows to classify these documents don’t exist. Each row in a ORJLFDOWDEOHZLOOFRUUHVSRQGWRDGRFXPHQWLQ&RXFK'%7KLV GRFXPHQW QHHGV WR EH H[SOLFLWO\ DVVRFLDWHG WR WKH FRUUHVSRQGLQJ WDEOH )RU WKLV RXU SURFHVV FUHDWHV IRU HDFK GRFXPHQW G FRUUHVSRQGLQJ WR D URZ LQ D ORJLFDO WDEOH W D FRPSOH[SURSHUW\݌ݎ௧௖௫ZKHUH

 ݌ݎ௧௖௫.H\ LV FRPSRVHG RI   WKH WDEOH QDPH DQG   D

VHTXHQWLDOQXPEHU

݌ݎ௧௖௫9DOXHFRQWDLQVWKHSURSHUW\OLVWRIG

For example, for a row in the logical table “Patient”, we have WKHIROORZLQJ&RXFK'%GRFXPHQW

(7)

3DWLHQWB^QDPH: “ ”, SURIHVVLRQ: “ ”, …` ` 7KXVDOOGRFXPHQWVKDYLQJWKHSURSHUW\NH\;;;;BLZLOOEH FRQVLGHUHGWREHORQJWRWKHORJLFDOWDEOH;;;; )RUPDOO\׊G א ܦܤ஼ு.'DGRFXPHQWFRUUHVSRQGLQJWRDURZ LQDORJLFDOWDEOHWZHFUHDWHDFRPSOH[SURSHUW\݌ݎ௧௖௫ZKHUH x ’”୲ୡ୶.H\ >W1@BLLLVDVHTXHQWLDOQXPEHUUHIHUULQJDURZ

LQW H[DPSOH: Patient_1, Doctor_5 …),

x ƒ…Š ƒ––”‹„—–‡ ƒ୲ א –Ǥ ‹• –”ƒ•ˆ‘”‡† ‹–‘ ƒ ’”‘’‡”–›

’”ୟǡ™Š‡”‡’”Ǥ‡›ൌƒǤǡƒ†’”Ǥ›ൌƒǤ›ǡƒ†–Š‡

ƒ††‡†–‘–Š‡’”‘’‡”–›Ž‹•–‘ˆ‹–•…‘–ƒ‹‡”’”୲ୡ୶•—…Šƒ•’”ୟ

א’”୲ୡ୶ǤƒŽ—‡Ǥ

5 ,Q&RXFK'%WKHORJLFDOUHODWLRQVKLSVFRXOGEHFRQYHUWHG

XVLQJ WZR IRUPV UHIHUHQFHV DQG QHVWHG GDWD 7KXV IRU HDFK UHODWLRQVKLS U EHWZHHQ WZR WDEOHV ܜ૚ DQG ܜ૛ WKH IROORZLQJ

VROXWLRQVPD\EHFRQVLGHUHG

6ROXWLRQULVWUDQVIRUPHGLQWRDSURSHUW\݌ݎ௥௘௙UHIHUHQFLQJ

RQHRU PRUHGRFXPHQWVWKDW FRUUHVSRQGWRURZVLQ ݐଶ ZKHUH

݌ݎ௥௘௙NH\  –

ଶ1 B5HIDQGWKHQDGGHGWRWKHSURSHUW\OLVWRI

GRFXPHQWV WKDW FRQWDLQ WKH FRPSOH[ SURSHUW\ ݌ݎ୲௖௫భ  ZKHUH

݌ݎ୲௖௫భ.H\ >ݐଵ1@BLVXFKDV݌ݎ௥௘௙א ݌ݎ୲௖௫భ 9DOXH

6ROXWLRQULVWUDQVIRUPHGLQWRDSURSHUW\݌ݎ௥௘௙UHIHUHQFLQJ

RQHRU PRUHGRFXPHQWVWKDW FRUUHVSRQGWRURZVLQ ݐଵ ZKHUH

݌ݎ௥௘௙NH\  –

ଵ1 B5HIDQGWKHQDGGHGWRWKHSURSHUW\OLVWRI

GRFXPHQWV WKDW FRQWDLQ WKH FRPSOH[ SURSHUW\ ݌ݎ୲௖௫మ  ZKHUH

݌ݎ୲௖௫మ.H\ >ݐଶ1@BLVXFKDV݌ݎ௥௘௙א ݌ݎ୲௖௫మ9DOXH

6ROXWLRQ  U LV WUDQVIRUPHG E\ HPEHGGLQJ RQH RU PRUH GRFXPHQWV FRQWDLQLQJ WKH SURSHUW\ ࢖࢘ܜࢉ࢞૛.H\  >࢚૛1@BL LQ

GRFXPHQWVFRQWDLQLQJWKHSURSHUW\݌ݎ୲௖௫భ .H\ >ݐଵ1@BLZKHUH

݌ݎ୲௖௫మ א ݌ݎ୲௖௫భ9DOXH

6ROXWLRQ  U LV WUDQVIRUPHG E\ HPEHGGLQJ RQH RU PRUH GRFXPHQWV FRQWDLQLQJ WKH SURSHUW\ ݌ݎ୲௖௫భ .H\  >ݐଵ1@BL LQ

GRFXPHQWVFRQWDLQLQJWKHSURSHUW\݌ݎ୲௖௫మ.H\ >ݐଶ1@BLZKHUH

݌ݎ୲௖௫భ א ݌ݎ୲௖௫మ9DOXH

7KH W\SH RI WKH UHIHUHQFH SURSHUW\ PRQRYDOXHG RU PXOWLYDOXHG XVHGLQVROXWLRQVDQGDVZHOODVWKHQXPEHU RIGRFXPHQWV RQHRUPDQ\ WREHQHVWHGLQVROXWLRQVDQG GHSHQGRQWKHUHODWLRQVKLSFDUGLQDOLWLHV

,Q WKLV VHFWLRQ ZH KDYH SURSRVHG GLIIHUHQW VROXWLRQV WR WUDQVIRUPWKHORJLFDOUHODWLRQVKLSVXQGHU&RXFK'%,QRUGHUWR FKRRVH WKH PRVW VXLWDEOH VROXWLRQ WKH GHYHORSHU FDQ EH ZHOO JXLGHG WKDQNV WR WKH SHUIRUPDQFH PHDVXUHPHQW VKRZQ LQ Section “Experiments”. :H KDYH PHDVXUHG WKH TXHULHV UHVSRQVH WLPH XVLQJ HDFK RI WKH SURSRVHG VROXWLRQ 7KH GHYHORSHU ZLOO PDNH KLV FKRLFH DFFRUGLQJ WR WKH TXHULHV IHDWXUHV KH QHHGV WR SHUIRUP WKH H[SHFWHG SHUIRUPDQFHV DV ZHOODVWKHTXHULHVIUHTXHQF\RIXVH

 ([SHULPHQWV

 ,Q WKLV VHFWLRQ ZH VKRZ KRZ WR WUDQVIRUP D 80/ FRQFHSWXDO PRGHO LQWR D GRFXPHQWRULHQWHG 1R64/ SK\VLFDO

PRGHO$VSUHVHQWHGLQWKHSUHYLRXVVHFWLRQVHYHUDOVROXWLRQV FDQ HQVXUH WKLV WUDQVIRUPDWLRQ :H ILUVW GHWDLO KRZ ZH LPSOHPHQWHG WKH 2EMHFW1R64/ SURFHVV DQG WKHQ ZH VKRZ WKH H[SHULPHQW ZH FRQGXFWHG WR VWXG\ WKH LPSDFW WKDW WKH FKRLFH RI VROXWLRQ XVHG WR PDS WKH ORJLFDO UHODWLRQVKLSV PD\ KDYHRQWKHH[HFXWLRQWLPHRITXHULHV

 ,PSOHPHQWDWLRQ

 :H KDYH LPSOHPHQWHG WKH 2EMHFW'2/0 DQG '2/0'230WUDQVIRUPDWLRQVXVLQJDVHWRIWRROVSURYLGHG E\ (FOLSVH Modeling Framework (EMF). It’s D PRGHOV WUDQVIRUPDWLRQ HQYLURQPHQW WKDW FRQWDLQV D VHW RI SOXJLQV ZKLFK FDQ EH XVHG WR FUHDWH D PRGHO DQG JHQHUDWH RWKHU RXWSXWVEDVHGRQWKLVPRGHO(DFKWUDQVIRUPDWLRQLVH[SUHVVHG DV D VHTXHQFH RI HOHPHQWDU\ VWHSV WKDW EXLOGV WKH UHVXOWLQJ PRGHOVWHSE\VWHSIURPWKHVRXUFH PRGHO6WHSZHFUHDWH

WKHVRXUFHDQGWKHWDUJHWPHWDPRGHOVXVLQJWKHPHWDPRGHOLQJ ODQJXDJH (FRUH6WHS ZH EXLOG DQ LQVWDQFH RI WKH VRXUFH

PHWDPRGHO IRU WKLV ZH XVH WKH VWDQGDUGEDVHG ;0/ 0HWDGDWD,QWHUFKDQJH ;0, IRUPDW6WHSZHLPSOHPHQWWKH

WUDQVIRUPDWLRQ UXOHV E\ PHDQV RI WKH 4XHU\  9LHZ  7UDQVIRUPDWLRQ 497 ODQJXDJH WKH20*VWDQGDUGODQJXDJH IRU VSHFLI\LQJ PRGHO WUDQVIRUPDWLRQV 6WHS ZH WHVW WKH

WUDQVIRUPDWLRQE\UXQQLQJWKH497VFULSWRIVWHSWKLVVFULSW WDNHVDVLQSXWWKHVRXUFHPRGHOFUHDWHGLQVWHSDQGUHWXUQV WKHUHVXOWLQJPRGHOLQWKHIRUPRI;0,ILOH

2EMHFW'2/0 LV WKH ILUVWVWHS LQ 2EMHFW1R64/ SURFHVV ,W WUDQVIRUPV WKH LQSXW 80/ FODVV GLDJUDP ILJXUH   LQWR WKH SURSRVHG'2/0 ILJXUH  2EMHFW'2/0WUDQVIRUPDWLRQLV SHUIRUPHGE\PHDQVRIWKHPDSSLQJUXOHVGHILQHGLQVHFWLRQ 7KHVHUXOHVKDYHEHHQIRUPDOL]HGXVLQJWKH497ODQJXDJHDQ H[FHUSW IURP WKH 497 VFULSW LV VKRZQ LQ ILJXUH  7KH FRPPHQWVLQWKHVFULSWLQGLFDWHWKHUXOHVXVHG

,Q WKH VHFRQG VWHS WKH GHYHORSHU LQGLFDWHV WKH GRFXPHQW RULHQWHGV\VWHPKHZDQWVWRXVH 0RQJR'%IRUH[DPSOH DQG FKRRVHVRQHRIWKHUHODWLRQVKLSPDSSLQJVROXWLRQVZHSURSRVH WKHQ WKH '2/0'230 WUDQVIRUPDWLRQ UXQV ILJXUH   6WDUWLQJ IURP WKH '2/0 FUHDWHG E\ WKH SUHYLRXV WUDQVIRUPDWLRQ 2EMHFW'2/0 '2/0'230JHQHUDWHV   WKH SK\VLFDO PRGHO RI WKH VHOHFWHG V\VWHP ILJXUH   DQG   WKH DVVRFLDWHG JXLGHOLQHV ILJXUH   :H LOOXVWUDWH RXU H[SHULPHQWXVLQJ0RQJR'%

1RWHWKDWGXHWRODFNRIVSDFHZHRQO\SUHVHQWH[FHUSWVIURP PRGHOVDQG497VFULSWV

 (YDOXDWLRQ

 'HSHQGLQJ RQ WKH IXQFWLRQDOLWLHV RI WKH GRFXPHQW RULHQWHG V\VWHP VHOHFWHG E\ WKH GHYHORSHU WKH ORJLFDO UHODWLRQVKLSVFRXOGEHPDSSHGLQWRGLIIHUHQWIRUPV7RDVVLVW WKH GHYHORSHU LQ FKRRVLQJ WKH PRVW HIIHFWLYH IRUP ZH SHUIRUPHGDQHYDOXDWLRQWRVWXG\WKHLPSDFWRIHDFKPDSSLQJ VROXWLRQRQWKHTXHULHVH[HFXWLRQWLPH

:H FDUU\ RXW WKH H[SHULPHQWDO DVVHVVPHQW XVLQJ D FOXVWHU PDGH XS RI  PDFKLQHV (DFK PDFKLQH KDV WKH IROORZLQJ VSHFLILFDWLRQV,QWHO&RUHL*%5$0DQG7%GLVN

(8)

2Q WKH RWKHU KDQG ZH KDYH XVHG GDWD JHQHUDWRU WRROV WR JHQHUDWHDGDWDVHWRIDERXW7%ZLWK-621IRUPDW7KHVHILOHV DUHORDGHGLQWRWKHV\VWHPVXVLQJVKHOOFRPPDQGV)RUTXHULHV VHW ZH KDYH ZULWWHQTXHULHVZKLFKFRQFHUQWZRWDEOHVDQG WKHUHODWLRQVKLSEHWZHHQWKHP7KHVHTXHULHVLQYROYHWKHWZR UHODWHGWDEOHVDQGJUDGXDOLQFUHDVHLQWKHQXPEHURIDWWULEXWHV WRUHWXUQ $QH[FHUSWIURPRXUH[SHULPHQWUHVXOWVLVGHSLFWHG LQILJXUH )RUHDFKTXHU\ZHLQGLFDWHWKHREWDLQHGUHVSRQVH WLPHDFFRUGLQJWR  WKHUHODWLRQVKLSFDUGLQDOLWLHVDQG  WKH WUDQVIRUPDWLRQ VROXWLRQ XVHG 7KXV WKH GHYHORSHU FDQ PDNH KLVFKRLFHXVLQJRXUH[SHULPHQWUHVXOWV7KLVFKRLFHZLOOEH

EDVHG RQ WKH IROORZLQJ FULWHULD   7KH IHDWXUHV RI TXHULHV QXPEHURIILOWHUVQXPEHURIDWWULEXWHVWRUHWXUQ DQG  +RZ IUHTXHQWO\HDFKTXHU\ZLOOEHXVHG:HQRWHWKDWGXHWRODFNRI SODFH RQO\ WKH UHVXOWV REWDLQHG XQGHU 0RQJR'% DUH SUHVHQWHG

 5HODWHGZRUN

 %LJ 'DWD DSSOLFDWLRQV GHYHORSHUV KDYH WR GHDO ZLWK WKH TXHVWLRQ KRZ WR VWRUH %LJ 'DWD LQ 1R64/ V\VWHPV" 7R DGGUHVVWKLVSUREOHPH[LVWLQJVROXWLRQVSURSRVHWRPRGHO%LJ 'DWD DQG WKHQ GHILQH PDSSLQJ UXOHV WRZDUGV WKH SK\VLFDO

(9)

OHYHO ,Q WKH VSHFLILF FRQWH[W RI D GDWD ZDUHKRXVH ERWK >@ DQG >@ KDYH GHILQHG D VHW RI UXOHV WR WUDQVIRUP D PXOWLGLPHQVLRQDO PRGHO LQWR D 1R64/ PRGHO 2WKHU VWXGLHV >@ DQG >@ KDYH LQYHVWLJDWHG WKH SURFHVV RI WUDQVIRUPLQJ UHODWLRQDOGDWDEDVHVLQWRD1R64/PRGHO7RWKHEHVWRIRXU NQRZOHGJH RQO\ IHZ ZRUNV KDYH SUHVHQWHG DSSURDFKHV WR LPSOHPHQW80/FRQFHSWXDOPRGHOLQWR1R64/V\VWHPV/LHW DO >@ SURSRVH D 0'$EDVHG SURFHVV WR WUDQVIRUP 80/ FODVVGLDJUDPLQWRFROXPQRULHQWHGPRGHOVSHFLILFWR+%DVH 6WDUWLQJIURPWKH80/FODVVGLDJUDPDQG+%DVHPHWDPRGHOV DXWKRUVKDYHSURSRVHGPDSSLQJUXOHVEHWZHHQWKHFRQFHSWXDO OHYHO DQG WKH SK\VLFDO RQH 2EYLRXVO\ WKHVH UXOHV DUH DSSOLFDEOH WR +%DVH RQO\ *ZHQGDO HW DO >@ GHVFULEH WKH PDSSLQJ EHWZHHQ D 80/ FRQFHSWXDO PRGHO DQG JUDSK GDWDEDVHVYLDDQLQWHUPHGLDWHJUDSKPHWDPRGHO,QWKLVZRUN WKHWUDQVIRUPDWLRQUXOHVDUHVSHFLILFWRJUDSKGDWDEDVHVXVHGDV D IUDPHZRUN IRU PDQDJLQJ FRPSOH[ GDWD ZLWK PDQ\ FRQQHFWLRQV*HQHUDOO\WKLVNLQGRI1R64/V\VWHPVLVXVHGLQ VRFLDOQHWZRUNVZKHUHGDWDDUHKLJKO\FRQQHFWHG,Q>@GDWD PRGHOLQJ LQ 0RQJR'% GDWDEDVH KDV EHHQ VKRZQ E\ XVLQJ FODVV GLDJUDP DQG -621 IRUPDW WR UHSUHVHQW WKH GRFXPHQWV 6LPLODUO\%DQNHU>@SURYLGHVVRPHWRROVRIGDWDPRGHOLQJ EXW OLPLWHG WR 0RQJR'% GDWDEDVH DQG DOZD\V UHIHUULQJ WR -621IRUPDWDVDPRGHOLQJVROXWLRQ

5HJDUGLQJWKHVWDWHRIWKHDUWVRPHRIWKHH[LVWLQJZRUNV>@ DQG >@ IRFXV RQ UHODWLRQDO PRGHO WKDW XQOLNH 80/ FODVV GLDJUDP ODFNV RI VHPDQWLF ULFKQHVV HVSHFLDOO\ WKURXJK WKH VHYHUDOW\SHVRIUHODWLRQVKLSVWKDWH[LVWEHWZHHQFODVVHV2WKHU VROXWLRQV>@DQG>@KDYHWKHDGYDQWDJHWR VWDUWIURPWKH FRQFHSWXDO OHYHO %XW WKH SURSRVHG PRGHOV DUH 'RPDLQ 6SHFLILF 'DWD :DUHKRXVHV V\VWHP  VR WKH\ FRQVLGHU IDFW GLPHQVLRQ DQG W\SLFDOO\ RQH W\SH RI OLQNV RQO\ $SSURDFKHV SURSRVHG LQ >@ DQG >@ DUH RQO\ DSSOLFDEOH WR FROXPQ RULHQWHG>@DQGJUDSKRULHQWHG>@GDWDVWRUHV>@DQG>@ SUHVHQW D VWXG\ RI WHFKQLTXHV DQG WRROV IRU GDWD PRGHOLQJ XVLQJ 0RQJR'% V\VWHP WKH SURSRVHG VROXWLRQV DUH QRW JHQHULF WKH\ DUH UHVWULFWHG WR 0RQJR'% GRFXPHQWGDWDEDVH +RZHYHU LW PDNHV PRUH VHQVH WR JHQHUDOL]H WKH WUDQVIRUPDWLRQSURFHVVLQRUGHUWRDOORZWKHXVHUWRFKRRVHWKH WDUJHW GRFXPHQWRULHQWHG V\VWHP WKDW VXLWV WKH EHVW ZLWK EXVLQHVVUXOHVDQGWHFKQLFDOFRQVWUDLQWV

 &RQFOXVLRQDQGIXWXUHZRUN

 7KLV SDSHU SURYLGHV DQ DXWRPDWLF 0'$ DSSURDFK WKDW JHQHUDWHV 1R64/ SK\VLFDO PRGHOV VWDUWLQJ IURP D 80/ FRQFHSWXDO PRGHO 2XU DSSURDFK UHOLHV RQ D SLYRW ORJLFDO PRGHO WKDW XVHV WDEOHV DQG ELQDU\ UHODWLRQVKLSV 7KLV PRGHO H[KLELWVDVXIILFLHQWGHJUHHRISODWIRUPLQGHSHQGHQF\VRWKDW LWV PDSSLQJ WR RQH RU PRUH 1R64/ GRFXPHQWRULHQWHG SODWIRUPV LV IHDVLEOH 7KH DGYDQWDJH RI XVLQJ D XQLILHG ORJLFDO PRGHOLVWKDWWKLVPRGHOUHPDLQVVWDEOHZKHQHYHUWKH 1R64/ V\VWHPV HYROYH RU HYHQ LI ZH GHFLGH WR FRPSOHWHO\ FKDQJH WKH XVHG SODWIRUP ,Q WKHVH WZR FDVHV LW ZRXOG EH HQRXJK WR HYROYH WKH SK\VLFDO SODWIRUPGHSHQGHQW  PRGHOV DQG RI FRXUVH DGDSW WKH WUDQVIRUPDWLRQ UXOHV WKLV VLPSOLILHV WKH WUDQVIRUPDWLRQ SURFHVV DQG VDYHV GHYHORSHUV HIIRUWV DQG WLPH

2XU DSSURDFK LV EDVHG RQ WZR PDLQ VWHSV 7KH ILUVW RQH DXWRPDWLFDOO\FUHDWHVWKHORJLFDOPRGHOVWDUWLQJIURPD80/ FODVVGLDJUDP,QWKHVHFRQGVWHSWKHGHYHORSHUFKRRVHVRQH RIWKHUHODWLRQVKLSLPSOHPHQWDWLRQVROXWLRQVZHSURSRVHDQG WKH 1R64/ SK\VLFDO PRGHOV DUH JHQHUDWHG VWDUWLQJ IURP WKH ORJLFDO PRGHO 2XU DSSURDFK DVVLVWV WKH GHYHORSHU WR FKRRVH WKH PRVW VXLWHG LPSOHPHQWDWLRQ WR WKH SURMHFW KH LV ZRUNLQJ RQ:HKDYHPHDVXUHGWKHTXHULHVUHVSRQVHWLPHXVLQJHDFKRI WKHSURSRVHGWUDQVIRUPDWLRQVROXWLRQ7KXVWKHGHYHORSHUFDQ FKRRVHWKH PRVWVXLWHGVROXWLRQDFFRUGLQJ WR   WKHTXHULHV IHDWXUHV QXPEHURIILOWHUVQXPEHURIDWWULEXWHVWRUHWXUQ DQG  WKHTXHULHVIUHTXHQF\RIXVH :HDUHFXUUHQWO\ZRUNLQJRQDXWRPDWLQJWKHRYHUDOOSURFHVV 7KHFKRLFHRIWKHUHODWLRQVKLSLPSOHPHQWDWLRQVROXWLRQZLOOEH GRQH E\ WKH V\VWHP LWVHOI ZLWKRXW UHTXLULQJ GHYHORSHU LQWHUYHQWLRQ :H DOVR SODQ WR FRPSOHWH DQG JHQHUDOL]H RXU WUDQVIRUPDWLRQ SURFHVV WR FRQVLGHU WKH FRQVWUDLQWV GHILQHG LQ WKH FRQFHSWXDO PRGHO RQFH WKH 1R64/ SK\VLFDO PRGHO LV FUHDWHG DQRWKHU SURFHVV KDV WR EH SHUIRUPHG WR FKHFN WKHVH FRQVWUDLQWV

 5HIHUHQFHV

>@ $$EHOOó, “Big data deVLJQ”LQ2/$3

>@ 9 +HUUHUR $ $EHOOy O. Romero, “NOSQL design for DQDO\WLFDOZRUNORDGVYDULDELOLW\PDWWHUV”LQ(5 >@ ' *ZHQGDO 6 *HUVRQ & -RUGL “0DSSLQJ &RQFHSWXDO

6FKHPDVWR Graph Databases”. In ER, 2016.

>@ - +XWFKLQVRQ 05RXQFHILHOG DQG -:KLWWOH 0RGHO GULYHQHQJLQHHULQJSUDFWLFHVLQLQGXVWU\LQ,&6( >@ -%p]LYLQDQG2*HUEp7RZDUGVD3UHFLVH'HILQLWLRQRI

WKH20*0'$)UDPHZRUNLQ$6( 

>@ “About the Unified Modeling Language Specification Version 2.5.” [Online]. Available: KWWSZZZRPJRUJVSHF80/

>@ $ $QJDGL $N $QJDGL .DUXQD Gull. “Growth of New 'DWDEDVHV & Analysis of NOSQL Datastores” LQ ,-$5&66( 

>@ 79DMN3)HKHU.)HNHWH+&KDUDI“Denormalizing GDWDLQWRVFKHPDfree databases”LQ &RJ,QIR&RP >@ & /L 7UDQVIRUPLQJ UHODWLRQDO GDWDEDVH LQWR +%DVH $

FDVHVWXG\LQ ,&6(66

>@<DQ /L 3LQJ *X 7UDQVIRUPLQJ 80/ &ODVV 'LDJUDPV LQWR+%DVH%DVHGRQ0HWDPRGHO”LQ,6((( >@'HKGRXK . %HQWD\HE ) %RXVVDLG 2 .DEDFKL 1

“8VLQJ WKH FROXPQ RULHQWHG PRGHO IRU LPSOHPHQWLQJ ELJ GDWDZDUHKRXVHV”LQ3'37$

>@0&KHYDOLHU0(O0DONL$.RSOLNX27HVWHDQG5 Tournier, “How can we implement a Multidimensional 'DWD:DUHKRXVHXVLQJ1R64/"”LQ,&$,6 >@5.XPDU6&KDUX6%DQVDO(IIHFWLYH:D\WR+DQGOLQJ

%LJ 'DWD 3UREOHPV XVLQJ 1R64/ 'DWDEDVH  ,Q -R$'06

>@5 $URUD 5 $JJDUZDO “0RGHOLQJ DQG TXHU\LQJ GDWD LQ PRQJRGE”LQ,-6(5

>@. %DQNHU 0RQJR'% LQ DFWLRQ 0DQQLQJ 3XEOLFDWLRQV &R

>@“$SDFKH CouchDB 2.1 Documentation” >2QOLQH@ $YDLODEOH KWWSGRFVFRXFKGERUJHQ

>@$EGHOKHGL ) $,7 %UDKLP $ $WLJXL )  =XUIOXK “0'$%DVHG $SSURDFK IRU 1R64/ 'DWDEDVHV 0RGHOOLQJ”LQ'D:D.

Références

Documents relatifs

WR HQVXUH WKDWDOO DVSHFWV RI UHSURGXFWLYH DQG VH[XDO KHDOWK LQFOXGLQJ LQWHU DOLD PDWHUQDODQG QHRQDWDO KHDOWK DUHLQFOXGHG ZLWKLQ QDWLRQDO PRQLWRULQJ DQG UHSRUWLQJ

http://doc.rero.ch.. )LJXUH 6 6ROLG VWDWH HPLVVLRQ VSHFWUD RI DQG OLJDQG DW.. IRU FRPSRXQG DQG LQ HDUO\ PHDVXUHPHQW ZDV QRW YLVLEOH ZKHQ PDJQHWL]DWLRQ GDWD ZHUH

^4:`/ ^45`/ ^6`/ ^48` dqg vwdwh hvwlpdwlrq ri glvfuhwh0wlph v|vwhpv ^7`/ ^46`1 Frqvlvwhqf| whfkqltxhv kdyh ehhq suryhg wr lqfuhdvh wkh h!flhqf| ri lqwhuydo phwkrgv zkhq wkh qxpehu

Rqh ri wkh prvw dwwudfwlyh ihdwxuhv ri erxqghg0huuru hvwlpdwlrq lv wkh idfw wkdw lwv uhvxowv fdq eh jxdudqwhhg hyhq iru qrqolqhdu prghov dqg qlwh gdwd vhwv/ surylghg wkdw d ihz

,PSRUWDQW SDUW RI H[SORLWDWLRQ RI GDWD PHUJHG IURP HQWLUH SURGXFW OLIHF\FOH LV GDWD

WR VWUHQJWKHQ H[LVWLQJ RU HVWDEOLVK QHZ FHQWUHV DQG LQVWLWXWLRQV HQJDJHG LQ JHQRPLFV UHVHDUFK ZLWK D YLHZ WR VWUHQJWKHQLQJ QDWLRQDO FDSDFLW\ DQG DFFHOHUDWLQJ

To meet their needs, the Associate Committee prepared a short form of the National Building Code in 1951, “A Building Code for Small Municipalities.” This code provided model

Combining the previous results with the control of the error between the discrete-time scheme and the discretely obliquely reflected BSDE derived in Section 3, we obtain the