Any correspondence concerning this service should be sent
to the repository administrator:
tech-oatao@listes-diff.inp-toulouse.fr
This is an author’s version published in:
http://oatao.univ-toulouse.fr/24854
To cite this version: Ait Oubelli, Lynda and Ait Ameur, Yamine
and Bedouet, Judicaël and Kevarc, Romain and Chausserie-Lapree,
Benoit and Larzul, Beatrice A scalable model based approach for
data model evolution: Application to space missions data models.
(2018) Computer Languages, Systems and Structures, 54. 358-385.
ISSN 1477-8424
Official URL
DOI :
https://doi.org/10.1016/j.cl.2018.08.001
Open Archive Toulouse Archive Ouverte
OATAO is an open access repository that collects the work of Toulouse
researchers and makes it freely available over the web where possible
A
scalable
model
based
approach
for
data
model
evolution:
Application
to
space
missions
data
models
Lynda
Ait
Oubelli
a, b, ∗,
Yamine
Aït
Ameur
a,
Judicaël
Bedouet
b,
Romain
Kervarc
b,
Benoît
Chausserie-Laprée
c,
Béatrice
Larzul
ca IRIT/INP-ENSEEIHT, University of Toulouse, 2 Rue Charles Camichel, Toulouse, France b ONERA/DTIS, University of Toulouse, Toulouse, France
c CNES-The French Space Agency, 18 Avenue Edouard Belin, Toulouse, France
Keywords:
Data model comparison Data model evolution Data migration Data conservation Model driven engineering (MDE) Composite evolution operators
a b s t r a c t
During thedevelopmentofa complex system,datamodelsare thekey toa successfulengineering process,as they containandorganize alltheinformation manipulatedby thedifferent functions involvedinthe design ofthesystem. Moreover, thesedatamodelsevolve throughoutthedesign, as thedevelopmentraises issuesthathave tobe solvedthrougha restructuration ofdataorganization. Butanysuch datamodel evolutionhas adeep impactonthe functionsthathave alreadybeing defined.
Recentresearch tries todealwith this issue bystudying howcomplex industrial datamodelsevolve from oneversion toanother and howtheir datainstancesco-evolve. Complexityand scalability issuesmakethis problema major scientificchallenge,leadingto hugegains indevelopment
efficiency.This problemis ofparticularinterest inthefield ofaeronauticsand spacesystems. Indeed, thedevelopmentofthese systemsproducesmanycomplex datamodelsassociated to thedesigned systemsand/or tothesystemsunder design, hence ontheone handdatamodelsare available. On theother hand, itiswell knownthat thesesystemsare developed inthecontextofcollaborative projectsthat may lastfor decades. Insuch projects,specifications togetherwith theassociated data modelsare boundtoevolve and engineeringprocesses shalltakeinto accountthisevolution. Our workaddresses theproblem ofdatamodel evolutioninamodel-driven engineeringsetting.We focusonminimizing theimpact ofmodelevolutionon thesystem developmentprocesses inthe specificcontext onthespace engineeringarea,where datamodelsmay involvethousands of conceptsand relationships, and weinvestigate the performanceofthe model-based development (MBD) approachwe proposefor datamodelevolutionover two spacemissions, namely PHARAOand MICROSCOPE.
∗ Corresponding author at: IRIT/INP-ENSEEIHT, University of Toulouse, 2 Rue Charles Camichel, Toulouse, France and ONERA/DTIS, University of Toulouse,
Toulouse, France
E-mail addresses: lynda.aitoubelli@enseeiht.fr,lynda.ait-oubelli@onera.fr, aitoubellilynda@gmail.com(L.A. Oubelli), yamine@enseeiht.fr(Y. Aït Ameur),
judicael.bedouet@onera.fr (J. Bedouet), romain.kervarc@onera.fr (R. Kervarc), benoit.chausserie-lapree@cnes.fr (B. Chausserie-Laprée), beatrice.larzul@cnes.fr
(B. Larzul).
1. Introduction
The development of complex information systems requires the definition of models, on which developers and engi-neers can rely in order to ensure the quality of the designed systems. This is paramount when the system belongs to a criticalfield,suchasaeronautics,space,health,powermanagement,etc.Thedesignedmodelsarecomplexandtheyinvolve manyconcepts,modeling artifactsand views.Moreover, theymaybesubject toevolution,especially if theyaredeveloped in a long-term project involving a large collaboration, due to the changes that may occur during the process, either on therequirements and specificationsoron the modelingtechniques and tools. In amodel-driven engineering(MDE)-based developmentprocess,modelsareatthecentreofthedesign anddevelopmentprocesses.
Our work focuses on MDE-based engineeringprocesses of large-scale,long-term, critical, and largelydocumented sys-tems.We focusourcontribution on models producedin system engineering,in particularin spaceengineering and space missions.
1.1. Motivations
Research intohowcomplex industrialdatamodelsevolvefromoneversiontoanotherand howtheirdatamigrates still stands asone of themost difficult scientific challenges to beaddressed. This is dueto existing solutions beingunable to facethehugecomplexityofchangesintermsoftypeandnumber.Interpretingcomparisonresultsoftwosmalldatamodels maybeaneasything,butinterpreting comparisonresults oftwo hugedatamodelsisstillahardtask.
Moreover, thefactthat spacesystems, representingtheapplicativecontextofourwork,aredesignedthrough largeand longprojectsmakesuchcomparisoncapabilitiesparticularlydesirable,andactuallynecessarytoefficient,cost-effective,and safedevelopment.
1.2. Objectives
Thispaperextendstheworkpresentedin[1].Itaddressesthefollowingfourobjectives. • beingabletorecognizemanydifferences asauniquecompositeoperator;
• ensuringevolutionschemabyfindingstructuraland descriptivechanges; • ensuringfulldataconservationwithoutany lossofinformation;
• proving thefeasibilityofthe comparisonbetweentwo complexindustrial datamodels issuedfrom thedesign ofspace systems
1.3. Ourapproach
To fulfill these objectives, we rely on astepwise process basedon thelayered MDE model of the Meta-ObjectFacility (MOF layers M0, M1, M2, M3). First, we investigate and improve several methods to compare data models (of M1 level) governedbystructureddata-orientedmeta-models(ofM2level). Then,weprocess theidentifieddifferences. Theobtained differencesaretransformedintoevolutionoperatorsworkingatdatamodellevel(M1) anddatalevel(M0) asfollows:
• At thedatamodellevel, anevolutionoperatordefinesadatamodeltransformationcapturingacommonevolution[2]. • At thedatalevel, itdefines amodeltransformationcapturingthecorrespondingmigration.
Thispaperpresentsthedetailsofthisstepwiseapproachandshowshowitappliestorealsystemsdatamodelsborrowed fromspaceengineering.
1.4. Structureofthepaper
The remainder of thispaper is organized as follows. First, we present the applicative context of spaceengineering in Section2.Then,Section3givesaglobalviewoftheproposedstepwiseapproachforhandlingdatamodelevolutionanddata migration, and thenpresents in deeperdetailthe theoreticaldescription ofeach ofthese steps inorder toformalize data modelsevolutionanddatamigrationbyintroducingasetofcompositionoperators.Section4thenpresentsthedeployment oftheproposedapproachinamodel-drivenengineeringsetting.Thepaperthenproceedstocasestudies:Section5presents a general overview of the problem of model evolution in the case of spacemissions in space engineering and describes thetwo addressedcase studies. Sections6 and 7 present thenthe deploymentof theapproach on two real casestudies, namelythePHARAOandMICROSCOPEspacemissions.Finally,Section8summarizespreviousrelatedworkforthedifferent approaches being used for meta-model evolution and ensuing model co-evolution, while Section 9 provides conclusions withsomefutureresearchdirectionsidentified inourwork.
2. Context:spacesystemsdatamodels
Space agenciesin EuropelikeCNES,theFrench spaceagency, havebeen involved formorethan 20 yearsindata mod-elingandinthestandardizationofdatamodelingtechniques, forwhichtheyhavedefinedrecommendations,referredtoas
theConsultativeCommittee forSpaceData Systems(CCSDS)standard [3].These recommendations arerelated tosyntactic and semantic data descriptiontechniques, long-termdata preservation, data producer,and archive interface. Furthermore, CNES, jointly withthe European Space Agency(ESA), have also developed tools to support therecommended approaches andtosupportdataengineeringforvariousspaceprojects.OneofthesetoolsistheBEST[4]workbenchthatisbasedupon EAST,theEnhancedAdaSubseT.Itallowsanon-ambiguousdescriptionofdataformatsincludingsyntacticandsemantic in-formation.Thistool isusedintheframeworkofspaceprojectsbyscientistsandengineers.Itallowsthemtoeasilydescribe their dataformats and makethem evolve,to quickly producetest data conformto the specificationformat, toaccess and interpretdatawithouthaving towritespecificcode.
Theformaldescription ofspace-relateddata iscrucial and shouldbetakenintoaccount fromthe earlieststages ofthe mission,inparticularwhen:
• thespacesystem (oneormoresatellites,agroundcontrolcentre,amissioncenter,etc.)isdesigned; • thesatellite,once launched,startstosend telemetryand theground segmentstartstosend commands;
• alargeamountofdataisproduced,processed,transformedandsentto end-usersduringthelifeofthesatellite. Alldifferentstakeholdershavetounderstandthedata,beabletointerpret it,and makeuseofit.Anymisunderstanding mightcause importantdelaysinmissionplanning. Arguably,acompleteandnon-ambiguousdefinitionofany kindofdata produced isakey factorfor meetingthedeadlinesof theproject. Thisformal definitioncan beused to generatedifferent pieces ofcode that will be used in the framework of theproject (eg. on-board software, simulationsoftware, etc.).Code generationishighlyvaluableforaproject,sincesomepartsoftheapplicationcanbeupdatedinafastandefficientwaywith aminimumdevelopment cost[5].However,data(DataofversionV1)generatedwithaparticularreleaseoftheapplication (Data-Model ofversionV1) arenot necessarilycompatible withanotherrelease (Data-Model ofversion V2).Thus, starting thecreationorthegenerationofdatathat conformsto(Data-ModelofversionV2)fromscratchisatediousoperation.
Toillustratethissituation, letusconsider thefollowingscenario issued fromspacedatamodels providedbyCNES. De-pending onthe finaluseofdata, thesameconceptmay bemodeledintwo ways. Forexample, inheritanceisrepresented inthe XIF1 meta-model bycomposition,whereasthe XTCE[3] meta-modeldefines conventional inheritancerelationships.
SuchadifferenceinmodelinginducesahugenumberofsyntacticdifferencesbetweenanXIFdatamodelandanXTCEdata model,evenif theyrepresent thesameconcept.
Thisexample illustratesthemainpurposesofourwork:
1. Wewant tobeabletorecognizeasemanticevolution(whichisinfacttherealdifferencebetweentwo models,and the causeof evolution)as aunique complex transformation,notwithstanding thefact that the associated difference will be hidden in a potentially very large number of syntactic differences (which are in fact apparent differences bearingnosignification,andrathersymptomsoftheevolution).
2. Whilst weadmit that ascendantcompatibility maynot always bekept,we want to seehow data may bemigrated withoutlossofinformation.
Letusgobacktoourexample.Forthefirstpurpose,wewanttorecognizethedifferencesduetoinheritanceasaunique operator,encompassingtheassociated syntacticdifferences,so that wecanclearlyidentifyother differences. Thisexample isanextreme example ofdata modelreconstruction. Butweoften encounter suchcaseswhen studying how datamodels evolvein the spaceindustry. For the second purpose, modellers often need to reconstructtheir data model to meet new requirements but do not want to completely breakthe ancient data model. Even without ascendant compatibility, which maynot bepossibleif theoriginal design doesnotallow toaddress properlythe neededchanges, itwould bepossibleto preserveinformationcontainedinolddataifthereconstructioniscorrectly identified.
Thus,thecapability tohandle thedata modelsevolution and thedata migration processesisa majorissue in thearea ofspaceengineering.Providingevolutionandmigrationtechniques thateasethese processeswillconsiderably improvethe qualityandefficiencyofsystem development.
3. Handlingmodelevolutionanddatamigration: ourapproach
3.1. Overviewofourapproach
Inorderto handle themodel and data changesinvolved in thedevelopment and exploitationof complexsystems in a critical application domain likespace engineering, we needto design a rigorous protocol to control model evolution and datamigrations.Thisprotocolshallsynthesizeandcapturethesetofchanges,whichmightbedifferentfromoneversionto another.
Weoffer herea novelstep wisemodel-based approachto automate theevolutionof datamodels. Thisapproach relies on different steps, each one manipulatingmodels. Allowing themanipulation of models asfirst-class objects requires the
1 The underlying language used to formally define the data is XML, formatted according a specific CNES standard named XIF. XIF is a join implementation
of two CCSDS standards (CCSDS 644.0-B-3 - Data Description Language EAST Specification AND CCSDS 647.1-B-1 - Data Entity Dictionary Specification Language. When exchanging satellite data base including telemetry and telecommand definitions with satellite manufacturers or with other Space partners, we may use another CCSDS standard: (CCSDS 660.0-B-1 - XML Telemetric and Command Exchange (XTCE)).
capability to handlemodeling languages and their semantics. For example, in anMDE setting, meta-models areset upto supportsuchmanipulation.
Fig.1givesanoverview ofthedefinedapproach,inwhichthefollowingthreemainstepshavebeenidentified:
• Step 1. Datamodels comparison. To begin the analysis process, the first step consists in identifying the structural and descriptive differences. Twoinput data modelsarecompared and their differences arerecorded.These differences (the
comparisonresultsonFig.1)correspondtoadatamodelofdifferences.
• Step 2. Evolutionoperators flattening. According to theobtained comparisonresults obtained bytheprevious step,a set ofevolutionoperatorsareidentifiedand modeled.
• Step3. Dataconservation.Atthisstep,theapplication oftheidentifieddataevolutionoperatorsmigratesthesourcedata ofthesourcedatamodeltothetargetdatacorrespondingtothetargetdatamodel.
Inordertocontroltheevolutionprocessandtoguaranteetheconservationofdatavaluesduringthemigration,allthree stepsofthepreviousapproach,namelyprocessdatamodelscomparison,evolutionoperatorsflattening, and dataconservation, areappliedinasequentialprocesswheretheoutputofeachstepisusedasinputforthenextone.
The approach describedabove canbe appliedto any data model and inany application domain, providedthat the ca-pabilityto manipulate,at structuraland descriptivelevels, these data modelsasfirst-class objectsis offered.In particular, wehavedeployedthisapproachforthespecific domainofspacemissionswheredatamodelsaremanipulatedinan MDE-settingthroughtheirmeta-models.
The previous sectionpresented anoverview of theproposed approachwhile thissection describes itinto details.Each stepisdefinedandanillustratingexampleisgiven.Thesamesharedexampleisfullydetailedthroughouttheprocesssoas toillustratetheapproachon asimple case.Azoom oneachstep ofFig.1is realizedand adetailedfigureforeachstep is presented.
3.2. Step1.Datamodelscomparison 3.2.1. Description
ThedatamodelcomparisonstepisdepictedonFig.2.Twoinputdatamodelsconformingtothesamemodelinglanguage are compared. As a resultof this comparison,using specific comparison engines, comparisonresults are produced at the structuraland descriptivelevels.
According toFig.2,theseresultscontain:
• matchesi.e.correspondencesbetweensource andtargetelements, aswellas • differencesbelongingtofourclasses
- Addtoassertthatadatamodelconcepthasbeen addedinthetarget datamodel,
- Deletetoassertthat adatamodelconcept ofthesourcedatamodelhasbeenremovedinthetargetdatamodel, - Movetoassertthat adata modelconceptofthesource datamodelhasbeenmovedtoanotherconcept inthetarget
datamodel,
- Changetoassertthatadatamodelconceptofthesource datamodelhasbeen modifiedinthetargetdata model.
3.2.2. Example
Let usconsider twoobject-orienteddatamodels:
• asourcemodelSMcontainingaclassCwithattributesa,b,c,dand
Fig. 2. Step 1. Data models comparison.
Fig. 3. Step 2. Evolution operators flattening.
• a target model TM containing a class C with integer attributes a and b and a class C′ with attributes c and d and C′
inheritsfromC
Then,thecomparisonenginesproducethefollowingdifferences: • adding, aclassC′ isaddedtothetargetmodel,usingtheAdd difference • moving attributescandd toclassC′,usingtheMovedifference
Observethataccordingtotheuseddifferenceengines, otheracceptableclassesofdifferencescouldhavebeenidentified. Forexample,one couldhaveidentifiedDeletefollowedbyAdddifferencesforattributescandd.
3.3. Step2.Evolutionoperatorsflattening 3.3.1. Description
As mentioned, the differences produced in Step 1 are structural and descriptive, they can be considered as low-level
differences or syntactical differences. Therefore, their interpretation from a semanticpoint of view isnot an easytask. In ordertoovercomethisdifficulty, wepropose aset ofoperators allowingadesignertoderive thetarget datamodel asthe resultofsuccessiveapplicationsofevolutionoperators(see Fig.3).
Theresultsobtainedatthepreviouscomparisonstepareinterpreted,usingatransformationengine(seeFig.3),asatomic evolutionoperators.Theseoperatorsareindependentofthecomparisonprocess:theyaredefinedaccordingtothemodeling
Fig. 4. Step 3. Co-evolution process.
languagelevelandareabletomanipulatemodelconcepts.Theseatomicoperatorsarecomposedtodefineasetofcomposite onesallowing tomove fromasource data modelto atargetdata model. Inother words,when appliedtothe sourcedata model,thesequenceof compositeoperators applicationsallowsderiving(i.e.producing) thetargetmodel. Theseoperators aredefinedbythedesigneraccordingtohisexpertise.
After applying the sequence of composite operators, the obtained target model is identical to the one obtained after evolution.Proceedingthiswayprovidesuswithaconstructiveapproachformodelevolution.
3.3.2. Example
Letuscomebacktothepreviousexample.Afteranalyzingtheobtaineddifferences,threeatomicoperatorsareidentified: • Add_Classtodefineanewclass.itisappliedtoidentifyclassC′;
• Add_Class_Is_atodefineaninheritancerelationshipbetweentwoclasses; • Mo
v
e_Att tomoveanattributefromclassCtoclassC′.Thus,itispossibletodefinecompositeoperatorPushdown=Add_Class;Add_Class_Is_a;Mo
v
e_Att;Mov
e_Attwhichcanbe generalizedto Pushdown=Add_Class;Mov
e_Att∗wherethe∗ meansiterativecomposition.At thislevel, itispossibletodescribetheschemaevolutionas:
Mo
v
e_Att(
Mov
e_Att(
Add_Is_a(
Add_Class(
C′)
,C)
,c)
,d)
3.4. Step3.Dataconservation3.4.1. Description
After handling the data model evolution in the previous step, the last step of our approach proceeds with the data migration, i.e. the data characterized by the defined data models. Indeed, the data available as values (or instances) in thesource model are migratedas data values (orinstances) of thetarget model. At thislevel we define co-evolution (or migration)operatorsasdepictedonFig.4.
3.4.2. Example
Again, with the same example, let us consider an instance of class C definedas Oid_int_C=
(
1,2,3,4)
where the at-tributes a,b,c and d evaluate to 1,2,3 and 4respectively. Then, theChange_Instance_Typeco-evolution operatoris applied onthisinstanceasOid_int_C′=Change_Instance_Type(
C′,Oid_int_C)
valuedasOid_int_C′=(
1,2,3,4)
.In the same manner, other co-evolution operators can be defined. For example, another co-evolution operation may producethefollowinginstances:Oid_int_C=
(
1,2)
andOid_int_C′=(
Oid_int_C,3,4)
.3.5. Modelevolutionanddatamigration
3.5.1. Checkingmodelevolutionanddatamigration
Once themodelevolutionand thedatamigrationoperators areapplied,theyleadto thesynthesis ofanewtargetdata model, (named Synthesized data model on Fig. 1) and a set of target data (named target data on Fig. 1). At this point, it
becomes possible to check whether the evolution is soundor not. This checking action isperformed using a comparison (structuraland descriptivecomparisons)ofbothtarget andsynthesizedtargetdatamodels.
Atthis stage,there isan alternative:either thechecking succeedsand theevolution/co-evolutionisaccepted and vali-dated,inwhich casee.g.anewversion orrelease canbeissued, orthechecking fails,inwhichcasetheevolutionprocess mustbe performed again witha new sequenceof atomic/composite operators applications. The developer isin charge of defininganew sequenceoperatoraccordingto achosenstrategy. Thechoiceof suchastrategy isout ofthescope ofthis paper,butsomedirectionsforfutureworkareprovidedinSection9.
3.5.2. Aboutthe modelevolutionanddatamigrationprocess
Theproposed approach canbe deployedin different situations likedatabase systems ormodel-driven engineering set-tings. However, the definition of the atomic and composite evolution and/or migration (co-evolution) operators needs a particularattention.
• Thewayinwhichthechangesarerecordedbythecomparisonengines mayhaveanimpactonthechoiceoftheatomic operators tobeappliedtodefinethecomposite evolutionoperators.
• The definition of the composite operators in the data model evolution may require additional semantics expressed through preconditionsandpost-conditions.
• The definition ofa conservativeor non conservativedata migration (co-evolution)atthe thirdstep: depending on the definedoperator,thedesignermaydefinemigrationoperators thatpreservethesource datacompletelyorpartially. Thenextsectionshowshowourapproachisdeployedinthecaseofmodel-drivenengineeringtechniquessupportedby theEclipseModeling Framework(EMF).
4. Generaloverviewofthemethodologicalframeworkforevolutionandmigration
Theprevious section presenteda general framework basedon three stepsto handle evolution ofdata models and mi-grationof data(instances).Thisframework isgenericand eachstep hasbeenexemplified usingasimpletoy example.We claimthattheuseofthisframeworkinabroadvarietyofmodelingsituationsispossible.
Inordertovalidatethisclaim,thissectiondiscussesthedeploymentofthestepwiseapproachpresentedinSection3to thecasewhere models aredescribedwithina datamodeling language conformingtothe modelengineering visionbased ontheMeta-ObjectFacilityframeworks(SeeFig.5).Thus,modelinglanguagesarecharacterizedbymeta-modelsallowinga designertomanipulate datamodelsand their instancesasfirst-class objects.Thethreeidentified stepsare definedinthis setting.
4.1. OurapproachinaMDEsetting:architecture
The use of a model-driven engineering approach to handle the different steps of the developed method presented in Section 3 requires defining these steps at the different modeling levels of MOF. Fig. 5 describes the global architecture of our approach. It gives a clear positioning of the different steps on top of theMOF architecture. In theremainder, the considereddatamodelsconformto UMLclassdiagrams.
4.1.1. Datapreparation
Datapreparationisa pre-processing phase. Itconsistsin aligningthedifferent source and target data model represen-tations. In other words, it consists in format alignments. The data models together with their instances are transformed into models atthe M0 and M1 levels of theEMF (Eclipse Modeling framework) architecture. This data preparationphase correspondstothedashedboxesofFig.5.
4.1.2. Threesteps
OnceallthemanipulateddataelementsaredescribedintheuniquemodelingsettingofferedbyEMF,itbecomespossible topositionthethreestepsoftheproposedapproachinthisframeworkarchitecture.
• Step1.Thefirststep,notedCompareonFig.5,consistsinusinganexternal tooltoperformthecomparisonbetweenthe source andtargetdata models.Wehaveused theEMFComparetool [6]forthispurpose.
• Step2.Thenextstepconsistsinapplyingtheatomicandcompositeevolutionoperators.Thankstothecapabilityoffered bytheMDE approach,theseoperators aremodeled usinga meta-modelofoperators. Eachoperatorischaracterized by asetofpropertiesdefiningwhataretheapplicationconditionsand howtheyapply.
• Step3.Similarlytothepreviousstep,thethirdstepconsistsinapplyingco-evolutionormigrationoperatorsalsodefined usingameta-modelofoperators.
Fig. 5. Technical description of the proposed approach in a MDE setting based on the MOF architecture.
Fig. 6. Extract of class diagram meta model.
4.2. OurapproachinaMDEsetting:deploymentwith EMF
Let usgivethedetailsofeachstep. Toillustratehow theapproachworksinaMDEsetting,weuseexamplesrelyingon different modelsand meta-modelsconformingto theMOF architectureusingtheEMF tool suite. Thesemodels are volun-tarily reducedto keepthe paperunderstandable. Complex case studies, based on a MDE approach, correspondingto real applicationsintheareaofspacemissionsengineeringarepresentedbelowinSection5.
4.2.1. Step1
Fig. 6 describes anextract (simplified subset) of themeta model for UML class diagrams.The models we manipulate (sourceandtarget) conformtothismeta-model.
Fig. 7. A structural model evolution: from inheritance to composition UML relationship.
Table 1
Matches results obtained on the data models of Fig. 7 .
Source Target Comment
Left = Source Data Model Model Right = Target Data Model Compared models
Left A type = Class Right A type = Class Class A matching
Left B type = Class Right B type = Class Class B matching
Left Inheritance type = Relation Right Composition type = Relation Matching at the relation level
Left Firstname type = Attribute Right Firstname = Attribute Attribute Firstname matching
Left Familyname type = Attribute Right Familyname type= Attribute Attribute Familyname matching
Thismeta-model supportsthe definitionofdata models ofclasses with attributes.Two kindsofrelationships between classes areavailable:simple inheritanceand classcomposition. Naturally,thismeta-model containsmanyother important concepts,whicharenotdisplayedhereastheyarenotusedinourexample.
Twodata modelsconforming tothe previousmeta-model (ofFig.6) aredefinedon Fig.7, bothdescribingtwo classes
Aand B. Inthesourcedata-model(left-handsideofFig.7),Binheritsfrom A.Inthetarget data-model(right-handside of Fig.7),acompositionrelationship replacesthisinheritance.Thislatter introduction ispartoftheside syntacticdifferences thataretobeamalgamatedintoauniqueoperatordescribingthetransformation.
Toidentifythedifferencesbetweenthese twomodels,weinvokeanexternaltool:EMFcompare [6].Thistoolconsiders both data models asgraphs. It applies a top-down matching algorithm starting from the ancestor classes to produce the matches.Each matchingbetweenconceptsiscomposedofzeroormanysub-matches.Thetool compareselementsthatare instancesofthesamemeta-classanduses asimilarity-basedalgorithmtomeasuretheweights ofeachfeatureanddecides whethertwoelements matchornot.Theusermaycustomizethetool bychangingthematchingpolicy,e.g.maymanually enforceamatchingbetweenmodelconceptsidentifiers.
Whenapplied totheexample ofFig.7,thefollowing matchesanddifferences areproduced. Theyare describedwithin aninternalformatthatcanbeprocessed.
Matches
The first category of results produced bythe EMF compare tool are matches. The identified correspondences between conceptsof source and target datamodels areexplicitstated.Regarding thedata modelsdescribed onFig. 7, thematches ofTable1areobtained.
Thetoolidentifiesthematchesrelativelytothemeta-model.Here,withthemeta-modeldescribedonFig.6,threekindof matchingsareidentified:classes(Aand B),attributes(Familynameand Firstname)and relations(arelationbetweenclasses
A and Bon both sides). Observethat the identified relationmatching holds atthe superclassRelation of themeta-model, whereasadifference atthelevelofinheritanceandcompositionisidentifiedbelow.
Differences
Thesecond categoryof resultsproduced bythe EMF compare tool aredifferences.They are categorizedon thebasis of thefourdifferences Add,Delete,ChangeandMove.Onourexample, obtaineddifferencesaregiveninTable2.
It appears that three changes are identified. They correspond to two changes for the inheritance relationship (a first one onthe superclassand a secondone on thesubclass)and anotherfor thecompositionrelationship. These changes are followedbyadeletionoftheinheritancerelationshipfollowedbyanadditionofacompositionrelationship.
4.2.2. Step2
AsshownonFig.3,thematchesand/ordifferencesgiverisetoasetofatomicoperators.Theseoperatorsareindependent ofthecomparisonprocess.Theirdefinitionshallconformtotheusedmeta-model.
Table 2
Differences results obtained on the data models of Fig. 7 .
Category of difference Source localization of the difference Comment
referenceChangeSpec: reference=Inheritance.SimpleClass, Change in the inheritance relation of the subclass B
Kind= CHANGE value= Class B
referenceChangeSpec: reference=Inheritance.SuperClass, Change in the inheritance relation of the superclass A
Kind= CHANGE value= Class A
referenceChangeSpec: reference=DataModel.relations, Deletion of the inheritance relation
Kind= DELETE value= Inheritance
referenceChangeSpec: reference=Composition.Source, Change in the relation of class A
Kind= CHANGE value= Class A
referenceChangeSpec: reference=Composition.Target, Change in the relation of class B
Kind= CHANGE value= Class B
referenceChangeSpec: reference=DataModel.relations, Add of a composition relation in the source class A
Kind= ADD value= Composition.ComposedOf
Table 3
Atomic operators obtained from the changes of Tables 1 and 2 .
Atomic operator Changes used to define the operator
DELETE_Inheritance_Relationship ReferenceChangeSpec: Kind= CHANGE
reference=Inheritance.SimpleClass, value=Class B ReferenceChangeSpec: Kind= CHANGE
reference=Inheritance.SuperClass, value=Class A ReferenceChangeSpec: Kind= DELETE
reference=DataModel.relations, value=Inheritance ADD_Composition_Relationship ReferenceChangeSpec: Kind= CHANGE
reference=Composition.Source, value=Class B ReferenceChangeSpec: Kind= CHANGE
reference = Composition.Target , value=Class A ReferenceChangeSpec: Kind= ADD
reference=DataModel.relations, value=Composition.ComposedOf
In the general case, each identified difference leads to one atomic operator. For example, Add_Attribute_to_Class or
Add_Class_to_Modelaretwocompositeoperatorsdirectlyminedfromtheset ofchanges. Moreover,otheratomicoperators can be obtained by cascadei.e. combining atomic operators at the structural level. For example, the cascading difference
Add_Attribute_to_Classfollowed byAdd_Class_to_ Model defines another atomic operatorAdd_Class_to_Model since addition ofaclassentailstheaddition ofalltheattributesofthisclass.
Pursuingwithourexample,theanalysisofTables1and 2leadstothedefinitionoftheatomicoperatorsofTable3.Two atomic operatorsare identified bycascade:addition of acompositionrelationship and deletion of aninheritance relation-ship.Thus,seven differencesatthesyntacticlevelareencapsulatedintotwoatomicoperators.
The composite operators are defined by the designer as compositions of atomic operators and/or of other pre-viously defined composite ones. In our example, one composite operator
From_Inheritance_to_Composition_
Relationship
is defined as the composition of deletion of inheritance and addition of composition relationships. We obtain:From_Inheritance_to_Composition_Relationship
=
DELETE_Inheritance_Relationship
;
ADD_Composition_Relationship
The application ofthisoperatorproduces thesynthesized datamodel that identical tothe datamodel oftheright hand sideofFig.7.
4.2.3. Remarks
• Asmentionedpreviously,acheckingaction isrequiredtocheck ifthesynthesizedtargetmeta-modelisidenticalto the target data-model.If thischeckingfails,anotherstrategyofcomposite/atomicoperatorapplicationsshall bedefinedand applied.
• Differentstrategiesfor thedefinitionofcompositeoperators canbesetupbythedesigner.Thesestrategiescorrespond tothesequentialcompositionorderoftheatomicoperators.
• The atomicand compositeoperators donotapply inanysituation. Pre-conditionsand post-conditions areassociatedto each operator in order to define sound operations. The application of each operator is possible only if the associated pre-conditionholds.
• The number of built sound composite operators impacts the efficiency of the definition of the data-model evolution. Indeed, theavailability of several composite operators allows thedesigner to definemorestrategies of compositionof composite/atomicoperators. Alibraryofcompositeoperatorsand applicationstrategiescanbeset up.
Fig. 8. Evolution meta-model based on atomic and composite operator’s.
4.2.4. Ameta-modelforoperators
Allidentifiedevolutionoperatorsconformtoameta-modelofoperatorsthatreferstothedatameta-model.Fig.8depicts anextractofthismeta-model,whichisalsosimplifiedforspacereasons.
Itdescribesanevolutionmeta-modelasasetofoperators.Operatorsmaybeeitheratomicorcompositeones.Each com-positeoperatoraredescribedbyasequencelist oforderedoperators(at least2operators)correspondingtotheapplication strategythat itcarries out.
Itisworthobservingthatthe
operator
classisequipped withtherelevantattributes:•
Precondition
andPostcondition
areBooleanexpressionsdescribingtherequiredand effectconditions whenthe operator is executed. These expressions manipulate concepts like classes or attributes borrowed from the data meta-modelofFig.7.•
WithImpact
is a Boolean expression indicating if the operator impacts or not the instances of the considered data-model.Indeed, someevolutionsmaybeabout data-modelelementsforwhichno instancecanexist.•
DataConservative
is aBoolean expressionasserting whether the evolution is without loss of information. Among theoperators thathasimpacton data,someofthemmaynotbedata-conservative,i.e.requiring forinstancemigration some additionalpiecesofinformationthatarenotcontainedintheoriginalinstances.Twoimportantoperationsarealsoavailable,whicharefundamentalinourapproach:
• When executed,the
Apply
operationcomputestheevolutionactionassociatedtotheOperator
.Itiscalledeachtime itoccursinasequenceofcompositionoperators.Itisonlydefinedforoperators whereWithImpact
istrue.• Inthesamemanner,the
Migrate
operation proceedswithdatamigrationassociatedtotheOperator
.Itisdiscussed innextStep3.4.2.5. Step3
Oncethe target datamodel isobtained byapplication oftheevolution operators, thelast stepconsists inapplying the migrationoperatorsinchargeoftheco-evolution,i.e.buildinginstancesofthetarget data-model.
Inour case, these instances conformto the meta-model ofdata i.e. meta-model ofinstances. An extractof this meta-model is presentedon Fig.9. In itsupper part,this figure represents the conceptsof the meta-modelfor data-models of Fig.8.
The
data
classofthemeta-model ofFig.9describes theinstancesof aclassitislinkedto through theInstanceOf
link.Eachinstanceiscomposedofasetofattributesvalues.Fig. 9. An extract of the meta-model of data and its connection with the data models meta-model.
Fig. 10. Example of data values conservation after data model structural evolution.
If wecomebacktothemeta-modelofoperators depictedon Fig.8,weobservethat the
Operator
isequipped withaMigrate
operationincharge ofproducing instancesoftypedata
.For eachcompositeand/oratomic operatorintroduced in Step 2, aMigrate
operation is definedby the designer. This operation describes the impact of the evolution on the instancesanddata.When definingthisoperation,twosituationsmayarise.• Themigrationisdata-conservative.Inthiscase,thedatastructurehaschangedwithno lossofinformation.
• The migration is nondata-conservative. In thiscasethe data structuremay bechanged by theevolution operators and lossofinformationmayoccur.
Back toourexample,Fig.10 illustratesthefirst casewith aconservationofdata.The instancevalue
John
isconserved unchangedwhileAlice
andBob
areconservedwithchangeinthedatastructuresincetheevolutionmovedfrom inheri-tancerelationship tocompositionrelationship.At thislevel, thewholeapproachshowingadeploymentinaModel Engineeringsettingiscompleted.Itiscomposedof thethreestepsmentionedabove.
4.3. OurapproachinaMDEsetting:summary
The stepwise approach depicted inFig. 1summarizes the previoussteps in theparticular casewhere MDE techniques aresetup.Fig.11showsagraphicalrepresentationoftheobtained stepwiseprocessforthiscase.
The process begins with a source and a target data models that conform to a given meta-model. Then Step 1 runs the comparisonengine (here the EMF compare tool) between source and target data models. A set of so-called low-level
Fig. 11. Theoretical description of the proposed approach.
differencesisproduced. These differencesrepresent the structuraland descriptive onesidentified thanksto theavailability ofthemeta-model.
Next,step 2 isdealing with thedefinition ofthemodel evolution operators. The low-leveldifferences are transformed into a set of atomic operators (A1...An for our example) conforming to a meta-model of operators. This meta-model is enriched with specific characteristics like preconditions or post-conditions associated to the definitions of the operators. In thesame manner, composite operators are definedaswell. For example, thetwo composite operatorsC1=A1;A2and
C2=A1;A2;A3;C1canbedefined.Finally,thesynthesizedtargetdatamodelisobtainedaftertheapplicationofasequence ofatomicand/orcompositeoperators. Inthefinalstep3,themigrationoperations,associatedtoeachoperator,areapplied. Eachoperation application isfollowed by achecking activitywhich validates the results. If thechecking fails, thestep needs to be performedagain with theintervention ofthe developer, until checking succeeds orthe designers aborts this process.Thisprocessisiteratedandanothersequenceofoperatorsapplicationshallbedefined.
4.4. Aboutscalability
The approach we have set up relies on the definition of atomic and composite operators mined from the differences produced bytheused comparisonengine. Thenaniterative process isset up,it requireschecking with respectto agiven meta-model. This approach is scalable in the sense that it does not perform any exhaustive search once the comparison engineisrun. Indeed, only constructive applications ofatomic orcomposite evolutionoperators are applied.These opera-torsapplyonlocalized differencesanddonotneedtraversalsofthemeta-model(more preciselyofthegraphassociatedto themeta-model).Moreover, theapplication of compositeoperators defines abstraction ofseveral atomic and/orcomposite operators applications. Indeed, itfactorizes several atomic and/orcomposite operators applications and thus reduces exe-cutiontime. Finally,nextsection showshow theproposedapproachhasbeen deployedon datamodels withthousands of elements.
5. Casestudies
In this section, we show our approach applies to particular evolutions of data models used in the PHARAO [7] and MICROSCOPE[8]spacemissionssuppliedbytheFrenchspaceagency(CNES)andattheFrenchAerospacelab(ONERA).
1. Mid-2018, the PHARAO instrument will become the first cold-atomclock ever to orbit earth, operating outside the International SpaceStation[7]. Thisspacemissionhasbeendesigned totaketimekeepingto newlevelsofaccuracy. On this use case, we deploy the first phase of the proposed approach, i.e. the comparison process, to some data models of this project in order to identify successive changes. We also aim in this study at showing that logical results may bein a short period of time: indeed, the current tool used to compare these data models, Aladin [9], based on the X-Diff algorithm [10], is unable to produce results in a reasonable time (more than several hours of computationareneededforsomerevisions).
2. Mid-2016, the MICROSCOPE satellitewas launched. This mission aimsat testing the universality of free fall for the first time in space.The spaceenvironment alloweddesigning anexperiment that is a hundred times more precise than any possible experiment on earth, thanks to two differential accelerometers supplied by the French National Aerospace Research Centre, ONERA, and embedded into amicrosatellite from theCNES. CNESis responsible for the system and satellitedevelopmentand participates tothe performanceevaluation, whereasONERAisresponsible for thescientificpayload and thescientific missioncentreasshownon (Fig.12).Adatamodel wasdevelopedtomodel
Fig. 12. Collaboration between CNES-ONERA using MICROSCOPE data models.
Fig. 13. Extract of the XIF meta-model.
thetelecommands(TC)andthetelemetries(TM)exchangedbetweenearthandspace.Other datamodelswere devel-opedtomodelthescientificprocessingoftelemetries.
For bothcasestudies,thecomplexity oftheproject washandled throughanintensiveuseofmodel-drivenengineering thanks to two in-house meta-models: XIF [4] and GAMME [5]. In the following sections, we detail these meta-models, the comparison phase and take a closer look at the matching, filtering and differencing processes. Additionally, for the secondcasestudy,wealsodiscussevolutionandmigrationstrategiesthatcouldbeusedtoco-evolveparametersduringthe developmentofthetelemetryprocessing.
6. Firstdeployment:PHARAO spacedatamodels
6.1. Context
TheXIFmeta-modeldefinesthecontentsandformatoftheframesexchangedbetweenagroundsegmentandasatellite. Asshownon Fig.13,itdefines fields,recordsandpackets.
Afieldcorrespondsto aset ofbits(e.g.value ofthealtitude).Arecordisaset ofseveralfields(e.g.attitude).Apacket (orroot)isaset ofrecordsthatconstitutes afullmessage, likeatelecommand(i.e.amessage sentfromground tospace) oratelemetry(i.e.amessagesentbyasatellite).TheexampleinFig.14representsaframesubdividedintoseveralrecords, whicharethemselvessubdivided intoseveralfields.
TheXIFmeta-modelalsodefinestwokindsoffunctions:monitoringfunctionstoindicatewhenanalarmmustberaised andcalibrationfunctions totransformrawvaluesintophysicalvalues.Itshouldbenoticedthattheconsidereddatamodels areratherbig,asshowninTable4,andrepresentative ofspacemissions.
Inthecontextofdatamodelevolution,wecanidentifyfromthepreviousdescriptionvariousexamplesofthetwokinds ofevolutiondistinguishedabove(namely,withorwithoutimpacton data).
Fig. 14. Binary frame construction [4].
Table 4
Data models size of PHARAO project.
Versions Size (MO) Number of records Number of fields
16 3.37 367 2383 17 3.37 367 2383 18 3.37 367 2419 19 3.41 367 2418 20 3.37 367 2417 21 3.42 367 2421
• Changingtheorderofthefieldsinsidearecordisanexampleofevolutionwithimpactondata:thischangeimpliesthat previousrecorded framesareobsolete andnewframesmustberegenerated.
• Changinga calibration lawisan exampleof evolutionwithout impacton data: previousframes arestill valid but users mustbenotifiedofthischangetorecomputephysicalvaluesifnecessary.
Fig. 15. Differences before EMF compare customization.
6.2. Step1:datamodelcomparison
The PHARAO data models are expressed in theXIF meta-model, which is currently definedby two XML schemes.We transformedbothintoauniqueEcoremeta-modelfollowingtheprocessdescribedin[11].AstheencodingofXMLattributes issimilar inXIF and in Ecore,we developed asimple XSLT (ExtensibleStylesheet Language Transformations) totransform thePHARAO XMLfilesintoXMIfilesunderstandablebytheEMF frameworkfromEclipse[12].
Once converted, it becomes possible to compare two revisions using, for example, theEMF Compare framework [13]. Fig.15representsthechangesbasedonthefourkindsidentifiedintheEMFComparecomparisonmeta-model(ADD,DELETE, CHANGE,MOVE).
The timeneeded to computethedifferences on ourreference machineisgiven bythe greenbroken line.Comparedto theX-Diff algorithm, which needs morethan 3hto compare revisions18 and 19, weget theresults inavery reasonable time. Nevertheless, we get a lot of false-positives, i.e. unexpected correspondences, false-negatives, i.e. undetected corre-spondences,and hiddenchanges.
For example,theXIF rootelements,whichdescribethetelecommandand telemetrypackets,arenotcorrectly matched. Indeed,theresults onlycontainsoneDELETE differenceand oneADDdifference foreachpacket, hidingtherealdifferences thatmayexistbetweeneachofthem.ThefollowingparagraphspresenthowwecustomizethedefaultbehavioroftheEMF Compareenginetoeliminatethem.
6.2.1. Step2: matchingcustomization
Duringthematchingstep,weidentifyelementswithinthecomparisonscopevia“identifiers” thatmaynotberecognized assuchbythedefaultmechanism.Forexample,thetelecommandand telemetrypacketsarenotcorrectlymatchedbyEMF Comparebecauseofthenumerouselementsthat constituteeachpacket.Nevertheless,theyarealwaysnamed
TC_PACKET
andTM_PACKET
respectively, whichwillhelpuscorrectlyidentifyingthem.Matchingisakeyphaseinourapproach.Obtainingcorrectdifferencesresultsfacilitatesevolutionoperatorsconstruction. It avoids the ambiguities that may result from detected equivalences. As a result, it eases the construction of evolution operatorswith asingleevolution and/orco-evolutionrole. Then,theobtainednon ambiguousoperators maybecomposed todefinehighlevel operatorsthatarethemselvesnonambiguous.
As shownin Listing1, weindicatethat thenameofa
RootType
correspondstotheidentifierthatmay helpto match twoRootType
together (line 5). For other model elements, finding an identifier may require to concatenate the names of the different parent fields to ensure uniqueness of each model element (line 1). Then, to know whether two models elementsmay matchtogether,identity-based matchingisused whenanidentifierisavailable (line 12).In thegeneralcase,similarity-basedmatchingisused(line8) [13].
6.2.2. Differencefilteringcustomization
IntheXIFmeta-model,onlytheelementsinsidethe
RecordType
elementsareordered,astheydeterminetheorderin whichthedifferentbitsofapacket aretransmittedbetweenearthandspace.Inordertodetecttheorderingchangesoftheseelementsonly,weredefineafilter(implementedasa
FeatureFilter
inEMFCompare)toignoreallMOVE differencesofallelements(line 2inListing2)butthese ones(line5).Listing 1. Extract of match engine customization.
Listing 2. Filter engine customization.
Fig. 16. Evaluating the scalability aspect of model-based approach before and after the customization.
6.2.3. Step3: differencesgivenbytheEMFcompareDiff engine
After having customized matching and filtering, the differences are obtained. Fig. 16 shows that thenumber of MOVE
changesisdrasticallydecreased,eliminatingalotofuninterestingMOVEchanges.
We also observe that the time to compute the differences is also reduced, especially in the comparison of the most complexevolution(18–19onFig.16).
6.3. Assessment
Compared to an XML based approach where each node of the original model is compared to each node of the new model, a model-based approach focuses on the comparison of the elements of the same type. Such an approach clearly helpsfindingmatchingbetweentheoriginaland evolveddatamodels.
Fig. 17. Extract of the GAMME meta-model.
Toimprovethequalityoftheresults,wehavecustomizedthetwofollowingprocesses.
• the matchingprocess to correctly match elements from theoriginal data model to elements of thetarget data model. Finding uniqueidentifiersto identifyeachmodel element isessential.It helps tomatchmodels elementstogether and seewhattherealdifferencesare;
• thedifferencefilteringprocesstoremoveundesired differenceslikethechangesintheorderofsomeelements.
Takingintoaccountthepropertiesoftheconsideredmeta-modelhelpsgetting therealdifferences:anADDcorresponds toadatamodelelementactuallyadded, aDELETEcorrespondstoanelementactuallydeleted.
In thecontext of the XIF meta-model, considering the data models representative ofspace missions, EMF Compare is arguably able to detect allthedifferences, and itachieves this taskefficiently. For the most complex examples(18–19 on Fig.16),thecomputationtimeisreducedfromseveralhours fortheXMLapproachtolessthan tenminutesforthe model-basedapproach.
Itisworthnoticingthat,thepurposeofFigs.15and16isnotonlytoillustratethebenefitsofcustomizationforspeeding upthecomparison process, butalso to provideasemantic interpretation(change,add, deleteand move)of eachidentified difference(diff)element.
7. Seconddeployment:MICROSCOPEspacedatamodels
7.1. Context
In this section, we study how the proposed approach applies to a particular evolution of a data model used in the MICROSCOPEproject [8].Inthefollowing,wedetail thecomparison,evolution andmigration strategiesthat couldbeused toco-evolve(migrate)parametersdatausedduringthetelemetryprocessing,followingourpipelinearchitecture.
The considered datamodel isexpressed in theGAMME meta-model.For thepurpose ofthis paper,we consider that a model(
GModel
)isaset oftypes(GClass
andGBasicType
)and quantities(GQuantity
)(seeFig.17).Each quantityisasetofphysicalunits(
GUnit
).Each classcontainsaset offeatures(
GStructuralFeature
)oftwodifferenttypes:GAttribute
andGReference
. AGReference
feature represents aggregation or composition relationships with other classes. AGAttribute
feature representsanattributetypedbyabasictypelikestring,integer,float,date,enumerate,etc.A physical quantitycan be associated to a
GAttribute
feature through thetypeGSimpleType
, which references a quantityandapreferredunit[5].AnextractoftheGAMMEmeta-modelisdepictedonFig.17.Thiscasestudyfocusesoncombiningtwotelemetries.Thisconceptisrepresentedbyclass
Signals
withtwoattributessignal1
andsignal2
.BothattributesweretypedbyanenumerationcalledSignal
,identifyingthedifferenttelemetries. Then,itwasdecidedtocombinesignals otherthanatelemetry.As shown on Fig.18, itwasdecided toreconstructthedata model byreplacingthetwo attributesbytwo composition relationships towards a new class, called
AbstractData
. The original attributessignal1
andsignal2
are factorized into aclassSessionData
, inheritingfromAbstractData
and owning asignal
attribute oftypeSignal
.This way,Fig. 18. Evolution from the old to new data model.
Table 5
Extract of the comparison results based on the EMF compare default engines.
end-userscancombinetwotelemetries, atelemetrywithanothersignal ortwo signals.Thus,wecanidentifythefollowing evolveditems:
• anewabstractclassnamed
AbstractData
isadded;• two newclasses named
SessionData,
OtherData
inheritingfromAbstractData
areadded; • anewattributenamedsignal
oftypeSignal
isaddedtotheclassSessionData
;• anewattributenamed
signalExt
oftypeString
isaddedtotheclassOtherData
; • thetypesofsignal1
andsignal2
arechanged fromSignal
toAbstractData
.Itshouldbenotedthat thedata modelpresented above isanextractofa realmodelthat containsabout one hundred ofunitsandhundreds ofclasses, attributesand composition/aggregationrelationships.
7.2. Step1.Datamodelscomparison
Using the default engines of EMF Compare 3.0.1, we find many false-negatives (i.e. undetected correspondences) and false-positives (i.e. unexpected correspondences) in thematching results. Table 5represents an extractof thecomparison resultsobtainedbythedefaultenginesonourexample,through247differences.Somecascadingdifferences,whicharethe consequencesofapreviousdifference,arealsoshown(e.g.theattribute
signalExt
forthenewclassOtherData
).Table 6
Extract of correct matching between microscope data models.
Table 7
Structural differences after customization.
A lot ofdifferences are dueto theinability of thedefault matchingengine to matcha unitwith another unit.Indeed, most units referenceone or two otherunits and each unithastwo attributesonly:a nameand amultiplication or expo-nentiationfactor,whichdoesnotfacilitatethematchingprocess.
Aninterestingunexpectedcorrespondenceisthematchingtheattribute
signal1
fromtheoriginalclassSignals
with theattributesignal
ofthenew classSessionData
(see difference -1- inTable5).Indeed, attributesignal1
is more similartothisattributethantothenewattributesignal1
ofthenewclassSignals
.Thenewattributesignal1
isnow areferencetoaclass,whichimpliesatypechangefromthetypeSignal
tothenewtypeAbstractData
.Nevertheless, itmakesmoresensetomatchsignal1
fromtheoldSignal
withsignal1
fromthenewSignal
.Consequently, an interesting undetected correspondence is the deletion of the attribute
signal2
from the old classSignals
,whereasitcouldbematchedtotheattributesignal2
ofthenewclassSignals
(seedifference-2-inTable5).7.2.1. Matchingcustomization
Intheconsideredmeta-model,thenamesoftheclassesareuniquebuttheonesofthefeatures(attributesand aggrega-tion/compositionrelationships)arenot.Weconsider thatafeaturecanbeidentifiedbyitsnameappendedtothenameof thecontainer class.
In ourcase, the name of a quantity or a unit is unique. After customizing the EMF Compare matching engine to use identity-based matching for features, classes, quantities and units, the numbersof differences decreases to 13 differences only.In particular, theold attributes
signal1
andsignal2
arecorrectly matched tothe newreferencessignal1
andsignal2
(seeTable6),whichinducesCHANGE
differences(seedifferencesfrom10to13inTable7)insteadofthepreviousMOVE
andDELETE
differences.Table 8
Interesting comparison results after customization and filtering.
Table 9
Atomic operator instances of the initial evolution strategy.
7.2.2. Differencefilteringcustomization
Wehavecustomized thedefaultfilterenginetoignoreallorderchanges.
7.2.3. Differencesraised bytheEMFcompareDiff engine
Thedifference engineis calledto obtain allthedifferences once thedifferentelements arecorrectlymatched.Wehave chosen to ignore the cascadingdifferences, as, bydefinition, theyrepresent logical consequences of other differences. For example, thedifferences 3, 4and 5in Table7are logical consequences ofadding the class
SessionData
(difference 2). Finally,attheendofthestep1,weget7differencestobeconsideredforthenextstep(seeTable8).7.3. Step2.Evolutionoperatorsflattening
As mentionedabove, after customization of EMF Compare,7 differences (see Table8)satisfying interm of correctness and precisionare obtained.These differenceshave been memorizedin ordertobetransformedintoatomic operators (see Table9).
Inourevolutionmeta-model,thefollowingatomicoperators havebeendefined. •
ADDClass(class:GClass,
to:GModel)
toadd aclasstoamodel;•
ChangeTypeOfStructuralFeature(f:GStructuralFeature,
type:GType)
tochangethetypeofafeature; •ADDReferenceType(f:GStructuralFeature,
referenceType:GReferenceType)
toindicatewhetheraref-erencecorrespondstoacompositionoranaggregationrelationship.
Wedefine acomposite operator
MoveFeatureFromCompositeToComponent
ableto preserve theinformation car-ried by the instances of the old classSignals
. The formula associated to this operator is defined by the sequenceADDReferenceType
followed byChangeTypeofStructuralFeature
. The construction of this sequence with the rightinstancesofthesetwoatomicoperators iscontrolledbythefollowingprecondition.∃
f′∈\
new(
f)
.gType.allFeatures, f′.gType=\
old(
f)
.gTypewhere
• frepresentsthesourceand targetfeature,inourcase
signal1
orsignal2
;• \new(f′).gType.allFeaturesrepresentsallthefeaturesofthetypeofthenewtargetfeature; • f′ isoneofthesefeatures,inourcasetheattribute
signal
intheclassSessionData
; • f′.gTypeisthetypeofthefeaturef′,inourcasetheenumerateSignal
;• \old(f).gTypeisthetypeofthesourcefeature,inourcasetheenumerate
Signal
. Attheend,Table 10
Remaining atomic operator instances in the final evolution strategy.
Table 11
Composite operator instances in the final evolution strategy.
Composite operator Sequence of atomic operators
MoveFeatureFromCompositeToComponent ADDReferenceType(Signals.signal1, Composition)
ChangeTypeofStructuralFeature(Signals.signal1, AbstractData) MoveFeatureFromCompositeToComponent ADDReferenceType(Signals.signal2, Composition)
ChangeTypeofStructuralFeature(Signals.signal2, AbstractData)
Fig. 19. Data conservation during the migration process.
• threeinstances,presentedinTable10,oftheatomicoperator
ADDClass
• andtwoinstances,presentedinTable11,ofthecompositeoperator
MoveFeatureFromCompositeToComponent
are obtained.Thesefiveoperatorscorrespondtothefinalevolutionstrategyabletomakethesourcedatamodelevolveintothetarget modelwhilepreservingtheinformationcontainedintheoriginaldata.Whencallingthemethod
apply
associated toeach operator,itbecomes possibletocheckthattheproposedevolutionstrategy iscorrect andthat thetransformedsourcedata modelisequaltothetargetdatamodelused duringthecomparisonprocess.7.4. Step3.Dataconservation
As shownpreviouslyinFig.8, themigration strategyisspecifieddynamicallyinthemeta-modelofevolutionoperators. Inourexample,themigrationissupportedbythesuccessiveapplicationofthe
migrate
operation forthetwocomposite operators.Duringthemigration,thevaluesoftheoldimpacted instancesoftheclassSignals
arepreserved,firstthrough thecreation oftwonewinstancesoftheclassSessionData
,initializedwiththesevalues,and second,troughthe modifi-cationofthefeaturesoftheoriginalinstancesoftheclassSignals
(see Fig.19).7.5. Assessment
The MICROSPCOPE use caseshows that model-based approachesand theirrelated tools may bemisledby therelative complexityofrelationsholding betweenclassesofameta-model. Forexample, units,which aremainlydefinedbymaking referencestootherunits,candefeatsuchatool,whereasnounithasevolved.Findinganidentifiertocorrectlymatchmodel elementstogetherisacrucialstep toobtaintherealdifferences.
Once correctlymatched,thelevelofdifferencesmaybelow.For example,removinganinheritancerelationshipmay re-sultinthreedifferences.Toimprovereadabilityandreusability,wecomposetheselowdifferencestoformatomicoperators, whichdescribealogicalevolutionofthemodel.Atomicoperatorscanbecombinedtobuild complexoperatorsto describe asetofrelatedchangesthat mustbeaggregatedtopreserveinformationinside originaldata.
In theinitialsequenceofapplication ofatomic operatorsinourexample (seeTable9),someoperators impactdataand arenotabletopreservetheinformationcarriedbythesourceinstances.Thefinalsequence(seeTables11and 10)contains atomic operatorswith no impactand composite operators withimpactand able to preserveinformation.Proceedings this
way, has the advantage of reducing the sequence to a set of composite operators capturing theoriginal intention of the modeler,i.e.movingthevaluesofattributesfromaninstancetoinstancesofotherclasses,whileextendingthemodelwith newconcepts representedbythe atomicADD operators. Findingcomposite operators that fulfill theintentof themodeler andpreserveinformationwhenitispossible,isarelevantpropertytodetermine alogicalevolutionoftheoriginalmodel. 8. Relatedwork
Software Evolutionhas been investigated by Lehman [14,15], Kajko-Mattsson et al. [16], Mens et al. [17], Jazayeri [18], Ciraciand Van DenBroek[19], Greevyetal.[20],and Hassanetal. [21].The authorsaddressedthedefinitionofthelaws, challenges and stylesof software evolution,with theaim ofincludingnew viewto maintenance activitiessuch assystem adaptation and reshaping. Then, with software engineering practices urgentlyneed to some rules that facilitate the cost-effective planning, design, construction and maintenance ofeffective programs, a new disciplinewas officially highlighted in[22]. This workmentions that “Software Evolutionis all programming activity thatis intended to generate a new software versionfrom anearlieroperationalversion”.
Capturing and formalizing evolution stages for complex systems is a challenging task, due to the nature and charac-teristics of such systems [23–26]. Furthermore, the rapid evolution, the density and the type of differences [27] change accordingtotheend-user’srequirements.Thesearethereasonswhyabroadvarietyofalgorithms,methodsandruleshave beendevelopedbythiscommunity.
The following subsections summarize the different existing approaches for addressing the two processes of evolution and co-evolutionusingadifference-basedapproach.Following theidentifiedstepsofourapproach,we studytheavailable approachesand contributionoftheliteratureset upineachstepofourapproach.
At thebeginning, we overview the work related to data model comparisonand data model matching.Then, we study operators-basedapproaches.Wesurveytheprocessesofconstructionofcompositeoperatorsbasedontheatomiconesand thedependencybetweentheseoperators.Finally, wetackletheexistingresultsfordatamigration.
8.1. Modelcomparison
Thehistory ofcomparisonalgorithms iscomposedofthree stages.At thebeginning, comparisoncharacter tocharacter was set up. For instance, diff programsare used to solve thelongest common subsequence problem (LCS) [28]. They are basedon findingthelinesthat donot changebetween files.Then,asimprovement,afocusontheattributesand nodesof astructureddocumentwasadvocated. Anumberofadvanced algorithmsareavailabletocapture differences forXML such asXDiff [10]or Aladin[9]. However, even if theygive logical results, theylack theability to recognizea hugenumber of complexchangesinareasonabletime.
Finally,otherapproachesfocused onthesemanticmeaningofnodesandattributes: forinstance comparingaclassnode witha classnode,anattribute nodewithan attributenode.The assistanceofthewell acceptedEMF Compareframework isusefultosupportthiscomparisonprocess [13].
8.2. Modelmatching
Inthefollowing,weprovideasetofexistingapproachestomodelmatching,whichcanbeseenasthefirststeptowards modelcomparison.
Matchingtechniquesareofparticularimportancefordatamodelsdifferencingapproaches.Manytechniques implement-ingmatchingareavailable.Theyrelyonuniversallyuniqueidentifiers(UUIDs)[29,30],identity-basedmatching[13], heuris-tics[10,31–33],signature-based matching[34], similarity-basedmatching[35,36]orcustom language-specificmatching al-gorithms[37,38].
8.2.1. Universallyuniqueidentifiers(UUIDs)
Inthiscategoryofapproaches[29,30],itisassumedthat eachmodelelementhasauniqueidentifiertorealizea match-ing. A UUID is assigned to each newly created element and must not be modified until the deletion of the element. If twoelementsoccurringindifferentversionshavethesameUUIDtheyareconsideredasequivalent.Obviously,thismethod worksonlyifUUIDsareavailableforallelements.Inthecasestudiesweaddressedinthispaper,suchidentifierswerenot available.
8.2.2. Identity-basedmatching
WhenUUIDarenotavailable,itispossibletodetermineartificialidentifiersforthedifferentdatamodelelements. Such identifiersareusefultofind whichobjectsmatchtogether, even iftheobtained matchingisnot fullyguaranteed,contrary toUUIDbasedmatching[13].
Inthecasestudiestackledinthispaper,itispossibletodefinesuchidentifiersfornamedelements.Thus,theidentifier ofanamedelement isoftenitsnameappendedtothenameofthedifferent elementscontainingit.Inourwork, identity-basedmatchingprovedthat itcansolvematchingerrorsproducedbyothertechniques.
8.2.3. Heuristics
Anothertechniquetomatchelementsofdifferentversionsintroducedin[10,31–33]reliesontheuseofheuristics.These matchingalgorithmsarebasedonthedistancebetweentheserializationofthedifferentdatamodelelements.Theclosera serializationbetweentwoelementsis,themorethesetwoelementsmatch.Thereareseveralheuristicsmatchingalgorithms describedintheliterature. They often usemetricsfrom informationtheorylike theHammingdistance ortheLevenshtein distance, which calculates the minimum number of operations (insertion, deletion, or substitution of a single character) neededtotransformonestringintoanother.
8.2.4. Signature-basedmatching
In signature-basedmatching, the identity of each model element is not static. Itsidentity, referred asits signature, is dynamicallycomputedbycombiningthevaluesofitsfeatures.Thesignaturecomputationisperformedbymeansofa user-definedfunctionspecifiedusingamodelqueryinglanguage.
8.2.5. Similarity-basedmatching
Similarity-based matchingapproaches[35,36]processmodelsastypedattributegraphsandattemptstoidentify match-ingelementsbasedontheaggregatedsimilarityoftheirfeatures.Itisworthnotingthat notallfeaturesofmodelelements areequallyrelevantforestablishing amatch(e.g. classeswithmatchingnamesaremorelikelytobematchedwithclasses specializing thesame parent superclass). Therefore, similarity-based algorithms typically needto be configured such that theyspecify therelativeweightofeachfeatureandthusofthedetectedcorrespondences.AccordingtoRoseetal.[39],the built-inalgorithmofEMFComparefallswithinthiscategory.However,tuningtheweightsofthefeaturesisapredominately empiricaltaskand errorprocess. Therefore,findingtherightvaluesofweights that deliverthebest resultsforaparticular modelinglanguagecanbeparticularlychallenging.
8.2.6. Customlanguage-specificmatchingalgorithms
Custom language-specific matching algorithms based techniques involve matching algorithms tailored to a particular modelinglanguage. Achievements inthiscategoryoftechniques areUMLDiff [37]and theworkof Nejatietal.[38]which specificallytargetUMLmodelsandstatechartsrespectively.
8.3. Evolutionoperators
Despite theoutstanding advances insoftware engineering,existing difference-based approachesstill have a numberof limitations[40]astheylackthecapabilityofprovidingefficienttwo-waycomparisonbetweenthedifferentrevisions. Differ-enceenginesmaymakemistakesbydisplayingfalse-negativeorfalse-positive correspondences,aswehavepreviouslyseen itinthispaperandasconfirmedin[41].Afterthecustomization ofdifference resultsandforreusabilitygoals,manyworks tackled the process of capturing and constructing evolution operators [42–46]. For instance, authors in [2] have defined a catalog of61 reusable evolutionoperators (30 areatomic and 31 are composite) with the aimof treatingthe coupled-evolutionofmeta-models(M2) andmodels(M1).Theauthors haveprovidedacatalogbasedon EMOFmeta-modeling for-malisminwhich theyhaveoutlinedaset ofmigration rulesspecifiedatamodel level. Thedetectionmechanismofthese operators is discussed in [47], where the authors propose a detection engine for complex changes. They have addressed thetwochallengesofvariability andoverlapbetweenevolution operators.Furthermore, Kehreretal.[48]introducedSilift, agenerictool environmentable tolift incomprehensiblelow level differencesderived from EMFCompareinto representa-tionsofuser-leveleditoperations.Mostrecently,aresearchprototypenamedCOPE[49]wasextendedintoatransformation toolnamedEdapttailoredforthemigrationofmodelsinresponsetometa-modelevolution[50].
Other approachesput end-user in thecenter of the evolution process. In [51], the authors present the MT-Scribe tool offersassistancetoend-userstobuildevolutionmodeloperators.Itcanbeappliedtosupportautomatingdifferenttypesof model evolutiontasksin anend-userprogramming style. Intheimplementation ofMT-Scribe, userscan demonstrate and generatetransformationpatterns.However,thecorrectnessofthesegeneratedpatternscannotbechecked.
8.4. Datamigration
Modelmigrationingeneralhasdrawntheattentionofseveralresearchers.In[52],theauthorsdefinemodelmigrationas transformationsofmodelsexpressedinagivenmodelinglanguagetoothermodelsdefinedinthesamemodelinglanguage. Theoretically, model migration raises theproblem ofsemantic preservation[53]which represents a researchchallenge. In practice, model migration is implemented either using a dedicated migration language or using a model transformation language.In[54],aclassificationofmodelmigrationapproachesintothreecategoriesisdescribed.Itissummarizedbelow.
8.4.1. Manualspecificationapproaches
These approaches allow a developer to define migration templates using transformation languages tailored to model migration.Anexampleofsuchatemplateisthe“copymodelelements” templatethatduplicatesmodelselements.Sprinkle’s language[55],MCL[56]andEpsilonFlock[57]areexamplesofthesetransformationlanguages.