• Aucun résultat trouvé

Component Extraction from Object-Oriented Source Code: ROMANTIC

Approach

ROMANTIC (Re-engineering ofObject-oriented systeMs byArchitecture extractioNand migraTIon to Component based ones.) is an approach to automatically recover a component-based architecture from the source code of an existing object-oriented software [Kebiret al., 2012b][Chardignyet al., 2008]. ROMANTIC relies on two models:mapping modelandquality model.

• A mapping model between object-oriented concepts (i.e. classes, interfaces, packages, etc.) and component-based software engineering ones (i.e. components, interfaces, sub-component, etc.).

• A measurement model of semantic-correctness of a component. This model refines charac-teristics of a component to measurable metrics. Based on these metrics, a fitness function is defined to measure the semantic-correctness of a component.

C.1 Mapping Model between Component and Object Concepts

In ROMANTIC, components are as disjoint collections of classes. Each collection is namedshape and contains classes which can belong to different object-oriented packages (see FigureC.1). The classes of a shape are organized into two sets to constitute respectively shapeinterfaceandcenter.

The shape interface classes have links with some classes from the outside of the shape using a method call or attribute use; while the remaining classes represent the shape center. FigureC.1shows the

171

172 Appendix C. Component Extraction from Object-Oriented Source Code: ROMANTIC Approach

Figure C.1 : Shape Structure [Kebiret al., 2012b][Chardignyet al., 2008].

Figure C.2 : Object-Component Mapping Model [Kebiret al., 2012b][Chardignyet al., 2008].

shape (resp. component) structure. FigureC.2shows the component-object mapping model that was proposed to handle the correspondence between object and component concepts.

C.2 Semantic-Correctness of Components

The semantic correctness of a component is based on the component characteristics. These charac-teristics are identified by studying the most commonly admitted definitions of software component.

By combining and refining the common elements of different definitions of the component concept, three semantic characteristics of software components were identified:composability,autonomyand specificity.

C.2. Semantic-Correctness of Components 173

Figure C.3 : MetaModel of Refinement Software Characteristics Norm ISO-9126 [Kebir et al., 2012b][Chardignyet al., 2008].

C.2.1 From Characteristics to Properties

The identified semantic correctness characteristics are measured by using the refinement model given by the norm ISO-9126 (see Figure C.3) [ISO, 2001]. Based on this model, the semantic cor-rectness of a component represents a characteristic and the components characteristics represent sub-characteristics. Also according to this model, these sub-characteristics are refined into mea-surable properties. This refinement is done using the semantic which is associated with these sub-characteristics. Below, we present such a refinement.

Autonomy: A component is autonomous if it has no required interfaces, and hence the prop-erty number of required interfaces should lead to a good measure of the component autonomy.

Composability: A component can be composed by means of its provided and required inter-faces. A component was considered more easily composed with another if services provid-ed/required in each interface are cohesive. The property average of service cohesion of com-ponent interface was used to measure composability.

Specificity: The specificity characteristic of a component is refined to properties by the evalua-tion of the number of provided services, which are based on the following statements. Firstly, a component which provides many interfaces may provide various services, as an interface can offer different services. Thus the higher the number of interfaces is, the higher the number of provided services. Secondly, if interfaces (resp. services in each interface) are cohesive (i.e.

share resources), they probably offer closely related functionalities. Thirdly, if the code of the component is loosely coupled (resp. cohesive), the different parts of the component code use each other (resp. common resources). Consequently, they probably work together in order to offer a few functionalities. From these statements, the specificity characteristic was refined to the following properties: number of provided interfaces, average of service cohesion of compo-nent interface, compocompo-nent interface cohesion, and compocompo-nent cohesion and coupling.

C.2.2 From Properties to Metrics

According to the refinement model given by the norm ISO-9126, a set of metrics are needed to mea-sure the components properties mentioned above. In order to define these metrics, a link between component properties and shape properties is needed. Such a corresponding link is established as follows:

174 Appendix C. Component Extraction from Object-Oriented Source Code: ROMANTIC Approach

Figure C.4 : The Refinement Model for Semantic-Correctness [Kebiret al., 2012b][Chardignyet al., 2008].

• Firstly, according to the mapping model, component interface set is linked to the shape in-terface set. Thus, the average of the inin-terface-class cohesion gives a correct measure for the property of the average of service cohesion of a component interface.

• Secondly, the component interface cohesion, the internal component cohesion and the inter-nal component coupling can respectively be measured by the properties of interface class co-hesion, shape class cohesion and shape class coupling.

• Thirdly, to link the property of the number of provided interfaces to a shape property, a com-ponent provided interface is associated to each shape-interface class having public methods.

Thus, the number of provided interfaces is measured using the number of shape interface classes having public methods.

• Finally, the number of required interfaces is evaluated by using coupling between the compo-nent and the outside. This coupling is linked to shape external coupling. Thus, this property is measured using the property shape external coupling.

The propertiesshape class couplingandshape external couplingrequire a coupling measurement.

Thus, the metric Coupl(E) and CouplExt(E) are defined to measure respectively the coupling of a shapeEand the coupling ofEwith the outside classes. They measure two types of dependencies be-tween objects: method calls and use of attributes of another class. Moreover, they are related through the equation below:

coupl E xt(E)=100−coupl(E). (C.1)

The propertiesaverage of interface-class cohesion,interface-class cohesion, andshape-class cohesion require a cohesion measurement. The metric Loose Class Cohesion (LCC), proposed by Bieman and Kang [Bieman et Kang, 1995], was used to measure the percentage of pairs of methods which are directly or indirectly connected. The refinement model of the semantic correctness of a component is summarized in FigureC.4.

C.3. Naming Components 175

C.2.3 Evaluation of the Semantic-Correctness

According to the established mapping between the sub-characteristics, properties and metrics, three evaluation functions were proposedSpe,AandCrespectively forspecificity,autonomyand compos-ability, wherenbPub(I)is the number of interface classes having a public method andIis the shape interface of the shapeE:

1. Spe(E)=15·(|I|1 ·P

iILCC(i)+LCC(I)+LCC(E)+C oupl(E)+nbPub(I)) 2. A(E)=coupl E xt(E)=100−coupl(E)

3. C(E)=|I|1 ·P

iILCC(i)

The functionS(E)defined below represents the semantic-correctness function of a shapeE. This function is based on the evaluation of each sub-characteristic. That is why it is as a linear combination of each fitness function of sub-characteristics (i.e.Spe,AandC):

4. S(E)= 31

P

i=1λi

·(λ1·Spe(E)+λ2·A(E)+λ3·C(E))

C.3 Naming Components

In ROMANTIC, naming component was performed based on the following observations: in many object-oriented languages, class names are a sequence of nouns concatenated using a camel-case notation (i.e. StringBuffer, ElementFilter, etc). The first word of a class name indicates the main purpose of the class; the second word indicate a complementary purpose of the class and so on.

On the other hand, an interface name should be an adjective that qualifies its implementing class.

According to these observations, three steps were proposed for naming components: extracting and tokenizing class and interface names from identified components, weighting tokens and constructing the component name.

C.3.1 Extracting and Tokenizing Class and Interface Names

In this step, class and interface names are extracted and then split into tokens according to the camel-case syntax. For example:StringBufferis split intoStringandBuffer.

C.3.2 Weighting Tokens

In this step, a weight is assigned to each extracted token. A large weight is given to tokens that con-stitute the first word of class names. A medium weight is given to tokens that are the first word of interface names. Finally a small weight is given to the other tokens.

As a component consists of two sets of classes (center and interface), two strategies were pro-posed to deal with these classes for naming component purpose. A large weight was given to tokens extracted from classes that belong to the provided interface of a component because these classes constitute the provided functionalities and services that it offers to other components, thus, its main

176 Appendix C. Component Extraction from Object-Oriented Source Code: ROMANTIC Approach

purpose. A small weight was given to tokens extracted from classes that belong to the center of a component because these classes are less concerned with the main purpose of the component and are mainly utility classes that do not interact with the outside. For a given word (w), the weight is calculated as follows:

wei g ht(w)= 1

5

P

i=1

Ni

·(1.0×(N1+N2)0.75×N3+0.5×(N4+N5) (C.2)

Where:

• N1: Number of appearance as the first word of a class name belonging to the provided interface.

• N2: Number of appearance in an entity name belonging to the provided interface of a compo-nent shape.

• N3: Number of appearance as the first word of an interface name.

• N4: Number of appearance other than the first word in an entity name.

• N5: Number of appearance in an entity name belonging to the center of a component shape.

C.3.3 Constructing the Component Name

In this step, a component name is constructed based on the strongest weighted tokens. The strongest weighted token constitutes the first word of the component name; the second strongest weighted word constitutes the second word of the component name and so on. The number of words used in a component name is chosen by the user. When many tokens have the same weight, all the possible combinations are presented to the user and he can choose the appropriate one.