Cyclomatic complexity metrics revisited : an empirical study of software development and maintenance

(1)

(2)

(3)

(4)

(5)

OEVMbY

HD28 .M414 no

IZTZ-9o

ms^^

Center

for

Information

Systems

Research

MassachusettsInstitute ofTechnology

Sloan SchoolofManagement

77MassachusettsAvenue

(6)

(7)

CYCLOMATIC

COMPLEXITY METRICS

REVISITED:

AN

EMPIRICAL

STUDY

OF

SOFTWARE

DEVELOPMENT

AND

MAINTENANCE

Geoffrey

K. Gill

Chris

F.

Kemerer

October 1990

CISR

WP

No.

218 Sloan

WP

No. 3222-90

•1990

G.K. Gill, C.F.

Kemerer

Center

for Information

Systems Research

Sloan

School

of

Management

(8)

(9)

Abstract

While the need forsoftware metrics to aid in the assessment of software complexity forboth development and maintenancehasbeenwidelyargued,littleagreementhasbeenreachedonthe

appropriatenessandvalue ofanysingle metric. McCabe'scyclomatic complexitymetric, ameasure of the

maximum number

of linearlyindependent circuits in a program control graph hasbeen widelyusedinresearch. However,despite thewidespreadinterestandpopularityofthismetric, ithas notbeenwithoutcriticism,bothanalytical(theMyers and

Hansen

variants)andempirical

(thehighcorrelationof cyclomatic complexity withsizemeasures).

The

current researchtested

both types ofcritiques on a newly collected dataset of real world software development and

maintenanceprojects.

The

analytical research questions weretested

on

a set of834software modules froma

number

ofexistingreal-time systems. Neither the Myersnor

Hansen

variants

were found to be significantly different from the original value as computed by McCabe. Therefore, these particular theoretical criticism

seem

to have little or no practical impact, as

representedbythe datacollected in this study. In regard tothe empirical research questions,

previousconcernswerevalidatedonthis

new

dataset. However, the currentresearchproposes

asimple transformationofthemetricwherebythecyclomaticcomplexityisdividedbythesizeof systeminsource statements, therebydetermininga "complexitydensity" ratio. This complexity

densityratioisdemonstratedtobea useful predictorof softwaremaintenanceproductivityona

smallpilotsampleofactualmaintenanceprojects.

CR

Categories and Subject Descriptors: D.2.7 [Software Engineering]: Distribution and Maintenance; D.2.8 [Software Engineering]:Metrics;D.2.9[SoftwareEngineering]:Management;

F.2-3[AnalysisofAlgorithmsand ProblemComplexity]: Tradeoffs

among

ComplexityMeasures;

IC6.0

[Management

of

Computing

and Information Systems]: General - Economics;

K.63 [Management

of

Computing

andInformation Systems]:Software

Management

General Terms: Management, Measurement,Performance.

Additional Key

Words

and Phrases: Software Productivity, Software Maintenance, Software Complexity,

McCabe

Metrics,Cyclomatic Complexity.

Researchsupportforthesecond authorfromtheCenterforInformation SystemsResearchand

the International Financial Services Research Center is gratefully acknowledged. Helpful

(10)

(11)

1.

INTRODUCTION

A

criticaldistinctionbetweensoftware engineeringandother,

more

well-establishedbranches of engineering is the shortage of well-accepted measures, or metrics ofsoftware. Without such

metrics, the tasksof planning and controlling softwaredevelopment and maintenancewillremain

stagnantina"craft''-typemode, whereingreaterskill isacquired onlythroughgreater experience,and such experience cannotbeeasilycommunicatedtothenextsystemfor study,adoption,andfurther

improvement. Withsuchmetrics,softwareprojectscanbequantitativelydescribed,andthemethods

and tools used

on

the projects to improve productivity and quality can be evaluated. These

evaluations will help the discipline grow and mature, as progress is

made

at adopting those

innovations that

work

well, anddiscardingorrevisingthosethatdonot.

Of

particularconcernis theneedtoimprovethe softwaremaintenanceprocess,given that

maintenanceisestimatedtoconsume 40-75%ofthesoftwareeffort[Vesseyand Weber,1986].

What

differentiatesmaintenance fromotheraspectsof software engineeringis,ofcourse, the constraintof

the existing system.

The

existing system constrains the maintenance

work

in two ways, the first

throughthe imposition ofageneral designstructure thatcircumscribesthe possible designs ofthe

systemmodifications, andsecond, andmorespecifically,throughthecomplexityofthe existingcode which mustbemodified.

A

measure, or metric ofthis latter,

more

specific type of system influence thathas been proposed is the McCabe's cyclomatic complexity metric, a measure of the

maximum

number

of

linearlyindependentcircuitsinaprogramcontrolgraph [McCabe,1976].

As

describedbyMcCabe,

a primary purpose ofthe metric is to "...identifysoftware modules thatwillbe difficult totest or

maintain" [McCabe, 1976, p.308], and therefore is of particular interest to researchers and

practitioners concernedwithmaintenance.

The

McCabe

metric hasbeenwidely used inresearch

(12)

or indirectly related to the

McCabe

metric [Shepperd, 1988]. However, despite thewidespread

interest and popularity of this metric, it has not been without criticism, with an early critique

appearingin 1982 [Evangelist], and most recently thecomprehensive aforementioned critique by

Shepperd [1988].

Shepperd's reviewhighlights twotypesofcriticisms.

The

first,whichhelabels "theoretical

criticisms,"havetodowithanalyticalcriticismsofthealgorithmbywhichthe cyclomatic complexity

metric iscomputed forsoftware. Mostprominent

among

these are the refinements proposed by

Myers [Myers, 1977] and

Hansen

[Hansen, 1978]. Shepperdalsonotesthat,

among

thenumerous

empirical validationsor usesofthe metric, that themostconsistent single resultisthe highdegree

ofcorrelationbetween McCabe'smetricandacount of sourcelinesofcode(SLOC).

As

evidence

ofthis,fourstudies citedbyShepperd hadPearsoncorrelation coefficientsof.9orgreater for these

two metrics.^ This "empiricalcriticism" suggests that the additional effortrequired forcomputing

andunderstanding the

McCabe

cyclomatic complexity metric

may

notbeworthwhileinpractice.

Therefore, giventheneedfor acomplexity metricsuchasMcCabe'singeneral, but also the

well-formedcriticismof the

McCabe

metricinparticular,theneedforadditionalresearchin thisarea

is clear.

The

purpose of the current research is to empirically test the two main criticisms

summarizedorraisedbyShepperd,andto suggestneededimprovements. Inregardto the theoretical

criticisms ofMyers and Hansen, Shepperddoes note that itis "arguablewhether [they] represent

much

ofan improvementover thatof

McCabe"

(p. 32),since "The majority ofmodifications to

McCabe'soriginalmetricremainuntested." (p.35)[Shepperd, 1988]

The

current researchdirectly

addressesthis callforanempiricaltestofthese variations.

The

second research questionrevolves aroundthe practicalusefulnessofthe metric, given

thehighcorrelationsnoted by

some

previousresearch.

As

notedby Shepperd, concerns aboutthe

'Sce[BasiliandPerricone,1984],[FeuerandFowlkes,1980], [Paige, 1980],and[Wcxxlward,HcnneU, andHedley,1979].

(13)

external validity of the data and analyses in

some

ofthe previous studies can be used raised to

mitigate

some

ofthe results, particularlythose using data onsmall programs, often usingstudent

subjects. Therefore, theseresults bearvalidation ondata fromactual systems, and,in particular,

fromdata

on

maintenanceprojects,sinceuse ofthe metric for testingand maintenance was oneof

its author's main stated purposes. In particular, this research seeks not to determine whether

Cyclomatic complexity capturesallaspectsof complexityinonefigureofmerit,butrather toanswer the question raised by Shepperd, as to whether cyclomatic complexity can serve as a "useful

engineering approximation" [Shepperd, 1988].

Brieflysummarized,the resultsofthisresearchwere as follows. Ina test

on

834software

modules from a

number

ofexisting real-timesystems,neithertheMyersnor

Hansen

variantswere

found tobesignificantly different fromthe originalvalue as computedby

McCabe.

However,in

regard tothesecondsetof researchquestions, theempiricalcriticismofShepperd andotherswas

validated, inthatthecorrelationbetweenCyclomatic complexityand

SLOC

was foundtobe.95for

thesereal world systems. However, a simple transformation involvingthe cyclomatic complexity

metric is proposed, whereby the cyclomatic complexityis divided by the size of systemin source

statements, thereby determining a "complexity density" ratio. This complexity density ratio is

demonstrated tobea usefulpredictorof softwaremaintenanceproductivity,

on

asmallsampleof

actualmaintenanceprojects.

The

restofthispaperisorganizedasfollows: Section2detailsthemainresearchquestions,

andprovides references toprevious research literaturewhere appropriate. Section3outlines the

datacollectionproceduresandsummarizesthedatasetusedinthe research. Resultsarepresented

(14)

2.

CYCLOMATIC COMPLEXITY METRICS

2.1 Analytic Critiques

McCabe

[1976]proposedthat ameasureofthe complexityisthe

number

ofpossible paths

throughwhichthe softwarecouldexecute. Sincethe

number

of pathsinaprogramwithabackward

branchisinfinite,he proposedthatareasonablemeasurewould bethe

number

of independentpaths.

Afterdefiningtheprogramgraphfora givenprogram, the complexitycalculationwouldbe:

V(G)

=

e-n +2.

where

V(G)

=

thecyclomaticcomplexity,

e

=

the

number

ofedges,

n

=

the

number

ofnodes.

Analytically,

V(G)

isthe

number

ofpredicates plus1foreach connectedsingle-entrysingle-exitgraph

or subgraph.

From

the beginning, a

number

ofresearchershave proposedvariations

on

theoriginalmetric.

The

two most prominentvariations arethose ofHansen(1978)and Myers(1977) [Shepperd,1988].

Hansen

[1978] gavefoursimple rulesforcalculating

McCabe

'sComplexity:

1. IncrementoneforeveryIF,

CASE

or otheralternateexecutionconstruct.

2. IncrementoneforeveryIterative

DO, DO-

WHILE

orotherrepetitive construct.

3.

Add

twolessthanthe

number

oflogicalalternativesina

CASE.

4.

Add

oneforeachlogicaloperator

(AND,

OR)

in anIF.

Myers[1977]demonstratedwhat hebelieved tobeaflawin

McCabe

*smetric. Myerspointed out thatan IF statement with logical operators was in

some

senseless complicated than two IF

(15)

statements withone imbeddedintheother because thereisthepossibilityofanextra

ELSE

clause

inthesecondcase. Forexample:

IF(

A

and

B

)

THEN

ELSE

islesscomplexthan:

EF(

A

)

THEN

IF(

B

)

THEN

ELSE

Myers[1977] notedthatcalculating

V(G)

byindividualconditions (assuggestedby

McCabe)

made V(G)

identical forthesetwocases.

To

remedythisproblem,Myerssuggestedusing a tupleof

(CYCMID,CYCMAX)

where

CYCMAX

isMcCabe'smeasure,V(G),(aU fourrulesareused)and

CYCMID

uses only the firstthree. Usingthiscomplexityinterval, Myers wasable to resolve the

anomaliesinthe complexity measure.

Hansen

[1978] raised a conceptual objection to Myers's improvement. Hansen felt that

because

CYCMID

and

CYCMAX

were so closely related, using a tuple to represent them was redundantand did not provide enough information towarrant the intrinsic difficulties ofusing a

tuple. Instead he proposedthat what wasreallybeing measured bythe second halfofthe tuple

(CYCMAX)

wasreallythe complexity ofthe expression,not the

number

of paths through thecode.

Furthermore,hefeltthat rule

#3

abovealsowasjustameasureofthecomplexity ofthe expressions.

To

provide abetter metric,hesuggestedthat atuple consistingof

(CYCMIN,

operation count) be

usedwhere

CYCMIN

iscalculated usingjustthefirst tworules forcountingMcCabe's complexity. All of the above tuple measures have a problem in that, while providing ordering of the

(16)

complexity of programs, they do not provide a single

number

that can be used for quantitative

comparisons. Itisalsoanopenresearchquestionwhetherchoosingoneortheother ofthese metrics

makes any practical difference. While a

number

of authors have attempted validation of the

CYCMAX

metric(e.g. LiandCheung, 1987;LindandVairavan, 1989) there does notappeartobe

anyempirical validationofthe practical significanceofthe Myersor

Hansen

variations [Shepperd

1988]. Therefore, one question investigated in this research is a validation of the

CYCMIN,

CYCMID,

and

CYCMAX

complexitymetrics.

2.2 Empirical Tests

InShepperd'scomprehensivereview of previous researchonthemetric,henotesthat,

among

thenumerousempirical validationsorusesofthe metric, that themostconsistent singleresultisthe

high degree ofcorrelation between McCabe'smetricand acount of sourcelinesofcode(SLOC).

As

evidence ofthis, four studies cited by Shepperd had Pearson correlation coefficients of .9or greater for these two metrics. This"empirical criticism"suggest thatthe additional effortrequired

forcomputing and understandingthe

McCabe

cyclomaticcomplexity metric

may

notbepractically

worthwhile. However, as noted by Shepperd, concerns about external vahdity of the data and

analyses in

some

oftheprevious studiescan be raised tomitigate

some

of the results, particularly

thoseusingdataonsmallprograms fromstudentsubjects. Therefore, theseresultsbearvalidation.on

datafrom actual systems. In particular, applied research in thisareadoes not seekto determine

whethercyclomaticcomplexity capturesall aspectsof complexityinone figureofmerit, but rather

looks toanswerthe overridingquestionraisedbyShepperd,as towhethercyclomatic complexity can

(17)

23

ProductivityApplications

As

suggestedbyMcCabe.alikelyuse ofacomplexity metricisto"...identifysoftwaremodules

thatwillbedifficulttotestor maintain"[1976,p.308].

The

emphasis

on

maintenanceisappropriate,

given thatsoftwaremaintenanceis an area ofincreasing practical signiScance. It isestimated that

from

40-75%

of information systems resources are spent on software maintenance [Vesseyand

Weber, 1986]. However, accordingtoLientzand Swanson [1980]relativelylittleresearch hasbeen doneinthisarea.

One

areathatisof considerablepracticalimportanceforresearchisthe effectof

existingsystem complexityon maintenanceproductivity. Itisoften assumedthat a)

more

complex

systemsareharderto maintain,andb) thatsystemssufferfrom"entropy",andtherefore

become more

chaoticand complexastimegoeson[BeladyandLehman, 1976]. Theseassumptionsraise a

number

of researchquestions,in particular,

when

does software

become

sufficientlycostly tomaintain that

itis

more

economictoreplace (rewrite) itthan torepair (maintain)it? Thisis aquestionof

some

practical importance, as one type oftool in the

Computer

Aided Software Engineering

(CASE)

marketplace is the so-called "restructurer", designed to reduce existing system complexity by

automaticallymodifying the existingsourcecode [Parikh, 1986;

US

GSA,

1987]. Knowledgethata

metricsuchas

CYCMAX

accuratelyreflectsthedifficulty inmaintaininga setofsourcecode would allow

management

to

make

rationalchoicesinthe repair/replace decision, as well as aidinevaluating

these

CASE

tools.

The

empiricalevidencelinkingsoftware complexitytosoftwaremaintenancecosts

hasbeen criticized as beingrelatively

weak

[Kearney, 1986]. However,studies have shownthat a

significant fractionofthestaffresourcesusedinmaintenancearespentinunderstandingthe existing

code[Fjeldstadand Hamlen. 1979]. Therefore,a metric that measures complexity should proveto

(18)

2.4

Summary

ofResearch Questions

Based upon theabove review ofthe previous literature,most especially that of Shepperd,

three research questionsarethe subjectofthisempirical study. Theyare:

1.

What

is the empirical relationship between the original

McCabe

formulation of the

cyclomatic complexitymetric andthatof the latervariantsproposed byMyers andHansen?

The

practical implicationsofthesequestionsarethat,ifthe variants are

shown

tobesignificantlydifferent

thanthe originalformulationinpractice,then they meritcollectionandanalysisbybothresearchers

andpractitioners.

2.

What

isthe empirical relationshipbetweenthecyclomaticcomplexity metricandsimpler

size metrics such as source lines ofcode?

The

practical implication of this question is whether

cyclomatic complexity should be used as the numeratorin ratios todetermine the productivityof

software development and maintenanceefforts,orwhethersimplerto computemeasures, suchas

SLOC

wouldsuffice.

3.

How

can cyclomatic complexity metrics be used to assess the impact of existing code complexityon maintenanceprojectproductivity? Ifarelationshipcanbeshown between

some

form ofthemetricandproductivity,thenthemetriccan be usedas acomponentin

management

planning

andcontrolof software maintenance, an activityof considerableeconomicsignificance.

3.

DATA

COLLECTION

3.1 Data OverviewandBackgroundof theDatasite

The

approachtakentoaddressingtheresearch questionsin thisstudywasto collect detailed

data about a

number

of completed software projects at one firm, hereafter referred to as Alpha

Company. Allthe projectsusedin thisstudyweresmall,customized,andprimarily real-timedefense

applications undertaken inthe period from 1984-1989, with mostofthe projectsin thelast three

(19)

years of that period.

The company

considers the data contained in this study proprietary and

thereforerequested that its

name

not be usedand anyidentifyingdatathatcoulddirectly connect

itwiththisstudybewithheld.

The

numericdata, however, have notbeenalteredinany way.

The

company

employed anaverage of about 30professionalsinthesoftware department.

One

beneflt of this approach was that because all the projects were undertaken by one

organization, neither inter-organizational differences norindustrydifferencesshouldconfound the

results.

A

possible limitation is that this concentration on one industry and one

company may

somewhatlimittheapplicabilityoftheresults. Ataminimum,however, the defenseindustryisavery importantindustry forsoftware-andtheresults willclearlybeofinteresttoresearchersinthat Geld.

More

generally,theseresults

may

providesuggestions forresearchatothersites.

3.2 DataSources

The

dataforthisstudywerederivedfromthreesources [Gill, 1989]:

1. Source

Code

--

A

_{copy of}thesourcecodedevelopedormodifiedbyeachprojectwas obtained.

2. Accounting Database--Thisdatabasecontainedthe

number

ofhours every software engineerworked oneachproject forevery

week

ofthestudyperiod.

3. ActivityReports

~

A

descriptionofthe

work

donebyeach engineerforeach

week

foreveryproject.

Of

thesourcecode developedforthe projects,74.4% waswrittenin Pascal,and the other

25.6% ina variety of other third generation languages, primarily

FORTRAN.

Following

Boehm

[1981],onlydeliverable codewasincludedin this study,withdeliverabledefinedasanycode which

was placed under a configuration

management

system. This definition eliminated any purely

fortune magazinereported

m

itsSeptember25, 1989issue thatthePentagonspends $30 biUionannually onsoftware.

(20)

temporarycode,butretainedcode whichmightnothavebeenpartofthe delivered system,butwas

consideredvaluableenoughtobe savedandcontrolled. (Suchcodeincludedprogramsforgenerating

testdata,procedurestorecompilethe entire system,utilityroutines,etc.)

The

sourcecode wasthe

maindata sourceforaddressingthe firsttwosetsof researchquestions.

Inordertoaddressthemaintenanceproductivity question,twoadditionaltypesof datawere

required.

The

first of thesewasthe total

number

ofhoursworked ontheproject andthe person

chargingthe project. These wereobtainedfromthefirm'saccountingdatabase. Becausethehourly

charges were used to bill the project customer and were subject to United States Government Department ofDefense

(DOD)

audit, the engineers and managers used extreme care to insure

properallocationofcharges. Forthisreason,thesedataarebelieved tobeaccuraterepresentations

of theactual effortexpended oneachproject.

A

thirddatasourcewasaweekly

summary

calledan activityreport ofthe work done by each software engineer every

week

as part of their standard

record-keepingfor

DOD

reporting.

An

exampleof anactivityreportappearsinAppendix

A

These

activityreportswereused to isolate codingand debugging activities

when

examining the effectof

complexitydensityon maintenanceproductivity.

3

J

DataDeflnitions

Forthepurposes ofthisresearch, asoftwaremodule wasdefinedasthesourcecodethatwas

placed in a single physical file (whichwas often compiled separately).

The

size of the software modules was measuredinnon-commentsourcelinesofcode

(NCSLOC).

The

definitionofaline

ofcodeadoptedfor thisstudyisthatproposed by Conte,

Dunsmore

andShen [1986, p. 35]:

A

line ofcodeis anylineofprogramtextthatisnot a

comment

or blankline,regardlessof

the

number

ofstatementsorfragments of statementsontheline. Thisspecificallyincludes

all lines containing program headers, declarations, and executable and non-executable

statements.

Thissetofrules has severalmajorbenefits. Itisverysimple to calculateand itisveryeasy

(21)

totranslate foranycxjmputerlanguage. Because ofitssimplicity, it has

become

themost popular

SLOC

metric [Conte, et al. 1986, p. 35]. Through consistent use ofthis metric, results

become

comparableacross different studies.

Inaddressingthefirst tworesearchquestions,thisstudy includedatotalof834 modulesof which 771 were written in Pascal and 63 in

FORTRAN.

They

averaged 176.2

NCSLOC

(Non-Comment

LinesofCode)with astandard deviationof257.9. Intotal,approximately 150,000lines ofcode

(NCSLOC)

from21 software systemswereanalyzed.

The

definition of software maintenance adopted by this research is that of Parikh and

Zvegintzov[1983], namely"workdone onasoftware system afterit becomesoperational." Lientz

andSwanson[1980, pp. 4-5] include three tasksintheirdefinitionofmaintenance:"...(1)corrective --dealingwithfailures inprocessing, performance, or implementation, (2)adaptive--respondingto

anticipated changein thedataor processing environments;(3) perfective -- enhancing processing

efficiency, performance, or systemmaintainability."

The

projectsdescribed below asmaintenance

work

pertain toallofthe threeLientzandSwanson types.

To

address the third research question, projectdatawere gatheredforseven maintenance

projects in order to test the hypothesis concerning the effect of complexity on maintenance

productivity.

The

averagesizeofamaintenanceprojectwas1006added (standarddeviation, 1158),

andrequiredan average of 1059

work

hours (standarddeviation, 982).

^ _As

_Boehm

[1987]hasdescribed,

SLOC

havemethodhasmanydeficiencies asametric.In particular,they

aredifficult todefineconsistently, asJones[1986] hasidentifiednofewer than elevendifferentalgorithmsfor

counting code.Also,

SLOC

donot necessarilymeasurethe "value"ofthecode(in terms offunctionalityand

quality). However,

Boehm

[1987] also points out that no other metric has aclear advantage over SLOC.

Furthermore,

SLOC

are easytomeasure, conceptuallyfamiliar tosoftware developers,andare usedinmost

productivitydatabasesandcostestimationmodels.Forthese reasons,itwasthemetricadoptedbythisresearch.

(22)

4

RESULTS

4.1 Empirical Tests of Analytic Critique

Forthe834modules,threecomplexitymetricswere computed:

CYCMAX

—

the original

McCabe

cyclomaticcomplexity.

CYCMID

—

theMyersvariation.

CYCMIN

—

theHansenvariation.

Table4.1.1givestheaverage valueforeachmetric. Table4.1.2givesthePearsoncorrelation

coefficients for thethreecomplexitymetricsusedin thisstudy.

(23)

the empiricaldata suggestthat there are unlikely tobe any practicallysignificant differentresults

using

CYCMIN

or

CYCMID

insteadof

CYCMAX.

4.2 Test of Empirical Criticism

As

suggested by previousresearch, the length metric,

NCSLOC,

andthecomplexity measure,

CYCMAX

also turned out to be highly correlated. Table 4.2.1 gives the Pearson correlation

coefficientsforthecomplexitymetricswiththe

SLOC

metrics. Similar totheresultsfoundin

some

(24)

donothavea great dealofcontrolover thesizeofaprogramasitisintimatelyconnectedtothesize

ofthe application. However,bymeasuringandadopting complexitystandards,and/or byusing

CASE

restructuringtools, theycan

manage

unnecessarycyclomatic complexity.

The

secondreasonisthat

there are valid intuitive reasons

why

the length of the code

may

not have a large effect on

maintenanceproductivity.

When

asoftwareengineer maintainsaprogram, eachchangewilltendto

belocalized to a fairlysmall

number

of modules.

He

will,therefore,onlyneeddetailedknowledge

ofa small fractionof thecode.

The

sizeofthisfractionisprobably unrelatedtothetotalsizeofthe

program. Thus thelength of theentirecode isless relevant.'* Therefore, a transformed metric,

complexitydensity, isdefinedastheratioofcyclomaticcomplexitytothousandlines of executable

code

(KNCSLOC).

These complexitydensity measures are similar to Gilb's logical decisions per statement [Gilb, 1977],but they have not oftenbeen published.

One

exception is aby-productof

Selby'sresearchonreusedcode[Selby, 1988]. Inhisstudyof25

NASA

projects,his

mean

was81.4,

withastandarddeviationof57.2.

CMAXDENS,

theaverage

McCabe

complexity per thousandlines ofcode was 121.3 foroursample (standarddeviation of62.7).

Since both Pascal and

FORTRAN

code was analyzed in this research, it is important to

determinewhetherthereexistsanyeffectofprogramminglanguageused.

The

average complexity

densityof

FORTRAN

modules was comparedwithPascalmodules(seeTable4.3.1).

*Infact,thecorrelation of productivitywithinitialcodelengthwasnotsignificant(Pearsoncoefficient,-0.21;

p

=

0.61).

(25)

(26)

(27)

This figure suggests that productivity declineswith increasingcomplexitydensity. While it

would bespeculative to ascribe too

much

importancetotheresults,whicharebasedononlyseven

datapoints, the resultsdo

seem

sufficientlystrongand the implications sufficientlyimportantthat

furtherstudyisindicated. Iftheseresultscontinuetohold

on

a largerindependent sample, then such

resultswouldprovide strong supportfortheuseofthecomplexitydensitymeasureasa quantitative

method

foranalyzing softwaretodetermineitsmaintainability. Itmightalsobeusedasanobjective

measure ofthisaspectof softwarequality.

5

SUMMARY

AND CONCLUDING REMARKS

This paper began by directly addressed the two research questions raised by Shepperd

regardingthe

McCabe

cyclomatic complexitymetrics.

The

firstquestionwas whetherthe analytical

criticismsof theoriginalmetrichadanypractical significance. Baseduponthedatausedin thisstudy,

neither the Myers nor

Hansen

variationsseemtoresult invalues that areequivalenttothe original

specification for all practical purposes to which they might be put. Thus, these data support

Shepperd's hypothesisthatthevariationsdonotrepresenta significantdifferenceoverthe original

formulation.

The

secondresearchquestionrevolvesaroundthe empirically-derivedvalueaddedby

themetric,given the highcorrelationsfound by previous researchbetweenthe metricandstandard

sizemeasures,mostparticularly

LOG

.

The

datafromthisstudyprovideadditionalevidenceofthis

highcorrelation. This is particularlynoteworthysincethese dataare fromactualsystems, and, in

particular, include dataon maintenance projects,which is a main intendeddomainofthe metric.

When

the outlying projectisremoved,the resultsimprovedramatically:

PRODUCTIVITY

= 31.3 - 0.142*

CMAXDENS

(6.67) (-4.98)

R* = 0.86 AdjustedR* = 0.82

F-Value

=

24.84 n

=

6

(28)

Going beyond this aDirelation, this paper has proposed the use ofa transformed version ofthe

metric, referred to as"complexity density", wherebythe ratioof thecyclomatic complexity ofthe

moduletoitslengthin

LOC

iscalculated. Thisratioismeanttorepresent the weighted complexity of a module,and hence its likely level ofmaintenance task difficulty. This proposed complexity

density ratio is tested on a small pilot sample of projects and

shown

to be a goodsingle value

predictorofmaintenancecosts. Thus,itisproposedthatthistransformed versionofthe cyclomatic

complexity metric canserve, inShepperd's words,asa "usefulengineeringapproximation" [Shepperd,

1988].

Of

course, further empirical researchwillbe requiredto testthesepilotresultsona larger

sample drawn from a differentenvironmentin order tovalidatethe usefulness ofthe complexity

densityratio.

The

complexitydensityratioproposed bythisresearchcould provetobea simple,but

practically usefulmeasureofcomplexity. Itincorporates the

McCabe

cyclomaticcomplexitymetric,

awell-knownandwell-understoodmeasureofcomplexity. In addition,inthe interveningyears since

itsintroduction, the collectionofthedata necessarytocomputethemetrichasbeen automatedby

a

number

oftools. Therefore, theearlydifficultiesincollectingthesedata havebeenresolved,and

thereforedatacollectionshould not provetobea practical barrier to theratio'suse.

The

continued

search for useful metrics of software product complexity is a necessary first step in theongoing processofmovingsoftwaredevelopment firmly into therealm of engineering. Withwell-founded

complexitymetrics,developerswillhaveobjectiveanduseful yardstickswithwhichtoevaluatetheir

initialproducts. Thesemetricswillalsobe used byeachmaintenanceteamasitseekstoenhance theinitial product without increasing the levelofunnecessarycomplexity. Bothof these aspects

should

make

a contributiontowards theimproved

management

ofthe softwaredevelopment and

maintenanceprocess.

e.g.,seeLanguage TechnologyIncorporated's

INSPECTOR

product,and

SET

Laboratories' product.

(29)

6

BroUOGRAPHY

[BasiliandPerricone, 19S4]

Basil!, V. R. and B. Perricone, "Software Errors and Complexity:

An

Empirical Investigation,"

Communications_ofthe

ACM,

27(1):42-52,(January 1984).

[Beladyand

Lehman,

1976]

Belady,L.A.andLehman,

M.

M.;

"A

Model

ofLargeProgram Development";

IBM

SystemsJournal,

V. 15, n. 1,pp.225-252,(1976).

[Boehm,1981]

Boehm,

Barry W.; Software EngineeringEconomics; Prentice-Hall,EnglewoodClifEs,NJ, 1981.

[Boehm,etal. 1984]

Boehm,

B.W.; Gray,T.E.;andSeewaldt,T.;

IEEE

TransactionsonSoftwareEngineering;

May

1984.

[Boehm,1987]

Boehm,

Barry W.; "Improving SoftwareProductivity";Computer,September1987; pp. 43-57.

[Conte,et al.1986]

Conte, S. D.; Dunsmore, H. E.; and Shen, V. Y.; Software Engineering Metrics and Models; Benjamin/CummingsPublishingQarapany,Inc., 1986.

[Evangelist,1982]

Evangelist,

W.

M., "Software complexity metricsensitivitytoprogramstructuringrules," Journal of Systems<kSoftware, 3231-243, (1982).

[Feuerand Fowlkes, 1980]

Feuer, A. R. and E. B. Fowlkes, "Some results from an empirical study ofcomputer software",

Proceedings of the Fourth

EEEE

International Conference

on

Software Engineering, Munich, Germany, 1980, pp.351-355.

[Fjeldstadand Hamlen, 1979]

Fjeldstad, R. K. and Hamlen,

W.

T.; "Application program maintenance study: Report to our

Respondents";Proc of

GUIDE

48;

The

GuideCorporation,Philadelphia, 1979.Alsoin pParikhand Zvegintzov, 1983].

[GUb,1977]

Gilb,Thomas;SoftwareMetrics;WinthropPress,Cambridge,

MA

1977.

[Gill,1989]

Gill. Geoffrey KL;

"A

Study of the Factors that Affect Software Development Productivity";

unpublished

MIT

Sloan MastersThesis; 1989; Cambridge,

MA.

(30)

[Hansen, 1978]

Hansen,

W.

J.; "MeasurementofProgram Complexity

By

the Pair(CyclomaticNumber, Operator Count)";

ACM

SIGPLAN

Notices; 13 (3):29-33,

March

1978.

[Jones,1986]

Jones,Capers;ProgrammingProductivity; McGraw-Hill,

New

York,1986.

[Kearney,etal, 1986]

Kearney, J.

K,

R. L. Sedlmeyer,

W.

B. Thompson,

M.

A. Gray and

M.

A. Adler, "Software ComplexityMeasurement", Communicationsofthe

ACM,

29(11): 1044-1050,

(November

1986).

[Liand Cheung,1987]

Li, H. F. and Cheung,

W. K;

"An Empirical Study of Software Metrics";

IEEE

Transactionson Software EngineeringSE-13 (6):697-708,June, 1987.

[Llentzand Swanson, 1980]

Lientz, B. P. and Swanson,E.B.; SoftwareMaintenanceManagement;Addison-Wesley Publishing

Company,1980.

[LindandVairavan, 1989]

Lind,

Randy

K. and Vairavan, K.; "An ExperimentalInvestigation of Software Metrics andTheir

Relationship to Software Development Effort";

IEEE

Transactions on Software Engineering; 15

(5):649-653, 1989.

[McCabe, 1976]

McCabe, Thomas

J.;

"A

Complexity Measure";

IEEE

Transactions onSoftwareEngineering; SE-2

(4):308-320, 1976.

[Myers, 1977]

Myers,G.J.;"AnextensiontotheCyclomaticMeasureofProgramComplexity".

SIGPLAN

Notices;

12(10):61-64, 1977.

[Paige,1980]

Paige,M.,

"A

metricforsoftware testplanning",Proceedings of

COMPSAC

80, Buffalo,

New

York,

1980, pp.499-504.

[Farildi,1986]

Parikh, Girish;"Restructuringyour

COBOL

Programs";ComputerworldFocus;20(7a):39-42;February 19, 1986.

[ParikliandZvegintzov, 1983]

Parikh,G.andZvegintzov,N.;eds.TutorialonSoftware Maintenance,

EEEE

Computer

SciencePress,

SUverSpring,

MD

(1983). [Selby,1988]

Selby,Richard W.;"EmpiricallyAnalyzing SoftwareReuseinaProduction Environment"; Software Reuse

-

EmergingTechnologies; Traz,

W.

eds.

IEEE

Computer

Society Press,

NY NY

1988.

(31)

[Shepperd, 1988]

Shepperd, M.,

"A

critiqueof cyciomatic complesdty asa software metric," Software Engineering

Journal,3(2):30-36,(March 1988).

[US

GSA,

1987]

Parallel TestandEvaluation of aCobolRestructuring Tool;Federal Software

Management

Support

Center; United States General Services Administration, Office of Software Development and Information Technology, FaUs Church,VirginiaSept 1987.

[Vesseyand Weber, 1983]

Vessey, I. and Weber, R.;

"Some

Factors Affecting Program Repair Maintenance:

An

Empirical

Study";Communicationsofthe

ACM;

26(2):128-134,February,1983.

[Woodward,et al, 1979]

Woodward,

M.

R.,

M.

A

Hennell and D.

A

Hedley,

"A

measureofcontrol flow complexityin

programtext,"

IEEE

TransactionsonSoftwareEngineering,5(1):45-50, (1979).

[Zuseand Bollmann,1989]

Zuse,H.andP.Bollmann, "Softwaremetrics:usingMeasurement TheorytoDescribethe Properties

andScalesofStaticSoftware ComplexityMetrics,"

ACM

SIGPLAN

Notices,24 (8):23-33, (1989).

(32)

AppendixA:

Example

ofanActivityReport

Junior Software Engineer's

Name

Weekly

Report 4-1-1987

Project

#1

35hrs. Spentttiewholetimeworkingwith [StaffScientist] and[PrincipalScientist]

tryingto findthecauseof thedifferencebetween[BaselineRuns]and

[New

Instnmient Runs]. Afteralotofheadscratchingitturnsoutthatbothwere

correct.

We

justhadtoplaywiththeformat of the dataandunits togetthe

twodatasetstomatchto within[xxx]

RMS.

Thisisstillabovethe [yyy]

RMS

limit,but the[baseline runs]and [newinstrumentruns] started offbeing[zzz]

RMS

apart.So,untilthe[newinstrument]canbeimproved,

my

softwarecan

do nobetterthanit isdoing now.

Project

#2

1 hrs. Ihadverylittletimetowork

on

this,butIdid

manage

tostart tofinish the

modifyadhoctarget routine.

Project

#3

9hrs

The

automaticheight input forthe calibrationroutine is up andrunning.I

also

made some more

modificationssothattheycanreturn totheoldmethod

ifthe

IEEE

boardfails.

Project

#4

2hrs [StaffScientist]and Ispent the twohourstryingto get the micro

VAX

up

after

we

installedthe[SpecialBoards].

We

endeduphavingto pull theboards back out to reboot the machine.

The

boardswere eventuallyinstalled, but

therestillseemstobesomething wrong.

Total

=

45.0hours

(33)

(34)

(35)

(36)

Date

Due

(37)

MIT LIBRARIES DUPL

(38)