OEVMbY
HD28 .M414 noIZTZ-9o
ms^^
Center
forInformation
Systems
Research
MassachusettsInstitute ofTechnology
Sloan SchoolofManagement
77MassachusettsAvenue
CYCLOMATIC
COMPLEXITY METRICS
REVISITED:
AN
EMPIRICAL
STUDY
OF
SOFTWARE
DEVELOPMENT
AND
MAINTENANCE
Geoffrey
K. GillChris
F.Kemerer
October 1990
CISR
WP
No.
218
Sloan
WP
No. 3222-90
•1990
G.K. Gill, C.F.Kemerer
Center
for InformationSystems Research
SloanSchool
ofManagement
Abstract
While the need forsoftware metrics to aid in the assessment of software complexity forboth development and maintenancehasbeenwidelyargued,littleagreementhasbeenreachedonthe
appropriatenessandvalue ofanysingle metric. McCabe'scyclomatic complexitymetric, ameasure of the
maximum number
of linearlyindependent circuits in a program control graph hasbeen widelyusedinresearch. However,despite thewidespreadinterestandpopularityofthismetric, ithas notbeenwithoutcriticism,bothanalytical(theMyers andHansen
variants)andempirical(thehighcorrelationof cyclomatic complexity withsizemeasures).
The
current researchtestedboth types ofcritiques on a newly collected dataset of real world software development and
maintenanceprojects.
The
analytical research questions weretestedon
a set of834software modules fromanumber
ofexistingreal-time systems. Neither the MyersnorHansen
variantswere found to be significantly different from the original value as computed by McCabe. Therefore, these particular theoretical criticism
seem
to have little or no practical impact, asrepresentedbythe datacollected in this study. In regard tothe empirical research questions,
previousconcernswerevalidatedonthis
new
dataset. However, the currentresearchproposesasimple transformationofthemetricwherebythecyclomaticcomplexityisdividedbythesizeof systeminsource statements, therebydetermininga "complexitydensity" ratio. This complexity
densityratioisdemonstratedtobea useful predictorof softwaremaintenanceproductivityona
smallpilotsampleofactualmaintenanceprojects.
CR
Categories and Subject Descriptors: D.2.7 [Software Engineering]: Distribution and Maintenance; D.2.8 [Software Engineering]:Metrics;D.2.9[SoftwareEngineering]:Management;F.2-3[AnalysisofAlgorithmsand ProblemComplexity]: Tradeoffs
among
ComplexityMeasures;IC6.0
[Management
ofComputing
and Information Systems]: General - Economics;K.63
[Management
ofComputing
andInformation Systems]:SoftwareManagement
General Terms: Management, Measurement,Performance.
Additional Key
Words
and Phrases: Software Productivity, Software Maintenance, Software Complexity,McCabe
Metrics,Cyclomatic Complexity.Researchsupportforthesecond authorfromtheCenterforInformation SystemsResearchand
the International Financial Services Research Center is gratefully acknowledged. Helpful
1.
INTRODUCTION
A
criticaldistinctionbetweensoftware engineeringandother,more
well-establishedbranches of engineering is the shortage of well-accepted measures, or metrics ofsoftware. Without suchmetrics, the tasksof planning and controlling softwaredevelopment and maintenancewillremain
stagnantina"craft''-typemode, whereingreaterskill isacquired onlythroughgreater experience,and such experience cannotbeeasilycommunicatedtothenextsystemfor study,adoption,andfurther
improvement. Withsuchmetrics,softwareprojectscanbequantitativelydescribed,andthemethods
and tools used
on
the projects to improve productivity and quality can be evaluated. Theseevaluations will help the discipline grow and mature, as progress is
made
at adopting thoseinnovations that
work
well, anddiscardingorrevisingthosethatdonot.Of
particularconcernis theneedtoimprovethe softwaremaintenanceprocess,given thatmaintenanceisestimatedtoconsume 40-75%ofthesoftwareeffort[Vesseyand Weber,1986].
What
differentiatesmaintenance fromotheraspectsof software engineeringis,ofcourse, the constraintof
the existing system.
The
existing system constrains the maintenancework
in two ways, the firstthroughthe imposition ofageneral designstructure thatcircumscribesthe possible designs ofthe
systemmodifications, andsecond, andmorespecifically,throughthecomplexityofthe existingcode which mustbemodified.
A
measure, or metric ofthis latter,more
specific type of system influence thathas been proposed is the McCabe's cyclomatic complexity metric, a measure of themaximum
number
oflinearlyindependentcircuitsinaprogramcontrolgraph [McCabe,1976].
As
describedbyMcCabe,a primary purpose ofthe metric is to "...identifysoftware modules thatwillbe difficult totest or
maintain" [McCabe, 1976, p.308], and therefore is of particular interest to researchers and
practitioners concernedwithmaintenance.
The
McCabe
metric hasbeenwidely used inresearchor indirectly related to the
McCabe
metric [Shepperd, 1988]. However, despite thewidespreadinterest and popularity of this metric, it has not been without criticism, with an early critique
appearingin 1982 [Evangelist], and most recently thecomprehensive aforementioned critique by
Shepperd [1988].
Shepperd's reviewhighlights twotypesofcriticisms.
The
first,whichhelabels "theoreticalcriticisms,"havetodowithanalyticalcriticismsofthealgorithmbywhichthe cyclomatic complexity
metric iscomputed forsoftware. Mostprominent
among
these are the refinements proposed byMyers [Myers, 1977] and
Hansen
[Hansen, 1978]. Shepperdalsonotesthat,among
thenumerousempirical validationsor usesofthe metric, that themostconsistent single resultisthe highdegree
ofcorrelationbetween McCabe'smetricandacount of sourcelinesofcode(SLOC).
As
evidenceofthis,fourstudies citedbyShepperd hadPearsoncorrelation coefficientsof.9orgreater for these
two metrics.^ This "empiricalcriticism" suggests that the additional effortrequired forcomputing
andunderstanding the
McCabe
cyclomatic complexity metricmay
notbeworthwhileinpractice.Therefore, giventheneedfor acomplexity metricsuchasMcCabe'singeneral, but also the
well-formedcriticismof the
McCabe
metricinparticular,theneedforadditionalresearchin thisareais clear.
The
purpose of the current research is to empirically test the two main criticismssummarizedorraisedbyShepperd,andto suggestneededimprovements. Inregardto the theoretical
criticisms ofMyers and Hansen, Shepperddoes note that itis "arguablewhether [they] represent
much
ofan improvementover thatofMcCabe"
(p. 32),since "The majority ofmodifications toMcCabe'soriginalmetricremainuntested." (p.35)[Shepperd, 1988]
The
current researchdirectlyaddressesthis callforanempiricaltestofthese variations.
The
second research questionrevolves aroundthe practicalusefulnessofthe metric, giventhehighcorrelationsnoted by
some
previousresearch.As
notedby Shepperd, concerns aboutthe'Sce[BasiliandPerricone,1984],[FeuerandFowlkes,1980], [Paige, 1980],and[Wcxxlward,HcnneU, andHedley,1979].
external validity of the data and analyses in
some
ofthe previous studies can be used raised tomitigate
some
ofthe results, particularlythose using data onsmall programs, often usingstudentsubjects. Therefore, theseresults bearvalidation ondata fromactual systems, and,in particular,
fromdata
on
maintenanceprojects,sinceuse ofthe metric for testingand maintenance was oneofits author's main stated purposes. In particular, this research seeks not to determine whether
Cyclomatic complexity capturesallaspectsof complexityinonefigureofmerit,butrather toanswer the question raised by Shepperd, as to whether cyclomatic complexity can serve as a "useful
engineering approximation" [Shepperd, 1988].
Brieflysummarized,the resultsofthisresearchwere as follows. Ina test
on
834softwaremodules from a
number
ofexisting real-timesystems,neithertheMyersnorHansen
variantswerefound tobesignificantly different fromthe originalvalue as computedby
McCabe.
However,inregard tothesecondsetof researchquestions, theempiricalcriticismofShepperd andotherswas
validated, inthatthecorrelationbetweenCyclomatic complexityand
SLOC
was foundtobe.95forthesereal world systems. However, a simple transformation involvingthe cyclomatic complexity
metric is proposed, whereby the cyclomatic complexityis divided by the size of systemin source
statements, thereby determining a "complexity density" ratio. This complexity density ratio is
demonstrated tobea usefulpredictorof softwaremaintenanceproductivity,
on
asmallsampleofactualmaintenanceprojects.
The
restofthispaperisorganizedasfollows: Section2detailsthemainresearchquestions,andprovides references toprevious research literaturewhere appropriate. Section3outlines the
datacollectionproceduresandsummarizesthedatasetusedinthe research. Resultsarepresented
2.
CYCLOMATIC COMPLEXITY METRICS
2.1 Analytic Critiques
McCabe
[1976]proposedthat ameasureofthe complexityisthenumber
ofpossible pathsthroughwhichthe softwarecouldexecute. Sincethe
number
of pathsinaprogramwithabackwardbranchisinfinite,he proposedthatareasonablemeasurewould bethe
number
of independentpaths.Afterdefiningtheprogramgraphfora givenprogram, the complexitycalculationwouldbe:
V(G)
=
e-n +2.where
V(G)
=
thecyclomaticcomplexity,e
=
thenumber
ofedges,n
=
thenumber
ofnodes.Analytically,
V(G)
isthenumber
ofpredicates plus1foreach connectedsingle-entrysingle-exitgraphor subgraph.
From
the beginning, anumber
ofresearchershave proposedvariationson
theoriginalmetric.The
two most prominentvariations arethose ofHansen(1978)and Myers(1977) [Shepperd,1988].Hansen
[1978] gavefoursimple rulesforcalculatingMcCabe
'sComplexity:1. IncrementoneforeveryIF,
CASE
or otheralternateexecutionconstruct.2. IncrementoneforeveryIterative
DO, DO-
WHILE
orotherrepetitive construct.3.
Add
twolessthanthenumber
oflogicalalternativesinaCASE.
4.
Add
oneforeachlogicaloperator(AND,
OR)
in anIF.Myers[1977]demonstratedwhat hebelieved tobeaflawin
McCabe
*smetric. Myerspointed out thatan IF statement with logical operators was insome
senseless complicated than two IFstatements withone imbeddedintheother because thereisthepossibilityofanextra
ELSE
clauseinthesecondcase. Forexample:
IF(
A
andB
)THEN
ELSE
islesscomplexthan:
EF(
A
)THEN
IF(
B
)THEN
ELSE
ELSE
Myers[1977] notedthatcalculating
V(G)
byindividualconditions (assuggestedbyMcCabe)
made V(G)
identical forthesetwocases.To
remedythisproblem,Myerssuggestedusing a tupleof(CYCMID,CYCMAX)
whereCYCMAX
isMcCabe'smeasure,V(G),(aU fourrulesareused)andCYCMID
uses only the firstthree. Usingthiscomplexityinterval, Myers wasable to resolve theanomaliesinthe complexity measure.
Hansen
[1978] raised a conceptual objection to Myers's improvement. Hansen felt thatbecause
CYCMID
andCYCMAX
were so closely related, using a tuple to represent them was redundantand did not provide enough information towarrant the intrinsic difficulties ofusing atuple. Instead he proposedthat what wasreallybeing measured bythe second halfofthe tuple
(CYCMAX)
wasreallythe complexity ofthe expression,not thenumber
of paths through thecode.Furthermore,hefeltthat rule
#3
abovealsowasjustameasureofthecomplexity ofthe expressions.To
provide abetter metric,hesuggestedthat atuple consistingof(CYCMIN,
operation count) beusedwhere
CYCMIN
iscalculated usingjustthefirst tworules forcountingMcCabe's complexity. All of the above tuple measures have a problem in that, while providing ordering of thecomplexity of programs, they do not provide a single
number
that can be used for quantitativecomparisons. Itisalsoanopenresearchquestionwhetherchoosingoneortheother ofthese metrics
makes any practical difference. While a
number
of authors have attempted validation of theCYCMAX
metric(e.g. LiandCheung, 1987;LindandVairavan, 1989) there does notappeartobeanyempirical validationofthe practical significanceofthe Myersor
Hansen
variations [Shepperd1988]. Therefore, one question investigated in this research is a validation of the
CYCMIN,
CYCMID,
andCYCMAX
complexitymetrics.2.2 Empirical Tests
InShepperd'scomprehensivereview of previous researchonthemetric,henotesthat,
among
thenumerousempirical validationsorusesofthe metric, that themostconsistent singleresultisthe
high degree ofcorrelation between McCabe'smetricand acount of sourcelinesofcode(SLOC).
As
evidence ofthis, four studies cited by Shepperd had Pearson correlation coefficients of .9or greater for these two metrics. This"empirical criticism"suggest thatthe additional effortrequiredforcomputing and understandingthe
McCabe
cyclomaticcomplexity metricmay
notbepracticallyworthwhile. However, as noted by Shepperd, concerns about external vahdity of the data and
analyses in
some
oftheprevious studiescan be raised tomitigatesome
of the results, particularlythoseusingdataonsmallprograms fromstudentsubjects. Therefore, theseresultsbearvalidation.on
datafrom actual systems. In particular, applied research in thisareadoes not seekto determine
whethercyclomaticcomplexity capturesall aspectsof complexityinone figureofmerit, but rather
looks toanswerthe overridingquestionraisedbyShepperd,as towhethercyclomatic complexity can
23
ProductivityApplicationsAs
suggestedbyMcCabe.alikelyuse ofacomplexity metricisto"...identifysoftwaremodulesthatwillbedifficulttotestor maintain"[1976,p.308].
The
emphasison
maintenanceisappropriate,given thatsoftwaremaintenanceis an area ofincreasing practical signiScance. It isestimated that
from
40-75%
of information systems resources are spent on software maintenance [VesseyandWeber, 1986]. However, accordingtoLientzand Swanson [1980]relativelylittleresearch hasbeen doneinthisarea.
One
areathatisof considerablepracticalimportanceforresearchisthe effectofexistingsystem complexityon maintenanceproductivity. Itisoften assumedthat a)
more
complexsystemsareharderto maintain,andb) thatsystemssufferfrom"entropy",andtherefore
become more
chaoticand complexastimegoeson[BeladyandLehman, 1976]. Theseassumptionsraise a
number
of researchquestions,in particular,
when
does softwarebecome
sufficientlycostly tomaintain thatitis
more
economictoreplace (rewrite) itthan torepair (maintain)it? Thisis aquestionofsome
practical importance, as one type oftool in the
Computer
Aided Software Engineering(CASE)
marketplace is the so-called "restructurer", designed to reduce existing system complexity by
automaticallymodifying the existingsourcecode [Parikh, 1986;
US
GSA,
1987]. Knowledgethatametricsuchas
CYCMAX
accuratelyreflectsthedifficulty inmaintaininga setofsourcecode would allowmanagement
tomake
rationalchoicesinthe repair/replace decision, as well as aidinevaluatingthese
CASE
tools.The
empiricalevidencelinkingsoftware complexitytosoftwaremaintenancecostshasbeen criticized as beingrelatively
weak
[Kearney, 1986]. However,studies have shownthat asignificant fractionofthestaffresourcesusedinmaintenancearespentinunderstandingthe existing
code[Fjeldstadand Hamlen. 1979]. Therefore,a metric that measures complexity should proveto
2.4
Summary
ofResearch QuestionsBased upon theabove review ofthe previous literature,most especially that of Shepperd,
three research questionsarethe subjectofthisempirical study. Theyare:
1.
What
is the empirical relationship between the originalMcCabe
formulation of thecyclomatic complexitymetric andthatof the latervariantsproposed byMyers andHansen?
The
practical implicationsofthesequestionsarethat,ifthe variants are
shown
tobesignificantlydifferentthanthe originalformulationinpractice,then they meritcollectionandanalysisbybothresearchers
andpractitioners.
2.
What
isthe empirical relationshipbetweenthecyclomaticcomplexity metricandsimplersize metrics such as source lines ofcode?
The
practical implication of this question is whethercyclomatic complexity should be used as the numeratorin ratios todetermine the productivityof
software development and maintenanceefforts,orwhethersimplerto computemeasures, suchas
SLOC
wouldsuffice.3.
How
can cyclomatic complexity metrics be used to assess the impact of existing code complexityon maintenanceprojectproductivity? Ifarelationshipcanbeshown betweensome
form ofthemetricandproductivity,thenthemetriccan be usedas acomponentinmanagement
planningandcontrolof software maintenance, an activityof considerableeconomicsignificance.
3.
DATA
COLLECTION
3.1 Data OverviewandBackgroundof theDatasite
The
approachtakentoaddressingtheresearch questionsin thisstudywasto collect detaileddata about a
number
of completed software projects at one firm, hereafter referred to as AlphaCompany. Allthe projectsusedin thisstudyweresmall,customized,andprimarily real-timedefense
applications undertaken inthe period from 1984-1989, with mostofthe projectsin thelast three
years of that period.
The company
considers the data contained in this study proprietary andthereforerequested that its
name
not be usedand anyidentifyingdatathatcoulddirectly connectitwiththisstudybewithheld.
The
numericdata, however, have notbeenalteredinany way.The
company
employed anaverage of about 30professionalsinthesoftware department.One
beneflt of this approach was that because all the projects were undertaken by oneorganization, neither inter-organizational differences norindustrydifferencesshouldconfound the
results.
A
possible limitation is that this concentration on one industry and onecompany may
somewhatlimittheapplicabilityoftheresults. Ataminimum,however, the defenseindustryisavery importantindustry forsoftware-andtheresults willclearlybeofinteresttoresearchersinthat Geld.
More
generally,theseresultsmay
providesuggestions forresearchatothersites.3.2 DataSources
The
dataforthisstudywerederivedfromthreesources [Gill, 1989]:1. Source
Code
--A
copy ofthesourcecodedevelopedormodifiedbyeachprojectwas obtained.2. Accounting Database--Thisdatabasecontainedthe
number
ofhours every software engineerworked oneachproject foreveryweek
ofthestudyperiod.3. ActivityReports
~
A
descriptionofthework
donebyeach engineerforeachweek
foreveryproject.
Of
thesourcecode developedforthe projects,74.4% waswrittenin Pascal,and the other25.6% ina variety of other third generation languages, primarily
FORTRAN.
FollowingBoehm
[1981],onlydeliverable codewasincludedin this study,withdeliverabledefinedasanycode which
was placed under a configuration
management
system. This definition eliminated any purelyfortune magazinereported
m
itsSeptember25, 1989issue thatthePentagonspends $30 biUionannually onsoftware.temporarycode,butretainedcode whichmightnothavebeenpartofthe delivered system,butwas
consideredvaluableenoughtobe savedandcontrolled. (Suchcodeincludedprogramsforgenerating
testdata,procedurestorecompilethe entire system,utilityroutines,etc.)
The
sourcecode wasthemaindata sourceforaddressingthe firsttwosetsof researchquestions.
Inordertoaddressthemaintenanceproductivity question,twoadditionaltypesof datawere
required.
The
first of thesewasthe totalnumber
ofhoursworked ontheproject andthe personchargingthe project. These wereobtainedfromthefirm'saccountingdatabase. Becausethehourly
charges were used to bill the project customer and were subject to United States Government Department ofDefense
(DOD)
audit, the engineers and managers used extreme care to insureproperallocationofcharges. Forthisreason,thesedataarebelieved tobeaccuraterepresentations
of theactual effortexpended oneachproject.
A
thirddatasourcewasaweeklysummary
calledan activityreport ofthe work done by each software engineer everyweek
as part of their standardrecord-keepingfor
DOD
reporting.An
exampleof anactivityreportappearsinAppendixA
Theseactivityreportswereused to isolate codingand debugging activities
when
examining the effectofcomplexitydensityon maintenanceproductivity.
3
J
DataDeflnitionsForthepurposes ofthisresearch, asoftwaremodule wasdefinedasthesourcecodethatwas
placed in a single physical file (whichwas often compiled separately).
The
size of the software modules was measuredinnon-commentsourcelinesofcode(NCSLOC).
The
definitionofalineofcodeadoptedfor thisstudyisthatproposed by Conte,
Dunsmore
andShen [1986, p. 35]:A
line ofcodeis anylineofprogramtextthatisnot acomment
or blankline,regardlessofthe
number
ofstatementsorfragments of statementsontheline. Thisspecificallyincludesall lines containing program headers, declarations, and executable and non-executable
statements.
Thissetofrules has severalmajorbenefits. Itisverysimple to calculateand itisveryeasy
totranslate foranycxjmputerlanguage. Because ofitssimplicity, it has
become
themost popularSLOC
metric [Conte, et al. 1986, p. 35]. Through consistent use ofthis metric, resultsbecome
comparableacross different studies.
Inaddressingthefirst tworesearchquestions,thisstudy includedatotalof834 modulesof which 771 were written in Pascal and 63 in
FORTRAN.
They
averaged 176.2NCSLOC
(Non-Comment
LinesofCode)with astandard deviationof257.9. Intotal,approximately 150,000lines ofcode(NCSLOC)
from21 software systemswereanalyzed.The
definition of software maintenance adopted by this research is that of Parikh andZvegintzov[1983], namely"workdone onasoftware system afterit becomesoperational." Lientz
andSwanson[1980, pp. 4-5] include three tasksintheirdefinitionofmaintenance:"...(1)corrective --dealingwithfailures inprocessing, performance, or implementation, (2)adaptive--respondingto
anticipated changein thedataor processing environments;(3) perfective -- enhancing processing
efficiency, performance, or systemmaintainability."
The
projectsdescribed below asmaintenancework
pertain toallofthe threeLientzandSwanson types.To
address the third research question, projectdatawere gatheredforseven maintenanceprojects in order to test the hypothesis concerning the effect of complexity on maintenance
productivity.
The
averagesizeofamaintenanceprojectwas1006added (standarddeviation, 1158),andrequiredan average of 1059
work
hours (standarddeviation, 982).^ As
Boehm
[1987]hasdescribed,
SLOC
havemethodhasmanydeficiencies asametric.In particular,theyaredifficult todefineconsistently, asJones[1986] hasidentifiednofewer than elevendifferentalgorithmsfor
counting code.Also,
SLOC
donot necessarilymeasurethe "value"ofthecode(in terms offunctionalityandquality). However,
Boehm
[1987] also points out that no other metric has aclear advantage over SLOC.Furthermore,
SLOC
are easytomeasure, conceptuallyfamiliar tosoftware developers,andare usedinmostproductivitydatabasesandcostestimationmodels.Forthese reasons,itwasthemetricadoptedbythisresearch.
4
RESULTS
4.1 Empirical Tests of Analytic Critique
Forthe834modules,threecomplexitymetricswere computed:
CYCMAX
—
the originalMcCabe
cyclomaticcomplexity.CYCMID
—
theMyersvariation.CYCMIN
—
theHansenvariation.Table4.1.1givestheaverage valueforeachmetric. Table4.1.2givesthePearsoncorrelation
coefficients for thethreecomplexitymetricsusedin thisstudy.
the empiricaldata suggestthat there are unlikely tobe any practicallysignificant differentresults
using
CYCMIN
orCYCMID
insteadofCYCMAX.
4.2 Test of Empirical Criticism
As
suggested by previousresearch, the length metric,NCSLOC,
andthecomplexity measure,CYCMAX
also turned out to be highly correlated. Table 4.2.1 gives the Pearson correlationcoefficientsforthecomplexitymetricswiththe
SLOC
metrics. Similar totheresultsfoundinsome
donothavea great dealofcontrolover thesizeofaprogramasitisintimatelyconnectedtothesize
ofthe application. However,bymeasuringandadopting complexitystandards,and/or byusing
CASE
restructuringtools, theycan
manage
unnecessarycyclomatic complexity.The
secondreasonisthatthere are valid intuitive reasons
why
the length of the codemay
not have a large effect onmaintenanceproductivity.
When
asoftwareengineer maintainsaprogram, eachchangewilltendtobelocalized to a fairlysmall
number
of modules.He
will,therefore,onlyneeddetailedknowledgeofa small fractionof thecode.
The
sizeofthisfractionisprobably unrelatedtothetotalsizeoftheprogram. Thus thelength of theentirecode isless relevant.'* Therefore, a transformed metric,
complexitydensity, isdefinedastheratioofcyclomaticcomplexitytothousandlines of executable
code
(KNCSLOC).
These complexitydensity measures are similar to Gilb's logical decisions per statement [Gilb, 1977],but they have not oftenbeen published.One
exception is aby-productofSelby'sresearchonreusedcode[Selby, 1988]. Inhisstudyof25
NASA
projects,hismean
was81.4,withastandarddeviationof57.2.
CMAXDENS,
theaverageMcCabe
complexity per thousandlines ofcode was 121.3 foroursample (standarddeviation of62.7).Since both Pascal and
FORTRAN
code was analyzed in this research, it is important todeterminewhetherthereexistsanyeffectofprogramminglanguageused.
The
average complexitydensityof
FORTRAN
modules was comparedwithPascalmodules(seeTable4.3.1).*Infact,thecorrelation of productivitywithinitialcodelengthwasnotsignificant(Pearsoncoefficient,-0.21;
p
=
0.61).This figure suggests that productivity declineswith increasingcomplexitydensity. While it
would bespeculative to ascribe too
much
importancetotheresults,whicharebasedononlysevendatapoints, the resultsdo
seem
sufficientlystrongand the implications sufficientlyimportantthatfurtherstudyisindicated. Iftheseresultscontinuetohold
on
a largerindependent sample, then suchresultswouldprovide strong supportfortheuseofthecomplexitydensitymeasureasa quantitative
method
foranalyzing softwaretodetermineitsmaintainability. Itmightalsobeusedasanobjectivemeasure ofthisaspectof softwarequality.
5
SUMMARY
AND CONCLUDING REMARKS
This paper began by directly addressed the two research questions raised by Shepperd
regardingthe
McCabe
cyclomatic complexitymetrics.The
firstquestionwas whetherthe analyticalcriticismsof theoriginalmetrichadanypractical significance. Baseduponthedatausedin thisstudy,
neither the Myers nor
Hansen
variationsseemtoresult invalues that areequivalenttothe originalspecification for all practical purposes to which they might be put. Thus, these data support
Shepperd's hypothesisthatthevariationsdonotrepresenta significantdifferenceoverthe original
formulation.
The
secondresearchquestionrevolvesaroundthe empirically-derivedvalueaddedbythemetric,given the highcorrelationsfound by previous researchbetweenthe metricandstandard
sizemeasures,mostparticularly
LOG
.The
datafromthisstudyprovideadditionalevidenceofthishighcorrelation. This is particularlynoteworthysincethese dataare fromactualsystems, and, in
particular, include dataon maintenance projects,which is a main intendeddomainofthe metric.
When
the outlying projectisremoved,the resultsimprovedramatically:PRODUCTIVITY
= 31.3 - 0.142*CMAXDENS
(6.67) (-4.98)
R* = 0.86 AdjustedR* = 0.82
F-Value
=
24.84 n=
6Going beyond this aDirelation, this paper has proposed the use ofa transformed version ofthe
metric, referred to as"complexity density", wherebythe ratioof thecyclomatic complexity ofthe
moduletoitslengthin
LOC
iscalculated. Thisratioismeanttorepresent the weighted complexity of a module,and hence its likely level ofmaintenance task difficulty. This proposed complexitydensity ratio is tested on a small pilot sample of projects and
shown
to be a goodsingle valuepredictorofmaintenancecosts. Thus,itisproposedthatthistransformed versionofthe cyclomatic
complexity metric canserve, inShepperd's words,asa "usefulengineeringapproximation" [Shepperd,
1988].
Of
course, further empirical researchwillbe requiredto testthesepilotresultsona largersample drawn from a differentenvironmentin order tovalidatethe usefulness ofthe complexity
densityratio.
The
complexitydensityratioproposed bythisresearchcould provetobea simple,butpractically usefulmeasureofcomplexity. Itincorporates the
McCabe
cyclomaticcomplexitymetric,awell-knownandwell-understoodmeasureofcomplexity. In addition,inthe interveningyears since
itsintroduction, the collectionofthedata necessarytocomputethemetrichasbeen automatedby
a
number
oftools. Therefore, theearlydifficultiesincollectingthesedata havebeenresolved,andthereforedatacollectionshould not provetobea practical barrier to theratio'suse.
The
continuedsearch for useful metrics of software product complexity is a necessary first step in theongoing processofmovingsoftwaredevelopment firmly into therealm of engineering. Withwell-founded
complexitymetrics,developerswillhaveobjectiveanduseful yardstickswithwhichtoevaluatetheir
initialproducts. Thesemetricswillalsobe used byeachmaintenanceteamasitseekstoenhance theinitial product without increasing the levelofunnecessarycomplexity. Bothof these aspects
should
make
a contributiontowards theimprovedmanagement
ofthe softwaredevelopment andmaintenanceprocess.
e.g.,seeLanguage TechnologyIncorporated's
INSPECTOR
product,andSET
Laboratories' product.6
BroUOGRAPHY
[BasiliandPerricone, 19S4]
Basil!, V. R. and B. Perricone, "Software Errors and Complexity:
An
Empirical Investigation,"Communicationsofthe
ACM,
27(1):42-52,(January 1984).[Beladyand
Lehman,
1976]Belady,L.A.andLehman,
M.
M.;"A
Model
ofLargeProgram Development";IBM
SystemsJournal,V. 15, n. 1,pp.225-252,(1976).
[Boehm,1981]
Boehm,
Barry W.; Software EngineeringEconomics; Prentice-Hall,EnglewoodClifEs,NJ, 1981.[Boehm,etal. 1984]
Boehm,
B.W.; Gray,T.E.;andSeewaldt,T.;IEEE
TransactionsonSoftwareEngineering;May
1984.[Boehm,1987]
Boehm,
Barry W.; "Improving SoftwareProductivity";Computer,September1987; pp. 43-57.[Conte,et al.1986]
Conte, S. D.; Dunsmore, H. E.; and Shen, V. Y.; Software Engineering Metrics and Models; Benjamin/CummingsPublishingQarapany,Inc., 1986.
[Evangelist,1982]
Evangelist,
W.
M., "Software complexity metricsensitivitytoprogramstructuringrules," Journal of Systems<kSoftware, 3231-243, (1982).[Feuerand Fowlkes, 1980]
Feuer, A. R. and E. B. Fowlkes, "Some results from an empirical study ofcomputer software",
Proceedings of the Fourth
EEEE
International Conferenceon
Software Engineering, Munich, Germany, 1980, pp.351-355.[Fjeldstadand Hamlen, 1979]
Fjeldstad, R. K. and Hamlen,
W.
T.; "Application program maintenance study: Report to ourRespondents";Proc of
GUIDE
48;The
GuideCorporation,Philadelphia, 1979.Alsoin pParikhand Zvegintzov, 1983].[GUb,1977]
Gilb,Thomas;SoftwareMetrics;WinthropPress,Cambridge,
MA
1977.[Gill,1989]
Gill. Geoffrey KL;
"A
Study of the Factors that Affect Software Development Productivity";unpublished
MIT
Sloan MastersThesis; 1989; Cambridge,MA.
[Hansen, 1978]
Hansen,
W.
J.; "MeasurementofProgram ComplexityBy
the Pair(CyclomaticNumber, Operator Count)";ACM
SIGPLAN
Notices; 13 (3):29-33,March
1978.[Jones,1986]
Jones,Capers;ProgrammingProductivity; McGraw-Hill,
New
York,1986.[Kearney,etal, 1986]
Kearney, J.
K,
R. L. Sedlmeyer,W.
B. Thompson,M.
A. Gray andM.
A. Adler, "Software ComplexityMeasurement", CommunicationsoftheACM,
29(11): 1044-1050,(November
1986).[Liand Cheung,1987]
Li, H. F. and Cheung,
W.
K;
"An Empirical Study of Software Metrics";IEEE
Transactionson Software EngineeringSE-13 (6):697-708,June, 1987.[Llentzand Swanson, 1980]
Lientz, B. P. and Swanson,E.B.; SoftwareMaintenanceManagement;Addison-Wesley Publishing
Company,1980.
[LindandVairavan, 1989]
Lind,
Randy
K. and Vairavan, K.; "An ExperimentalInvestigation of Software Metrics andTheirRelationship to Software Development Effort";
IEEE
Transactions on Software Engineering; 15(5):649-653, 1989.
[McCabe, 1976]
McCabe, Thomas
J.;"A
Complexity Measure";IEEE
Transactions onSoftwareEngineering; SE-2(4):308-320, 1976.
[Myers, 1977]
Myers,G.J.;"AnextensiontotheCyclomaticMeasureofProgramComplexity".
SIGPLAN
Notices;12(10):61-64, 1977.
[Paige,1980]
Paige,M.,
"A
metricforsoftware testplanning",Proceedings ofCOMPSAC
80, Buffalo,New
York,1980, pp.499-504.
[Farildi,1986]
Parikh, Girish;"Restructuringyour
COBOL
Programs";ComputerworldFocus;20(7a):39-42;February 19, 1986.[ParikliandZvegintzov, 1983]
Parikh,G.andZvegintzov,N.;eds.TutorialonSoftware Maintenance,
EEEE
Computer
SciencePress,SUverSpring,
MD
(1983). [Selby,1988]Selby,Richard W.;"EmpiricallyAnalyzing SoftwareReuseinaProduction Environment"; Software Reuse
-
EmergingTechnologies; Traz,W.
eds.IEEE
Computer
Society Press,NY NY
1988.[Shepperd, 1988]
Shepperd, M.,
"A
critiqueof cyciomatic complesdty asa software metric," Software EngineeringJournal,3(2):30-36,(March 1988).
[US
GSA,
1987]Parallel TestandEvaluation of aCobolRestructuring Tool;Federal Software
Management
SupportCenter; United States General Services Administration, Office of Software Development and Information Technology, FaUs Church,VirginiaSept 1987.
[Vesseyand Weber, 1983]
Vessey, I. and Weber, R.;
"Some
Factors Affecting Program Repair Maintenance:An
EmpiricalStudy";Communicationsofthe
ACM;
26(2):128-134,February,1983.[Woodward,et al, 1979]
Woodward,
M.
R.,M.
A
Hennell and D.A
Hedley,"A
measureofcontrol flow complexityinprogramtext,"
IEEE
TransactionsonSoftwareEngineering,5(1):45-50, (1979).[Zuseand Bollmann,1989]
Zuse,H.andP.Bollmann, "Softwaremetrics:usingMeasurement TheorytoDescribethe Properties
andScalesofStaticSoftware ComplexityMetrics,"
ACM
SIGPLAN
Notices,24 (8):23-33, (1989).AppendixA:
Example
ofanActivityReportJunior Software Engineer's
Name
Weekly
Report 4-1-1987Project
#1
35hrs. Spentttiewholetimeworkingwith [StaffScientist] and[PrincipalScientist]tryingto findthecauseof thedifferencebetween[BaselineRuns]and
[New
Instnmient Runs]. Afteralotofheadscratchingitturnsoutthatbothwere
correct.
We
justhadtoplaywiththeformat of the dataandunits togetthetwodatasetstomatchto within[xxx]
RMS.
Thisisstillabovethe [yyy]RMS
limit,but the[baseline runs]and [newinstrumentruns] started offbeing[zzz]RMS
apart.So,untilthe[newinstrument]canbeimproved,my
softwarecando nobetterthanit isdoing now.
Project
#2
1 hrs. Ihadverylittletimetoworkon
this,butIdidmanage
tostart tofinish themodifyadhoctarget routine.
Project
#3
9hrsThe
automaticheight input forthe calibrationroutine is up andrunning.Ialso
made some more
modificationssothattheycanreturn totheoldmethodifthe
IEEE
boardfails.Project
#4
2hrs [StaffScientist]and Ispent the twohourstryingto get the microVAX
upafter
we
installedthe[SpecialBoards].We
endeduphavingto pull theboards back out to reboot the machine.The
boardswere eventuallyinstalled, buttherestillseemstobesomething wrong.
Total
=
45.0hoursDate
Due
MIT LIBRARIES DUPL