Comparison of atlas-based techniques for whole-body bone segmentation

(1)

Article

Reference

Comparison of atlas-based techniques for whole-body bone segmentation

ARABI, Hossein, ZAIDI, Habib

Abstract

We evaluate the accuracy of whole-body bone extraction from whole-body MR images using a number of atlas-based segmentation methods. The motivation behind this work is to find the most promising approach for the purpose of MRI-guided derivation of PET attenuation maps in whole-body PET/MRI. To this end, a variety of atlas-based segmentation strategies commonly used in medical image segmentation and pseudo-CT generation were implemented and evaluated in terms of whole-body bone segmentation accuracy. Bone segmentation was performed on 23 whole-body CT/MR image pairs via leave-one-out cross validation procedure. The evaluated segmentation techniques include: (i) intensity averaging (IA), (ii) majority voting (MV), (iii) global and (iv) local (voxel-wise) weighting atlas fusion frameworks implemented utilizing normalized mutual information (NMI), normalized cross-correlation (NCC) and mean square distance (MSD) as image similarity measures for calculating the weighting factors, along with other atlas-dependent algorithms, such as (v) shape-based averaging (SBA) and (vi) Hofmann's pseudo-CT generation method. The performance [...]

ARABI, Hossein, ZAIDI, Habib. Comparison of atlas-based techniques for whole-body bone segmentation. Medical Image Analysis , 2017, vol. 36, p. 98-112

PMID : 27871000

DOI : 10.1016/j.media.2016.11.003

Available at:

http://archive-ouverte.unige.ch/unige:90691

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

ContentslistsavailableatScienceDirect

Medical Image Analysis

journalhomepage:www.elsevier.com/locate/media

Comparison of atlas-based techniques for whole-body bone segmentation

Hossein Arabi

^a

, Habib Zaidi

^a^,^b^,^c^,^d^,^∗

aDivision of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, CH-1211 Geneva, Switzerland

bGeneva Neuroscience Center, Geneva University, CH-1205 Geneva, Switzerland

cDepartment of Nuclear Medicine and Molecular Imaging, University of Groningen, University Medical Center Groningen, 9700 RB Groningen, Netherlands

dDepartment of Nuclear Medicine, University of Southern Denmark, DK-500, Odense, Denmark

a rt i c l e i nf o

Article history:

Received 14 June 2016 Revised 7 November 2016 Accepted 10 November 2016 Available online 12 November 2016 Keywords:

MRI PET/MRI Bone segmentation Atlas-based segmentation Whole-body

a b s t ra c t

Weevaluatetheaccuracyofwhole-bodyboneextractionfromwhole-bodyMRimagesusinganumber ofatlas-based segmentationmethods.The motivationbehindthiswork istoﬁndthe mostpromising approachforthepurposeofMRI-guidedderivationofPETattenuationmapsinwhole-bodyPET/MRI.To thisend,avarietyofatlas-basedsegmentationstrategiescommonlyusedinmedicalimagesegmentation andpseudo-CTgenerationwereimplementedandevaluatedintermsofwhole-bodybonesegmentation accuracy. Bonesegmentation was performedon 23 whole-bodyCT/MR image pairs vialeave-one-out crossvalidation procedure.Theevaluatedsegmentationtechniquesinclude: (i)intensityaveraging(IA), (ii) majorityvoting(MV),(iii)global and(iv)local(voxel-wise) weightingatlas fusionframeworks im- plementedutilizingnormalizedmutualinformation(NMI),normalizedcross-correlation(NCC)andmean square distance(MSD) as image similaritymeasures for calculatingthe weighting factors,along with otheratlas-dependentalgorithms,suchas (v)shape-basedaveraging(SBA)and(vi)Hofmann’spseudo- CT generation method.The performance evaluationofthe differentsegmentationtechniques wascar- ried outin termsof estimatingboneextraction accuracy from whole-body MRI usingstandard metrics,suchasDicesimilarity(DSC)andrelativevolumedifference(RVD)consideringbonystructuresob- tainedfromintensitythresholdingofthereferenceCTimagesasthegroundtruth.ConsideringtheDice criterion, globalweightingatlas fusionmethods providedmoderate improvement ofwhole-bodybone segmentation (DSC=0.65±0.05) comparedto non-weightedIA (DSC=0.60±0.02).The local weighed atlas fusionapproachusingthe MSD similaritymeasure outperformedthe otherstrategiesbyachiev- ingaDSCof0.81±0.03whileusingtheNCCand NMImeasures resultedinaDSCof0.78±0.05and 0.75±0.04,respectively.Despiteverylongcomputationtime,theextractedboneobtainedfrombothSBA (DSC=0.56±0.05)andHofmann’smethods (DSC=0.60±0.02)exhibitednoimprovementcomparedto non-weightedIA.Findingtheoptimumparametersforimplementationoftheatlasfusionapproach,such asweightingfactorsandimagesimilaritypatchsize,havegreatimpactontheperformanceofatlas-based segmentationapproaches.Thevoxel-wiseatlasfusionapproachexhibitedexcellentperformanceinterms ofcancellingoutthenon-systematicregistrationerrorsleadingtoaccurateandreliablesegmentationre- sults.DenoisingandnormalizationofMRimagestogetherwithoptimizationoftheinvolvedparameters playakeyroleinimprovingboneextractionaccuracy.

1. Introduction

The emergence of hybrid imaging techniques, such as PET/CT andPET/MRIinclinicalpracticeengenderedanumberofnewclin- icalandresearch opportunitiesandimprovedthequantitative ac- curacyanddiagnosticconﬁdenceofPETﬁndings(Judenhoferetal., 2008). Anumber ofactive research groupsare focusing their ef-

∗ Corresponding author. Fax: + 41 22 372 7169.

E-mail address: [email protected] (H. Zaidi).

forts on addressing thechallenges ofcombined PET/MRI, encom- passing instrumentation developments,optimization of workﬂow anddataacquisitionprotocols andthe improvementofthe quantitative performance of both imaging modalities (Zaidi and Del Guerra,2011).Besidethepreciousanatomicalinformationprovided by CT orMRI, additional information that can be extractedfrom theseimages, such asattenuation propertiesof body tissuesand motion information can be exploited for correction of emission data and quantitative PET image reconstruction. However, MRI- guidedattenuationcorrectioninwhole-bodyPET/MRIprovedtobe http://dx.doi.org/10.1016/j.media.2016.11.003

(3)

achallengingissueandhasthereforeremainedanactiveandopen research questionduringthelast decade(Mehranianetal., 2016).

Commercially available PET/MR scannersemploy tissue classiﬁca- tionmethods, whichrelyonsegmentationofMRimagesintotis- sueclassesandassigninguniformlinearattenuationcoeﬃcientsto each tissueclass(Martinez-Molleretal., 2009;Arabietal., 2015).

The major drawback ofsuch methods, particularlyinthe context ofwhole-bodyimaging,liesinignoringbonesasaseparatetissue class. Sincebone tissue generatesa voidsignal whenusing com- monMRsequences,itisindistinguishablefromair.Assuch,bony structures arecommonlyreplaced bysoft-tissueincurrentmeth- ods, thus leading to signiﬁcant underestimation of tracer uptake inthevicinityofbonystructures (Bezrukovetal.,2013;Hofmann etal.,2011).

Anumberoftechniqueshavebeen proposedto considerbone tissue duringattenuationcorrection (AC)inwhole-body PET/MRI.

Basically two categories have emerged: atlas-guided attenuation mapgenerationapproaches(Hofmannetal.,2011;Bezrukovetal., 2013; Arabi and Zaidi, 2016a; Marshall et al., 2013; Arabi and Zaidi, 2016b) andemission-basedapproaches(Rezaei etal.,2012; Mehranian and Zaidi, 2015). Atlas-guided methods primarily rely onprior informationprovidedby registrationofanatlasinto tar- getimagecoordinatestoallowclassificationofbonetissues.Direct segmentationofbonesfromMRimages,particularlyinwhole-body imaging, is a difficult task owing to anatomical complexity, low quality andhigh noise levelof dedicatedMR sequences used for thepurposeofAC(Hofmannetal.,2008).Atlas-guidedsegmenta- tion hasbeensuccessfullyappliedinvarious image segmentation tasks usinga wide variety ofimagingmodalities (Lorenzo-Valdés etal.,2004). Inprinciple,each individualatlasimagetransformed tothecoordinatesofthetargetimageisregardedaspotentialcan- didate. It has, however, been proven that using the information frommultipleatlas imagesleadstomoreaccurate results(Svarer et al., 2005). The informationobtainedfrom severalatlasimages canbepooledintoanaverageatlasorintoasocalledprobabilistic atlas(Rohlfingetal.,2001;Svareretal.,2005).However,thereisa trendtotakefulladvantageofmultipleatlasimagesathandbyex- ploitingpatternrecognitiontechniquestoidentifymorphologically similarcasesintheatlasdatasetduringthemulti-atlasfusionpro- cess. Thisdramatically reduces non-systematicregistration errors and improves the accuracy of the segmentation (Artaechevarria etal.,2009).

Various strategies were proposed to incorporate bone tissue in PET/MRI attenuation maps in whole-body imaging (Hofmann etal.,2011;Bezrukovetal., 2015;Ayetal.,2014;ArabiandZaidi, 2016a; Bezrukov et al., 2013; Marshall etal., 2013; Pauluset al., 2015). In whole-body imaging, almost all proposed methods, except joint attenuation-activity reconstruction techniques, rely on priorknowledgepresentinatlasimagestopredictbonefromMRI.

Moreover,owingtolongacquisitiontime,applicationofultra-short echo time (UTE)(Keeremanetal., 2010) orzero time echo(ZTE) (Delso et al., 2015) sequences are still limited to brain imaging (singlebedposition).Atlas-guidedsegmentationhasbeensuccess- fully appliedin various image segmentation tasks usingdifferent imaging modalities, particularly forcases with very low contrast to the surrounding tissues (Lorenzo-Valdés et al., 2004). Atlas- basedmethodsareofspecialinterestsincetheyhavesofarexhib- ited superiorperformance intermsof boneidentiﬁcation(Burgos et al., 2014) particularlyin whole-body imaging (Hofmann et al., 2011). Burgosetal.(2014) demonstratedsuperior performanceof atlas-basedmethodsinCTsynthesisandPETquantitativeaccuracy compared toa segmentationmethodusingan UTE MRIsequence in brainimaging. Likewise, Mehranianetal.(2016) demonstrated that atlas-based methods provide the most accurate attenuation maps compared to simultaneous activity-attenuation estimation andstate-of-the-art 3-classsegmentation method.In whole-body

imaging, Hofmann et al. (2011) proposed an atlas-based method combinedwithapatternrecognitiontechnique,whichresultedin lessthan 10% uptakeerror on average, thus outperforming stan- dardsegmentationmethodsinwhole-bodyimaging.Marshalletal.

(2013)evaluatedamethodenablingtoincorporatebonystructures intoattenuationmapsbasedonafastatlas-basedapproach.Byin- cludingbone,themagnitudeoftherelativeerrorwasreducedtoa rangeacceptableinclinicalsetting.

Various atlas-based methods were independently developed and evaluated using different MRI sequences, different atlas datasetsin termsof samplesize,patient variability, ﬁeld ofview and body region, different MRI quality (noise level or acquisi- tiontime)andevaluationprocedures andmetrics. Althoughthere is substantial literature reporting promising results achieved by atlas-basedmethods,theperformanceofthesetechniquesstill re- quiresfurtherinvestigationbasedonacommonground.Therefore, acomparisonofvarious atlas-basedstrategiesprovidesavaluable insightintotheirapplicationtoattenuationcorrectioninPET/MRI.

Since the delineation of bones is the most challenging task in whole-body MRI-guided attenuation map generation, we fo- cused our comparison of the various pseudo-CT generation approaches andatlas-based segmentation methods on the accuracy ofextractedwhole-bodybone.Tothisend,weselectedandimple- mentedanumberofconventionalatlas-basedsegmentationmeth- ods, such as majority voting, intensity averaging, global and local weighting atlas fusion strategies together withHofmann’s algorithm (proposedforwhole-body PET/MRattenuation map generation) and shape-based averaging (SBA) technique. In addition to the comparison of the different segmentation techniques, our goalistoselectthemostpromisingalgorithmforattenuationcor- rection in whole-body PET/MRI. The very preliminary results of thiswork havebeenpreviously published(ArabiandZaidi,2014).

The presentarticlepresents a substantial extension of the previ- ousworkthroughtheimplementationandcomparisonofahigher number of algorithms using a larger database of clinical studies andreportingmoredetailedquantitativeanalysisofthedata.

2. Materialsandmethods 2.1. Atlas-basedsegmentation

The objectiveof atlasbased segmentation is toprovide label- ing of unknown tissue classes on the target image. Consider the segmentation ofan image withpotentially Ldifferentclassesbe- longingtoalabelsetLabel={1,2,…, L}.Inthecaseofboneseg- mentation,thenumberofclassesisconﬁnedtoL=2wherelabel 1 stands for background and label 2 represents bony structures.

Here,asetof3-DMRimagesAmr_nalongwiththeircorresponding alignedCT images Actn are considered as atlas images.An atlas- basedclassiﬁer isdeﬁnedby asetofatlasimages Amrn n=1, …, Nandtransformation matrices(Mn)which map coordinatesfrom thetargetimageTtotheatlasimagesn:Mn:R³→R³.Sincebone segmentationcan be simplycarriedout by intensitythresholding ofCTimages,Actn images actascandidates fortissue labelingof thetargetMRimageofT.Applyingagiventransformationmatrix Mn toanatlasimageActn yieldsanestimatedsegmentationofthe targetsubjectT_AnwhereasetofsegmentationcandidatesT_Ann=1,

…,Nmustbecombinedtoformthefinalestimatedbonesegmen- tationTs.Atlas-basedsegmentation canberegardedastheclassi- fication ofX unorderedsamples wherethe candidaten assignsx toclassl ∈Label.Theoutput ofN independentclassifierscanbe combinedtogenerateasingleresponse ofthecombinationstrat- egy,E(x). Theaimofbuildingan ensembleclassifier istoachieve ahigherprobabilityofcorrectlyclassifyingthevoxelsoftheimage thanthatobtainedbyusinganindividualclassifiermaximizingthe probabilitygivenall classifierdecisionsT_Anandaclassifier perfor-

(4)

mancemodelC(Eq.1)(Rohlﬁngetal.,2004b).

E

(

^x

)

⁼^a^rg^max_l ^P

(

^x⁼^l

|

^T^A¹^, ^.^. ^., ^T^A^N^, ^C

)

⁽¹⁾

2.2.PET/CTandPET/MRdataacquisition

ThestudypopulationcomprisedN=23consecutivepatients,15 men and 8 women (mean age±SD=60±8 y), refereed to our department for MRI of the head and neck, whole-body ¹⁸F-FDG PET/MRIandwhole-body ¹⁸F-FDG PET/CTforstagingofheadand neck malignancies. The study protocol was approved by the in- stitutionalethics committeeandall patientsgave their informed consent to participate in the study. ¹⁸F-FDG PET/CT scans were performedona Biograph64TruePointscanner(SiemensHealth- care, Erlangen, Germany). The CT subsystem consists of a 40- row ceramic detector with 1344 channels per row using adap- tive collimation and the z-sharp technique to acquire 64 slices per rotation. After a localization scout scan, an unenhanced CT scan (120kVp, 180mAs, 24×1.5 collimation, a pitch of 1.2, and 1s per rotation) was performed for attenuation correction and localization. The typical acquisition time for whole-body CT was lessthan 10s. PET/MRI examinations were performed on the In- genuity TF PET/MR, a sequential system consisting of a whole- body time-of ﬂight (TOF) GEMINI TF PET and a 3T Achieva TX MRI separated by a distance of 3m sharing a common rotat- ing table platform (Zaidi etal., 2011). The gradient systemvalue andthe slewrateare 40mT/mand100mT/m/s, respectively.The coils used for MR imaging include a SENSE neurovascular 16- channelcoilforheadandneckandaquadraturebodycoilfortotal body scanning. Whole-bodyDixon examinations were performed on the 3T Achieva TX MRI of the Ingenuity TF PET/MR scanner.

ThewholebodyDixon3DvolumetricinterpolatedT1-weightedse- quence(Dixon,1984)wasacquiredusingthefollowingparameters:

ﬂip angle 10°, TE₁ 1.1ms, TE₂ 2.0ms, TR 3.2ms, 450×354mm² transverseFOV, 0.85×0.85×3mm³ voxelsize, anda totalacqui- sitiontimeof2min17s.BothMRI andCTacquisitionswere per- formedinfreeshallowbreathing.Thissequenceproducedin-phase andopposed-phaseimagesthatarethenaddedtogether toobtain wateronlyimages,andsubtractedtogetfat-onlyimages.In-phase imageswereusedfortheassessmentofwhole-bodybonesegmentation.

Due to temporalseparation betweenMRI andCT acquisitions, in-phaseMRIwere deformablyregisteredtothecorresponding CT imagesusingtheElastixframeworkbasedontheITKlibrary(Klein et al., 2010) using a combination of rigid registration based on maximum mutual information and non-rigid registration as de- scribedpreviously (Akbarzadeh etal., 2013). MRI andCTacquisi- tionswere performedwiththesamepatient positioning tomini- mizenon-rigiddeformation.However,incaseofalignmenterrors owingfor instanceto breathing motion, theregistration parame- terswere adjusted to achieve acceptable results.In caseof gross registrationerrors,thestudieswereexcluded.

2.3.Datapreprocessing

Clinical whole-body MRimages containa relatively highlevel ofnoise andare commonlycorruptedbylow frequencybiasﬁeld andinter-patient intensityinhomogeneity (Lötjönen etal., 2010).

Aswill be describedinthe followingsection, bone segmentation proceduresentail directhandlingof MRimageintensity. Assuch, the presence of aforementioned sources of intensity variation in MRimagesmightskewbone segmentationaccuracy.Toovercome theseprospectivesourcesof error,in-phaseMRimagesofall pa- tientsunderwentsomepre-processingproceduresinthefollowing order:

• Gradientanisotropic diffusionﬁlteringto suppressnoise using thefollowing parameters:conductance=4,iterations=10and timestep=0.01. Thisalgorithmsmoothes regions ofanimage wherethegradientmagnitudeisrelativelysmall(homogenous regions)but diffuseslittle over areas ofthe image where the gradientmagnitudeislarge (i.e.,edges).Therefore,the central regionsofobjectsaresmoothedbuttheiredges areblurredto alowerextent.

• N4biasﬁeldcorrection (Tustison etal., 2010) toremovemag- neticﬁeld inhomogeneity effect:Bspline gridresolution=400, numberofiteration=200(ateachgridresolution),convergence threshold=0.001, Bspline order=3, Spline distance=400, numberofhistograms=256andshrinkfactor=3.

• Histogram matching (McAuliffe et al., 2001): Histogram level=1024 and match points=128. In order to get the best resultfromhistogrammatching,itisrecommendedtoexclude backgroundairvoxelsofboth referenceandtarget imagesbe- foreprocessing.

Thebone segmentationprocedure requiresthebinarymaskof segmented background air to save processing time. To this end, the external body contour wasdetermined by applying a 3D ac- tive snakecontouralgorithmon in-phaseMRimages(Kassetal., 1988).Thesegmentationprocessbeginsbymanualselectionofthe initialseedsinthebackgroundusingtheITK-SNAPimageprocess- ingsoftware(Yushkevichetal.,2006).

2.4. Labelfusionstrategies

This study contains 23 pairs of co-registered in-phase MRI DixonandCTimages.AllMRimageswere processedaccordingto the procedure described in Section 2.3. Using the leave-one-out cross-validation(LOOCV) method, foreach subject, imagesof the remainingN-1(i.e.22) patientsare non-rigidlywarpedtothe co- ordinatesof the target image. Imageregistration wascarriedout usingthe Elastix package (based on theITK library) (Klein et al., 2010) through a combination of aﬃne and non-rigid alignment based on the advanced Mattes mutual information as described inprevious work (Akbarzadeh etal., 2013). Thefollowing param- eterswere adopted:interpolate: Bspline,optimizer:standard gra- dientdescent,imagepyramidschedule:(16 8422),gridspacing schedule (32.0 16.0 8.0 4.0 2.0), maximum number of iterations (4096409620481024512),numberofhistogrambins:32.Theob- tainedtransformationmatricesfromtheregistrationbetweenatlas andtargetMRimageswere appliedtothecorrespondingatlasCT images.Foreach target image, 22 candidateCTimages are avail- ablefromwhichbonecanbesegmentedbyintensitythresholding usinga thresholdof 180HU. Thiswork focusesonhow well the labelfusionstrategies canpool theinformationfrom22segmen- tationcandidates tomaximize the ﬁnal boneextraction accuracy.

Inthefollowingsections,wedescribeindetaillabelfusionstrate- giescommonlyusedinatlas-basedsegmentation.

2.4.1. Generalaveraging

Acommonlyusedapproachforpseudo-CTgenerationandseg- mentationofanatomicalstructuresistosimplycalculatethearith- metic average of the alignedatlas images (Rohlﬁng et al., 2001; Rohlﬁng et al., 2004a). In our case, general arithmetic averaging isperformedbycomputingtheintensityaverageofN=22aligned atlasCTimages(Eq.2).Thereisnoselectiveorweightingstrategy inthisapproachandallatlasimages(regardlessoftheirmorpho- logicalsimilarityto thetargetsubject) contributeequallyto bone extractionprocess.

Tav= 1 N

N

n=1

T_An Bn

(

^x

)

⁼

1, if Tav

(

^x

)

^>¹⁸⁰

0, otherwise (2)

(5)

HereT_An isthenthalignedatlasCTimagetothetarget image T.Asmentionedearlier,bonesegmentation(Bn)canbeperformed byapplyingintensitythresholdingtotheaverageimage,Tav.Here- after,wecallthisapproachintensityaveraging(IA),meaningbone segmentationisperformedaftertheaveragingprocess.

The same task can be achieved by the well-known majority voting framework where instead of taking the average intensity of aligned atlas CT images, each CT image is converted to a bi- narybonemask(T_Sn)followedbytheaveragingprocess.Thevoxel the majority of classiﬁers agree on is labeled as bone (Eq. 3) (Heckemann etal., 2006; Artaechevarria etal., 2009; Yushkevich etal.,2010;Artaechevarriaetal.,2008).

TSav= 1 N

N

n=1

TSn B

(

^x

)

⁼

₁_, _if _T

Sav

(

^x

)

^>⁰^.⁵

0, otherwise (3)

T_Sav isalsocalledboneprobability mapwherevaluesof1and 0indicatethat alltheatlasesunanimouslypredictbonyandnon- bonytissuesforthatvoxel,respectively.

ItishypothesizedthatthenumberofatlasimagesNhasama- jor impact on the accuracy ofextracted bone (Heckemann et al., 2006).Toevaluatethisfeature,bonesegmentationwascarriedout for various numbers of atlases selected randomly among the 22 patientdatasets.

Conventionalmulti-atlassegmentationapproachesentailsNon- line registrations between target and atlas images. A number of studies utilizedonlyonesingleatlasimageortemplate(obtained fromtakingtheaverageofpopulation)isutilizedtodelineatethe anatomical structures inthe target image after warpingtheatlas image to the target coordinates to reduce the computation time (Rohlfingetal.,2004a;Heckemannetal.,2006).Consequently,this approach requires only one online registration, which makes it computationally efficient. The performance evaluation ofthe sin- gleatlasapproachisofspecialinterestsinceitintroducesatrade- off between computational time andthe quality of the outcome compared to conventional multi-atlas approach. The single atlas approach, referred as “single atlas image” in Table 4, was comparedto variousmulti-atlasapproaches.Toevaluate theaccuracy ofthisapproach,aniterativeatlasgenerationframeworkwasuti- lized via the LOOCV scheme (Rohlfing et al., 2001). In summary, anMRimagebelongingtothepatientwiththemedianbodymass indexof the populationwasselected asthe initial atlasforatlas space alignment. The initial iteration containsthe registrationof other MRimagesto theselected atlasusingthe sequentialaffine and non-rigid registration procedure describedin Section 2.4. At the endofeachiteration, thenewaverageatlasisgenerated and usedinthesubsequentiteration.Sincethetemplateobtainedfrom the previous iteration serves better as common/reference spatial coordinate, after each iteration, the obtained template would be morerepresentativeforthetargetsubject.Inthepresentwork,we usedfiveiterationsandthefinal transformationfieldwasapplied onthecorrespondingCTstoyieldtheaverageCTatlas.Inthelast step,theaverageMRIatlasisnon-rigidlyalignedtothetargetMRI and bone segmentation iscarried out on the warped averageCT image. As mentionedearlier, thisapproachrequires onlyone on- lineregistrationandtheatlascreationisperformedoffline.

2.4.2. Globalweighting

The methods described in Section 2.4.1 do not involve any strategy to detect and consequently discard miss-registration errors. Registration errors occur due to local minima, inter-patient anatomy variability and presence of noise, which might incur grossmismatchontheresultingimages(Svareretal., 2005).One strategy to overcome the misalignment error consists in assign- ing weights to the atlas images globally(as opposed to local or voxel-wise approach)onthe basisofmorphologicalsimilaritybe- tween target and atlas images. By this approach, aligned atlas

images presenting the higher degree of anatomy andpose simi- larities contribute more effectively to the resulting segmentation (Artaechevarriaetal.,2009;Chandraetal.,2012;Yingetal.,2013; Artaechevarriaetal., 2008).The firststep towardweightedatlas- based segmentation consists in developing a similarity criterion between the target image and aligned atlas images. Normalized mutualinformation(NMI),normalizedcrosscorrelation(NCC)and meansquaredistance(MSD)arethemostcommonsimilaritymea- suresusedforimplementationofweightedatlas-basedsegmentation (Yushkevich et al., 2010; Artaechevarria et al., 2008). These similarity measures are briefly describedbelow. Normalized mu- tualinformationisdefinedas:

NMI=H

(

^T

)

⁺^H

(

^M

(

^Am^rⁿ

) )

H

(

^T,M

(

^Am^rn

) )

⁽⁴⁾

whereH(T)is theentropyof image TandH(T,M(Amr_n))indicates thejointentropyofbothimages.Theentropyofan imagecanbe computedfromitshistogramh(x)as:

H

(

^T

)

=− F

i=1

h

(

^ci

)

^lo^g2h

(

^ci

)

whereFisthenumberofhistogrambinsandc_icorrespondstothe centroidoftheithhistogrambin(Wellsetal.,1996).

The normalized cross-correlation between the two images is deﬁnedas:

NCC=

Co

v ₍

T,M

(

^Am^rⁿ

) )

Var

(

^T

)

^.

Var

(

^M

(

^Am^rⁿ

) )

⁽⁵⁾

whereCov(T, M(Amrn))isthecovarianceoftheimagesandVar(T) indicatesthevarianceoftheimageT.

Themeansquaredistanceissimplytheintensitydifferencebe- tween two images. Here, we used the following formulation to measuretheintensitysimilaritybetweenthetargetMRimage (T) andthe co-registered atlas MR images (M(Amr_n)). X denotes the totalnumberofimagevoxels.

MSD= X

_X

x=0

|

^T

(

^x

)

−M

(

^Am^rn

(

^x

) ) |

² ⁽⁶⁾

Previously published works in the realm of multi-atlas based segmentation employed various ways of incorporating weighting factors ineither majority voting(MV) or intensityaveraging (IA) labelfusionschemes.Inthiswork,weexaminedthreemostcom- monlyused schemes forwhole-body bone segmentation through global weighting. Each of these schemes can be performed us- ingeitherofthreeabove introducedsimilaritycriteria. Yingetal.

(2013) exploited NMI similaritymeasure to identify similar atlas images via the following equation for the purpose of bone ele- mentssegmentationofhipandfemurfromMRimages.

wn= SM

(

^T,M

(

^Am^rn

) )

−minm[SM

(

^T,M

(

^Am^rm

) )

^] maxm[SM

(

^T,M

(

^Am^r^m

) )

^]− minm[SM

(

^T,M

(

^Am^r^m

) )

^]

subjectto wn≥

⁽⁷⁾

HereSMcanbeanysimilaritymeasurecriterion(NMI,NCCand MSD)betweenthetargetMRimageandtransformedatlasimages Amrn.TheminandmaxoftheSMarecalculatedamongallatlases to normalize the weighing factor w. After obtaining the weight- ingfactorw,thenextstepistoselecttheatlaseswhichsatisfythe conditionw≥⁽⁰≤≤1),where^is^the^threshold^used^to^dis- cardpoorlyperformingatlases.Therefore,theweightedaverageof atlasesTavcanbecalculatedusingthefollowingformulation:

Tav= 1 Nr

N

n=1

wn.TAn Bn

(

^x

)

=

₁_, _if _T

av

(

^x

)

^>¹⁸⁰

0, otherwise (8)

NristhenormalizationfactorobtainedbyNr=N

n=1wn.Thema- jorityvotingscheme (Artaechevarria etal., 2009) can beadapted

(6)

forthispurposeas:

TSav= 1 Nr

N

n=1

wn.TSn Bn

(

^x

)

=

₁

, if T_Sa_v

(

^x

)

>0.5

0, otherwise (9)

Ying et al. (2013) utilized only NMI similarity measure along witha ﬁxed threshold =0.9 whilein our work all three simi- laritycriteriaandavariablethresholdwereexaminedfortheboth MVandIA schemesin orderto determine theoptimal threshold valueandthemosteﬃcientsimilaritymeasure.

The second approach to incorporate the similarity weights in theatlasfusionprocessisthroughgainexponent.Inthiscase,the weightingfactorisdeﬁnedasw=SM(T,M(Amrn))^P,wherethegain exponentPmightbeincreasedifthesimilaritymeasureisnotsen- sitiveenough toprovideappropriate differencesbetweenweights (Artaechevarriaetal.,2009). Theweightingfactorwcanbeincor- poratedeitherinEqs.(8)or(9). Presently,ouraim istoﬁnd the optimumvalueofthegainexponentPforthethreesimilaritycri- teriaviaIAandMVschemes.

ThethirdatlasweightingschemeisbasedonYushkevichetal.

(2010) workwhichassumedthattherangeofsimilaritymeasures can vary quite dramatically between subjects and locations. The sameschemewasusedbyBurgosetal.(2013)forpseudo-CTgen- erationintheheadregion.Assuch,arankingschemeisproposed wherebythe similarity measure value for each transformed atlas isranked across all atlases. Let’ssuppose that ranked atlases are denotedasRn.Theconversion totheweight isperformedbyap- plyinganexponentialdecayfunction.

wn=e⁻^a^Rⁿ10 (10)

whereRn denotes theranked atlasindices (e.g.1,2,3,…) anda isa weighting parameter tobe optimized. Byadopting therank- ingscheme,thetrainingsubjectthatbestmatchesthetargetsub- jectis given a weightof 1.The training subjectwiththe second best match is assigned a weight e⁻^a and so on. Thus, the seg- mentationcanbe performedbyapplyingtheweighting factorwn

toEqs.(8) and(9). Here,theranking process wasrepeatedthree timesusing the NMI, NCC and MSD similarity measures and for each one the optimum parameter a, whichmaximized the accu- racyofsegmentedbonewasdetermined.

Insome studies,themostsimilarsubjectisselectedforeither MRIsegmentationorattenuationmapgenerationinPET/MRItore- ducethecomputationtime(Rohlﬁngetal.,2004a;Marshalletal., 2013). The most similar atlas can be determined before the reg- istrationprocess on thebasis ofmeta-data andimage processing features(Marshalletal.,2013).Inourwork,themostsimilarsub- jecttothetargetimagewasdeterminedaftertheregistrationpro- cessusingthe three aforementioned similaritymeasures andthe extractedbonewasvalidatedforeachone.

2.4.3. STAPLE

A well-established approachaiming at maximizing multi-atlas based segmentation accuracy is Simultaneous Truth and Perfor- manceLevel Estimation (STAPLE) (Warfield et al., 2004). A num- berofstudiesusingmulti-atlasbasedsegmentationemployedSTA- PLE algorithm to find the optimal combination of segmentations suggestedby the differentclassifiers (Artaechevarria etal., 2009; Artaechevarriaetal.,2008).STAPLEisanexpectation-maximization algorithmforsimultaneoustruthandperformancelevelestimation thatconsidersacollectionofsegmentationsandcomputesaprob- abilisticestimate of thetrue segmentation anda measure ofthe performance level represented by each segmentation. The source of each segmentation in the collection may be an appropriately trainedhumanrater(orraters),oran automatedsegmentational- gorithm,such asregistered atlasclassifiers.Theprobabilistic esti- mateof the true segmentation is formed by estimatingan opti-

malcombinationofthesegmentations,weightingeach segmenta- tiondependingupon theestimated performancelevel, andincor- poratingapriormodelofthespatialdistributionofstructures be- ingsegmentedaswellasspatialhomogeneityconstraints(Warﬁeld etal., 2004).TheSTAPLEalgorithmestimatesagroundtruthbone map fromgivenbone atlasbinarymaps(T_Sn).Let

θ

ⁿ ^be ^a^matrix

whereeachelementdescribestheprobabilitythat atlasnlabelsa voxelasbone(b)whenthetruelabeliss(

θ

ⁿ^(b,s^)).^The^perfect^at-

laswillhaveaprobabilitymatrix(

θ

ⁿ⁾^equal^to^the^identity^matrix.

Let

θ

=[

θ

1…

θ

N] be the unknown set of all probability matrices characterizingallatlasimages(N)andB=[B₁…B_N]beavectorrep- resentingtheunknown groundtruthbonelabelmapandDbe an V×Nmatrix(Visthenumberofimagevoxels)whosecolumnsin- dicatetheNunknownsegmentations.STAPLEestimatestheground truth bone segmentation(B) aswell astheparameter matrix(

θ

⁾

bymaximizingtheloglikelihoodf=(D,B|

θ

⁾^using^theexpectation maximizationalgorithm(Warﬁeldetal.,2004).

Since the implementation of STAPLE algorithm is not very straightforward and is computationally demanding, Martin- Fernandez et al. (2005) proposed Williams’ index whereby the classifiers are assigned weights based on mutual similarity with other classifiers and the general consensus agreed on by all classifiers.Williams’indexisdefinedas:

In=

(

^N⁻²

)

^Ni=na

(

^T^An^,^T^Ai

)

2N

i=n

i

k=na

(

^TAi,T_Ak

)

⁽¹¹⁾

where N is the number of classifiers or atlases, T_An denotes the segmented bone provided by the nth atlas and a(TAn,T_Ai) is the agreement betweentheclassifier T_Anand T_Ai over all image voxels.Variousagreementmeasures canbeused;a fewofthemwill be definedinSection 2.5.We usedthe Dicesimilarity coefficient (Dice,1945)forthispurpose.Incasetheatlasngeneratesanindex (In)greaterthanone,itcanbeconcludedthattheperformanceof thisatlascoincides withthemajority oftheother atlases. There- fore,this index can be used to selecteffective atlases (Williams, 1976).TheevaluationperformedinMartin-Fernandezetal.(2005) demonstratesthattheoutputofSTAPLEanalysisandWilliams’in- dexaresimilar. Inthiswork, weimplementedbothalgorithms to comparetheirperformanceintermsofsegmentationaccuracy.

2.4.4. Localweighting

Thevoxel-wise weighting procedureiscarried outsimilarly to the global weighting scheme, except that the similarity measure betweenthetarget imageandtransformedatlasisobtainedinde- pendentlyforeach voxel within itssurrounding image patch(D).

The same image similarity criteria(NMI, NNC andMSD) used in globalweightingare utilizedhere,exceptthat thesearchingwin- dow parameter D (patch size)introduced above needs to be optimized. As such,the NMIsimilarity measure betweenthe target MRimageTandthenthtransferredatlasimageM(Amr_n)forvoxel xconsideringitsDneighborhoodisdeﬁnedas:

NMID

(

^x

)

⁼^H^D

(

^T

)

⁺^H^D

(

^M

(

^Am^rⁿ

) )

HD

(

^T^,^M

(

^Am^rⁿ

) )

⁽¹²⁾

Thefastconvolution-basedapproachproposedbyCachieretal.

(2003) isused to compute the localnormalized cross-correlation (LNCC).

LNCCD

(

^x

)

⁼

D

T, M

(

^Am^rⁿ

)

x

σ (

^T

)

x .

σ (

^M

(

^Am^rⁿ

) )

x

where

σ (

^T

)

x=

T_x²− Tx

2 T¯x=KG ∗ Tx

^T^, ^M

(

^Am^rⁿ

)

x =T.M

(

^Am^rⁿ

)

x− T¯x.M

(

^Am^rⁿ

)

x (13) whereK_Gand^∗denotetheGaussiankernelandconvolutionoper- ator,respectively.AGaussiankernelwithstandarddeviationequal

(7)

to3voxels(4mm)wasadoptedinthisstudy.TheMSDimagesim- ilarityovertheimagepatchDisdeﬁnedas:

MSDD

(

^x

)

⁼ ^D

x∈D

|

^T

(

^x

)

⁻^M

(

^Am^rⁿ

(

^x

) ) |

² ⁽¹⁴⁾

Voxel-wiseweightingatlasfusionusingthegainexponentwas usedinArtaechevarriaetal.(2009)forbrainMRimagesegmenta- tion.Thegainexponentisusedtoboostthesensitivityofthesim- ilaritymeasureacrosstheatlasdataset.Theweighingfactorwould havetheformw_n(x)_D=SM_D(T,M(Amr_n))_x^P.SM_Dcouldbeanyofthe imagesimilaritycriteria(NMI,LNCCandMSD)calculatedoverthe block D centered atvoxel x.The obtainedweighting factor could beincorporatedinIAorMVschemesas:

Tav

(

^x

)

D= 1 Nr

N

n=1

wn

(

^x

)

D.TAn

(

^x

)

Bn

(

^x

)

=

₁

, if Tav

(

^x

)

D>180

0, otherwise (15)

TSav

(

^x

)

D = 1 Nr

N

n=1

wn

(

^x

)

D.TSn

(

^x

)

Bn

(

^x

)

⁼

₁_, _if _T

Sav

(

^x

)

D>0.5 0, otherwise

Nr= N

n=1

wn

(

^x

)

D (16)

The second approach for calculating the weighting factors is similar to that describedinSection 2.4.2 asranking scheme.The onlydifferenceisthattherankingstepmustbeperformedforeach imagevoxel(consideringthesurroundingvoxelsinthewindowD) ratherthan theentireatlasimage acrossthe wholedataset.After calculatingthevoxel-wiserankingvectorR(x)_D onthebasisofim- agesimilaritycriteria,theweightingfactorisobtainedvia:

wn

(

^x

)

D=e⁻^aR⁽^x⁾^D (17) Again,thisweightingfactorcanbereplacedeitherinEq.(15)or (16) to perform the ﬁnal segmentation step. The same local weighting atlasfusion strategy wasexploitedbyBurgoset al.for attenuation map synthesis in brainPET/MRI (Burgos et al., 2013, 2014).

Another strategy for utilizing voxel-wise similarity considers only theinformationof themostsimilar voxel.Tothisend, after computingthevoxel-wise rankingvectorR(x)D,onlythe intensity information (or segmentation label) of the foremost voxel is as- signedtotheﬁnalsegmentedimage.Fromnowon,thisisreferred toasthemostsimilarvoxel(MSV).

Inthissection, threevoxel-wiseatlasfusionschemes werein- troducedwiththeaimtoseektheoptimalvalueoftheir freepa- rameters,namelyP,aandD.Tofulfillthisendeavor,wefirstcalcu- latedtheimagesimilaritymeasurebetweenthetargetandanyof theatlasimagesusingNMI,LNCCandMSDformulaforasearching windowD=10mm(ineach directionx,yandz).Then,atafixed valueofD,theoptimalvaluesoftheparametersPandawerede- termined.Inthenext step,theobtainedoptimalvaluesofthepa- rametersPandawerekeptfixedtofindtheoptimalvalueofD.

2.4.5. Hofmann’sapproach

Hofmann et al. (2011) proposed an approach of generating whole-body pseudo-CTimagesfromMRI.Thismethodreliesona combinationofatlasregistrationandpatternrecognitionviaGaus- sian process regression(GPR) (Hofmannet al.,2008).Atlas regis- trationprocessmightfailtomatchthetargetpatientperfectlybe- causeoflocalminimaofnon-rigiddeformationenergyfunction.To

alleviatetheadverseeffectofthelocalsignalmismatch,thenearby textureinformation of a given voxel wasfed into a GPRvia the patch ofsurrounding voxels to predict more accurate pseudo-CT values.Tothis end, a setof MRI/CT pairs are non-rigidlyaligned tothetarget MRimage andthen theGPRkernelisformed using thelocalimagepatchesontargetandatlasimages.Inaddition,5- classsegmentation(backgroundair,lung,fat,fat&non-fatmixture andnon-fattissue)isperformedonin-phaseMRimages(Bezrukov etal.,2013;Hofmannetal.,2011)andthecorrespondingpatchin- formationisusedintotheGPRkernelthroughEq.(18).

k

d_i,d_j

=exp

_−W

₍

P_MR,i

)

⁻ ^W

P_MR,_j

²

2

σ

MR² ,patch

×exp

−Xi− X_j

²

2

σ

pos²

×exp

₋W

P_Seg_,_i

− W

P_Seg_,_j

²

2

σ

_Seg,patch²

(18)

whered=(P_MR,P_Seg,X)whileP_MRandP_Segaresub-volumepatches fromthein-phaseMRimageand5-classsegmentedMRimage,respectively. Wis a weighting vector withhigher value forcentral voxelsin thepatch relativeto surroundingvoxels, X isthe train- ingandtest patchcenterposition.The parameters

σ

^pos,

σ

MR,_patch and

σ

Seg,patch determinehowtheoverallkernelvalueisinﬂuenced by similarityinposition andpatchintensityvalue inMRI and5- classsegmentation image. Thetrainingisperformedon thesam- plesddrawnfromrandomlocationsintheatlasdatabase.Finally, Eq.(19)isusedtocalculatethepseudo-CTvalue ofagivenvoxel.

cl =k^TC⁻¹y (19)

where c_l denotes the calculated pseudo-CT value of the voxel of interestl. k_l=k(d_i,d_l)standsfora(n×1) vectorwhere d_i=(P_MR,i, P_Seg,i,X_i)istheinformationextractedfromthepatchesoftheMRI atlasesandd_l=(P_MR,l,P_Seg,l,X_l)indicatestheinformationobtained fromthepatchesofthetarget MRI.C=k(d_i,d_j)representstheco- variance matrix (n×n) obtained from Eq. (18) using d_i and d_j patches on the MRI atlases. y is an (n×1) vector containing CT valuescorrespondingtothecentralvoxeloftrainingpatchesd_i.

2.4.6. Shape-basedaveraging

Shape-based averaging (SBA), categorized as an atlas-based segmentation technique, is a voting scheme where each vote is weightedby thesignedEuclideandistancecomputedforeach in- putlabel.SBA votingistheonlymethodincorporatingspatialin- formationinthelabelfusionprocess(RohlﬁngandMaurer, 2007).

Letdn(x)denotethesignedEuclideandistanceofvoxelxfromthe nearest surface voxel with bone label in the nth atlas segmentation. A negativevalue of dn(x) corresponds to the inside bony structureof thenthatlas whilea positive value impliesthatx is located outside. A value equal to zero is obtained ifand only if voxel x ison the surfaceof bony structure. In effect, the signed Euclideandistanceprovides aprobabilitymap forthepresenceof bonebasedonevery singleatlassegmentation.Bycomputingthe distancemaps ofbony structures inall aligned atlasimages, the averagedistance of agiven voxel xfromthe bone surfaceis ob- tainedfrom:

AD

(

^x

)

⁼_N¹ N

n=1

dn

(

^x

)

⁽²⁰⁾

InterestedreadersarereferredtoRohlﬁngandMaurerJr(2005) formoredetailsonimplementationoftheSBAalgorithm.

Inadditiontothespatialweightthatisassignedtoeachvoxel usingtheSBAalgorithmonthebasisoftheEuclideandistance,the

(8)

Table 1

Comparison of validation measures (mean ±SD), including Dice similarity (DSC), relative volume distance (RVD), Jaccard similarity (JC), sensitivity (S) and mean absolute surface distance (MASD) between the bone extracted from different methods of global weighting atlas fusion using intensity averaging (IA) and majority voting (MV) approaches together with the optimum weighting parameters , P and a for MI, NCC and MSD image similarity measures. ( ^∗) indicates P -value < 0.05 according to the paired t -test analysis.

Similarity measure Weighting parameter DSC RVD(%) JC S MASD(mm)

NMI

IA = 0.50 0.64 ±0.06 −36.5 ±05.6 0.47 ±0.05 0.53 ±0.06 06.4 ±01.5 MV = 0.55 0.64 ±0.05 −40.1 ±04.8 0.47 ±0.06 0.51 ±0.07 06.8 ±01.7 IA P = 5 0.63 ±0.06 −41.5 ±05.8 0.46 ±0.05 0.50 ±0.05 06.9 ±01.5 MV P = 4 0.63 ±0.07 −43.1 ±06.0 0.46 ±0.06 0.49 ±0.06 06.9 ±01.8 IA a = 1 0.63 ±0.05 −41.6 ±05.7 0.46 ±0.05 0.50 ±0.04 07.1 ±01.6 MV a = 1 0.63 ±0.06 −41.7 ±05.9 0.45 ±0.06 0.49 ±0.05 07.2 ±01.7 NCC

IA = 0.75 0.64 ±0.06 −39.9 ±05.6 0.47 ±0.06 0.51 ±0.06 06.7 ±01.5 MV = 0.8 0.64 ±0.06 −39.2 ±05.9 0.47 ±0.07 0.51 ±0.07 06.9 ±01.7 IA P = 6 0.63 ±0.05 −42.0 ±6.0 0.46 ±0.05 0.50 ±0.05 06.9 ±01.6 MV P = 5 0.63 ±0.06 −43.9 ±6.0 0.45 ±0.06 0.49 ±0.06 07.0 ±01.7 IA a = 1 0.62 ±0.05 −43.0 ±6.1 0.45 ±0.05 0.49 ±0.06 07.1 ±01.6 MV a = 2 0.62 ±0.05 −43.0 ±6.3 0.45 ±0.06 0.49 ±0.07 07.1 ±01.7 MSD

IA = 0.9 0.65 ±0.05 −34.0 ±04.8 0.49 ±0.04 0.55 ±0.04 05.7 ±01.2 MV = 0.9 0.64 ±0.05 −36.9 ±05.2 0.47 ±0.06 0.53 ±0.06 05.9 ±01.2 IA P = 10 0.64 ±0.05 −37.5 ±04.0 0.47 ±0.05 0.52 ±0.04 05.9 ±01.3 MV P = 10 0.64 ±0.06 −39.5 ±04.9 0.47 ±0.06 0.52 ±0.05 06.1 ±01.4 IA a = 2 0.63 ±0.05 −41.0 ±06.1 0.46 ±0.06 0.50 ±0.06 06.9 ±01.6 MV a = 2 0.63 ±0.06 −41.3 ±06.2 0.46 ±0.06 0.50 ±0.07 07.0 ±01.7

local weight corresponding to the image similarity measure can also be incorporated in Eq. (20). Sabuncu et al. (2010) included voxel-wisesimilarityweightingfactorsintheSBAalgorithmtoen- hanceitsperformanceinthecontextofbrainimagesegmentation.

Asanextension tothiswork,we usedidenticalweightingfactors deﬁnedinSection2.4.4andincludedtheminEq.(20):

AD

(

^x

)

= 1 N

N

n=1

wn

(

^x

)

Ddn

(

^x

)

⁽²¹⁾

Applying imagesimilaritymeasureweightingfactortotheSBA method introduces the same optimization parameters, namely P, aandD, foreach image similaritycriteria(NMI,LNCCandMSD).

Since the SBA technique is computationally intensive and time- consuming(RohlﬁngandMaurerJr,2005),theoptimalvalue ofD obtainedfromexperimentsdescribedinSection2.4.4wasusedto optimizetherestofcontributingparameters.

2.5.Evaluationmetrics

The evaluation of the accuracy of extracted bone using the various atlas-based segmentation strategies described in Section2.4wascarriedoutbycomparingthesegmentationoutput tothebone segmentedonthecorrespondingreferenceCTimages usingﬁve volume/distance-basedmeasures: Dice similarity(DSC) (Dice,1945),relativevolumedifference(RVD)(Uhetal.,2014),Jac- card similarity (JC) (Uh et al., 2014), sensitivity (S) (Ying et al., 2013) and mean absolute surface distance (MASD) (Heckemann etal.,2006).

DSC

(

^A,M

)

= 2

|

^A^∩^M

|

^A

|

⁺

|

^M

|

^, ^RV^D

⁽

^A^,^M

⁾

⁼¹⁰⁰^×

|

^A

|

⁻

|

^M

|

^M

|

^,

JC

(

^A,^M

)

⁼

|

^A^∩^M

|

^A^∪^M

|

^., ^S

⁽

^A,^M

⁾

⁼

|

^A^∩^M

|

^M

|

^.,

MASD

(

^A,M

)

= dave

(

^SA, SM

)

+ dave

(

^SM, SA

)

2

whereAisthesegmentedbonefromthereferenceCTimageandM denotestheextractedbonebytheatlas-basedsegmentationtech- nique. dave(S_A,S_M) is the average direct surface distance from all

points on the referencebone surface S_A to the segmented bone surfaceS_M.

The Shapiro-Wilktest wasused toexamine the nullhypothe- sis that the calculated evaluation metrics follow a normally dis- tributedpopulationandthecalculatedp-valueswere reportedfor eachindividualsegmentationscheme.Thedifferenceswereconsid- eredstatisticallysigniﬁcantifthep-valuewaslessthan0.05.

3. Results

Whole-body bone segmentation through non-weighting averaging wasperformedforvarying numberofatlases selected randomly from the entire dataset. Fig. 1 illustrates the accuracy of extractedbone intermsofDSC andRVDvalidation measures us- ingbothIAandMV.Thebarsshowthestandarddeviationateach measuredpoint.

Fig.2depictstheaccuracyofextractedboneusingtheweights deﬁned in Eq.(7) for varying threshold levels. The top andbot- tomrowsdepicttheresultsobtainedusingIAandMVframeworks, respectively,forNMI, NCCandMSD imagesimilaritymeasures. A similar analysis wasrepeated to obtain the optimal value of parameters P anda (Table 1). The comparisonwasmade usingthe ﬁvevalidationmeasuresdescribedinSection2.5.

Fig.3depictstheaccuracyofextractedbonebasedonDSCand RVDvalidationmeasures calculatedatdifferentvaluesofP anda forNMI, LNCCandMSDsimilaritycriteriausingasearchingwin- dowofD=10mm.TheresultsillustratedinFig.3areobtainedus- ingtheIAframework.

The best resultatD=10mm is achievedby the MSD similaritymeasure with P=3.5 using theIA framework, yielding a DSC of0.75, thusdemonstratingsigniﬁcant improvementcompared to the globalweighting strategy (DSC=0.65). After determining the optimal value of P and a,these parameters were kept ﬁxed and theoptimumsizeofthesearchingwindowDwascalculated.Fig.4 depictstheimpact ofvarying sizeofthe searchingwindowD on theaccuracyofextractedbonefordifferentimagesimilaritycrite- ria.Thetoprowcorrespondstotherankingschemeobtainedfrom Eq.(17)ata=1whereasthebottomrowcorrespondstotheMSV

(9)

Fig. 1. DSC (top) and RVD (bottom) similarity measures vs. the number of subjects using the intensity averaging and majority voting frameworks.

approachusingtheIAframework. Table2summarizesvoxel-wise atlas fusion results together with optimal parameter values. The best results were achieved when applying voxel-wise weighting ranking scheme(using a=1) andtheMSV approach (using D=5 andMSDsimilaritymeasure)withaDSC=0.81(Table2).

Although the SBA method was the most time consuming ap- proachamongthosestudiedinthiswork,thistechniqueexhibited poorperformance withoutlocalweighting(Table3).However,incorporating voxel-wise weightingimproved theDSC from0.56to 0.76.Fig.5illustrates theperformance ofSBAatvaryingvaluesof aobtainedusingdifferentimagesimilaritycriteria.

A comparison of the performance of the various segmenta- tiontechniquesisprovidedinTable4.Thetechniquesincorporat- ing optimizationparameters are reportedattheir optimal values.

Figs.6–8illustratearepresentativesliceofsegmentedbonefroma whole-bodyMRimagetogetherwithcorrespondingerrordistance mapusingacombinationofmethodspresentedinTable4.

4. Discussion

Bonesegmentationfromwhole-body MRimagesproved tobe a challengingtask. We investigated the accuracyof a number of atlas-guidedsegmentationapproaches.Ourprimarymotivationfor conductingthisworkistoidentifythemostpromisingalgorithms foratlas-guidedattenuationcorrectioninPET/MRI.Sincetheiden- tiﬁcationandsegmentation ofbonystructures forMRI-guidedat- tenuationmapgeneration,particularlyinwhole-bodyimaging,we focusedourevaluationonmetrics reﬂectingtheaccuracyofbone extractionamongthevariousapproaches.

A commonly used approach to combine the information pro- videdbydeformedatlasimages isthroughIAorMV labelfusion schemes(Chakravartyetal.,2013).Intheory,inmultipleatlasseg- mentation,increasingthe numberofinput atlaseswouldimprove theoutcome. Assuch, thequality ofsegmentation is expectedto improvemonotonicallybyaddingmoreatlases.However, inprac-

(10)

Fig. 2. Plots of DSC and RVD vs. global atlas weighting parameter measured using NCC, MSD and NMI similarity criteria for intensity averaging (top row) and majority voting (bottom row) frameworks.

Fig. 3. The effect of varying voxel-wise label weighting parameters ( P and a ) on DSC and RVD validation measures obtained from IA segmentation framework using LNCC, MSD and NMI similarity criteria for D = 10 mm.

ticeatacertainnumberofinputatlases,theimprovementreaches apeak(atanumberof14inFig.1).TherisingpartoftheDSCplot (from1to14subjects)canbejustiﬁedbythenonsystematicmis- alignment cancelation due to uncorrelatederror between atlases (Artaechevarriaetal.,2009;Heckemannetal.,2006).Byincreasing the numberof atlases beyond the peak, the resulting segmenta-

tionstendtoapproachthepopulationmeanandthesegmentation accuracywillreachanasymptoticvalue.Anoverlyincreasednum- ber of atlases woulddegrade the segmentation accuracy because ofthehighlevelofsmoothnessandthelackofpatient-speciﬁcde- tails.Assumingthatinputatlasesareofsimilarqualityandarese- lectedrandomly,addingmoreatlasesafterreachingthepeak(here