HAL Id: hal-00203719
https://hal.archives-ouvertes.fr/hal-00203719
Preprint submitted on 21 Jan 2008
HAL
is a multi-disciplinary open access archive for the deposit and dissemination of sci-
L’archive ouverte pluridisciplinaire
HAL, estdestinée au dépôt et à la diffusion de documents
detection
Vittorio Ferrari, Loic Fevrier, Cordelia Schmid, Frédéric Jurie
To cite this version:
Vittorio Ferrari, Loic Fevrier, Cordelia Schmid, Frédéric Jurie. Groups of adjacent contour segments
for object detection. 2008. �hal-00203719�
a p p o r t
d e r e c h e r c h e
ISRNINRIA/RR--5980--FR+ENG
Thème COG
Groups of Adjacent Contour Segments for Object Detection
Vittorio Ferrari — Loic Fevrier — Frederic Jurie — Cordelia Schmid
N° 5980
September 2006
Vittorio Ferrari , LoiFevrier , FrederiJurie , CordeliaShmid
ThèmeCOG Systèmesognitifs
ProjetLEAR
Rapportdereherhe n°5980September200631pages
Abstrat: Wepresenta familyof sale-invariant loal shape features formed by hains
of k onneted, roughly straight ontour segments (kAS), and their use for objet lass
detetion. kAS are ableto leanly enode pure fragmentsof an objetboundary, without
inludingnearbylutter. Moreover,theyoeranattrativeompromisebetweeninformation
ontentandrepeatability,andenompassawidevarietyofloalshapestrutures. Wealso
dene atranslationand saleinvariant desriptorenoding thegeometri ongurationof
thesegmentswithin akAS,makingkAS easytoreusein otherframeworks,forexampleas areplaementoradditiontointerestpoints.
We demonstrate the high performane of kAS within a simple but powerful sliding-
window objet detetion sheme. Through extensive evaluations, involving eight diverse
objet lasses and more than 1400 images, we 1) study the evolution of performane as
thedegreeoffeature omplexitykvariesand determinethebestdegree;2) showthatkAS
substantiallyoutperform interestpoints fordeteting shape-based lasses; 3) ompare our
objetdetetorto thereent,state-of-the-artsystembyDalalandTriggs[4℄.
Key-words: loalfeatures,shapedesriptors,objetdetetion
Asoftwareimplementationisavailableatlear.inrialpes.fr/softw are
∗
ThisresearhwassupportedbytheEADSfoundation,INRIAandCNRS.VittorioFerrariwasfunded
byapostdotoral fellowshipoftheEADSfoundation.
1 Introdution
Inthelastfewyears,theproblemofreognizingobjetlasseshasreeivedgrowingattention,
in bothvariants of whole image lassiation[3, 5,10, 14, 15℄, and objetloalization [2,
4, 16, 31℄. The majority of existing methods use loal image pathes as basi features.
Whiletheseworkwellforsomeobjetlasses,suhasmotorbikesandars,otherlassesare
denedbytheirshape,andarethereforebetterrepresentedbyontourfeatures(e.g. horses,
ormugs). Inspiteoftheirsubstantialsope,onlyomparablyfewworks[2,13,22,29℄have
takledthelass-levelloalizationproblem usingontourfeatures.
In this paper we present a family of loal ontour features, and their appliation for
detetingandloalizingobjets. Thesefeaturesaresmallgroupsofonneted,approximately
straightontoursegments,alledkadjaent segments,orkAS.ThesegmentsinakASform
apathoflengthkthroughanetworkofontoursegmentsoveringtheimage[9℄. Essentially, twosegmentsareonnetedinthenetworkiftheyareadjaentonthesameedgel-hain,or
ifone is at theend ofanedgel-hain direted towardstheother segment(setion3). The
larger the number k of segments in a kAS, the more omplex the loal shape strutures
it an apture. 1AS are just individual segments, while 2AS inlude L shapes, and 3AS
an form C, F and Z shapes (gures 2, 3). Along with the kAS features, we propose a
low dimensional, translation+saleinvariant desriptor designed to enode the geometri
propertiesofthesegmentsomposingakAS,and theirrelativeloations.
kAS haveaseveralattrativeproperties. First,asbothkASandtheirdesriptorsover
solely short hainsof onneted segments, they havethe ability to overpure portions of
an objet boundary, without inluding lutter edges whih very often lie in the viinity.
Seond, for a sensible range of k, kAS have intermediate omplexity, whih makes them detetablerepeatablywhilebeinginformativeat thesametime. Third,onnetednessis a
naturalgrouping riterion to form kAS.It avoids the need for dening a 'groupingsale'
ora 'grouping neighborhood' for asegment, and eetively onstrains the features to be
hainsofsegments,whiharemorelikelyoflyingentirelyonaboundary. Finally,kASare
omplete loal invariant features: eah has awelldened loation and sale, aninvariant
desriptor, and is deteted basedonly on loal properties of asingle image. Hene, they
an bereused eortlesslyin a variety of reognitionandimage mathing frameworksas a
replaementor additiontointerestpoints(suhas [2,5,10, 16,30℄).
We demonstrate the power and exibility of kAS within an objet detetion frame-
work whih brings together several suessful ideas presented before. Following the `bag
of features' paradigm[3,14, 34℄, weonstrut aodebook of kAS types, eah apturing a
dierent kindof loal shapestruture (gures 2 and 3). An image window issubdivided
intotiles[4,15℄andeahisdesribedbyaseparatebagofkAS.Inthisfashionthewindow
representation is omposed of several bags of kAS spatially loalized within the window.
Addingthis layerof spatial organizationimprovesthedisriminativepowerompared to a
standardorderlessbagoffeaturesovertheentirewindow. Wersttrainalassierfromex-
ampleobjetandbakgroundwindows,andthenloalizepreviouslyunseeninstanesintest
imagesviaamulti-salesliding-windowmehanism[4, 31℄oupledwiththe lassier. Our
k
togram[24℄, whihis areentlydevelopeddatastruturesupportingtherapidomputation
ofmultidimensionalhistograms.
During an extensive evaluation, involving eight diverse objet lasses and over 1400
images(setion6), westudyseveral aspetsofkAS.First,weanalyzetheobjetdetetion
performane while varying k, thereby shedding light onthe relationbetweenrepeatability andinformativenessaskinreases. Seond,foreahk,wevarytheresolutionofthewindow
tiling,allowingtoobservethetrade-obetweenaddingloalizationinformationandreduing
tolerane to spatial variationswithin the lass. Interestingly, wend theoptimal window
tilingto relateto theomplexityof thefeatures (k),with simplerfeatures preferringner
tiling. Moreover,wethoroughlyomparetheperformaneofkASagainstinterestpoints,and
againstthestate-of-the-artobjetdetetiontehniquebyDalalandTriggs[4℄. Theirwork
is partiularly relevant beause it followsa similar detetion framework (sliding-windows,
tiles),butitappliesdierentdesriptorstothewindowtiles(simplerhistogramsofgradient
orientations). Finally, we experimentwith theappliation of kAS with dierent k at the
sametime,andwiththeombinationofinterestpointsandkAS.
2 Related works
Inthe followingwerst review objet detetiontehniques basedon ontourfeatures, for
whih kAS oeranalternative,and thenpresentworks onthepereptualgroupingof on- tours,uponwhih kAS build.
Contour features for objet lass detetion Selinger and Nelson [27℄ detet key
urves: long segmentsof an edgel-hain omprised betweentwo high urvaturepoints. A
keyurve'ssizeandorientationdenesasquareimagepath,whihisthendesribedusing
all edgels falling within it. These edge pathes attempt to strike a winning trade-o: be
loal, and hene bring robustness to olusion and lutter, while also omplex enough to
bedistintivetosomedegree,enablingtomathindividual features,and openingthedoor
toomputationallyeientindexingshemes. However,these pathesare likelyto inlude
lutter edgels lying near the objet boundary, whih orrupt their desriptors and makes
themdiulttoputin orrespondene.
SelingerandNelson'sreognitionsystemwasdemonstratedinontrolledlaboratoryon-
ditions, with lean images ontaining modest amountsof lutter, and mostly on the task
ofreognizingspei objets. JurieandShmid[13℄wereamong thersttoproposeloal
ontourfeaturesforthedetetionofobjetlasses,andtotesttheirsystemonreal,luttered
images. Theirsale-invariantfeaturedetetorrespondstoirulararsofedgels,whihare
desribedbythespatialdistributionofpointsin athin annularneighborhoodoftheirle.
Thisattemptstoexludelutterfromthedesriptorbyavoidingenodingpointsinsidethe
irle. Asonelimitation,irulararsonlyoverafairlyrestritedlassof shapes.
Intheirveryreentworks,Shottonetal.[29℄andOpeltetal.[22℄independentlypropose
toonstrutontourfragmentstailoredtoaspeilass. Theideaistoexpliitlyonstrut
ones. Both works employ boosting to selet fragments from a large pool of andidates,
but dier in the way these andidates are onstruted (random retangles sampled from
training segmentation masks in [29℄, whereas[22℄ grows fragments starting from random
ontourpoints,and optimizestheirlengthso asto maximizeChamfermathingsoreand
auray of objetentroidpredition in validation images). Although they anbe more
disriminativeforthelearnedlass, thiskindoffragmentsarehardertoreusewithin other
reognitionorimagemathingframeworks,omparedtogenerifeatures,dependingonlyon
loalpropertiesofindividualimages. Moreover,thefragmentsof[29℄arenotsaleinvariant
andneedsegmentedtrainingimagestobeprodued,whihfurtherlimitstheirappliability.
Bergetal.[2℄ oeranalternativeviewonontour-basedobjetreognition,astingthe
problemasdeformableshapemathing. Insteadofountingonsophistiatedloalfeatures,
theysimplytakeindividualedgels(withaGeometriBlurneighborhooddesriptor),andput
themin orrespondenebetweenpairsofimages withapowerfulnon-rigid pointmathing
algorithmbasedonIntegerQuadratiProgramming.Themethodobtainsimpressiveresults
onthehallengingCalteh101database. Onedisadvantageisthat itredues reognitionto
mathingpairsoftrainingandtestimages,anddoesn'tinferfromthetrainingimagesasingle
model summarizingommonpropertiessharedbydierent instanesof thelass. Besides,
it would beinterestingto injet kAS in their framework, as replaeement for individual edgels,andobservewhether thiswouldleadtoimprovedperformane.
Dalaland Triggs[4℄onsiderably advaned the state-of-theart in humandetetion,by
designingtheHistogramof OrientedGradients(HoG)desriptor, andarefullyoptimizing
it over a large dataset ontaining thousands of humans in unonstrained poses. In their
reognitionframeworkimagewindowsaresubdividedintilesandeahoneisdesribedbya
HoG.Asimplesliding-windowmehanismthenallowstoloalizeobjets. Photometrinor-
malizationwithinmultipleoverlappingbloksoftilesmakesthemethodpartiularlyrobust
tolightingvariations. NotiethatHoG desriptorsareonlydened withinagivensubwin-
dow, they don't have aonept of loation and sale. Hene, they need to be assoiated
to someexternalfeature detetor beforebeingappliable within frameworks notbasedon
sliding-windows.
Pereptual grouping Pereptual groupingof ontours has a longhistory in omputer
vision [6, 12, 17, 18, 25, 26, 28, 32℄. The ruial idea behind these works is that piees
of ontour related by some pereptually salient property are more likely to belong to the
sameobjet. Thepereptualpropertiesexploitedinludeonvexity[12℄,o-irularity[32℄,
onnetedness[26,28℄, parallelism[18℄,andproximity[18℄.
Onemajorareaofappliation forpereptual groupingis imagesegmentation,in whih
thetaskistogrouptogetherallelementsbelongingtoindividual,unspeiedobjets[6,12,
32℄. Moreover,pereptualgroupingplayedan importantrolein the reognitionof spei
objetsunder varyingviewpoint,partiularlyinthe80sand 90s. Thefouswas mainlyon
planarobjets[26℄ andpolyhedra[11,18℄.
The kAS features are motivated by the same general intuitions of earlier pereptual
groupingworks,andaremostrelated theideasofRothwell[25, 26℄, whoadvoatedforthe
importane of onnetedness and topologial relations. We believe that onnetedness is
afundamental, powerfuldriving fore whih is urrently still underexploited in omputer
vision. Inthispaper,onnetednessisbroughttothedomainofobjetlassdetetion,and
is exploited to dene modernloal invariantfeatures: image elementswith awell dened
loation,asaleandaninvariantdesriptor,readytobeusedinmanyreentmathingand
reognitionshemes.
3
k
adjaent segments (k
AS)3.1 Contour Segment Network
Wesummarizeherethetehniqueof[9℄tobuildtheontoursegment network(CSN)ofthe
image,onwhih wewilldetetourkASfeatures.
Edgelsaredetetedbythe exellentBerkeleynaturalboundarydetetor[20℄, andthen
hained. Theresultingedgel-hainsarelinkedat theirdisontinuities,i.e. twoedgel-hains
c1 and c2 arelinkedif c2 passes nearan endpointof c1, andifthe endingof c1 is direted
towardsc2 (gure 1a). Informally, ifc1 would beextended abit, itwouldmeet c2. These
linksareusefulintwoways: theyreordthataontourmightontinueoverthegapbetween
twoedgel-hains,andallowtoapturejuntions(L-juntions,T-juntions,andhigherorder
juntionsinvolvingseveraledgel-hains).
The edgel-hains are partitioned into roughly straight ontour segments. The idea is
to organize these segmentsin anetwork, byonneting them along the edgel-hains, and
arosstheir links (gure 1a). Sine everyedgel-hain anbe linked to several others, the
CSNisaomplexbranhingstruture. Intuitively,twosegmentsareonnetediftheedgels
provide evidene that theymight be adjaentalong someobjetontour,even whenthey
are physially separatedby a(small) gap,or when forming a juntion. Thekeyproperty
of the CSN is to inlude paths going along the ontours of the imaged objets[9℄, whih
motivateskASfeatures.
3.2 Deteting
k
ASThe prinipal ontribution of this paper is to propose a family of loal features: paths
of length k through the CSN. More formally, a group of k segments is a kAS i they
an be ordered so that the i-th segment is onneted in the CSN to the (i+ 1)-th one,
fori ∈ {1, k−1}. Hene we all them k adjaent segments,and referto theirlength k as
degree. Askgrows,kASanformmoreandmoreomplexloalshapestrutures: individual
segmentsfork= 1;Lshapesand2-segmentT shapesfork= 2;C, Y, F, Z shapes,3-segment T shapes,andtrianglesfork= 3 (gures2,3). ThedimensionalityofkASdesriptorsalso
growswithk(nextsetion),andwetreatkASofdierentdegreesasdierentfeaturetypes,
allunitedinonefamilybyasharedruialproperty: tobesequenesofonnetedsegments.
Connetedness provides a natural riterion for grouping segments into kAS. It avoids
arbitrary denitions of the neighborhood of a segment, and onstrains kAS to be hains
r2
r3 r4
a
E D
A
b B
C 1
1 2 2
E D
A B
C c 1 2
3
E D
A B
C e E
D
A B
C d 2 1
3 4
Figure 1: a) Three edgel-hains, with ve segments and their inter-onnetions (arrows) in the
network. b)Twodeteted 2AS(B, C)and (D, E). Theorderof eahsegment in thedesriptoris
markednexttoit. Notiethat(A, B),(A, C),(C, E)arealsodeteted, thoughnotdisplayedbeause
they overlap with (B, C) and(D, E). ) A 3AS (C, A, E). d) A 4AS (E, B, C, D). e) ri vetors
involvedinthedesriptorforthe 4ASind).
of segments. Compared to the broader lass of groups of `nearby' segments, they have
higher hanes to lie entirely on a portion of the objet boundary. The features of [13℄
instead, inlude disonneted sets of edgels whih happen to be loated along part of a
irle. Besides, thekeyurvesof [27℄ are based onindividual edgel-hains, and heneare
lessrobustlydetetedinrealimagesthankAS,whihbridgegapsbetweenedgel-hains.
kAS anbedetetedbyadepth-rstsearhstartedfromeverysegment,followedbythe
eliminationofequivalentpaths(twodierentpathsinvolvingthesamesegmentsonstitute
thesamekAS).Thisisomputationallyheapforthesmallvaluesofkorrespondingtoloal features (aboutk≤5). Wedisregard highervaluesof k beause theyresultin largesale
strutures,toospeitoapartiularimageorobjetinstane,andinanexessivenumber
of detetedfeatures (several thousandsalready for k= 5). Morepreisely, the numberof kAS in animageontainingn segmentsgrowquiklywith k,as an beunderstoodbythe
followingobservations. Onaverage,eahsegmentisonnetedtotwotothreeother,beause
T and higher orders juntions our lessfrequently than simple 1-to-1onnetions. As a
onsequene,askgrows,thenumberofpathsoflengthkpassingthroughagivenbranhing
point inreasesquikly. Inpratie, while theaverage numberof 2AS is onlyabout1.5n,
thenumberof3ASis4n,thatof4AS is10n,andtherearemorethan20n5AS!
As k inreases, features inrease in omplexity. On the one hand, they beome more
andmoreinformative,whileontheothertheygraduallygetlessandlessrepeatableaross
dierentimagesandobjetinstanes. Additionally,thenumberofnon-boundaryfeatures(or
mixedfeaturesoveringpartlyboundaryandpartlylutter)alsogrowswithk,atuallyfaster
thanpureboundaryones,leavingalowersignal-to-noiseratio. Hene,forratherlowvalues
of k, kAS have an attrative intermediate omplexity, oering a onvenient ompromise:
simpleenoughtobedetetedrepeatably, yet omplexenoughto aptureinformativeloal
objetstrutures. Insetion 6,weonrm these intuitionsexperimentally,and determine
that2ASperformbest.
3.3 Desribing
k
ASIn order to ompare dierent kAS, we need a numerial desriptor. As rst step, it is
important to order the kAS segments {si}i=1..k in a repeatable manner, so that similar
kAS havethesameorder. Weseletas rstsegmentthe one withmidpoint losesttothe
entroidofallmidpoints{mi= (xi, yi)}i=1..k(whenseveralsegmentshavesimilardistanes
totheentroid,wepikthe rstone aordingtotheorder denedbelow). Aswewill see
in the desriptor below, this entermost segment is the natural hoie as referene point
formeasuringtherelativeloationoftheothersegments. Theremainingsegmentstakeup
positions2throughk,andareorderedfromlefttoright,aordingtotheirmidpoint. Iftwo
segmentssi, sj havesimilarxoordinate,i.e. (xi−xj)≤0.2
(xi−xj)2+ (yi−yj)2,then
theyareorderedfromtoptobottom. Notethatthisorderisstable,asnotwosegmentsan
havesimilarloationin bothxandy. Exampleorderingsanbeseenin gure1b-d.
One the order established, a kAS is a list P = (s1, s2, . . . , sk) of segments. Let ri = (rix, ryi) be the vetor going from the midpoint of s1 to the midpoint of si. Fur-
thermore, let θi, li = si be the orientation and length of si. The desriptor of P is
omposedof4k−2 values1 (gure1e):
r2x Nd
,ry2 Nd
, . . . , rxk Nd
, ryk Nd
, θ1, . . . , θk, l1
Nd
, . . . , lk
Nd
(1)
ThedistaneNdbetweenthetwofarthestmidpointsisusedasnormalizationfator,making thedesriptorsale-invariant(hene,boththekASfeaturesandtheirdesriptorsaresale-
invariant). Whilesegmentlengthsareknowntobeofteninaurate,andeahisbasedonly
onpartofthekAS,thedistanebetweenthe farthestmidpointsmakesabetterhoiefor
areliableestimateof thekAS sale. InadditiontoakASsale, wealsodene itsloation
tobethegeometri enter ofthemidpointsof itssegments. Exat denitions ofsaleand
loationareusefulwhenusingkASinhigherlevelalgorithms,suhasinoursliding-window objetdetetionsheme(nextsetions).
Theproposeddesriptoronsidersthesegmentsasompletelystraight,soastoapture
onlytherelevantinformationofthegeometriongurationtheyform,andnotthevarying
1
Theasek= 1makesexeption.Thedesriptorisomposedonlyofθ1,andthesaleof1ASisdened asl1.