• Aucun résultat trouvé

Groups of adjacent contour segments for object detection

N/A
N/A
Protected

Academic year: 2021

Partager "Groups of adjacent contour segments for object detection"

Copied!
34
0
0

Texte intégral

(1)

HAL Id: hal-00203719

https://hal.archives-ouvertes.fr/hal-00203719

Preprint submitted on 21 Jan 2008

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci-

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents

detection

Vittorio Ferrari, Loic Fevrier, Cordelia Schmid, Frédéric Jurie

To cite this version:

Vittorio Ferrari, Loic Fevrier, Cordelia Schmid, Frédéric Jurie. Groups of adjacent contour segments

for object detection. 2008. �hal-00203719�

(2)

a p p o r t

d e r e c h e r c h e

ISRNINRIA/RR--5980--FR+ENG

Thème COG

Groups of Adjacent Contour Segments for Object Detection

Vittorio Ferrari — Loic Fevrier — Frederic Jurie — Cordelia Schmid

N° 5980

September 2006

(3)
(4)

Vittorio Ferrari , LoiFevrier , FrederiJurie , CordeliaShmid

ThèmeCOG Systèmesognitifs

ProjetLEAR

Rapportdereherhe 5980September200631pages

Abstrat: Wepresenta familyof sale-invariant loal shape features formed by hains

of k onneted, roughly straight ontour segments (kAS), and their use for objet lass

detetion. kAS are ableto leanly enode pure fragmentsof an objetboundary, without

inludingnearbylutter. Moreover,theyoeranattrativeompromisebetweeninformation

ontentandrepeatability,andenompassawidevarietyofloalshapestrutures. Wealso

dene atranslationand saleinvariant desriptorenoding thegeometri ongurationof

thesegmentswithin akAS,makingkAS easytoreusein otherframeworks,forexampleas areplaementoradditiontointerestpoints.

We demonstrate the high performane of kAS within a simple but powerful sliding-

window objet detetion sheme. Through extensive evaluations, involving eight diverse

objet lasses and more than 1400 images, we 1) study the evolution of performane as

thedegreeoffeature omplexitykvariesand determinethebestdegree;2) showthatkAS

substantiallyoutperform interestpoints fordeteting shape-based lasses; 3) ompare our

objetdetetorto thereent,state-of-the-artsystembyDalalandTriggs[4℄.

Key-words: loalfeatures,shapedesriptors,objetdetetion

Asoftwareimplementationisavailableatlear.inrialpes.fr/softw are

ThisresearhwassupportedbytheEADSfoundation,INRIAandCNRS.VittorioFerrariwasfunded

byapostdotoral fellowshipoftheEADSfoundation.

(5)

1 Introdution

Inthelastfewyears,theproblemofreognizingobjetlasseshasreeivedgrowingattention,

in bothvariants of whole image lassiation[3, 5,10, 14, 15℄, and objetloalization [2,

4, 16, 31℄. The majority of existing methods use loal image pathes as basi features.

Whiletheseworkwellforsomeobjetlasses,suhasmotorbikesandars,otherlassesare

denedbytheirshape,andarethereforebetterrepresentedbyontourfeatures(e.g. horses,

ormugs). Inspiteoftheirsubstantialsope,onlyomparablyfewworks[2,13,22,29℄have

takledthelass-levelloalizationproblem usingontourfeatures.

In this paper we present a family of loal ontour features, and their appliation for

detetingandloalizingobjets. Thesefeaturesaresmallgroupsofonneted,approximately

straightontoursegments,alledkadjaent segments,orkAS.ThesegmentsinakASform

apathoflengthkthroughanetworkofontoursegmentsoveringtheimage[9℄. Essentially, twosegmentsareonnetedinthenetworkiftheyareadjaentonthesameedgel-hain,or

ifone is at theend ofanedgel-hain direted towardstheother segment(setion3). The

larger the number k of segments in a kAS, the more omplex the loal shape strutures

it an apture. 1AS are just individual segments, while 2AS inlude L shapes, and 3AS

an form C, F and Z shapes (gures 2, 3). Along with the kAS features, we propose a

low dimensional, translation+saleinvariant desriptor designed to enode the geometri

propertiesofthesegmentsomposingakAS,and theirrelativeloations.

kAS haveaseveralattrativeproperties. First,asbothkASandtheirdesriptorsover

solely short hainsof onneted segments, they havethe ability to overpure portions of

an objet boundary, without inluding lutter edges whih very often lie in the viinity.

Seond, for a sensible range of k, kAS have intermediate omplexity, whih makes them detetablerepeatablywhilebeinginformativeat thesametime. Third,onnetednessis a

naturalgrouping riterion to form kAS.It avoids the need for dening a 'groupingsale'

ora 'grouping neighborhood' for asegment, and eetively onstrains the features to be

hainsofsegments,whiharemorelikelyoflyingentirelyonaboundary. Finally,kASare

omplete loal invariant features: eah has awelldened loation and sale, aninvariant

desriptor, and is deteted basedonly on loal properties of asingle image. Hene, they

an bereused eortlesslyin a variety of reognitionandimage mathing frameworksas a

replaementor additiontointerestpoints(suhas [2,5,10, 16,30℄).

We demonstrate the power and exibility of kAS within an objet detetion frame-

work whih brings together several suessful ideas presented before. Following the `bag

of features' paradigm[3,14, 34℄, weonstrut aodebook of kAS types, eah apturing a

dierent kindof loal shapestruture (gures 2 and 3). An image window issubdivided

intotiles[4,15℄andeahisdesribedbyaseparatebagofkAS.Inthisfashionthewindow

representation is omposed of several bags of kAS spatially loalized within the window.

Addingthis layerof spatial organizationimprovesthedisriminativepowerompared to a

standardorderlessbagoffeaturesovertheentirewindow. Wersttrainalassierfromex-

ampleobjetandbakgroundwindows,andthenloalizepreviouslyunseeninstanesintest

imagesviaamulti-salesliding-windowmehanism[4, 31℄oupledwiththe lassier. Our

k

(6)

togram[24℄, whihis areentlydevelopeddatastruturesupportingtherapidomputation

ofmultidimensionalhistograms.

During an extensive evaluation, involving eight diverse objet lasses and over 1400

images(setion6), westudyseveral aspetsofkAS.First,weanalyzetheobjetdetetion

performane while varying k, thereby shedding light onthe relationbetweenrepeatability andinformativenessaskinreases. Seond,foreahk,wevarytheresolutionofthewindow

tiling,allowingtoobservethetrade-obetweenaddingloalizationinformationandreduing

tolerane to spatial variationswithin the lass. Interestingly, wend theoptimal window

tilingto relateto theomplexityof thefeatures (k),with simplerfeatures preferringner

tiling. Moreover,wethoroughlyomparetheperformaneofkASagainstinterestpoints,and

againstthestate-of-the-artobjetdetetiontehniquebyDalalandTriggs[4℄. Theirwork

is partiularly relevant beause it followsa similar detetion framework (sliding-windows,

tiles),butitappliesdierentdesriptorstothewindowtiles(simplerhistogramsofgradient

orientations). Finally, we experimentwith theappliation of kAS with dierent k at the

sametime,andwiththeombinationofinterestpointsandkAS.

2 Related works

Inthe followingwerst review objet detetiontehniques basedon ontourfeatures, for

whih kAS oeranalternative,and thenpresentworks onthepereptualgroupingof on- tours,uponwhih kAS build.

Contour features for objet lass detetion Selinger and Nelson [27℄ detet key

urves: long segmentsof an edgel-hain omprised betweentwo high urvaturepoints. A

keyurve'ssizeandorientationdenesasquareimagepath,whihisthendesribedusing

all edgels falling within it. These edge pathes attempt to strike a winning trade-o: be

loal, and hene bring robustness to olusion and lutter, while also omplex enough to

bedistintivetosomedegree,enablingtomathindividual features,and openingthedoor

toomputationallyeientindexingshemes. However,these pathesare likelyto inlude

lutter edgels lying near the objet boundary, whih orrupt their desriptors and makes

themdiulttoputin orrespondene.

SelingerandNelson'sreognitionsystemwasdemonstratedinontrolledlaboratoryon-

ditions, with lean images ontaining modest amountsof lutter, and mostly on the task

ofreognizingspei objets. JurieandShmid[13℄wereamong thersttoproposeloal

ontourfeaturesforthedetetionofobjetlasses,andtotesttheirsystemonreal,luttered

images. Theirsale-invariantfeaturedetetorrespondstoirulararsofedgels,whihare

desribedbythespatialdistributionofpointsin athin annularneighborhoodoftheirle.

Thisattemptstoexludelutterfromthedesriptorbyavoidingenodingpointsinsidethe

irle. Asonelimitation,irulararsonlyoverafairlyrestritedlassof shapes.

Intheirveryreentworks,Shottonetal.[29℄andOpeltetal.[22℄independentlypropose

toonstrutontourfragmentstailoredtoaspeilass. Theideaistoexpliitlyonstrut

(7)

ones. Both works employ boosting to selet fragments from a large pool of andidates,

but dier in the way these andidates are onstruted (random retangles sampled from

training segmentation masks in [29℄, whereas[22℄ grows fragments starting from random

ontourpoints,and optimizestheirlengthso asto maximizeChamfermathingsoreand

auray of objetentroidpredition in validation images). Although they anbe more

disriminativeforthelearnedlass, thiskindoffragmentsarehardertoreusewithin other

reognitionorimagemathingframeworks,omparedtogenerifeatures,dependingonlyon

loalpropertiesofindividualimages. Moreover,thefragmentsof[29℄arenotsaleinvariant

andneedsegmentedtrainingimagestobeprodued,whihfurtherlimitstheirappliability.

Bergetal.[2℄ oeranalternativeviewonontour-basedobjetreognition,astingthe

problemasdeformableshapemathing. Insteadofountingonsophistiatedloalfeatures,

theysimplytakeindividualedgels(withaGeometriBlurneighborhooddesriptor),andput

themin orrespondenebetweenpairsofimages withapowerfulnon-rigid pointmathing

algorithmbasedonIntegerQuadratiProgramming.Themethodobtainsimpressiveresults

onthehallengingCalteh101database. Onedisadvantageisthat itredues reognitionto

mathingpairsoftrainingandtestimages,anddoesn'tinferfromthetrainingimagesasingle

model summarizingommonpropertiessharedbydierent instanesof thelass. Besides,

it would beinterestingto injet kAS in their framework, as replaeement for individual edgels,andobservewhether thiswouldleadtoimprovedperformane.

Dalaland Triggs[4℄onsiderably advaned the state-of-theart in humandetetion,by

designingtheHistogramof OrientedGradients(HoG)desriptor, andarefullyoptimizing

it over a large dataset ontaining thousands of humans in unonstrained poses. In their

reognitionframeworkimagewindowsaresubdividedintilesandeahoneisdesribedbya

HoG.Asimplesliding-windowmehanismthenallowstoloalizeobjets. Photometrinor-

malizationwithinmultipleoverlappingbloksoftilesmakesthemethodpartiularlyrobust

tolightingvariations. NotiethatHoG desriptorsareonlydened withinagivensubwin-

dow, they don't have aonept of loation and sale. Hene, they need to be assoiated

to someexternalfeature detetor beforebeingappliable within frameworks notbasedon

sliding-windows.

Pereptual grouping Pereptual groupingof ontours has a longhistory in omputer

vision [6, 12, 17, 18, 25, 26, 28, 32℄. The ruial idea behind these works is that piees

of ontour related by some pereptually salient property are more likely to belong to the

sameobjet. Thepereptualpropertiesexploitedinludeonvexity[12℄,o-irularity[32℄,

onnetedness[26,28℄, parallelism[18℄,andproximity[18℄.

Onemajorareaofappliation forpereptual groupingis imagesegmentation,in whih

thetaskistogrouptogetherallelementsbelongingtoindividual,unspeiedobjets[6,12,

32℄. Moreover,pereptualgroupingplayedan importantrolein the reognitionof spei

objetsunder varyingviewpoint,partiularlyinthe80sand 90s. Thefouswas mainlyon

planarobjets[26℄ andpolyhedra[11,18℄.

The kAS features are motivated by the same general intuitions of earlier pereptual

groupingworks,andaremostrelated theideasofRothwell[25, 26℄, whoadvoatedforthe

(8)

importane of onnetedness and topologial relations. We believe that onnetedness is

afundamental, powerfuldriving fore whih is urrently still underexploited in omputer

vision. Inthispaper,onnetednessisbroughttothedomainofobjetlassdetetion,and

is exploited to dene modernloal invariantfeatures: image elementswith awell dened

loation,asaleandaninvariantdesriptor,readytobeusedinmanyreentmathingand

reognitionshemes.

3

k

adjaent segments (

k

AS)

3.1 Contour Segment Network

Wesummarizeherethetehniqueof[9℄tobuildtheontoursegment network(CSN)ofthe

image,onwhih wewilldetetourkASfeatures.

Edgelsaredetetedbythe exellentBerkeleynaturalboundarydetetor[20℄, andthen

hained. Theresultingedgel-hainsarelinkedat theirdisontinuities,i.e. twoedgel-hains

c1 and c2 arelinkedif c2 passes nearan endpointof c1, andifthe endingof c1 is direted

towardsc2 (gure 1a). Informally, ifc1 would beextended abit, itwouldmeet c2. These

linksareusefulintwoways: theyreordthataontourmightontinueoverthegapbetween

twoedgel-hains,andallowtoapturejuntions(L-juntions,T-juntions,andhigherorder

juntionsinvolvingseveraledgel-hains).

The edgel-hains are partitioned into roughly straight ontour segments. The idea is

to organize these segmentsin anetwork, byonneting them along the edgel-hains, and

arosstheir links (gure 1a). Sine everyedgel-hain anbe linked to several others, the

CSNisaomplexbranhingstruture. Intuitively,twosegmentsareonnetediftheedgels

provide evidene that theymight be adjaentalong someobjetontour,even whenthey

are physially separatedby a(small) gap,or when forming a juntion. Thekeyproperty

of the CSN is to inlude paths going along the ontours of the imaged objets[9℄, whih

motivateskASfeatures.

3.2 Deteting

k

AS

The prinipal ontribution of this paper is to propose a family of loal features: paths

of length k through the CSN. More formally, a group of k segments is a kAS i they

an be ordered so that the i-th segment is onneted in the CSN to the (i+ 1)-th one,

fori ∈ {1, k−1}. Hene we all them k adjaent segments,and referto theirlength k as

degree. Askgrows,kASanformmoreandmoreomplexloalshapestrutures: individual

segmentsfork= 1;Lshapesand2-segmentT shapesfork= 2;C, Y, F, Z shapes,3-segment T shapes,andtrianglesfork= 3 (gures2,3). ThedimensionalityofkASdesriptorsalso

growswithk(nextsetion),andwetreatkASofdierentdegreesasdierentfeaturetypes,

allunitedinonefamilybyasharedruialproperty: tobesequenesofonnetedsegments.

Connetedness provides a natural riterion for grouping segments into kAS. It avoids

arbitrary denitions of the neighborhood of a segment, and onstrains kAS to be hains

(9)

r2

r3 r4

a

E D

A

b B

C 1

1 2 2

E D

A B

C c 1 2

3

E D

A B

C e E

D

A B

C d 2 1

3 4

Figure 1: a) Three edgel-hains, with ve segments and their inter-onnetions (arrows) in the

network. b)Twodeteted 2AS(B, C)and (D, E). Theorderof eahsegment in thedesriptoris

markednexttoit. Notiethat(A, B),(A, C),(C, E)arealsodeteted, thoughnotdisplayedbeause

they overlap with (B, C) and(D, E). ) A 3AS (C, A, E). d) A 4AS (E, B, C, D). e) ri vetors

involvedinthedesriptorforthe 4ASind).

of segments. Compared to the broader lass of groups of `nearby' segments, they have

higher hanes to lie entirely on a portion of the objet boundary. The features of [13℄

instead, inlude disonneted sets of edgels whih happen to be loated along part of a

irle. Besides, thekeyurvesof [27℄ are based onindividual edgel-hains, and heneare

lessrobustlydetetedinrealimagesthankAS,whihbridgegapsbetweenedgel-hains.

kAS anbedetetedbyadepth-rstsearhstartedfromeverysegment,followedbythe

eliminationofequivalentpaths(twodierentpathsinvolvingthesamesegmentsonstitute

thesamekAS).Thisisomputationallyheapforthesmallvaluesofkorrespondingtoloal features (aboutk≤5). Wedisregard highervaluesof k beause theyresultin largesale

strutures,toospeitoapartiularimageorobjetinstane,andinanexessivenumber

of detetedfeatures (several thousandsalready for k= 5). Morepreisely, the numberof kAS in animageontainingn segmentsgrowquiklywith k,as an beunderstoodbythe

followingobservations. Onaverage,eahsegmentisonnetedtotwotothreeother,beause

T and higher orders juntions our lessfrequently than simple 1-to-1onnetions. As a

onsequene,askgrows,thenumberofpathsoflengthkpassingthroughagivenbranhing

point inreasesquikly. Inpratie, while theaverage numberof 2AS is onlyabout1.5n,

thenumberof3ASis4n,thatof4AS is10n,andtherearemorethan20n5AS!

(10)

As k inreases, features inrease in omplexity. On the one hand, they beome more

andmoreinformative,whileontheothertheygraduallygetlessandlessrepeatableaross

dierentimagesandobjetinstanes. Additionally,thenumberofnon-boundaryfeatures(or

mixedfeaturesoveringpartlyboundaryandpartlylutter)alsogrowswithk,atuallyfaster

thanpureboundaryones,leavingalowersignal-to-noiseratio. Hene,forratherlowvalues

of k, kAS have an attrative intermediate omplexity, oering a onvenient ompromise:

simpleenoughtobedetetedrepeatably, yet omplexenoughto aptureinformativeloal

objetstrutures. Insetion 6,weonrm these intuitionsexperimentally,and determine

that2ASperformbest.

3.3 Desribing

k

AS

In order to ompare dierent kAS, we need a numerial desriptor. As rst step, it is

important to order the kAS segments {si}i=1..k in a repeatable manner, so that similar

kAS havethesameorder. Weseletas rstsegmentthe one withmidpoint losesttothe

entroidofallmidpoints{mi= (xi, yi)}i=1..k(whenseveralsegmentshavesimilardistanes

totheentroid,wepikthe rstone aordingtotheorder denedbelow). Aswewill see

in the desriptor below, this entermost segment is the natural hoie as referene point

formeasuringtherelativeloationoftheothersegments. Theremainingsegmentstakeup

positions2throughk,andareorderedfromlefttoright,aordingtotheirmidpoint. Iftwo

segmentssi, sj havesimilarxoordinate,i.e. (xi−xj)≤0.2

(xi−xj)2+ (yi−yj)2,then

theyareorderedfromtoptobottom. Notethatthisorderisstable,asnotwosegmentsan

havesimilarloationin bothxandy. Exampleorderingsanbeseenin gure1b-d.

One the order established, a kAS is a list P = (s1, s2, . . . , sk) of segments. Let ri = (rix, ryi) be the vetor going from the midpoint of s1 to the midpoint of si. Fur-

thermore, let θi, li = si be the orientation and length of si. The desriptor of P is

omposedof4k−2 values1 (gure1e):

r2x Nd

,ry2 Nd

, . . . , rxk Nd

, ryk Nd

, θ1, . . . , θk, l1

Nd

, . . . , lk

Nd

(1)

ThedistaneNdbetweenthetwofarthestmidpointsisusedasnormalizationfator,making thedesriptorsale-invariant(hene,boththekASfeaturesandtheirdesriptorsaresale-

invariant). Whilesegmentlengthsareknowntobeofteninaurate,andeahisbasedonly

onpartofthekAS,thedistanebetweenthe farthestmidpointsmakesabetterhoiefor

areliableestimateof thekAS sale. InadditiontoakASsale, wealsodene itsloation

tobethegeometri enter ofthemidpointsof itssegments. Exat denitions ofsaleand

loationareusefulwhenusingkASinhigherlevelalgorithms,suhasinoursliding-window objetdetetionsheme(nextsetions).

Theproposeddesriptoronsidersthesegmentsasompletelystraight,soastoapture

onlytherelevantinformationofthegeometriongurationtheyform,andnotthevarying

1

Theasek= 1makesexeption.Thedesriptorisomposedonlyofθ1,andthesaleof1ASisdened asl1.

Références

Documents relatifs

The proposed method combines a lightweight real-time object detection network PVANET based on deep learning and accurate shape detection Hough transform to detect and

The present paper proposes a technique for the extraction of crop rows orientation based on Formosat-2 panchromatic images acquired with a spatial resolution of 2 m.. The

On cherche donc une combinaison qui élimine k. Comme 5 ne possède que deux diviseurs positifs 1 et 5, nous avons répondu à la question 3) Algorithme : liste des

We present a family of scale-invariant local shape features formed by chains of k connected, roughly straight contour segments (kAS), and their use for object class detection. kAS

Figure 3e shows the first two deformation modes for our running example. The first mode spans the spec- trum between little coffee cups and tall Starbucks-style mugs, while the

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

More crucially, we do not only report a list on the basis of the cumulated number of citations received between year of publication and the end of the observation period, but also

Depending on the basic theory and problem analysis, we developed a new method for fast road detection with sus- tainable modifications and reliable improvements: firstly, sky