Data-Driven Animation of Crowds

(1)

HAL Id: hal-00494248

https://hal.archives-ouvertes.fr/hal-00494248

Submitted on 22 Jun 2010

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Data-Driven Animation of Crowds

Nicolas Courty, Thomas Corpetti

To cite this version:

Nicolas Courty, Thomas Corpetti. Data-Driven Animation of Crowds. Proc. of Mirage 2007 -

Computer Vision / Computer Graphics Collaboration Techniques and Applications, May 2007, Paris,

France. pp.377–388. �hal-00494248�

(2)

NiolasCourty

1

andThomasCorpetti

2

1

UniversitédeBretagne-Sud,LaboratoireVALORIA,

56000VannesCedex,Frane

niolas.ourtyuniv-ubs.fr

2

UniversitédeHaute-Bretagne,LaboratoireCOSTEL,

35000RennesCedex,Frane

thomas.orpettiuhb.fr

Abstrat. In this paper we propose anoriginal method toanimate a

rowdofvirtual beings inavirtual environment.Instead ofrelying on

modelstodesribethemotionsofpeoplealong time,wesuggesttouse

a priori knowledgeon the dynamiof the rowdaquired from videos

of real rowd situations. In our method this information is expressed

as atime-varying motioneld whih aountsfor a ontinuous owof

peoplealong time. This motion desriptor is obtained through optial

ow estimation with a spei seond order regularization. Obtained

motioneldsarethenusedinalassialxedstepsizeintegrationsheme

that allows to animate a virtual rowdinreal-time. Thepower ofour

tehniqueisdemonstratedthroughvariousexamplesandpossiblefollow-

upstothisworkarealsodesribed.

1 Introdution

Crowdsof people exhibit partiularand subtlebehaviorswhose omplexityre-

ets theomplexnatureof humanbeings.Whileomputersimulationofsuh

phenomena have made it possible to reprodue partiular and singular rowd

ongurations,noneofthemhavemanagedtoreprodue,withinageneriframe-

work,the typial emergent behaviors observed within a rowd with suient

details and ata satisfyinglevel.In theontext ofanimationof human-likeg-

ures, hugeprogress have been observed with the use of motion apture. It is

nowpossibletousemotionsaquired fromrealperformersthroughavarietyof

editingand warpingoperationswith substantialbenetsin termsofrealismin

theproduedanimation.The aimof ourtehnique isto provide suh atoolin

the ontext of rowd animation. While other approahes try to traksingular

pedestrians into theow ofpeople, our framework is based onthe hypotheses

that themotions ofindividuals within therowdis theexpression of aontin-

uous ow that drivesthe rowd motion. This assumes that the rowd is dense

enough so that pedestriansare onsidered asmarkers of an underlying ow.In

this sense,ourmethod ismorerelatedto marosopi simulationmodels (that

trytodeneanoverallstruturetotherowd'smotions)ratherthanmirosopi

models (thatdenetherowd'smotionsasanemergentbehaviorofthesumof

(3)

in Figure 1. First, images are extrated from a video of a real rowd. From

all the pairsof suessiveimages a vetoreld is omputedthrough amotion

estimationproess.Theonatenationofallthesevetoreldsrepresentatime

serieswhihaountsforthedisplaementofthewhole rowdalongtime. This

ends upthe analysis part.Thesynthesisof anewrowd animationis doneby

advetingpartiles(the pedestrians)alongthistimevaryingow.

Motion Estimation

nn-1 0 Crowd Video

n-1 0

Integration

Vector field time serie

Crowd Animation Crowd description :

- density - start position

Analy sis Synthesis

Time

Fig.1.Overviewofthewholeproess

Thispaperis dividedas follow: setion2is a stateof theart of thedier-

ent existing approahes in the ontext of rowd simulation as well as motion

estimation. Setion 3 deals with the estimator used in our methodology, and

setion4presentstheintegrationofthemotiondesriptorinarowdanimation

ontroller.Thelasttwosetionspresentresultsobtainedwithourmethodalong

withaonlusionandperspetivesforourwork.

2 State of the art

The ideaof usingvideos asan inputto animation systemis notnew,and has

alreadybeensuesfullyusedintheontextof,forinstane,faialanimation[9℄,

harateranimationfromartoon[8℄oranimalgaits[14℄.Reentworksshowed

example of re-synthesisof uids ow from real videoexamples [3℄. Simulating

rowds from real videosfails into this hallenging ategoryof methods. First,

it is interestingto understand the limitationsof rowdsimulationmodel (rst

partof this setion). Wethen introduesomegeneral issuesabout themotion

estimationproblem.

2.1 Crowd simulation

Crowd behaviorand motion of virtual people havebeen studied and modeled

(4)

of buildings and open-spaes. Thestate of theart in human rowdbehavioral

modelling is large and an be lassied in two main approahes: mirosopi

and marosopi models. Themodels belonging to therst ategoryare those

desribingthetime-spaebehaviorofindividualpedestrianswhereastheseond

ategoryarethosedesribingtheemergentpropertiesof therowd.

Mirosopi simulation The simplest models of mirosopi simulation are

based on ellular automata [6,5℄. The soial fore model was rst introdued

by Helbing [15℄. It onsists in expressing the motion of eah pedestrian as a

result of aombination of soial fores, that repel/attrat pedestrians toward

eah others. It has been shown that this model generates realisti phenomena

asar formationsin exits orinreasingevauationtime with inreaseddesired

veloities.Ithasbeenextendedtoaountforindividualities[7℄orthepreseneof

toxigasesintheenvironment[12℄.Moreomplexmodelsonsidereahmember

oftherowdasautonomouspedestriansendowedwithpereptiveandognitive

abilities[22,24,23℄.Thosemodelsexhibitavarietyofresultsdependingonthe

qualityofthebehaviordesign.

Marosopimodels Modellingarowdomposedofdisreteindividualsmay

lead to inorretemergent global behaviors.These diulties may be avoided

byusingaontinuumformulation[18,26℄.Equationsusingtheoneptsofuid

mehanishavebeenderivedinordertomodelsuhapproahofhumanrowds.

Those approahesrelyontheassumptionthat theharateristidistane sale

between individuals is muh less than the harateristi distane sale of the

regionin whihtheindividualsmove[18℄.Henethedensityoftherowdhasto

betakenintoaountforthosemodelstobepertinent.Finallyseveralhypotheses

onthebehaviorofeahmembersoftherowdleadtopartialderivativeequations

governingtheowofpeople.

Although rowds are madeup of independent individuals with theirown ob-

jetivesandbehaviourpatterns,thebehaviorofrowdsiswidelyunderstoodto

haveolletiveharateristiswhihanbedesribedingeneralterms.Though,

marosopimodelsmaylakofsubtletiesand oftenrelyon stronghypotheses

(notably on density). Our framework propose to apture this global dynami

from real rowd video sequenes. This imposes the use of motion estimation

tehniques.

2.2 Motion estimation

Whenarowdisdenseenough,theusualtrakingsystemslikeKalmanltersor

stohastiltering[13℄willgeneratelargestatespaethatwillyieldaomputa-

tionally tooexpensiveproblem.It isthen neessarytouse alternativemethods

to obtain the information on the dynamis of the rowd in order to hara-

(5)

able to measure amotion information from image sequenes. Onean ite for

instanetheparametrimethods, theorrelationtehniquesortheoptial ow

approahes(see[21℄forasurvey).Theselatterareknowntobethemostaurate

to address thegeneri problem of estimating theapparentmotion from image

sequenes(seeforinstane[27℄forsomepresentationsand[2℄foromprehensive

omparisons with ompletely dierent approahes). The idea of using optial

owto estimaterowd motionshasreentlydrawnattention in theontext of

humanativityreognition[1℄.Theoriginaloptialowisbasedontheseminal

workofHorn&Shunk[16℄ andisbriey desribedinthenextparagraph.

Optial Flow TheoptialowbasedonHorn&Shunkonsistsinthemin-

imization of a global ost funtion

H

ômposed ôf ^two ^terms. ^The ^rst ône,

named observation term, is derived from a brightness onstany assumption

andassumesthatagivenpointkeepsthesameintensityalongitstrajetory.It

isexpressedthroughthewellknownoptial owonstraintequation(ofe):

H

obs

(E,

^v

) = Z Z

Ω

f

1

∇E(

^x

, t) ·

^v

(

^x

, t) + ∂E(

^x

, t)

∂t

d

^x

,

⁽¹⁾

where v

(

^x

, t) = (u, v)

^T îs ^the ûnknown ^veloity êld ât ^time

t

^and ^loation

x

= (x, y)

ⁱⁿ ^the ^image ^plane

Ω

^,

E(

^x

, t)

^is ^the ^image brightness, viewed fora whileasaontinuousfuntion.

This rst term relies on the assumption that the visible points onserve

roughlytheirintensityin theourseofadisplaement.

dE

dt = ∇E ·

^v

+ ∂E

∂t ≈ 0.

⁽²⁾

The assoiated penalty funtion

f

¹ ^is ^often ^the

L

² ^norm. ^However, ^better ^es-

timates are usually obtained by hoosing asofter penalty funtion [4℄. Suh

funtions, arisingfrom robust statistis[17℄, limit theimpat of themany lo-

ations where the brightness onstany assumption does not hold, suh as on

olusionboundaries.

This single (salar) observation term does not allow to estimate the two

omponents

u

^and

v

ôf^the^veloity.Înôrder ^to^solve^thisîll-posed ^problem, ît

is ommon to employan additional smoothness onstraint

H

reg^. ^Usually, ^this

seond term enforesa spatial smoothness oherene of theoweld. It relies

onaontextualassumptionwhihenforesaspatialsmoothnessofthesolution.

Thistermusuallyreads:

H

reg

(

^v

) = Z Z

Ω

f

2

| ∇ u(

^x

, t)| + | ∇ v(

^x

, t)|

,

⁽³⁾

Aswiththepenaltyfuntioninthedataterm,thepenaltyfuntion

f

2^was^taken

(6)

20℄.Basedon(1)and(3),theestimationofmotionanbedonebyminimizing:

H(E,

^v

) = H

obs

(E,

^v

) + αH

reg

(

^v

)

= Z Z

Ω

f

1

∇E(

^x

, t) ·

^v

(

^x

, t) + ∂E(

^x

, t)

∂t

d

^x

+

α Z Z

Ω

f

2

| ∇ u(

^x

, t)| + | ∇ v(

^x

, t)|

,

(4)

where

α > 0

îs â ^parameter ôntrolling^the ^balane ^between ^the ^smoothness

onstraintand theglobaladequaytotheobservationassumption.

Theminimizationofthisoverallostfuntionenablestoextrattheapparent

motioneldbetweenapairofimages

E(

^x

, t

¹

)

^and

E(

^x

, t

²

)

^.

Disussion Ithasbeenprovedthatinmanyimagesequenesandespeiallyin

uid-likeimagery,theselassiassumptionsareviolatedinanumberofloations

in the image plane. Even if in most of rigid-motion situations, the use of a

robustpenaltyfuntionenablesustoreoverproperlythemotionofpathologial

situations(oludingontours,...)theusualassumptionsare,unfortunately,even

lessappropriatein uidimagery.

Somestudies have provedthat a rowd dense enoughhas sometimesa be-

haviorthat an be explained by some uid mehanis laws [18℄. It is then of

primary interest to integrate suh prior knowledge in the optial ow (in the

observationtermorontheregularizationonstraint,dependingonthenatureof

thephysiallawtointegrate)toobtainatehniquedevotedtorowdmotion.In

this paper,wepropose to useasmoothingonstraintdediated to theapture

of the signiant properties of the ow from a uid mehanis point of view.

These properties are the divergene (linked to the dispersionof arowd) and

thevortiity (alsonamedurl)linkedto arotation.

3 Crowd motion estimation and representation

In this setion, we present the regularization used in the motion estimator to

extratareliablerowdmotioninformation.Formoredetails ontheapproah,

thereaderanreferto[11,10℄.Undertheassumptionthatadenseenoughrowd

hasabehaviourthat an bemodeledwith someuidmehanislaws, onean

demonstrate that the usual rst-order regularization funtional in (3) is not

adaptedforuidsituations.

ByusingEuler-Lagrangeonditionsofoptimality,itisindeedreadilydemon-

strated[10℄that thestandardrst-orderregularizationfuntional:

H

reg

(

^v

) = Z Z

Ω

| ∇ u(

^x

)|

²

+ | ∇ v(

^x

)|

²

d

^x ⁽⁵⁾

(7)

ularizationfuntional[25℄:

H

reg

(

^v

) = Z Z

Ω

div

2

v

(

^x

) +

^url²^v

(

^x

)

d

^x

,

⁽⁶⁾

wheredivv

=

^∂u_∂x

+

^∂v_∂y ^and^urlv

=

_∂x^∂v

−

^∂u_∂y ^arerespetivelythedivergeneand thevortiityofthemotioneldv

= (u, v)

^.

A rst-order regularization therefore penalizes the amplitude of both the

divergene andthevortiityof thevetoreld. Foradenserowdmotionesti-

mation,thisdoesnotseemappropriatesinetheapparentveloityeldnormally

exhibitsompatareaswithhighvaluesofvortiityand/ordivergene.Itseems

thenmoreappropriatetorelyonaseond-orderdiv-urlregularization[25℄:

H

reg

(

^v

) = Z Z

Ω

| ∇

divv

(

^x

)|

²

+ | ∇

urlv

(

^x

)|

²

d

^x

.

⁽⁷⁾

This regularization tends to preserve the divergene and the vortiity of the

motioneld vto estimate.Interestedreadersmayrefereeto[11℄ to getpreise

desriptions on the optimization strategy and on assoiated numerial imple-

mentationissues.

Themotioneldvisthentheminimumofthefollowingostfuntion(with

• = (

^x

, t)

^):

v

(•) = min

v∈Ω

Z Z

Ω

f

₁

∇E(•) ·

^v

(•) + ∂E(•)

∂t

+ αk ∇

divv

(•)k

²

+ αk ∇

urlv

(•)k

²

d

^x

.

(8)

andtheglobalrowdmotionisrepresentedasatimeseriesofsuhmotionelds.

4 Data-driven animation of rowds

Onethetimeseriesofmotioneldshasbeenomputed,itispossibletoonsider

this informationasinputdataforan animationsystem.Let usrstreall that

the omputed veloities orrespond to a veloity in the image spae, and our

goalisto animate individualitiesin the virtual world spae.Given theposition

of suh a person in the virtual world, it is possible to get the orresponding

positionin theimageframealongwith aameraprojetionmodel.Parameters

forthisprojetionanbeobtainedexatlythroughameraalibration.Wehave

onsideredasanapproximationofthismodelasimpleorthographiprojetionin

theexperimentspresentedintheresultsetions.Thisassumptionholdswhenever

the amerais suiently faraway from therowd sene. One this projetion

hasbeendened,animatingindividualitieswhihonstitutetherowdamounts

to solvethelassialfollowingdierentialequation (with

x(t)

^the^position ^of^a

personintheimageframeat time

t

⁾^:

∂x

∂t = v(x(t), t)

⁽⁹⁾

(8)

equippedwithappropriateinitialondition

x(0) = x

0^whih^stands^for^the^initial

positionsoftheindividual intheoweld. Inourframeworkwehaveusedthe

lassial4-thorderRungeKuttaintegrationsheme,whihallowstoomputea

newposition

x(t + 1)

^givenâ^xed^timestep ^withânâeptable âuray.^This

newposition isthen projetedbakin the virtualworldframe. Thisproess is

depitedinFigure 2.

Fig.2.Motion synthesis from oweld. Thepositionofthe rowd'smemberis

projetedontotheow(step1),theintegrationisperformedintheimageframe(step

2)andthenthenewpositionisprojetedbakinthevritualworldframe(step3).

Letusnallynotethatthequalityofthegeneratedanimationisloselylinked

withtheinitialpositionoftherowdmembersandtheirdensity.Wehaveused

in thesubsequentresultsurvesouresthat reaterandompedestriansalonga

hand-designedurvesituatedin theow.

5 Results

Ourapproahwasrsttestedonsyntheti rowdsequenestovalidatethethe-

oritialpartofourwork.Wehavealsousedrealrowdsequenestohandlereal

ases.Thoseresultsarepresentedin thissetion.

5.1 Syntheti example

The syntheti sequene representsa ontinuousow of humanbeingswith an

obstale(aylindernamed

C

⁾ⁱⁿ^the^middleôf^theîmage.Ît^has^been^generated

using the lassial Helbing simulation model [15℄. In this situation, the true

motion eld inside the ylinder

C

^is ^known ^(no ^motion, ^i.e. ^v

(

^x

∈ C) = 0

^).

Theostfuntion (8)beingdened onthewholeimage plane,weneedtohave

apartiularproessto dealwiththis spei no-data area.Atually, sineany

motioninsidethearea

C

îsâ^reliableândidate^(theôfe⁽¹⁾îs^nulleverywhere), themotionestimationusingrelation(8)islikelytoyieldsomeinoherentresults

inside and outside the ylinder (due to the regularization term whih spreads

(9)

C

^. ^Thanks ^to ^the ^robust ^estimator

f

1 ûsed ⁱⁿ ^(1), ^this ârea îs ^not ^taken înto

aountbytheobservation termoftheestimation proess.Hene,themotions

eldsestimatedoutsidetheylinderarenotdisturbedbytheonesinside

C

^.^This

is illustratedin Figure 3.We present animage of thesequenein Figure3(a),

theestimatedmotioneldinFigure3(b),azoomoftheylinderareawithand

withoutthespeitreatmentproposedonthispartiularsituation(Figure3()

and3(d)respetively).Someimagesoftherowdanimationsynthesisareshown

a

0 50 100 150 200 250 300 350

0 50 100 150 200 250

b

135140 145 150155 160 165170 175 180 185

90 100 110 120 130 140

130 140 150 160 170 180 190 200

90 100 110 120 130 140

d

Fig.3.Estimation of the motion eld onthe syntheti example;(a):images

from the original sequene; (b) the estimatedmotion eld; () the motion near the

ylinder estimatedwith a speial are of this no-data area and (d) same as () but

without aspeitreatment for the ylinder.Onean seethat the motion near the

ylinderin(d)isnottotallyoherent.

onFigure4.TheanimationwasgeneratedthankstoaMayapluginwhihdenes

arowdas asetof partilesand performs thesynthesisdesribedin setion4.

Asexpeted,thevirtualrowdisinaordanewiththeunderlyingmotionand

the obstaleis orretly managed. This rst example provesthe ability of the

proposed approah to synthesize aoherent motionfrom anestimated motion

eld.Letus nowapplythistehniquetoreal data.

5.2 Real data

We present the results obtained on two real sequenes. Both data have been

aquired with a simple video amera with an MPEG enoder. The resulting

imagesareheneverypoorintermsofbrightness:thislatterisindeedsometimes

onstantin asquared area.It is important to note that this point is likely to

(10)

Fig.4.Some imagesofthe synthetirowdanimation for

4

^dierent^times^of

thesequene.

Strikesequene Therstrealsequeneisavideorepresentingastrikewhihtook

plae at Vannesin Frane. All pedestriansare walkingon the samediretion.

TwoimagesofthesequeneanbeseenonFigure5(a) and(b).InFigure5()

and (d), wepresent thesyntheti rowd animationobtained superimposed on

theestimatedmotioneld.Oneanobservethattheresultingrowdanimation

is in aordane with the real pedestrian behaviors. Hene, on this example,

ourmethod hastheadvantagetosynthesizeorretlytheobservedphenomena

withoutresortingtousualmotionapturetehniques.Letusnowseetheresults

onamoreompliatedrealsequene.

Shibuya sequene The seond real sequene isa videoaquired in the Shibuya

rossroads in Tokyo,Japan,whih is famousfor thedensityof people rossing

the streets. Three images of the sequene anbe seen onFigure 6(a-). This

situationisomplexsineatleasttwomainowsofpeopleinoppositediretions

arerossingtheroad.Itisimportanttoobservethatinthisase,theunderlying

assumptions of our approah (a very dense rowd) are not totally respeted.

Thisexampleisthereforeshowntoevaluatethelimitsofourmethod.InFigure

6(d-f),wepresentthesynthetirowdanimationobtainedsuperimposedonthe

estimatedmotioneld.Oneanseeontheseguresthatthetwomainopposite

owsare orretly extrated and synthesised, despite the fat that the initial

sequene was very poor in terms of quality and that our initial assumptions

werenotrespeted.Thegeneratedsequeneisrelativelyrealisti.Nevertheless,

the intersetion of the two groups of people is not orretly managed: some

pedestrians haveinoherent trajetories. This issue has twomain reasons: the

estimation proess is loally inoherent when two people olude eah other,

andthereisnotemporalontinuityin theestimatedow.Twopossibilitiesan

be exploited to ope suh a situation: the rst one onsists in improving the

motion estimation proess through a temporal smoothing of the motion eld

whereastheseondpossibilityistointrodueadynamiallawin thetrajetory

reonstrutionstep.Thistwokeypointswillbethesopeofourfurtherwork.

6 Conlusion

Inthis paper,wehave presentedanewand orginal methodwhih proposes to

(11)

d

Fig.5.Thestrikesequene.(a,b):twoimagesofthesequene;(,d)theorrespond-

inganimationsuperimposedontheestimatedmotioneld.

a b

d e f

Fig.6.The Shibuya sequene.(a-): two imagesofthe sequene;(d-f) theorre-

spondinganimationsuperimposedontheestimatedmotioneld.