Camera Cooperation for Achieving Visual Attention

(1)

HAL Id: inria-00590202

https://hal.inria.fr/inria-00590202

Submitted on 3 May 2011

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Camera Cooperation for Achieving Visual Attention

Radu Horaud, David Knossow, Markus Michaelis

To cite this version:

(2)

attention

Radu Horaud, David Knossow, and Markus Mi haelis

INRIA Rhne-Alpes

655, avenue de l'Europe

38330 Montbonnot Saint-Martin, FRANCE

Corresponding author: Radu Horaud

Radu.Horaudinrialpes.fr

fax: +33 476 615 454

April 27, 2011

Ma hine Vision and Appli ations,

16(6), pp 331342, February 2006

Abstra t

In this paperwe address the problem of establishing a omputational model for visual attention using ooperation between two ameras. More spe i ally we wish to maintain avisual event withintheeldof viewofa rotatingand zooming amera

through the understanding and modelling of the geometri and kinemati oupling between a stati amera and an a tive amera. The stati amera hasawide eldof view thus allowing panorami surveillan e at low resolution. High-resolution details

may be aptured by a se ond amera, provided that it looks in the right dire tion. Wederiveanalgebrai formulationfor the oupling betweenthetwo amerasandwe spe ifythe pra ti al onditionsyielding auniquesolution. Wedes ribe amethod for

separating a foreground event (su h as a moving obje t) from its ba kground while the amera rotates. A set of outdoor experiments shows the two- amera system in operation.

Keywords: video surveillan e, visual attention, stereo vision, amera alibration,

(3)

In this paper we address the problem of establishing a omputational model for visual

attentionusing ooperationbetweentwo ameras. Attentionme hanismsmaygenerallybe

dened as pro esses that allo ate signi ant omputingpowertoone part orseveral parts

ofanimage,whereinformationrelevanttothetaskathandislikelytobefound. Therefore,

attentionpro essesshoulden apsulatebothtop-downandbottom-upvisualpro essessu h

as (i) the sele tion of a visual event of interest, (ii) the dete tion of image features whi h

hara terizethesele tedevent,(iii)me hanismsformaintainingthesefeaturesinthevisual

eld of view, as well as (iv) further analysis su h as re ognition and interpretation. In

parti ular,we addressthe problemofmaintainingavisualevent withintheeld ofviewof

a ameraand the approa h that we take onsists of monitoringan a tive amerathrough

the understanding and modelling of the oupling between an a tive amera and a stati

amera.

Consider for example the ase of a pedestrian or a bi y le rider evolving in an urban

environment. They may be viewed as stati obje ts in a single image. Nevertheless, in

ordertotakeintoa ount the deformable/arti ulatednatureof theirshapeand motionas

well as their time evolution,it is ru ial to observe them invideos and therefore onsider

them asdynami obje ts.

Traditional visual attention systems use either an a tive amera, a bino ular a tive

system, or several stati ameras. An a tive amera may rotate, translate, and zoom-in

and -out in order to maintain the obje t of interest within its eld of view and in order

to ompensate for hanges inthe obje t's appearan e [15℄, [8℄, [6℄, [16℄. Bino ular devi es

use ontrolled amera movements for gaze holding the two opti al axes interse t and

produ e a zero-disparity surfa e [3℄, [2℄. Other systems use several stati ameras [12℄.

Stati amera ongurations have been thoroughly studied from a geometri al point of

view[9℄.

Both singleand multiple amera systemshave advantagesand disadvantages. A single

ameraissimplertooperateanditsmotion anbeeasily ontrolledwithmotors. However,

it annota quire depthinformationthatis usefulfor s eneunderstanding. Another

draw-ba k is that it annot provide low and high resolution simultaneously. Multiple amera

systemshavetheadvantageofbeingabletoa quirepotentiallyri herinformationprovided

that the imageregistration (or orresponden e) problemissolved. A tivebino ular heads

try to ombine the advantages of ontrolledmotions and of multiple amerageometry.

In this paperwe propose an innovative solution that ombines the advantages of both

stati anda tive amerasandofbothlow-andhigh-resolutionimages. One ameraisxed

andhasawideeldofview, thusallowingsurveillan eofawideareaintermsofbothwidth

and depth of its eld of view. Therefore, the image asso iated with this amera provides

a panorami view while it annot apture s ene details. These s ene details are aptured

by another amera whi h is mounted onto a motor-driven pan and tilt devi e. Therefore,

this amera is able to gaze in a spe i dire tion with a spe ied fo al length. At the

best of our knowledgethe only previousattemptto ombinestati and a tive amerasfor

visualattention and surveillan e isdes ribed in[18℄. Withrespe t to [18℄ whi h des ribes

a general philosophy and a system ar hite ture, we analyse and hara terize indetail the

(4)

as a moving person is rst dete ted and sele ted using the rst (stati ) amera. Sin e

this amera is stati and its eld of view overs the whole s ene, an event willappear in

its asso iated image sequen e as a relatively small obje t. Well understood and widely

developed methods(opti alow, imagedierentiation,ba kground subtra tion,et .) may

be used todete t anevent o urring insu h aregion and tra k it overtime. However, the

resolution asso iated with this image is not su ient to properly re ognize and interpret

the event. The se ond amera must be ontrolled in order to dynami ally adjust its pan,

tilt, and zoom su h that the moving obje t remains in its eld of view and su h that the

obje t proje ts onto the image plane at onstant size and resolution. Ideally one would

like that the amera's degrees of freedom (pan, tilt, and zoom) ompensate for hanges

in appearan e due to both viewpoint and depth variations. On e the obje t of interest

has been properly aptured by the se ond amera, the lattershould be ableto tra k the

obje tusingavisualservoingloopwhi h ontrolsthe amera'srotationsandzoomsettings

[5℄.

Su h a amerasystem raises several interesting issues and questionsfrom

methodolog-i al, omputational, and pra ti al points of view. The traditional approa h for oupling

two or several stati ameras based on proje tive geometry and its asso iated algebrai

and numeri al tools is not su ient. Sin e one of the ameras is a tive, both the

ge-ometri al and the me hani al ouplings must be onsidered. Another ru ial issue that

must be addressed is the stereo orresponden e problem. With two stati ameras the

orresponden e problem does not have, in general, a good pra ti al solution be ause of

the inherent ambiguity asso iated with image-to-image mat hing. With an a tive stereo

system and under the assumption that aspe i obje tmust besele ted and tra ked, the

orresponden e problem be omes tra table from a omputational point of view.

More-over, stereo orresponden e is required only for bootstrapping the attention me hanism.

Finally, ooperation between alow-resolutiontra kerperformedwith a stati amera and

a high-resolution tra ker performed with an a tive amera must be properly dened and

modelled.

This paperhas the followingoriginal ontributions. ontributions: We derive a

mathe-mati al expression for the two- amera oupling, where one amera is stati and the other

omerarotates,undertheformofasetofpolynomialequations. Weshowthat,inthe

gen-eral ase,theremaybeseveralsolutionsforthepan andtiltanglesand thatthesesolutions

areparameterizedby theadepthparameter(thedepthfromthestati ameratothes ene

event). We onsider the spe ial ase where the pan and tilt rotationalaxes are mutually

orthogonal. We show that with a pra ti al amera setup there is a unique solution for

the pan and tilt values. We des ribe a pra ti alsolutionfor a hieving gaze ontrolwith a

rotating amera and for separating a moving obje t from its stati ba kground. On e an

initial solution is found, gaze- ontrol is redu ed to the tra king of an event in the stati

image and tothe updatingof the pan and tiltangle values.

The remainder of this paper isorganized as follows.

Se tion2des ribes andanalyses indetailthe geometri andkinemati oupling between

a stati amera and a rotating amera. The ouplingmodelallows the rotating amerato

gaze onto anevent sele ted inthe stati amera. We analyse both the general ase and a

(5)

from its ba kgroundbyestimatingthe proje tivemappingasso iatedwith a amera

under-going rotationalmotions. We des ribe a method for robustly estimating this mappingby

aligning the grey-levels/ olorsof image pixels whi h orrespond to the ba kground. This

transformation is then used for warping the previous and next frames onto the urrent

frameand for dete ting event pixels, i.e., with anapparent image motion that isdierent

than the apparent ba kgroundmotion.

Se tion4providesanoverviewofthepra ti alsystemthatisimplementedtogetherwith

some implementation details: amera, stereo, and kinemati alibration, as well as depth

estimation witha stati -a tive amera pair. A omplete set ofexperimentsis des ribed in

detail aswell.

Appendi es A, B, and C provide a detailed des ription of the kinemati model being

used to des ribe the pan and tilt devi e, as well as a method for alibrating the xed

parameters of this zero-referen e kinemati model.

2 The oupling between a stati and a rotating amera

In this se tion we onsider the geometri and kinemati aspe ts of the oupling between

xed and rotating ameras. From a geometri point of view, the two ameras a t as a

stereos opi devi ewhi h anbedes ribedusingtheepipolar onstraintwithinaproje tive

geometry framework. From a me hani al point of view, the rotating amera is mounted

on a pan and tilt me hanism whi h has an asso iated kinemati stru ture. In order to

des ribe the latter we willadopt azero-referen e kinemati model.

Inthis se tionweestablishtheformallinkbetweenastati ameraandarotating

am-era basedon the epipolar geometry(whi h holds atea h time instant)and the kinemati

modelasso iatedwith apan and tiltme hanism. Firstwe introdu ethe point

re onstru -tion equations. Se ond we onsider a pan and tilt kinemati model in its most general

form. Third, we analyse the ase of a simplied pan and tiltmodel, i.e., the pan and tilt

rotationaxes are mutually orthogoanal.

2.1 Two- amera geometry

Let us denote by P

1

and P

2

the proje tion matri es asso iated with the two ameras. A 3-D point

M

, represented in proje tive spa e by a 4-ve tor

M

= (X Y Z 1)

⊤

, is related

toits image proje tions

m

1

and

m

2

by:

λ

1 m

1 =

P

1 M

(1)

λ

2 m

2 =

P

2 M

(2)

The non null s alars

λ

1

and

λ

2

indi ate that the proje tive equality is dened up to a s ale fa tor. They may be interpreted as the proje tive depths along the lines of

(6)

oordinates

m

1

and

m

2

. For pinhole ameras, the 3

×

4 proje tion matri es have the followingparameterization: P

1 =

K

1

I

0

(3) P

2 =

K

2

R

t

(4)

The3

×

3matri esK

1

andK

2

havetheintrinsi ameraparametersasentries(seebelow the expression of K

2

). The rotation R and the translation

t

des ribe the orientation and position of the se ond amerawith respe t to the rst amera. Withoutloss of generality

we will assume that the rst amera is alibrated, therefore matrix K

1

is known. The se ond amera is alibrated as well up to its fo al length

f

whi h may or may not be known and whi h is allowed to vary. The expression of K

2

is:

K

2 =





kf 0 u

c

0 f v

c

0

1 



=





k 0 u

c

0 1 v

c

0 0

1 







f 0 0

0 f 0

0 0 1





=

K

′

2

D

f

In order toeliminatethe known ameraparameters from the equationswe use the the

substitutions

m

1 =

K

1 n

1

and

m

2 =

K

′

2 n

2

. By ombining eq. (1)with eq. (3) we obtain a simpleexpression for the oordinates of

M

:

M

=

λ

1 n

1

By ombining eq. (2) with eq. (4)and by substitutionof

M

weobtain:

λ

2 n

2 =

D

f

(λ

1

R

n

1 + t)

(5)

This is the proje tive epipolarrelationshipbetween the amera oordinates

n

1

and

n

2

(of

m

1

and

m

2

), the fo al length of the a tive amera

f

, and the relative position and orientation of the a tive amera with respe t to the stati amera,

t

and R. With the notation

n

2 = (x

2 y

2 1)

⊤

we farther eliminate

λ

1

by dividing the rst and se ond ve tor omponents,

()

1

,

()

2

with the third ve tor omponent,

()

3

:

x

2 = f

(λ

1

R

n

1 +

t

)

1 (λ

1

R

n

1 +

t

)

3 y

2 = f

(λ

1

R

n

1 +

t

)

2 (λ

1

R

n

1 +

t

)

3

(6)

Without loss of generality we seek a solution whi h ongures the stereo system su h

that the s ene point

M

is viewed in the enter of the image asso iated with the a tive amera:

n

2 = (0 0 1)

⊤

. The equations abovebe ome:

(λ

1

R

n

1 + t)

1 = 0

(λ

1

R

n

1 + t)

2 = 0

(7)

Problem formulation. Given a 3-D point

M

whi h is observed in the stati amera's image at

m

1

with amera oordinates

n

1

, we want to nd the position and orientation of the a tive amera su h that

M

proje ts onto the a tive amera's image enter.

In order to solve this problem we must parameterize the rotations and translations of

thea tive ameraasafun tionof(i)therelativepositionofthe a tive amerawithrespe t

tothestati ameraandof(ii)thekinemati modelasso iatedwiththea tive amera'span

and tiltme hanism. Therefore we must establishthe link between the epipolargeometry

onstraintandthekinemati model onstrains. Wewilladoptthezero-referen ekinemati

model for the pan-tilt devi e. This model allows the user to sele t a zero-referen e or a

do kingreferen e forthe kinemati hain. Wesolvefora generalpan-tiltkinemati model

and we develop a lose-form solution for a simplied pan-tilt model. The existen e of a

unique solutionallows tosafely apply numeri al methods tothe general ase.

We denoteby Tthe 4

×

4 homogeneousmatrix:

T

=

R

t

0 ⊤

1

(8)

We also denote by T

0

the do king or referen e position of the a tive amera. From a pra ti alpointof viewand forstereo alibrationpurposes,this referen epositionis hosen

su h that the two ameras have a ommon eld of view. Let Q des ribe the rigid and

onstrained motion undergone by the a tive amerafromits do kingposition toa urrent

position. From Figure1 one an noti e that the followingrelationship holds:

T

=

QT

0

(9)

Insert Figure 1 approximatively here

2.2 General pan-tilt model

Matri es Q and T

0

have the same mathemati al stru ture althoughthe former des ribes akinemati ally onstrained motion while the latterdes ribes astati relationship between

two Cartesianframes. MatrixQ des ribesthe motionundergonebya panand tilt

me ha-nism. In order todes ribesu h a me hanism we willadopt the well known zero-referen e

kinemati model. The latter is des ribed in many textbooks su h as [13, 17, 14℄. In its

most general formthis motion an be de omposed asfollows (see appendix A):

Q

=

Q

2 (α, α

0 )

Q

1 (β, β

0 , α

0 )

(10)

where Q

1

and Q

2

are one-dimensionalLie groupsea h one des ribinga rotation:

α

and

β

are thepanand tilt anglesparameterizingthesemotionswith

α

0

and

β

0

beingthe pan and tilt values asso iated with the zero-referen e position. Ea h one of these transformations

(8)

Matrix

Q

ˆ

1

des ribesthe tangent operator asso iated with the rigid motion; It is om-posed of a skew-symmetri matrix

R

ˆ

1

and a translational velo ity ve tor

ˆt

1

and writes as:

ˆ

Q

₁

=

ˆ

R

1 ˆt

1

0 ⊤

0

(12)

It isworthwhile tonoti ethat Q

−

₁

1 ((β − β

0 )) =

Q

1 (−(β − β

0 ))

and fromequation (11) we obtainthat the tangent operatormay beestimated from asingle motion:

tra e

(

Q

1 ) = 2 (1 + cos(β − β

0 ))

(13) and:

ˆ

Q

1 =

1 2 sin(β − β

0 )

Q

1 −

Q

−

1

(14)

By substituting eq.(12) intoeq. (11) we obtain:

R

1 =

I

3×3

+ sin(β − β

0 ) ˆ

R

1 + (1 − cos(β − β

0 )) ˆ

R

2

1

(15)

t

1 = sin(β − β

0 )ˆt

1 + (1 − cos(β − β

0 )) ˆ

R

1 ˆt

1

(16) There is asimilar expression for Q

2

. Equation (9)may bewritten as:

R

=

R

2

R

1

R

0

(17)

t

=

R

2

R

1 t

0 +

R

2 t

1 + t

2

(18)

Eq. (7)be omes (the subs ripts

()

1

and

()

2

denote the rst and se ond ve tor ompo-nents):

(λ

1

R

2

R

1

R

0 n

1 +

R

2

R

1 t

0 +

R

2 t

1 + t

2 )

1 = 0

(λ

1

R

2

R

1

R

0 n

1 +

R

2

R

1 t

0 +

R

2 t

1 + t

2 )

2 = 0

(19)

This a set of of two equations with three unknowns:

α

,

β

, and

λ

1

. We re all that we want to determine the pan and tilt angles su h that the event dete ted at position

m

1

in the rst image (with amera oordinates

n

1

) appears at position

m

2

(with amera oordinates

(0 0)

)inthese ondimage. The unknown

λ

1

isthedepthoftheobserved s ene point with respe t to the xed amera. In order to be able to nd a solution for the pan

and tiltangles wemust spe ifythis depth. The pra ti almethod forestimating the latter

is des ribed in detail inse tion 4.2. From nowon we willassumethat

λ

1

isknown.

In pra ti e itwillbe more onvenient to onsider the initialset of threeequations, i.e.,

eq. (5). Bysubstituting equations(17), (18) intothis equation and with

p

= λ

1

R

0 n

1 + t

0

we obtain: R

1 p

+ t

1 +

R

⊤

2 t

2 =

R

⊤

2 



0

0 λ

2 



(20)

(9)

beobserved thatR

1

,

t

1

,R

2

,and

t

2

areparameterizedbytheknowntangentoperators(see appendixC) andbythethreeunknownsthepanand tiltanglesandthedepthparameter

λ

2

. Thereforeweobtainthreeequationsin

cos(β −β

0 )

,

sin(β −β

0 )

,

cos(α−α

0 )

,

sin(α−α

0 )

, and

λ = λ

2

. Withthe following standard substitutions:

sin(α − α

0 ) =

2 tan

(α−α

0 )

2 1 + tan

2 (α−α

0 )

2 =

2t

α

1 + t

2 α

cos(α − α

0 ) =

1 − tan

2 (α−α

0 )

2 1 + tan

2 (α−α

0 )

2 =

1 − t

2 α

1 + t

2 α

we obtain three polynomial equations in three unknowns:

t

α

,

t

β

, and

λ

. It is possible to eliminate

λ

as an unknown between the se ond and third equations, at the pri e of in reasing the degree of the resulting polynomials. In the general ase it will be di ult

to analyse the number of admissible solutions of su h a set of polynomials [4℄. Although

inpra ti e these polynomialswillbesolved usingnumeri almethods, su h asthe Newton

method for ndingrootsof sets of polynomials,itis ru ialto beable to statein advan e

the exa t numberof pra ti al solutions.

We denote these sets of solutions by

(α

(i)

_{, β}

(i)

_{, λ}

(i)

₎

. They are in the intervals

[α

0 −

π, α

0 + π]

,

[β

0 − π, β

0 + π]

and we must have

λ > 0

. Sofar we onsidered themost general ase. Weanalyse in detail a simpliedpan-tiltdevi e and we show that inthis ase there

is aunique solution. We on lude that the general ase alsoadmits aunique solution.

2.3 Simplied pan and tilt model

In the ase where the pan and tilt axes are mutually orthogonal the kinemati model of

the devi e is simplied, asdes ribed in appendix B. This simplerkinemati modelallows

an algebrai analysis of the number of solutions asso iated with the inverse kinemati s

of the pan and tilt amera. Moreover and for the sake of this analysis, one may hoose

α

0 = β

0 = 0

. The matri esbe ome:

Q

1 =







cos β

0 sin β t

1

0

1

0 t

1

2 − sin β 0 cos β t

1

3

0

1 





(21) Q

2 =







1

0

0 t

2

1 0 cos α − sin α t

2

2 0 sin α

cos α

t

2

3

0

1 





(22)

It follows that eq. (20) be omes:

(10)

whi hyields the followingequations in

tan

β

2 = t

β

,

tan

α

2 = t

α

, and

λ = λ

2

:

(t

1 ₁

+ t

2 ₁

− p

1 ) t

2 β

+ 2p

3 t

β

+ (t

1

1 + t

2

1 + p

1 ) = 0

(t

1 ₂

− t

2 ₂

+ p

2 ) t

2 α

+ 2(t

2

3 − λ) t

α

+ (t

1

2 + t

2

2 + p

2 ) = 0

(1 + t

2 _α

)((t

1 ₃

− p

3 ) t

β

2 − 2p

1 t

β

+ p

3 + t

1

3 ) +

(1 + t

2 _β

)((λ − t

2 ₃

) t

2 _α

− 2t

2 ₂

t

α

− (λ − t

2

3 )) = 0

The rst equation has two real solutions for

t

β

. Indeed, its dis riminant is:

∆ =

(p

3 )

2 + (p

1 )

2 − (t

1

1 + t

1

2 )

2

. Obviously the oordinates of ve tor

p

have larger values than

t

1

1 + t

2

1

. Were all that ve tor

p

represents the oordinates of the observed point

M

inthe zero-referen e amera frame. Therefore

∆ > 0

and there are two solutions for

β

in the interval

[−π, π]

. Only one of these solutions an be a hieved in pra ti e, i.e., when the observed point lies in front of the amera. To on lude, the rst equation always admits

two solutions and onlyone solution isa hievable in pra ti e.

The se ond equation has two real solutions for

t

α

as well. Indeed its dis riminant is:

∆ = (t

2

3 − λ)

2 − (p

2 )

2 + (t

2

2 − t

1

2 )

2

. Re all that

λ

represents the depthfrom the amerato the observed point and in pra ti al ongurations

λ >> t

2

3

and

λ >> p

2

. Therefore this equation admits two solutions as well and with the same reasoning as above we on lude

that onlyone solutionis a hievable inpra ti e.

3 Event/ba kground separation

In the previous se tion we des ribed the geometri and me hani al oupling allowing the

a tive amera to rotate su h that an event dete ted and tra ked with the stati amera

may bevisualized atahigher resolution. Inorder tobe ableto analysethis event inmore

detail, one must properly isolateit fromthe ba kground.

In the past, event ba kground separation has been mainlyaddressed with stati

am-eras. When a amera moves, the problem is more di ult be ause one has to distinguish

between ameramotion(egomotion)andeventmotion. Nevertheless, wheneverthe amera

undergoes a pure rotational motion, i.e., when the enter of proje tion lies onto the axis

of rotation, it is possible to separate egomotion from event motion by assuming that the

ba kgroundpixelsaretransformedfromoneimagetoanotherbya2-Dproje tivemapping,

[10℄.

The motion of the pan and tilt amera is des ribed by eq. (10). In general, this does

not guarantee that the amera undergoes a pure rotation around its enter of proje tion

be auseoftheme hani alosets. Inpra ti ethelattersaresmall omparedtothedistan e

from the amera to the ba kground and therefore the ba kground may well be viewed as

a planeat innity, [10℄.

Let

m

t−1

2

and

m

t

2

des ribe the homogeneous oordinates of an image point at times

t − 1

and

t

. The subs ript

2

indi atesthatwe deal withthe a tive amera. One anapply equations (3) and (4) to the a tive amera and assume that the translational part of the

motion is null. We obtain the following well known formula for ameras undergoing pure

(11)

where R

t,t−1

2

R

t,t−1

1

models the rotation of the a tive amera. We denote this mappingby:

H

t,t−1

₌

K

2

R

t,t−1

2

R

t,t−1

1

K

−

1

2

(24)

and the problem isto estimatethe 3

×

3 matri es

H

asthe amerarotates. Therelationshipbetween

m

t−1

2

and

m

t

2

aboveisvalidforstati s enepoints. Inthepast this was used in ombination with an outlier reje tion te hnique in order to segment the

image into two layers: a stati layer orrespondingto a stati ba kground and a dynami

layer orrespondingtomovingobje tsaforeground. However,su hastrategyisgenerally

based onrobust statisti almethodsappliedto asingle rotating amera.

With the two- amera onguration being used here, the segmentation algorithm is

greatly simplied. Indeed, movingobje ts are dete ted as events in the image asso iated

with the stati amera. The amera oupling allows to predi t the main event under

investigation,topla ethe se ond amera,and toadjustitssettings,su h thatthiseventis

enteredwithrespe ttothea tive amera oordinateframe. Therefore,amajoradvantage

asso iated with this two- amera onguration is that a robust statisti al method is not

required. This is best shown onFigure2.

The separation between an event and its ba kground is therefore based on(i)aligning

theimagesbasedonthestati ba kgroundand(ii)on omparingthem,pixelby pixel. The

event dete tion, performedwith the low-resolutionstati image, bootstraps this pro ess.

From nowonwe onsider the imagesasso iatedwith the a tive ameraand weassume

that these images are segmented into two regions: foreground

F

and ba kground

B

. In ordertond thehomography whi haligns theba kgroundsbetween times

t

and

t − 1

,the followingerror must beminimized(for the sake of simpli itywe drop the subs ript

2

):

E

min

= min

h

i

X

m

∈B

kI

t−1

(Ψ(m

t−1

)) − I

t

_(Ψ(

H

t,t−1

m

t−1

))k

2

(25)

The fun tion

Ψ()

denotes the non linearmappingfromhomogeneous toEu lidean oordi-nates of

m

,

Ψ(m

1 , m

2 , m

3 ) = (m

1 /m

3 , m

2 /m

3 )

⊤

. Various methods were developed inthe

past for solving this non-linear minimizationproblem [11℄, [21℄, [1℄.

On e su hahomographyisestimated, itoptimallyaligns theba kgrounds. The

statis-ti s asso iated with the a tual minimizationresult (

E

min

) allows one to asso iate a prob-ability of ba kground with ea h pixel. These statisti s an be improved if a ba kground

image is in rementally built as is done in [1℄. Eventually one may use a de ision rule in

order to de ide whether a pixel belongs to the ba kground or to the foreground [7℄. In

pra ti e su hanapproa hwillnot performaswellasexpe tedsimplybe auseba kground

and foreground image regions may have similar grey-levelor olor values.

Therefore, tofurtherrenetheforegroundareawepro eedbypixel-to-pixel omparison

between three images at times

t − 2

,

t − 1

, and

t

. The dieren e between two pixels orresponding totwoaligned imageswrites:

D

t,t−1

(Ψ(m

t−1

)) = I

t−1

(Ψ(m

t−1

)) − I

t

(Ψ(

H

t,t−1

(12)

Thereisasimilarexpressionfor

D

t−1,t−2

_(Ψ(m

t−2

₎₎

wherethemapping

m

t−1

₌

H

t−1,t−2

_m

t−2

holds for the ba kground. As already mentioned, statisti sasso iated with the

minimiza-tionofeq.(25)allowstheestimationofathreshold

s

su hthatthefollowingsimplede ision rule is used: A pixel

m

t

belongs tothe foreground if:

D

t,t−1

(Ψ(m

t−1

)) ≥ s

and

D

t−1,t−2

_(Ψ(m

t−2

_{)) ≥ s}

4 Methodology, implementation, and experiments

High-qualitypan-tilt amerasavailabletoday an a hieveapre isionof about

0.05

0

inpan

and tilt. The pre ision toberea hed inpra ti e, using a alibrated amerasetup su h as

the one des ribed in this se tion, is of the order of

0.1

0

. Consider for example a eld of

view with an aperture angle of about

2

0

. At 100 meters the width of the eld of view is

3.5 meters and therefore the pre ision is of the order of 0.2 meters. This is su ient to

gaze and zoomonto afootballplayer, onto abi y le, onto apedestrian, or ontoa ar ina

typi al tra s enario. This overall pre ision

0.1

0

an be a hieved only if the system

is properly alibrated.

Another important ingredient of su h a two- amera devi e is the ontrol of the a tive

amera su h that it ontinuously looks towards the obje t of interest and maintains its

gaze su h thatthis obje tappears nearby itsimage enter, even if theobje t's appearan e

hanges, if its depth hanges, and/or if the obje t is partially o luded. This pro ess

requires three steps: o-line alibration,initializationand gaze ontrol.

The two- amera visualattention system, pro eeds asfollows:

•

O-line alibration: see se tion4.1.

•

Initialization:

As eneobje tisdete tedandtra kedovertime(automati ally,semi-automati ally,

or manually)using the stati amera;

The a tive amera rotates su h that this s ene obje t falls within its eld of

view and the depth of this s ene obje tis estimated, i.e., se tion 4.2;

Panand tiltvalues are estimated from s rat hby solvinga set of three

polyno-mials asso iated with the inverse kinemati s of the pan-and-tilt devi e and the

a tive amera isrotated a ordingly;

•

Gaze ontrol:

The pan and tiltanglevaluesestimated attime

t − 1

are usedas initialguesses to nd their values at time

t

. Noti e that the depth information is maintained onstant and the onsequen es of this hoi e are explained below.

Images at times

t − 1

,

t

,and

t + 1

are used to separate the movingobje tfrom the ba kground.

Noti e that during the gaze- ontrol stage of the algorithm the depth asso iated with

the s eneobje tisnotupdated: Instead,itspreviouslyestimatedvalue(duringthe

initial-izationstage) isused. As a onsequen e, the obje t willnotappear atthe image enter of

(13)

The amera ooperationmethod des ribed inthis paper ee tivelyworksin pra ti e only

onthepremisesthatthegeometri and kinemati parametersof thetwo ameradevi eare

properlyestimated. This is performed by the following steps:

1. Intrinsi amera alibration. The intrinsi parameters of both ameras, i.e., K

1

and K

2

in eqs. (3), (4), are alibrated using a lassi al amera alibration pro ess as des ribed indetail in[19℄.

2. Stereo alibration. Whenthea tive ameraisinitsdo kingorzero-referen eposition,

the two amerasmay beviewed as a standard stereos opi pair hara terized either

by the rotationand translation between the two amera frames (stereo alibration)

or by the epipolar geometry (weak stereo alibration). The method des ribed in

[20℄ allows for an a urate stereo alibration by moving a 3-D pattern in front of

the ameras. Eventually, the matrix R

0

and the ve tor

t

0

hara terizingthe amera setup inits do king positionare evaluated.

3. Kinemati alibration. Thea tive ameraismountedontoapanandtiltdevi etwo

oupledrotationalmotions. Kinemati alibration onsistsinestimatingthe tangent

operators asso iated with these onstrained motions, i.e.,

Q

ˆ

1

in eq. (11). The pan-tiltkinemati modelisformally des ribed inappendix A. The kinemati alibration

pro edure is des ribed indetail inappendix C.

4.2 Depth estimation

The method des ribed in se tions 2.2 and 2.3 returns a unique set of values for the pan

and tiltanglesprovided that anestimation of the depthfromthe stati ameratoa s ene

obje t is available,

λ

1

. In this se tionwedes ribe a pra ti alte hnique for estimatingthe depth toa s ene obje t. This involves the followingsteps:

1. Dete t this obje t inthe stati image and lo ate its enter, say

m

1

;

2. Controlthe a tive amerasu h that it looks in the rightdire tion and thereforethe

epipolarlineasso iatedwith

m

1

isvisible inits image,and

3. Sear h along this epipolar line in order to nd the best mat h of

m

1

, say

m

2

, and estimatethe depth to the s ene obje t.

Let us suppose that this obje t is dete ted and lo ated in the xed image and let

m

1

with amera oordinates

n

1

be the image of its enter. The s ene obje t lies somewhere along the lineof sight asso iatedwith this image point,i.e., Figure3.

(14)

Let

λ

min

and

λ

max

be the minimum and maximum expe ted depth values along this lineofsight su h that

λ

min

≤ λ

1 ≤ λ

max

. We asso iate two pointswith thesedepth values,

M

min

and

M

max

. They proje t onto the a tive amera's image plane at

m

min

and

m

max

. These image-plane pointslie onthe epipolar lineasso iated with

m

1

. Weseek a position, anorientation,and afo allengthfor thea tive amerasu hthattheepipolarline-segment

between

m

min

and

m

max

isa tually visiblein the image.

We onstrain this epipolarline-segment to be a horizontal image linepassing through

the image enter, i.e., the oordinates of

m

min

and

m

max

verify:

n

min

= (c, 0, 1)

⊤

and

n

max

= (−c, 0, 1)

⊤

, where

2c

orresponds to the image width. The image oordinates of these points verify eq.(6):

c = f

(λ

min

R

n

1 +

t

)

1 (λ

min

R

n

1 +

t

)

3 0 = f

(λ

min

R

n

1 +

t

)

2 (λ

min

R

n

1 +

t

)

3 −c = f

(λ

max

R

n

1 +

t

)

1 (λ

max

R

n

1 +

t

)

3 0 = f

(λ

max

R

n

1 +

t

)

2 (λ

max

R

n

1 +

t

)

3

In order to solve these equationsand estimate R,

t

, and

f

,we re all that the rotation matrix and the translation ve tor an be parameterized by the pan and tilt angles

α

,

β

and by the stereo alibration parameters R

0

and

t

0

, e.g., eqs (17) and (18). Nevertheless, this parameterization does not allow proper alignment be ause the a tive amera annot

rotate around its opti al axis. For this reason we introdu e a third rotation allowing a

virtualrotation of the a tive ameraaround itsz-axis, R

3 (γ)

.

Therefore, there are four equations in four unknowns,

f

,

α

,

β

, and

γ

. A solution an befound using theNewton's methodforsolving aset ofpolynomials. Noti ethat forea h

point-to-point orresponden e and for a given depth value, there is a unique solution in

α

and

β

. Hen e, one an use the known triplets

n

1 , n

min

, λ

min

and

n

1 , n

max

, λ

max

to nd initialvaluesfor thepan and tiltanglesand guarantee that thea tive ameragazesinthe

rightdire tion.

The a tive amerais ontrolledtozoomandrotateinordertorea hthe solutionfound

above, up to a rotation

γ

around its opti al axis. Eventually, standard stereo te hniques are applied in order to nd the best mat h along the epipolar line and to estimate the

depth tothe s ene obje t.

4.3 Experiments

A full set of experiments is summarized through Figures4, 5, 6, 7, 8, and 9. The

stereo-baseline between the stati and a tive ameras is of the order of 1 meter. The ameras

observe an outdoor environment. The frames whi h are shown orrespond to 8 samples

out of a 550-frameimage sequen e.

In the rst example (gures 4, 5, 6) the obje t of interest is a pedestrian. During the

initialization phase, this obje t is rst dete ted in the image asso iated with the stati

amera. Given minimum and maximum depth estimates (from the stati amera to that

person), the a tive amera rotatesand zooms su h that the person fallswithin itseld of

view. Sin e the amera ouple is alibrated, it is possible to predi t an epipolar line, to

sear h for a mat h along this line, and to estimate the depth from the pedestrian to the

(15)

The pan and tilt values allowing to pla e the person's enter of gravity at the image

enter are estimated and the a tive amera's me hanism is ontrolled to a tually pla e

the person in its enter. A region of interest is dened around the moving obje t. Noti e

however that the pedestrian is not displayed at the image enter. This is be ause there is

an error in the depth estimate. The pan and tilt values are omputed based on a depth

estimation that is dierent than the true depth value.

It is worthwhile to noti e the behaviour of the system in the presen e of o lusions

and of missing data. The pedestrian is rst o luded by a ar, then appears and then

walks outside the eld of view of the amera, turns, and omes ba k. Instead of these

disturban es the gaze of the a tive amera is orre tly ontrolled. Whenever the obje t

disappears fromtheeld ofviewofthestati amera,the a tive ameratra ks themoving

obje t using the event/ba kground separation methodoutlinedabove.

In the se ond example (gures7, 8, 9) the obje t of interest is a bi y le rider. Noti e

thatthe obje tisproperlytra ked inspiteof partialo lusionsbysurroundingobje ts. In

order to assess the quality of homography estimation between onse utive images in the

sequen e, we removed the foreground pixels and built a foreground image sequen e, as

shown in Figure9.

Fromamorepra ti alpointofview,thesizeoftheimagesis640

×

480. Thefo allength of the stati amera is of 500 pixels. Event/ba kground separation operates on 320

×

240 images. Thewholetwo- amerasystemrunsat10framesperse ondona1.7GHzpro essor.

5 Con lusion

In thispaperwe addressed the problemof ouplingtwo amerasinorder toa hievevisual

attention ontrolling a amerato gaze in a sele ted dire tion. The rst amera is stati

and it has a wide eld of view. Therefore it is able to apture, at low resolution, su h

events as movingobje ts. The se ond amera ismounted ontoa rotatingdevi e withtwo

degrees of freedom. Moreover it has a narrow eld of view of the order of 2 degrees.

Therefore itis abletoprovidea high-resolutionimage ofa s eneobje t,provided thatthe

latter fallswithin its eld of view.

We analyzed in detail the geometri and kinemati oupling between a stati amera

andarotating amera. Wederivedasolutionforthis ouplingbothforageneralkinemati

me hanism and for a simpler pan-tilt model. We showed that under the pra ti al setup

that we used, there is a unique solution allowing to rotate the amera su h that it gazes

(16)

depth.

On e the obje t of interest lies along the a tive amera's opti al enter a gaze- ontrol

loopisa tivatedinordertoestimatethe amera'srotationaldegreesoffreedom. Moreover,

the system is able to use event dete tion (performed with the stati amera) in order to

fa ilitate event/ba kground segmentationperformedwith a rotating amera.

The amera ooperation prin iple developed in this paper ould easily be generalized

to several rotating ameras. Therefore, multiple moving obje ts dete ted with the stati

amera ould behandled separately by multiplerotating ameras.

The vast majorityof visualsurveillan eand visualattention systems use asingle

am-era. Cooperation between stati and a tive ameras is an essential step forward allowing

to rapidly analyse an event at low resolution, and to swit h to high resolution if further

re ognition and interpretation steps are ne essary.

A The pan-tilt kinemati model

In this appendix we formally dene the rotational me hanism asso iated with the a tive

amera. First we onsider the most generalkinemati model. We adoptthe zero-referen e

kinemati representation. The angle asso iated with the tilt rotation is denoted by

β

. The angle asso iated with the pan rotation is denoted by

α

. The kinemati hain is omposed of ve Eu lidean frames and four rigid transformations between these frames,

see Figure 10:

•

Frame #5 is atta hed to a xture,it is equivalent tothe base of the devi e;

•

Frame#4 isamovingframerotatingaroundframe#5; This tiltrotationisdenoted by T

1

whi his a 4

×

4homogeneous matrix denoting arigid transformation;

•

Frame #3 is rigidlyatta hed to frame#4 through the xed transformation L

1

;

•

Frame#2isamovingframerotatingaroundframe#3;This panrotationisdenoted by T

2

;

•

Frame #1, or the amera frame, is rigidly atta hed to frame #2 through the xed transformationL

2

The oordinates of the physi al point

M

(observed by the amera) an be written in amera oordinates,

M

(1)

, aswell asin xture oordinates,

M

(5)

. Obviously we have:

M

(1)

(α, β) =

L

2

T

2 (α)

L

1

T

1 (β)M

(5)

(27)

The same formula holds for a do king position whi h is referred to as the zero-referen e

and whi his hara terized by xed values for the two angles:

M

(1)

(α

0 , β

0 ) =

L

2

T

2 (α

0 )

L

1

T

1 (β

0 )M

(5)

(17)

Byeliminating

M

(5)

inbetween these twoequations andby properlyaddingsomedummy

transformations,we obtain:

M

(1)

(α, β) =

L

2

T

2 (α)

T

−

1

2 (α

0 )

L

−

1

2

L

2

T

2 (α

0 )

L

1

T

1 (β)

T

−

1

1 (β

0 )

L

−

1

T

−

1

2 (α

0 )

L

−

1

2 M

(1)

(α

0 , β

0 )

=

L

2

T

2 (α − α

0 )

L

−

1

2 |

{z

}

Q

2

L

2

T

2 (α

0 )

L

1

T

1 (β − β

0 )

L

−

1

T

2 (−α

0 )

L

−

1

2 |

{z

}

Q

1 M

(1)

(α

0 , β

0 )

This is the zero-referen e kinemati modelof the a tive amera, i.e., eq. (10):

M

(1)

(α, β) =

Q

2 (α, α

0 )

Q

1 (β, β

0 , α

0 )M

(1)

(α

0 , β

0 )

(29)

The referen e frames have been appropriately dened su h that (without loss of

gen-erality) the transformations T

1

and T

2

an be written in a anoni al form, i.e., rotation around the lo alz-axis:

T

1 (β) =







cos β − sin β 0 0

sin β

cos β

0 0

0

0 1 0

0

0 0 1







(30)

Thesematri esformaone-dimensionalLiegroupwithT

−

1

1 (β) =

T

1 (−β)

. Therefore,from the equations above weobtain the followingexpressions for Q

2

and Q

1

: Q

2 (α, α

0 ) =

L

2

T

2 (α − α

0 )

L

−

₁

2

(31) Q

1 (β, β

0 , α

0 ) =

L

2

T

2 (α

0 )

L

1 |

{z

}

U

1

T

1 (β − β

0 )

L

−

₁

1

T

−

₁

2 (α

0 )

L

−

₁

2 |

{z

}

U

−

1

(32) Sin ematri esQ

i

andT

i

are relatedbysimilaritytransformations,itfollows thatboth Q

1

and Q

2

form one-dimensional Liegroups as well. It iswellknown, [13℄, that these groups an be parameterized using their Lie algebraand their angle of rotation,i.e., eq. (11).

B Simple pan-tilt model

In the ase of a simplied model it is assumed that the pan and tilt axes are mutually

(18)

We obtain: Q

2 =







1

0

0 t

2

1 0 cos(α − α

0 ) − sin(α − α

0 ) t

2

2 0 sin(α − α

0 )

cos(α − α

0 )

t

2

3

0

1 





(35) Q

1 =

U

1 





cos(β − β

0 ) − sin(β − β

0 ) 0 0

sin(β − β

0 )

cos(β − β

0 )

0 0

0

0 1 0

0

0 0 1







U

−

1

(36) with: U

1 =







0

1

0 l

1

3 + l

2

1 − sin α

0 0 cos α

0 l

1

1 cos α

0 − l

1

2 sin α

0 + l

2

2 cos α

0 0 sin α

0 l

1

1 sin α

0 + l

2

1 cos α

0 + l

2

3

0

1 





C Kinemati alibration

Kinemati alibration onsists inestimating the Lie algebrasasso iatedwith the matri es

Q

1

and Q

2

formally dened in appendix A. Ea h one of these matri es form a one-parameter Lie group su h that Q

1 (β

1 )

Q

1 (β

2 ) =

Q

1 (β

1 + β

2 )

. Moreover, on e a referen e frameis being hosen, the tangent operator (orthe Liealgebra) remainsxed. Therefore,

the kinemati alibration pro ess onsistsinndinganumeri alestimateof

Q

ˆ

1

andof

Q

ˆ

2

, i.e., eq. (14). Forthat purpose we onsider again eq.(29). Noti e that the transformation

fromposition

α

1 , β

1

to position

α

2 , β

2

writes:

Q

α

1 →α

2 ,β

1 →β

2 =

Q

2 (α

2 )

Q

1 (β

2 − β

1 )

Q

2 (α

1 )

Let the pan-tilt devi e perform two one-parameter motions: a motion from

α

1

to

α

2

and anothermotion from

β

1

to

β

2

. From the equationabove we obtain:

Q

2 (α

2 − α

1 ) =

Q

α

1 →α

2 ,β

1

(37) Q

1 (β

2 − β

1 ) =

Q

2 (−α

1 )

Q

α

1 ,β

1 →β

2

Q

2 (α

1 )

(38)

In pra ti e the kinemati alibration pro eeds asfollows:

Step 1: Move the devi e in the

α

1 , β

1

position;

Step 2: Using amera alibration tools, estimate the external amera parameters, i.e., the

position and orientation of the amera frame with respe t to a alibration xture

expressed asa rigid transformationT

(α

1 , β

1 )

; Step 3: Move the devi e in the

α

2 , β

1

position;

(19)

Step 6: Repeat Step 2 for this position and estimateT

(α

1 , β

2 )

; Step 7: Compute Q

α

1 →α

2 ,β

1 =

T

(α

2 , β

1 )

T

(α

1 , β

1 )

−

₁

;

Step 8: Compute

Q

ˆ

2

from Q

2 (α

2 − α

1 )

using eq.(14); Step 9: Compute Q

α

1 ,β

1 →β

2 =

T

(α

1 , β

2 )

T

(α

1 , β

1 )

−

1

; Step 10: Compute Q

2 (α

1 )

, Q

2 (−α

1 )

, and Q

1 (β

2 − β

1 )

, and

Step 11: Compute

Q

ˆ

1

from Q

2 (β

2 − β

1 )

using eq. (14);

Referen es

[1℄ A. Bartoli, N. Dalal, and R. Horaud. Motion panoramas. Computer Animation and

Virtual Worlds, 15(6):501517,November 2004.

[2℄ J.Batista,P.Peixoto,andH.Araujo. Real-timea tivevisualsurveillan eby

integrat-ingperipheralmotiondete tion withfoveatedtra king. In IEEE Workshop onVisual

Surveillan e, Mumbai, India, 1998.

[3℄ D.CoombsandC.Brown. Real-timebino ularsmoothpursuit. International Journal

of ComputerVision, 11(2):147164,O tober 1993.

[4℄ D. Cox, J. Little, and D. O'Shea. Using Algebrai Geometry. Springer, 1998.

[5℄ A. Crétual and F. Chaumette. Appli ation of motion-basedvisual servoing to target

tra king. Int. Journal of Roboti s Resear h, 20(11):878890,November2001.

[6℄ K. Daniilidis, C. Krauss, M. Hansen, and G. Sommer. Real time tra king of moving

obje ts with an a tive amera. RealTime Imaging,4(1):320, February 1998.

[7℄ F. Dufaux and F. Mos heni. Ba kground mosai king for low bit rate video oding.

In Pro eedings IEEE International Conferen e on Image Pro essing,volume 1, pages

673676,Lausanne, Switzerland,September 1996.

[8℄ J. A. Fayman, O. Sudarsky, E. Rivlin, and Rudzsky. M. Zoom tra king and its

appli ations. Ma hine Vision and Appli ations, 13(1):2537,August 2001.

[9℄ R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision.

Cam-bridge University Press, Cambridge,UK, 2000.

[10℄ R. I. Hartley. Self- alibration from multiple views with a rotating amera. In Pro .

Third European Conferen e on ComputerVision,pages471478,Sto kholm, Sweden,

May 1994.

[11℄ M. Irani and P. Anandan. About dire t methods. In B. Triggs, A. Zisserman, and

R. Szeliski, editors, Vision Algorithms: Theory and Pra ti e, number1883 inLNCS,

(20)

Q

M

T

0 Fixed

camera (1)

m

1 m

2 λ

₁

Depth ( )

λ

₂

Active camera (2) in

general pan−tilt position

Active camera (2) in

zero−reference pan−tilt position

Figure 1: The a tive amera has a do king or a zero-referen e position. Both the stereo

(21)

Fixed

camera

Active

camera

Event

Foreground

Background

Figure2: The ouplingbetween the ameras allows one toasso iate foreground and

ba k-groundregions withthe a tive amera's image. The event, whi h ispredi ted inthe stati

amera at low resolution, must lie in the foreground region asso iated with the a tive

(22)

Depth

Pan

Tilt

Yaw

λ

_min

λ

_max

m

1 M

Static

camera

Active

camera

Figure 3: In order toestimate the depth to the point

M

, the a tive amera must see this point. Thedegrees of freedomof the a tive amera pan, tilt,yaw, andfo allengthare

(23)

the stati amera. The se ond frameshows the traje toryof the moving person.

Figure 5: The output of the a tive ameraafter gaze ontrol.

(24)

Figure8: The result of foreground dete tion applied tothe se ond example

Figure 9: The foreground pixels were removed from the image sequen e and repla ed by

(25)

= pan

α

= tilt

β

M

m

₂

frame #1

frame #2

frame #3

frame #4

frame #5

L2 (coordinate change)

T2 (pan rotation)

L1 (coordinate change)

T1 (tilt rotation)

z

x

y

Figure10: This gureshows ageneralpan-tiltme hani almodelwhi h atta hes a amera

(frame #1) to a rigid xture (frame #5). Estimatingthe pan and tiltangles su h that a

(26)

Journal of Roboti s Resear h, 21(2):97113,February 2002.

[13℄ J. M. M Carthy. Introdu tion to Theoreti al Kinemati s. MIT Press, Cambridge,

1990.

[14℄ B. W. Mooring, Roth Z.S., and M. R. Driels. Fundamentals of Manipulator

Calibra-tion. John Wiley &Sons, 1991.

[15℄ D. Murray and A. Basu. Motion tra kingwith ana tive amera. IEEE Transa tions

on Pattern Analysis and Ma hine Intteligen e,16(5):449459,May 1994.

[16℄ D.W.Murray,K.J.Bradshaw,P.F.M Lau hlan,I.D.Reid,andP.M.Sharkey. Driving

sa ade to pursuit using image motion. International Journal of Computer Vision,

16(3):205228,November 1995.

[17℄ R.M.Murray, Z.Li,and S.S. Sastry. A Mathemati al Introdu tionto Roboti

Manip-ulation. CRCPress, Ann Arbor,1994.

[18℄ P. Peixoto, J. Batista, and H. Araujo. Integration of information fromseveral vision

systemsfora ommontaskofsurveillan e.In6thInt.WorkshoponIntelligentRoboti s

Systems, Edinburgh,UK, 1998.

[19℄ Matthieu Personnaz and Radu Horaud. Camera alibration: estimation, validation

andsoftware. Te hni alReportRT-0258,INRIARhoneAlpes,Grenoble,Mar h2002.

[20℄ Matthieu Personnaz and Peter Sturm. Calibration of a stereo-vision system by the

non-linear optimization of the motion of a alibration obje t. Te hni al Report R

T-0269, INRIA, September2002.

[21℄ H.-Y. Shum and R. Szeliski. Systems and experiment paper: Constru tion of

panorami mosai s with global and lo al alignment. International Journal of