Deeper understanding of the homography decomposition for vision-based control

(1)

HAL Id: inria-00174036

https://hal.inria.fr/inria-00174036v3

Submitted on 25 Sep 2007

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

for vision-based control

Ezio Malis, Manuel Vargas

To cite this version:

Ezio Malis, Manuel Vargas. Deeper understanding of the homography decomposition for vision-based

control. [Research Report] RR-6303, INRIA. 2007, pp.90. �inria-00174036v3�

(2)

a p p o r t

d e r e c h e r c h e

-6 3 9 9 IS R N IN R IA /R R --6 3 0 3 --F R + E N G Thème NUM

Deeper understanding of the homography

decomposition for vision-based control

Ezio Malis and Manuel Vargas

N° 6303

(3)

(4)

decomposition for vision-based control

Ezio Malis

∗

_{and Manuel Vargas}

†

Thème NUM — Systèmes numériques Projet AROBAS

Rapport de recherche n°6303 — Septembre 2007 — 90 pages

Abstract: The displacement of a calibrated camera between two images of a planar object

can be estimated by decomposing a homography matrix. The aim of this document is to propose a new method for solving the homography decomposition problem. This new method provides analytical expressions for the solutions of the problem, instead of the traditional numerical procedures. As a result, expressions of the translation vector, rotation matrix and object-plane normal are explicitly expressed as a function of the entries of the homography matrix. The main advantage of this method is that it will provide a deeper understanding on the homography decomposition problem. For instance, it allows to obtain the relations among the possible solutions of the problem. Thus, new vision-based robot control laws can be designed. For example the control schemes proposed in this report combine the two final solutions of the problem (only one of them being the true one) assuming that there is no a priori knowledge for discerning among them.

Key-words: Visual servoing, planar objects, homography, decomposition, camera

calibra-tion errors, structure from mocalibra-tion, Euclidean reconstruccalibra-tion.

∗_{Ezio Malis is with INRIA Sophia Antipolis}

†_{Manuel Vargas is with the Dpto. de Ingenieria de Sistemas y Automatica in the University of Seville,}

(5)

d’homographie pour l’asservissement visuel

R´esum´e : Le dplacement d’une camra calibre peut tre estim partir de deux images d’un

objet planaire en dcomposant une matrice d’homographie. L’objectif de ce rapport de recherche est de proposer une nouvelle mthode pour rsoudre le problme de la dcomposition de l’homographie. Cette nouvelle mthode donne une expression analytique des solutions du problme au lieu des solutions numriques classiques. Finalement, nous obtenons explicitement le vecteur de translation, la matrice de rotation et la normale au plan exprims en fonction des lments de la matrice d’homographie. Le principal avantage de cette mthode est qu’elle aide a mieux comprendre le problme de la dcomposition. En particulier, elle permet d’obtenir analytiquement les relations entre les solution possibles. Donc, des nouveaux schmas de commande rfrence vision peuvent tre conus. Par exemple, le mthodes d’asservissement visuel proposes dans ce rapport combinent les deux solutions de la dcomposition (une seule des deux est la vraie solution) en supposant qu’il ne pas possible de distinguer a priori quelle est la bonne solution.

Mots-cl´es : Asservissement visuel, objets plans, homographie, dcomposition, erreurs de

(6)

1 Introduction

Several methods for vision-based robot control need an estimation of the camera displace-ment (i.e. rotation and translation) between two views of an object [3, 4, 6, 5]. When the object is a plane, the camera displacement can be extracted (assuming that the intrinsic camera parameters are known) from the homography matrix that can be measured from two views. This process is called homography decomposition. The standard algorithms for ho-mography decomposition obtain numerical solutions using the singular value decomposition of the matrix [1, 11]. It is shown that in the general case there are two possible solutions to the homography decomposition. This numerical decomposition has been sufficient for many computer and robot vision applications. However, when dealing with robot control applica-tions, an analytical procedure to solve the decomposition problem would be preferable (i.e. analytical expressions for the computation of the camera displacement directly in terms of the components of the homography matrix). Indeed, the analytical decomposition allows us the analytical study of the variations of the estimated camera pose in the presence of camera calibration errors. Thus, we can obtain insights on the robustness of vision-based control laws.

The aim of this document is to propose a new method for solving the homography decomposition problem. This new method provides analytical expressions for the solutions of the problem, instead of the traditional numerical procedures. The main advantage of this method is that it will provide a deeper understanding on the homography decomposition problem. For instance, it allows to obtain the relations among the possible solutions of the problem. Thus, new vision-based robot control laws can be designed. For example the control schemes proposed in this report combine the two final solutions of the problem (only one of them being the true one) assuming that there is no a priori knowledge for discerning among them.

The document is organized as follows. Section 2 provides the theoretical background and introduce the notation that will be used in the report. In Section 3, we briefly remind the standard numerical method for homography decomposition. In Section 4, we describe the proposed analytical decomposition method. In Section 5, we find the relation between the two solutions of the decomposition. In Section 6 we propose a new position-based visual servoing scheme. Next, in Section 7 we propose a new hybrid visual servoing scheme. Finally, Section 8 gives the main conclusions of the report.

2 Theoretical background

2.1 Perspective projection

We consider two different camera frames: the current and desired camera frames, F and F∗

in the figure, respectively. We assume that the absolute frame coincides with the reference

(9)

n π d d∗ m∗ i mi t _R Pi F = Fc F∗_{= F}_d

Figure 1: Desired and current camera frames and involved notation. of n 3D feature points, P, with Cartesian coordinates:

P = (X, Y, Z)

These points can be referred either to the desired camera frame or to the current one, being

denoted asd_{P and} c_{P, respectively. The homogeneous transformation matrix, converting}

3D point coordinates from the desired frame to the current frame is:

c_T d= _c Rd ctd 0 1 wherec_R

d andctd are the rotation matrix and translation vector, respectively.

In the figure, the distances from the object plane to the corresponding camera frame are

denoted as d∗ _{and d. The normal n to the plane can be also referred to the reference or}

current frames (d_n_orc_{n, respectively). The camera grabs an image of the mentioned object}

from both, the desired and the current configurations. This image acquisition implies the projection of the 3D points on a plane so they have the same depth from the corresponding camera origin. The normalized projective coordinates of each point will be referred as:

m∗= (x∗, y∗, 1) =dm; m= (x, y, 1) =cm

for the desired and current camera frames. Finally, we obtain the homogeneous image coordinates p = (u, v, 1), in pixels, of each point using the following transformation:

(10)

where K is the upper triangular matrix containing the camera intrinsic parameters. All along the document, we will make use of an abbreviated notation:

   R = c_R d t = c_t d/d∗ n = d_n

Where we see that the translation vector t is normalized respect to the plane depth d∗_{. Also}

we can notice that t and n are not referred to the same frame.

2.2 The homography matrix

Let p∗_{= (u}∗_{, v}∗_{, 1) be the (3×1) vector containing the homogeneous coordinates of a point}

in the reference image and let p = (u, v, 1) be the vector containing the homogeneous coordinates of a point in the current image. The projective homography matrix G transforms one vector into the other, up to a scale factor:

αgp= G p∗

The projective homography matrix can be measured from the image information by matching several coplanar points. At least 4 points are needed (at least three of them must be non-collinear). This matrix is related to the transformation elements R and t and to the normal of the plane n according to:

G= γ K (R + t n⊤_{) K}−1 ₍₁₎

where the matrix K is the camera calibration matrix. The homography in the Euclidean space can be computed from the projective homography matrix, using an estimated camera

calibration matrix bK:

b

H= bK−1G bK (2)

In this report, we suppose that we have no uncertainty in the intrinsic camera parameters.

Then, we assume that bK = K, so that bH = γ (R + t n⊤_{). In a future work, we will try}

to study the influence of camera-calibration errors on the Euclidean reconstruction, taking advantage of the analytical decomposition method presented here. The equivalent to the homography matrix G in the Euclidean space is the Euclidean homography matrix H. It transforms one 3D point in projective coordinates from one frame to the other, again up to a scale factor:

αhm= H m∗

where m∗ _{= (x}∗_{, y}∗_{, 1) is the vector containing the normalized projective coordinates of a}

point viewed from the reference camera pose, and m = (x, y, 1) is the vector containing these normalized projective coordinates when the point is viewed from the current camera pose. This homography matrix is:

H=Hb

γ = R + t n

(11)

Notice that med(svd(H)) = 1. Thus, after obtaining bH, the scale factor γ can be computed as follows:

γ = med(svd( bH))

by solving a third order equation (see Appendix B).

The problem of Euclidean homography decomposition, also called Euclidean reconstruc-tion from homography, is that of retrieving the elements R, t and n from matrix H:

H =⇒ {R, t, n}

Notice that the translation is estimated up to a positive scalar factor (as t has been

nor-malized with respect to d∗_).

3 The numerical method for homography

decomposi-tion

Before presenting the analytical decomposition method itself, it is convenient to concisely remind the traditional methods based on SVD [1, 11].

3.1 Faugeras SVD-based decomposition

If we perform the singular value decomposition of the homography matrix [1]:

H= U Λ V⊤

we get the orthogonal matrices U and V and a diagonal matrix Λ, which contains the singular values of matrix H. We can consider this diagonal matrix as an homography matrix as well, and hence apply relation (3) to it:

Λ= RΛ+ tΛn

⊤

Λ (4)

Computing the components of the rotation matrix, translation and normal vectors is simple

when the matrix being decomposed is a diagonal one. First, tΛcan be easily eliminated from

the three vector equations coming out from (4) (one for each column of this matrix equation).

Then, imposing that RΛ is an orthogonal matrix, we can linearly solve for the components

of nΛ, from a new set of equations relating only these components with the three singular

values (see [1] for the detailed development). As a result of the decomposition algorithm, we

can get up to 8 different solutions for the triplets: {RΛ, tΛ, nΛ}. Then, assuming that the

decomposition of matrix Λ is done, in order to compute the final decomposition elements, we just need to use the following expressions:

R = U RΛV

⊤

t = U tΛ

(12)

It is clear that this algorithm does not allow us to obtain an analytical expression of the decomposition elements {R, t, n}, in terms of the components of matrix H. This is the aim of this report: to develop a method that gives us such analytical expressions. As already said, there are up to 8 solutions in general for this problem. These are 8 mathematical solutions, but not all of them are physically possible, as we will see. Several constraints can be applied in order to reduce this number of solutions.

3.2 Zhang SVD-based decomposition

Notice that a similar method to obtain this decomposition is proposed in [11]. The authors claim that closed-form expressions for the translation vector, normal vector and rotation matrix are obtained. However, the closed-form solutions are obtained numerically, again from SVD decomposition of the homography matrix.

They propose to compute the eigenvalues and eigenvectors of matrix H⊤_H:

H⊤H= V Λ2_V⊤

Where the eigenvalues and corresponding eigenvectors are:

Λ= diag(λ1, λ2, λ3) ; V= [v1v2v3]

with the unitary eigenvalue λ2 and ordered as:

λ1≥ λ2= 1 ≥ λ3

Then, defining t∗ _{as the normalized translation vector in the desired camera frame, t}∗ ₌

R⊤_{t, they propose to use the following relations:}

kt∗_{k = λ} 1− λ3; n⊤t∗= λ1λ3− 1 and v1 ∝ v′1= ζ1t∗+ n v2 ∝ v′2= t∗× n v3 ∝ v′3= ζ3t∗+ n

where vi are unitary vectors, while vi′ are not, and ζ1,3 are scalar functions of the

eigenvalues given below.

These relations are derived from the fact that (t∗_{× n) is an eigenvector associated to}

the unitary eigenvalue of matrix H⊤_H_{and that all the eigenvectors must be orthogonal.}

Then, the authors propose to use the following expressions to compute the first solution for the couple translation vector and normal vector:

t∗_{= ±}v ′ 1− v3′ ζ1− ζ3 n_{= ±}ζ1v ′ 3− ζ3v′1 ζ1− ζ3

(13)

and for the second solution: t∗= ±v ′ 1+ v3′ ζ1− ζ3 n= ±ζ1v ′ 3+ ζ3v′1 ζ1− ζ3

In order to use these relations, after SVD, ζ1,3 must be computed as:

ζ1,3= 1 2 λ1λ3 −1 ± s 1 + 4 λ1λ3 (λ1− λ3)2 !

Also, the norms of v′

1,3 can be computed from the eigenvalues:

kvi′k2= ζi2(λ1− λ3)2+ 2 ζi(λ1λ3− 1) + 1 i = 1, 3

Then, v′

1,3 are obtained from the unitary eigenvectors using:

v_i′ _{= kv}′_i_{k v}i i = 1, 3

Finally, the rotation matrix can be obtained:

R= H I + t∗_n⊤−1

As we see, this is not an analytical decomposition procedure, since we don’t obtain {R, t, n} as explicit function of H. On the contrary, the computations fully rely on the singular value decomposition as in Faugeras’ method.

Moreover, in order to compute the rotation matrix, the right couples should be chosen, but there is a +/- ambiguity. This means that, a priori, there is no way to know if the right

couple for the choice of the plus sign in the expression of t∗_{is the vector obtained using the}

plus or the minus sign in the expression of n. With the proposed analytical procedure that ambiguity can be a priori solved.

3.3 Elimination of impossible solutions

We describe now how the set of solutions of the homography decomposition problem can be reduced from the 8 mathematical solutions to the only 2 verifying some physical constraints. Of course, this is valid not only for the numerical decomposition method, but in general, whatever the method used.

3.3.1 Reference-plane non-crossing constraint

This is the first physical constraint that allows to reduce the number of solutions from 8 to 4. This constraint imposes that:

(14)

This means that the camera cannot go in the direction of the plane normal further than the distance to the plane. Otherwise, the camera crosses the plane and the situation can be interpreted as the camera seeing a transparent object from both sides. In Figure 2, the

translation vector from one frame to the other,d_t

c, gives the position of the origin of F with

respect to F∗_{. This is not the same as vector t used in our reduced notation, but they are}

related by: t= −cRd d_t c d∗ d_n_{= n}∗ _π d d∗ F∗_{= F}_d F = Fc d_t c d_R c (d_t⊤ c dn) Pi

Figure 2: Reference-plane non-crossing constraint.

That way, the translation vector d_t

c, the normal vector n = dn and the distance to

the plane d∗ _{are referred to the same frame, F}∗_{. With this notation, is clear that the}

reference-plane non-crossing constraint is satisfied when:

d_t⊤

c dn< d∗ (5)

That is, the projection of the translation vector d_t

c on the normal direction, must be less

than d∗. Written in terms of our reduced notation:

1 + n⊤_R⊤_t_{> 0} ₍₆₎

As we will see in Section 4, only 4 of the 8 solutions derived using the analytic method verify this condition. In fact, the set of four solutions verifying the reference-plane non-crossing

(15)

constraint are, in general, two completely different solutions and their ”opposites”: Rtna = {Ra, ta, na} Rtnb = {Rb, tb, nb} Rtna− = {Ra, −ta, −na} Rtnb− = {Rb, −tb, −nb} 3.3.2 Reference-point visibility

This additional constraint allows to reduce from 4 to 2 the number of feasible solutions. The following additional information is required:

The set of reference image points: p

∗

The matrix containing the camera intrinsic parameters: K

First, the projective coordinates of the reference points are retrieved:

m∗= K−1_p∗

Then, each normal candidate is considered and the projection of each one of the points m∗

on the direction of that normal is computed. For the solution being valid, this projection must be positive for all the points:

m∗⊤n∗> 0

The same can be done regarding to the current frame:

m⊤(R n) > 0

The geometric interpretation of this constraint is that (see Figure 3):

(16)

na π F∗ mi Pi −nb −na (m⊤ inb) (m⊤ ina) nb

Figure 3: Reference points visibility constraint.

From the four solutions verifying the reference-plane non-crossing constraint, two of them have normals opposite to the other two’s, then at least two of them can be discarded with the new constraint. It may occur that even three of them could be discarded, but it is not the usual situation.

4 A new analytical method for homography

decompo-sition

In this section, we introduce a new analytical method for solving the homography decom-position problem. Contrarily to [1], where a numerical method based on SVD is used, we provide the expressions of {R, t, n} as a function of matrix H.

The four solutions we will achieve following the procedure will be denoted as:

Rtna = {Ra, ta, na} (7)

Rtnb = {Rb, tb, nb} (8)

Rtna− = {Ra, −ta, −na} (9)

Rtnb− = {Rb, −tb, −nb} (10)

as said before, these solutions are, in general, two completely different solutions and their opposites.

(17)

First, we will summarize the complete set of formulas and after that we will give the details of the development.

4.1 Summary of the analytical decomposition

First method: computing the normal vector first

We can get closed forms of the analytical expressions by introducing a symmetric matrix, S, obtained from the homography matrix as:

S= H⊤H− I =   s11 s12 s13 s12 s22 s23 s13 s23 s33  

We represent by MSij, i, j = 1..3, the expressions of the opposites of minors (minor

corre-sponding to element sij) of matrix S. For instance:

MS11 = − s22 s23 s23 s33 = s223− s22s33≥ 0

In general, there are three different alternatives for obtaining the expressions of the

normal vectors ne (and from this, te and Re) of the homography decomposition from the

components of matrix S. We will write: ne(sii) = n′ e(sii) kn′ e(sii)k ; e = {a, b} , i = {1, 2, 3}

Where ne(sii) means for nedeveloped using sii. Then, the three possible cases are:

n′a(s11) =   s11 s12+ p MS33 s13+ ǫ23 p MS22   ; n′b(s11) =   s11 s12− p MS33 s13− ǫ23 p MS22   (11) n′a(s22) =   s12+ p MS33 s22 s23− ǫ13 p MS11   ; n′b(s22) =   s12− p MS33 s22 s23+ ǫ13 p MS11   (12) n′a(s33) =   s13+ ǫ12 p MS22 s23+ p MS11 s33   ; n′b(s33) =   s13− ǫ12 p MS22 s23− p MS11 s33   (13)

where ǫij = sign(MSij). In particular, the sign(·) function should be implemented like:

sign(a) =

1 if a ≥ 0

(18)

in order to avoid problems in the cases when MSii= 0, as it will see later on.

These formulas give all the same result. However, not all of them can be applied in every

case, as the computation of ne(sii), implies a division by sii. That means that this formula

cannot be applied in the particular case when sii = 0 (this happens for instance when the

i-th component of n is null).

The right procedure is to compute naand nbusing the alternative among the three given

corresponding to the siiwith largest absolute value. That will be the most well conditioned

option. The only singular case, then, is the pure rotation case, when H is a rotation matrix. In this case, all the components of matrix S become null. Nevertheless, this is a trivial case, and there is no need to apply any formulas. It must be taken into account that the four

solutions obtained by these formulas (that is {na, −na, nb, −nb}) are all the same, but are

not always given in the same order. That means that, for instance, na(s22) may correspond

to −na(s11), nb(s11) or −nb(s11), instead of corresponding to na(s11).

We can also write the expression of nedirectly in terms of the columns of matrix H:

H= h1 h2 h3

We give, as an example, the result derived from s22(equivalent to (12)):

n′_a(s22) =         h⊤1h2+ q h⊤ 1h2 2 − (kh1k2− 1) (kh2k2− 1) kh2k2− 1 h⊤2h3− ǫ13 q h⊤2h3 2 − (kh2k2− 1) (kh3k2− 1)         (14) n′b(s22) =         h⊤ 1h2− q h⊤ 1h2 2 − (kh1k2− 1) (kh2k2− 1) kh2k2− 1 h⊤2h3+ ǫ13 q h⊤2h3 2 − (kh2k2− 1) (kh3k2− 1)         (15)

and where ǫij = sign(MSij), the expression of which is, in this particular case:

ǫ13= sign −h⊤1 I+ [h2]2× h3

(19)

The expressions for the translation vector in the reference frame, t∗

e = R⊤ete, can be

obtained after the given expressions of the normal vector:

t∗e(s11) = kn ′ e(s11)k 2 s11   s11 s12∓ p MS33 s13∓ ǫ23pMS22   − ktek2 2 kn′ e(s11)k   s11 s12± p MS33 s13± ǫ23pMS22   t∗_e(s22) = kn ′ e(s22)k 2 s22   s12∓ p MS33 s22 s23± ǫ13 p MS11   − ktek2 2 kn′ e(s22)k   s12± p MS33 s22 s23∓ ǫ13 p MS11   t∗e(s33) = kn ′ e(s33)k 2 s33   s13∓ ǫ12 p MS22 s23∓ p MS11 s33   − ktek2 2 kn′ e(s33)k   s13± ǫ12 p MS22 s23± p MS11 s33  

For e = a the upper operator in the symbols ±, ∓ must be chosen, for e = b choose the

lower operator. The vector t∗

e can also be given as a compact expression of na and nb:

t∗_a(sii) = ktek 2 [ǫsiiρ nb(sii) − ktek na(sii)] (16) t∗b(sii) = ktek 2 [ǫsiiρ na(sii) − ktek nb(sii)] (17) being ǫsii = sign(sii) ρ2= 2 + trace(S) + ν (18) ktek2= 2 + trace(S) − ν (19)

Where ν can be obtained from:

ν = p2 [(1 + trace(S))2_{+ 1 − trace(S}2_)]

= 2p1 + trace(S) − MS11− MS22− MS33

The expression for the rotation matrix is:

Re= H

I−2_νt∗en⊤e

(20)

Finally, te can be obtained:

(20)

Second method: computing the translation vector first

A simpler set of expressions for tecan be obtained, starting from the following matrix, Sr,

instead of the previous S:

Sr= H H⊤− I =   sr11 sr12 sr13 sr12 sr22 sr23 sr13 sr23 sr33  

The new relations for vector te are:

te(srii) = ktek t′ e(srii) kt′ e(srii)k ; e = {a, b}

Where te(srii) means for tedeveloped using srii. Then, the three possible cases are:

t′a(sr11) =   sr11 sr12+ p MS_r33 sr13+ ǫr23 p MS_r22   ; t′b(sr11) =   sr11 sr12− p MS_r33 sr13− ǫr23 p MS_r22   (22) t′a(sr22) =   sr12+ p MSr33 sr22 sr23− ǫr13 p MS_r11   ; t′b(sr22) =   sr12− p MSr33 sr22 sr23+ ǫr13 p MS_r11   (23) t′_a(sr33) =   sr13+ ǫr12 p Ms_r22 sr23+ p MSr11 sr33   ; t′_b(sr33) =   sr13− ǫr12 p Ms_r22 sr23− p MSr11 sr33   (24)

where MS_rii and ǫrij have the same meaning as before, but referred to matrix Srinstead of

S. Notice that the expression for ktek is given in (19). In this case, srii becomes zero, for

instance, when the i-th component of t is null.

We can also write the expression of tedirectly in terms of the rows of matrix H:

H⊤=hr1 hr2 hr3

(21)

We give, as an example, the result derived from sr22: t′_a(sr22) =         h⊤_r₁hr2+ q h⊤ r1hr2 2 − (khr1k2− 1) (khr2k2− 1) khr2k 2_{− 1} h⊤r2hr3− ǫr13 q h⊤ r2hr3 2 − (khr2k 2_{− 1) (kh} r3k 2_{− 1)}         (25) t′b(sr22) =         h⊤ r1hr2− q h⊤ r1hr2 2 − (khr1k2− 1) (khr2k2− 1) khr2k 2_{− 1} h⊤r2hr3+ ǫr13 q h⊤ r2hr3 2 − (khr2k 2_{− 1) (kh} r3k 2_{− 1)}         (26)

being ǫrij = sign(MSrij), that can be written in this case as:

ǫr13= sign −h ⊤ r1 I+ [hr2] 2 × hr3

In the same way we obtained before the expressions for t∗

e= R⊤ete from the expressions

of ne, we can obtain now the expressions for n′e= Rene, from the given expressions of te:

n′_a(srii) = 1 2 ǫs_rii ρ ktek tb(srii) − ta(srii) (27) n′b(srii) = 1 2 ǫs_rii ρ ktek ta(srii) − tb(srii) (28) being ǫs_rii = sign(srii)

The expression for the rotation matrix, analogous to (20), is:

Re=

I−2_νten′⊤e

H

Finally, necan be obtained:

ne= R⊤en′e

Of course, if we have directly the couple neand te corresponding to the same solution,

we can get the rotation matrix as:

(22)

It must be noticed that if we combine the expressions for neand te(14)-(15) with

(25)-(26) (or equivalently (11)-(13) with (22)-(24)), in order to set up the set of solutions, instead

of deriving one from the other, we must be aware that, as expected, na(sii) not necessary

will couple with ta(srii).

As it can be seen, contrarily to the numerical methods, in this case, we have the analytical expressions of the decomposition elements {R, t, n}, in terms of the components of matrix H.

4.2 Detailed development of the analytical decomposition

In this section, we present the detailed development that give rise to the set of analytical expressions summarized before. We will describe two alternative methods for the analytical decomposition. Using the first one, we will derive the set of formulas that allow us to compute the normal vector first, and after it, the translation vector and the rotation matrix. On the contrary, the second method allows to compute the translation vector first, and after it, the normal vector and the rotation matrix.

4.2.1 First method

In order to simplify the computations, we start defining the symmetric matrix, S, obtained from the homography matrix as follows:

S= H⊤H− I =   s11 s12 s13 s12 s22 s23 s13 s23 s33   (29)

The matrix S is a singular matrix. That is:

det(S) = s11s22s33− s11s223− s22s213− s33s212+ 2s12s13s23= 0 (30)

This means that we could write, for instance, element s33as:

s33=

s11s223+ s22s213− 2s12s13s23

s11s22− s212

(31)

We will denote the opposites of the two-dimension minors of this matrix as MSij (minor

corresponding to element sij). The opposites of the principal minors are:

MS11 = − ss2223 ss2333 = s223− s22s33≥ 0 MS22 = − ss1113 ss1333 = s213− s11s33≥ 0 MS33 = − ss1112 ss1222 = s212− s11s22≥ 0

(23)

all of them being non-negative (this property, that will be helpful afterwards, is proved in Appendix C.1). On the other hand, the opposites of the non-principal minors are:

MS12 = MS21 = − ss1223 ss1333 = s23s13− s12s33 MS13 = MS31 = − s12 s13 s22 s23 = s22s13− s12s23 MS23 = MS32 = − ss1112 ss1323 = s12s13− s11s23

There are some interesting geometrical aspects related to these minors, which are described in Appendix C.2. It can also be verified that the following relations between the principal and non-principal minors hold:

MS212 = MS11MS22 (32)

MS213 = MS11MS33 (33)

MS223 = MS22MS33 (34)

This can be easily proved using the property of null determinant of S and writing some

diagonal element as done with s33in (31) (alternatively, see Appendix C.1). These relations

can be also written in another way:

MS12 = ǫ12 p MS11 p MS22 (35) MS13 = ǫ13 p MS11 p MS33 (36) MS23 = ǫ23 p MS22 p MS33 (37) where ǫij = sign(MSij)

Condition (30) could also have been written using these determinants:

det(S) = −s11MS11− s12MS12− s13MS13 = 0

If we denote by h, i = 1..3 each column of matrix H:

H= h1 h2 h3

matrix S could be written as:

S=   kh 1k2− 1 h⊤1h2 h⊤1h3 h⊤ 1h2 kh2k2− 1 h⊤2h3 h⊤ 1h3 h⊤2h3 kh3k2− 1   (38)

(24)

In a similar way, the opposites of the minors can be written as: MS11 = h ⊤ 2h3 2 − kh2k2− 1 kh3k2− 1 (39) MS22 = h ⊤ 1h3 2 − kh1k2− 1 kh3k2− 1 (40) MS33 = h ⊤ 1h2 2 − kh1k2− 1 kh2k2− 1 (41) MS12 = h ⊤ 1 I+ [h3]2× h2 (42) MS13 = h ⊤ 1 I+ [h2]2× h3 (43) MS23 = h ⊤ 2 I+ [h1]2× h3 (44)

Once the definition and properties of matrix S have been stated, we start now the development that will allow us to extract the decomposition elements from this matrix. We will see that the interest of defining such a matrix is that it will allow us to eliminate the rotation matrix from the equations. Using (3) we can write S in terms of {R, t, n} in the following way:

S= R⊤+ n t⊤ R+ t n⊤− I = R⊤t n⊤+ n t⊤R+ n t⊤t n⊤ (45)

If we introduce two new vectors, x and y, defined as:

x = R ⊤_t kR⊤_t_k = R⊤_t ktk (46) y = kR⊤tkn = ktkn (47)

Scan be written as:

S= xy⊤+ yx⊤+ yy⊤ (48)

It is clear that matrix S is linear in x:   y 2 1+ 2y1x1 y2x1+ y1x2+ y1y2 y3x1+ y1x3+ y1y3 . y2 2+ 2y2x2 y3x2+ y2x3+ y2y3 . . y2 3+ 2y3x3   =   ss1112 ss1222 ss1323 s13 s23 s33   (49)

From this, we can set up two systems of equations: y2 1+ 2y1x1 = s11 (50) y22+ 2y2x2 = s22 (51) y32+ 2y3x3 = s33 (52) y2x1+ y1x2+ y1y2 = s12 (53) y3x1+ y1x3+ y1y3 = s13 (54) y3x2+ y2x3+ y2y3 = s23 (55)

(25)

Solving for x from equations (50), (51) and (52): x1 = s11− y21 2y1 (56) x2 = s22− y 2 2 2y2 (57) x3 = s33− y23 2y3 (58) Replacing this in equations (53), (54) and (55)

s22y21− 2s12y1y2+ s11y22 = 0 (59)

s11y23− 2s13y1y3+ s33y21 = 0 (60)

s33y22− 2s23y2y3+ s22y23 = 0 (61)

Then, after setting

z1 = y1 y2 (62) z2 = y1 y3 (63) z3 = y3 y2 (64)

we get three independent second-order equations in z1, z2, z3, respectively:

s22z12− 2s12z1+ s11 = 0 (65)

s33z22− 2s13z2+ s11 = 0 (66)

s22z32− 2s23z3+ s33 = 0 (67)

the solutions of which are:

z1 = α1± β1; α1= s12 s22; β1= p s2 12− s11s22 s22 = p MS33 s22 (68) z2 = α2± β2; α2= s13 s33 ; β2= p s2 13− s11s22 s33 = p MS22 s33 (69) z3 = α3± β3; α3= s23 s22 ; β3= p s2 23− s22s33 s22 = p MS11 s22 (70)

where it has been assumed that s22 and s33 are different from 0 (we will see later that, in

this case, the constraint on s33 can be removed). Note that thanks to the given property,

(26)

We impose now the constraint x2 1+ x22+ x23= 1: x21+ x22+ x23= (s11− y21)2 4y2 1 +(s22− y 2 2)2 4y2 2 +(s33− y 2 3)2 4y2 3 = 1 After setting w = y2 2, and using y12= z12y22, y23= z23y22, (s22− w)2+(s11− wz 2 1)2 z2 1 +(s33− z 2 3w)2 z2 3 − 4w = 0

Now, we can solve for w the following second-order equation:

a w2_{− 2 b w + c = 0}

being the coefficients:

a = 1 + z21+ z32 (71) b = 2 + trace(S) = 2 + s11+ s22+ s33 (72) c = s222+ s2 11 z2 1 +s 2 33 z2 3 (73) Then, the two possible solutions for w are:

w = wnum

a =

b ±√b2_{− a c}

a (74)

After this, the y vector can be computed:

y=   z1 1 z3   · y2; y2= ±√w (75)

Now, from (56)-(58), the x vector could be obtained. It can be checked that the possible solutions for w are real and positive (w must be the square of a real number), guaranteeing that the components of vectors x and y are always real. This is proved in Appendix C.3.

As said before, the homography decomposition problem has, in general, eight different mathematical solutions. The eight possible solutions come out from two possible couples of

{z1, z3}, two possible values of w for each one of these couples, and finally, two possible values

of y2from the plus/minus square root of w. Four of them correspond to the configuration of

the reference object plane being in front of the camera, while the other four correspond to the non-realistic situation of the object plane being behind the camera. The latter set of solutions can be simply discarded when we are working with real images. In fact, we will prove now that, using the given formulation, the four valid solutions of the homography decomposition problem (verifying the reference-plane non-crossing constraint) are those corresponding to

(27)

the choice of the minus sign for w in (74). In Section 3.3.1, we stated that the reference-plane non-crossing constraint implies the following condition:

1 + n⊤R⊤t > 0 (76)

Then, we can choose the right four solutions writing this condition in terms of x and y:

1 + n⊤R⊤t= 1 + y⊤x> 0

If we replace x as a function of y using (56)-(58),

[y1y2y3]     s11−y21 2y1 s22−y22 2y2 s33−y23 2y3     = s11− y12 2 + s22− y22 2 + s33− y23 2 ≥ −1

This can also be written as

trace(S) + 2 ≥ kyk2 (77)

Using (62)-(64) and (71), the squared norm of y takes the form

kyk2= y12+ y22+ y32= w (1 + z12+ z23) = w a = wnum (78)

Then, the condition (77) becomes

b ≥ w a

Let us check which one of the possible values of w verify this condition. These two values will be called: w+ = b + √ b2_{− a c} a (79) w− = b − √ b2_{− a c} a (80)

It is obvious that, according that w is real as stated before, only w = w− _{will verify the}

required condition

b ≥ w−a

Then, we can conclude that, to get the four physically feasible solutions, it is sufficient to choose:

w = w−

On the other hand, it is worth noticing that only z1 and z3 are in fact needed for

computing the values of w and then the solutions of the problem. From the four possible

couples we can set up for {z1, z3}:

(28)

being

za1 = α1+ β1

zb1 = α1− β1

za3 = α3+ β3

zb3 = α3− β3

only two are valid, these are those verifying:

z2=

z1

z3

as they are related through (62)-(64). In other words, the couples {z1, z3} must verify

equation (66), when z2is replaced by z1/z3:

s33z 2 1 z2 3 − 2s 13z1 z3 + s11= 0 (81)

Hence, equation (66) is only needed as a way of discerning the two valid couples for z1,z3.

We show now how to make the straightforward computation of the two valid couples {z1, z3}.

Choosing the valid pairs{z1, z3}

When computing the couples {z1, z3} using (68) and (70), some inconvenience arises. It

is derived from the fact that the right two couples {z1, z3} are not always the same, but

they may swap among the four possibilities. In fact, the two valid couples are always complementary. That is, the only possibilities are:

{{za1, za3} , {zb1, zb3}}

or

{{za1, zb3} , {zb1, za3}}

This means that we need to check, each time, if the right pair complement of za1 is za3

or zb3, evaluating (81) in both cases. What it is intended here is to avoid the eventual

swapping of the right pair complement of za1 between the two possibilities, forcing it to be

always equal to one of them. This will provide a straight analytical computation for the eight homography decomposition solutions.

Replacing in (70) s33according to (31) (or directly using (33)), will give a better insight

into this swapping mechanism. In particular, we get a new expression for β3:

β3= q (s23s12−s22s13)2 s2 12−s11s22 s22 = |s23s12− s22s13| s22 p MS33

(29)

The absolute value in the numerator of this expression is the cause of the eventual swapping

between za3 and zb3. If we compute z3using the following expressions, instead

z′a3 = α3+ β ′ 3 (82) zb′3 = α3− β ′ 3 (83) being now β3′ = s23s12− s22s13 s22 p MS33 = −MS13 s22 p MS33

where the absolute value has been removed, we can verify that the right pair complement

of za1 is z

′

a3. Therefore, the right couples are always the same:

{{za1, z

′

a3} , {zb1, z

′ b3}}

This can be verified by simply replacing the expressions of za1and z

′

a3(correspondingly with

zb1 and z

′

b3) in (81) and checking the equality.

The procedure now is much simpler. We can completely ignore z2, and forget about its

computation and about the checking (81). Just compute z1 and z3 according to:

z1 = α1± β1

z3 = α3± β3′

The only problem of this alternative of computing z3to avoid the above-mentioned swapping

is that we introduce a division for pMS33 and, as a consequence, it could not be applied

when this minor is null. We can obtain the same result, avoiding this inconvenience, by

simply computing β′ 3 as: β3′ = −ǫ13 p MS11 s22 where ǫ13 is: ǫ13= sign(MS13)

The four solutions we will achieve following the procedure will be denoted as:

Rtna = {Ra, ta, na}

Rtnb = {Rb, tb, nb}

Rtna− = {Ra, −ta, −na}

Rtnb− = {Rb, −tb, −nb}

as said before, these solutions are, in general, two completely different solutions and their opposites.

(30)

Computation of the normal vector

Once we have seen how to avoid unrealistic solutions, we will now give the formulas to obtain the elements the homography decomposition {R, t, n} directly as a functions of the components of matrix H. We start with the normal vector. We have already determined

the following expressions for the intermediary variables z1 and z3

za1 = s12+ p MS33 s22 ; zb1= s12− p MS33 s22 za3 = s23− ǫ13 p MS11 s22 ; zb3= s23+ ǫ13 p MS11 s22

We also need to compute another intermediary variable, w (see (80)):

wa = b − ν

aa

wb = b − ν

ab

where the coefficients ae, b are:

ae = 1 + ze21+ z

2

e3 (84)

b = 2 + trace(S) (85)

where the subscript can be e = {a, b}. After some manipulations of the expressions of these coefficients, ν can be written as a function of matrix S:

ν =p2 [(1 + trace(S))2_{+ 1 − trace(S}2_)] ₍₈₆₎

or, alternatively:

ν = 2p1 + trace(S) − MS11− MS22− MS33 (87)

It can be proved (see Appendix C.4) that the coefficient ν introduced in (84) is:

ν = 2 1 + n⊤R⊤t (88)

Now, we can compute the four possible y vectors:

ye= ±√we   ze1 1 ze3   = ±√wen′e ye= ± √ b − ν √_a e n′_e_{= ±}√_{b − ν} n ′ e kn′ ek

(31)

As kyek = ktek, from the previous expression, we can deduce the translation vector norm,

which is the same for all the solutions:

ktek2= wnum= 2 + trace(S) − ν (89)

Dividing yeby this norm, we get the expression of the normal vector:

ne= n′e kn′ ek ; e = {a, b} n′a=   s12+ p MS33 s22 s23− ǫ13 p MS11   ; n′b =   s12− p MS33 s22 s23+ ǫ13 p MS11   (90) being ǫ13= sign(MS13)

In particular, the sign(·) function in this case should be implemented like: sign(a) =

1 if a ≥ 0

−1 otherwise

in order to avoid problems in the cases when MSii = 0. To understand this, suppose that

MS33 = 0, according to relations (36)-(37), also MS13 = 0 and MS23 = 0. Then, with the

typical sign(·) function, ǫ13= ǫ23= 0, erroneously cancelling also the second addend of the

third component of na,band providing a wrong result. Moreover, in order to avoid numerical

problems, it is advisable to consider the parameter of the sign(·) function equal to zero if its magnitude is under some precision value.

The complete set of formulas. The previous development started with the assumption

that s226= 0, as the expressions were developed dividing by s22. In case s22= 0 (for instance

when the second component of the object-plane normal is null), this formulas cannot be

applied. In this situation, we can make a similar development, but dividing by s11 or s33,

instead. Suppose s22= 0 and we want to develop dividing by s11. What we need to do is

to define variables z1, z2, z3in (62)-(64) in a different way. In particular, we will choose:

z1 = y2 y1 z2 = y3 y1 z3 = y2 y3

The three new second-order equations in z1, z2, z3are:

s11z12− 2s12z1+ s22 = 0

s11z22− 2s13z2+ s33 = 0

(32)

From this, we can follow a parallel development and we will find new expressions for na and nb: n′_a(s11) =   s11 s12+ p MS33 s13+ ǫ23 p MS22   ; n′_b(s11) =   s11 s12− p MS33 s13− ǫ23 p MS22   (91)

with the notation ne(sii) we mean:

the expression of ne obtained using sii (i.e. dividing by sii).

The third alternative is developing the formulas dividing for s33. From this case we will

obtain: n′a(s33) =   s13+ ǫ12 p MS22 s23+pMS11 s33   ; n′b(s33) =   s13− ǫ12 p MS22 s23−pMS11 s33  

On the other hand, we may prefer to write the expressions of nedirectly in terms of the

column vectors of the H matrix, hi:

H= h1 h2 h3

We give, as an example, the result derived from s22:

n′a(s22) =         h⊤1h2+ q h⊤1h2 2 − (kh1k2− 1) (kh2k2− 1) kh2k2− 1 h⊤2h3− ǫ13 q h⊤ 2h3 2 − (kh2k2− 1) (kh3k2− 1)         (92) n′_b(s22) =         h⊤1h2− q h⊤₁h2 2 − (kh1k2− 1) (kh2k2− 1) kh2k2− 1 h⊤ 2h3+ ǫ13 q h⊤ 2h3 2 − (kh2k2− 1) (kh3k2− 1)         (93)

being ǫ13= sign(MS13), that can be written as:

ǫ13= sign − h⊤1

I+ [h2]2×

(33)

Computation of the translation vector

Next, we want to obtain the expression for the translation vector. From (56)-(58), the x vector could be computed as:

xe=   xe1 xe2 xe3   ; xei = sii− ye2i 2yei ; i = 1..3

We can rewrite this expression in terms of the translation and normal vectors, using (46)-(47). In particular, we consider here the translation vector in the reference frame t∗e= R⊤ete, t∗e= 1 2    s11 ne1 s22 ne2 s33 ne3    −ktek 2 2 ne; e = {a, b}

This formula cannot be applied as such when any of the components of the normal vector

are null. In those cases, we get an indetermination, as ni= 0 =⇒ sii= 0. In order to avoid

this, we make a simple operation that allows us to cancel out sii from nei. Consider, for

instance, the ratio s11/na1:

s11 na1 = kn′ak s11 n′ a1 = kn′ak s11 s12+ p MS33

multiplying and dividing by (s12−

p MS33), we obtain: s11 na1 = s12− p MS33 s22

with a similar operation in the other components, we get:

t∗_a = kn ′ ak 2 s22   s12− p MS33 s22 s23+ ǫ13 p MS11   − ktek2 2 kn′ ak   s12+ p MS33 s22 s23− ǫ13 p MS11   (94) t∗b = kn ′ bk 2 s22   s12+ p MS33 s22 s23− ǫ13 p MS11   − ktek2 2 kn′ bk   s12− p MS33 s22 s23+ ǫ13 p MS11   (95)

Comparing with (90), it is clear that the translation vector can be obtained form the normals: t∗a = kn ′ ak 2 s22 n′b−kt ek2 2 na t∗_b = kn ′ bk 2 s22 n′_a₋ktek 2 2 nb

(34)

In order to avoid dependencies from the non-unitary vectors n′ e, we write: t∗_a = kn ′ ak kn′bk 2 s22 nb−ktek 2 2 na t∗_b = kn ′ bk kn′ak 2 s22 na−ktek 2 2 nb

it can be verified that the scalar quotient appearing in the first term of both equations is the same value for all the solutions, in fact we can write it as:

kn′ ak kn′bk

|s22| = ρ ktek

being ρ:

ρ2= b + ν =⇒ ρ2= 2 + trace(S) + ν = ktek2+ 2 ν (96)

where ν was given in (86). Finally, we can write compact expressions for t∗

efrom ne: t∗_a = ktek 2 (ǫs22ρ nb− ktek na) (97) t∗b = kt ek 2 (ǫs22ρ na− ktek nb) (98) being ǫs22 = sign(s22)

In this case, for the sign(·) function we don’t have the same problem as for ǫ13 in (90), as

we assumed s226= 0. This means that we can use the typical sign(·) function (sign(0) = 0)

for ǫs22.

In order to find the translation vector in the current frame, te, we need to compute the

rotation matrix in advance.

The complete set of formulas. Again, we need a complete set of formulas, that makes

possible the computation of the translation vector even if some sii are null. In particular,

relation (91) was obtained by dividing by s11. The translation vector derived from that is:

t∗a(s11) = kn ′ a(s11)k 2 s11   s11 s12− p MS33 s13− ǫ23pMS22   − ktek2 2 kn′ a(s11)k   s11 s12+ p MS33 s13+ ǫ23pMS22   t∗_b(s11) = kn ′ b(s11)k 2 s11   s11 s12+ p MS33 s13+ ǫ23 p MS22   − ktek2 2 kn′ b(s11)k   s11 s12− p MS33 s13− ǫ23 p MS22  

(35)

We can also write the expressions directly in terms of na(s11) and nb(s11):

t∗_a(s11) = ktek

2 (ǫs11ρ nb(s11) − ktek na(s11))

t∗b(s11) = ktek

2 (ǫs11ρ na(s11) − ktek nb(s11))

The third alternative is developing the formulas dividing for s33. In this case we obtain:

t∗_a(s33) = kn ′ a(s33)k 2 s33   s13− ǫ12 p MS22 s23− p MS11 s33   − ktek2 2 kn′ a(s33)k   s13+ ǫ12 p MS22 s23+ p MS11 s33   t∗b(s33) = kn ′ a(s33)k 2 s33   s13+ ǫ12 p MS22 s23+pMS11 s33   − ktek2 2 kn′ a(s33)k   s13− ǫ12 p MS22 s23−pMS11 s33  

The alternative expression of the translation vector directly from na(s33) and nb(s33) is, as

expected:

t∗a(s33) = ktek

2 (ǫs33ρ nb(s33) − ktek na(s33))

t∗b(s33) = ktek

2 (ǫs33ρ na(s33) − ktek nb(s33))

From the previous expressions, we see that the formulas we could write for t∗

e in terms

of the columns of matrix H will not be as simple as those given in (92)-(93) for the normal

vector. On top of this, we need to multiply by matrix R⊤

e in order to get te. In the next

subsection, we will propose a different starting point for the development, such that, the

expressions obtained for te will be exactly as simple as those already obtained for ne(90),

(92) and (93). However, it must be noticed that, computing the normal and translation vectors using the previous set of formulas, we always get the right couples. That means

that, for instance, ta (and not −tanor tbnor −tb) is always the right couple for na, without

requiring any additional checking. Computation of the rotation matrix

The rotation matrix could be obtained from x and y, using the definition of homography matrix and the relations (46)-(47):

H= R + t n⊤= R (I + x y⊤)

as:

(36)

The required inverse matrix can be computed making use of the following relation:

I+ x y⊤−1_{= I −} 1

1 + x⊤_yx y

⊤

Then, the rotation matrix can be written as:

Re= H I−_{1 + x}1⊤ eye xey⊤e

Alternatively, we can express the rotation matrix directly in terms of ne and t∗e:

Re= H I₋ 1 1 + n⊤ et∗e t∗_en⊤_e = H I₋2 νt ∗ en⊤e (99) 4.2.2 Second method

Even if we consider that, for the analytical computation of the translation vector, the given formulas (94)-(95) or (97)-(98) are good enough, it maybe sometimes convenient to have a closer form for obtaining this vector. For instance, if we want to study the effects of camera-calibration errors on the translation vector derived from homography decomposition, having a pair of formulas for it similar to (92) and (93), directly in terms of the elements of matrix H, would greatly simplify the analysis. In this subsection, we will derive such a more compact

expression for te.

Computation of the translation vector

The symmetry of the problem suggest that, instead of (45), we could have started defining the matrix:

Sr= H H⊤− I = R + t n⊤

R⊤+ n t⊤ − I = R n t⊤+ t n⊤R⊤+ t t⊤

We also redefine vectors x and y as:

x = R n

y = t

Then, Srcan be written as:

Sr= xy⊤+ yx⊤+ yy⊤=   sr11 sr12 sr13 sr12 sr22 sr23 sr13 sr23 sr33   It can be easily verified that:

trace(Sr) = trace(S)

trace(S2

r) = trace(S2)

(37)

Matrix Sr has exactly the same form as S in (48), but now, with a different definition of

vectors x and y. This means that we can use the same results we had before, with the only difference that the first elements we derive now with the expressions equivalent to those

given in (90) are vectors parallel to te, instead of vectors parallel to ne. Then, we can

obtain the new relations for vector te:

t′a=   sr12+ p MS_r33 sr22 sr23− ǫr13 p MS_r11   ; t′b=   sr12− p MS_r33 sr22 sr23+ ǫr13 p MS_r11   (100)

where MS_rii and ǫrij have the same meaning as before, but referred to matrix Srinstead of

S. From (100), we compute the translation vector with the right norm:

te= ktek

t′e

kt′ ek

Where we need the expression for ktek:

ktek = 2 + trace(Sr) − ν

ktek = 2 + trace(S) − ν

and ν can be computed using (86) or (87), using either of them, S or Sr.

On the other hand, notice that, in a similar way to (38), Srcan be written as:

Sr=       khr1k 2_{− 1} _h⊤ r1hr2 h ⊤ r1hr3 h⊤r1hr2 khr2k 2_{− 1} _h⊤ r2hr3 h⊤r1hr3 h ⊤ r2hr3 khr3k 2_{− 1}       (101)

where h⊤ri, i = 1..3 means for each row of matrix H:

H=       h⊤r1 h⊤r2 h⊤ r3      

This is why the new Sr matrix was named with subindex r: because it can be written in

(38)

tein terms of the components of matrix H, as intended: t′a =         h⊤ r1hr2+ q h⊤ r1hr2 2 − (khr1k2− 1) (khr2k2− 1) khr2k 2_{− 1} h⊤r2hr3− ǫr13 q h⊤ r2hr3 2 − (khr2k 2_{− 1) (kh} r3k 2_{− 1)}         t′b =         h⊤r1hr2− q h⊤ r1hr2 2 − (khr1k 2_{− 1) (kh} r2k 2_{− 1)} khr2k 2_{− 1} h⊤r2hr3+ ǫr13 q h⊤ r2hr3 2 − (khr2k 2_{− 1) (kh} r3k 2_{− 1)}        

being ǫr13 = sign(MSr13), that can be written as:

ǫr13= sign −h ⊤ r1 I+ [hr2] 2 × hr3

The vectors te(with e = a, b) computed in this way are actually te(s22). The complete

set of formulas, including te(s11) and te(s33) are given in the summary.

Computation of the normal vector

In Section 4.2.1, we computed t∗

e= R⊤etefrom the expressions of ne. Following a symmetric

development, we can obtain now the expressions for n′

e= Rene, from the given expressions

of te. These are: n′a(srii) = 1 2 ǫs_rii ρ ktek tb(srii) − ta(srii) n′_b(srii) = 1 2 ǫs_rii ρ ktek ta(srii) − tb(srii) being ǫs_rii = sign(srii)

Computation of the rotation matrix

An expression for the rotation matrix, analogous to (99), can also be obtained:

R⊤e = H⊤

I−2_νn′et⊤e

(39)

or, what is the same: Re= I−2 νten ′⊤ e H

(40)

5 Relations among the possible solutions

The purpose of this section can be easily understood posing the following question: suppose we have one of the four solutions referred in (7)-(10) of the homography decomposition for a given homography matrix. Is it possible to find any expressions that allow us to compute the other three possible solutions ? The answer is yes and these expressions will be given in this section. In particular, we need to find the expressions of one of the solutions in terms of the other one:

Rtnb = f (Rtna)

Rtna = f (Rtnb)

We will take advantage of the described analytical decomposition procedure in order to get these relations. Again, we first introduce the expressions relating the solutions, and then proceed with the detailed description.

5.1 Summary of the relations among the possible solutions

We summarize here the final achieved expressions. Suppose we have one solution and we

call it: Rtna= {Ra, ta, na}. Then, the other solutions can be denoted by:

Rtnb = {Rb, tb, nb}

Rtna− = {Ra, −ta, −na}

Rtnb− = {Rb, −tb, −nb}

where the elements Rb, tb and nb can be obtained as:

tb= ktak ρ Ra 2 na+ R ⊤ ata (102) nb= 1 ρ ktak na+ 2 ktak R⊤ata (103) Rb= Ra+ 2 ρ2 ν tan⊤a − tat⊤aRa− ktak2Ranan⊤a − 2 Ranat⊤aRa (104) In these relations the subindexes a and b can be exchanged. The coefficients ρ and ν are:

ρ = k2 ne+ R⊤etek > 1 ; e = {a, b} (105)

ν = 2 (n⊤eR⊤ete+ 1) > 0 (106)

(41)

For the rotation axes and angles: rb= (2 − n⊤ ata) I + tan⊤a + nat⊤a ra+ (na× ta) 2 + n⊤ ata+ r⊤a(na× ta) (108)

being re the chosen parametrization for a rotation of an angle θe about an axis ue:

re= tan θe 2 ue; e = {a, b}

It must be pointed out that, in the usual case when two solutions verify the visibility

constraint according to the given set of image points, if we suppose that one of them is Rtna,

the other one can be either Rtnb or Rtnb−, according to the formulation given.

Special cases

It is worth noticing that these expressions have two singular situations on which they cannot be used. One of them occurs when ρ = 0, what we already know is physically impossible to happen. Anyway, we will do a geometric interpretation of this situation, in which:

ta= −Ranaktak and ktak = 2 =⇒ ρ = 0

This means that the required motion for the camera going from {F∗_{} to {F} implies a}

displacement towards the reference plane following the direction of the normal and after crossing this plane, situating at the same distance it was at the beginning. This is clearly understood if we write the relation:

t∗a= R⊤ata = −2 na

Where t∗

a is the translation vector expressed in the reference frame. Also as the translation

is normalized with the depth to the plane d∗_{, we will write instead:}

t∗_a= 2 d∗na; t∗a = −R⊤ata

As d∗ _{is measured from the reference frame, we have also changed the displacement vector}

to see it as the displacement from the reference frame to the current frame.

The second singular situation for the formulas relating both solutions is when we are in the pure rotation case. This corresponds to the degenerate case of all the singular values of the homography matrix being equal to 1. In this case, no formulas are needed to obtain one solution from the other as both solutions are the same:

tb= ta = 0 ; Ra= Rb

(42)

5.2 Preliminary relations

First, we look for the relation between vectors y and x in both solutions, Rtna and Rtnb.

From (75), we had: ya = zaya2; yb = zbyb2 being: ze=   ze1 1 ze3   ; e = {a, b}

A matrix can be introduced as a multiplicative transformation from one to the other:

yb= Tyya; Ty= ry2   rz1 0 0 0 1 0 0 0 rz3   (109)

Where rz1, rz3 and ry2 are the corresponding ratios:

rz1= zb1 za1 ; rz3 = zb3 za3 ; ry2 = yb2 ya2

If we start from the expression of the S matrix as a function of the first solution:

S= xay⊤a + yax⊤a + yay⊤a

which has the form shown in (49) which is reproduced here for convenience:

S=   y2 1+ 2y1x1 y2x1+ y1x2+ y1y2 y3x1+ y1x3+ y1y3 . y2 2+ 2y2x2 y3x2+ y2x3+ y2y3 . . y2 3+ 2y3x3   (110)

and where for notation simplicity xi and yi are used instead of xai yai. If we compute z1

and z3 using this form of the components of matrix S, we get:

za1 = (1 + ǫ) x1y2+ (1 − ǫ) x2y1+ y1y2 y2(2x2+ y2) zb1 = (1 − ǫ) x1y2+ (1 + ǫ) x2y1+ y1y2 y2(2x2+ y2) za3 = (1 + ǫ) x3y2+ (1 − ǫ) x2y3+ y3y2 y2(2x2+ y2) zb3 = (1 − ǫ) x3y2+ (1 + ǫ) x2y3+ y3y2 y2(2x2+ y2) where ǫ = sign(x1y2− x2y1)

(43)

Then, if ǫ = −1, we get: za1 = y1 y2 , za3 = y3 y2 zb1 = 2x1+ y1 2x2+ y2 , zb3 = 2x3+ y3 2x2+ y2

and, in case ǫ = +1, the solutions are swapped

za1 = 2x1+ y1 2x2+ y2 , za3 = 2x3+ y3 2x2+ y2 zb1 = y1 y2 , zb3 = y3 y2

We will assume the first case, so the first solution is the trivial one, that is, that one from which we wrote S. Otherwise, the value ǫ = +1 will lead us to the same result of how getting a solution from the other, the only difference is that these two solutions have been swapped, what does not need to be considered. Then, we assume ǫ = −1 and proceed computing the ratios: rz1= zb1 za1 =y2(2x1+ y1) y1(2x2+ y2) rz3= zb3 za3 =y2(2x3+ y3) y3(2x2+ y2)

In order to get the ratio ry2, we first compute

wb wa = (b − ν)/ab (b − ν)/aa =aa ab = (2x2+ y2) 2_kyk2 y2 2(kyk2+ 4 x⊤y+ 4)

From this, the third ratio is

ry2 = √_w b √_w a = (2x2+ y2) kyk y2 p kyk2_{+ 4 x}⊤_y_{+ 4} (111)

Then, matrix Ty can be constructed:

Ty= p kyk kyk2_{+ 4 x}⊤_y_{+ 4}    2x1+y1 y1 0 0 0 2x2+y2 y2 0 0 0 2x3+y3 y3   

Replacing this in (109), gives a very simple expression for computing yb from ya and xa

yb= ±kyak

(44)

the plus/minus ambiguity is due to the two possible values of the square root in (111). The coefficient ρ is ρ = q kyek2+ 4 x⊤eye+ 4 = q kxe× yek2+ (x⊤eye+ 2)2; e = {a, b}

No subscript is added to this coefficient as it is equal for every solution. Analogously, the first solution could be computed from the second one:

ya= ±kybk

ρ (yb+ 2xb)

Now, the relation between the x vectors is being obtained using (56)-(58) and the form of

s11, s22and s33 from (110), providing

xb = 1 ρ ν kyak ya− kyak xa (113) xa = 1 ρ ν kybk yb− kybk xb

being the coefficient ν

ν = 2 (x⊤eye+ 1) ; e = {a, b} (114)

5.3 Relations for the rotation matrices

Next, we try to find the relations between Ra and Rb. We can start this development from

the following expression:

H= Ra+ tan⊤a = Rb+ tbn⊤b = Rb I+ R⊤btbn⊤b Then, Rb is Rb= Ra+ tan⊤a I+ R⊤ btbn⊤b −1

We can write Rb as the product of Ra times another rotation matrix:

Rb= RaRab (115)

This rotation matrix is:

Rab= I + R⊤atan⊤a

I+ R⊤b tbn⊤b

−1

Which can be written in a more compact form as a function of xa, ya and xb, yb:

Rab= I + xaya⊤

I+ xby⊤b

−1

The required matrix inversion can be avoided, making use of the following relation: I+ xbyb⊤ −1 = I − 1 1 + x⊤ byb xby⊤b

(45)

As already said, x⊤ byb= x⊤aya, giving: Rab= I + xay⊤a I₋ 2 ν xby ⊤ b

where the expression of ν (114) have been used. Replacing yband xbby its expressions (112)

and (113), respectively, and after some manipulations, another form of Rabis obtained:

Rab= I + 2 ρ2 ν xaya⊤− kyak2xax⊤a − yaya⊤− 2 yax⊤a (116)

Finally, Rb can be written as a function of the elements in Rtna:

Rb= Ra+ 2 ρ2 ν tan⊤a − tat⊤aRa− ktak2Ranan⊤a − 2 Ranat⊤aRa

5.3.1 Relations for the rotation axes and angles

Now we want to find the expression of the rotation axis and angle of Rb as a direct function

of the axis and angle corresponding to Ra. The following well-known relations will be useful

in the developments. Being R a general rotation matrix, its corresponding rotation axis u and rotation angle θ can be retrieved from:

cos(θ) = trace(R) − 1 2 [u]_× = R− R ⊤ 2 sin(θ) u = k ′ kk′_k; k ′₌   r32− r23 r13− r31 r21− r12   (117)

where rij are components of the rotation matrix. In order to avoid the ambiguity due to the

double solution: {u, θ} and {−u, −θ}, we have assumed that the positive angle is always

chosen. In the case of the rotation matrix Rab of the form (116), the corresponding angle

θabis: cos(θab) = x⊤aya+ 2 2 − kxa× yak2 (x⊤ aya+ 2)2+ kxa× yak2

(46)

As (x⊤

aya+ 2) is always strictly positive, we can divide by its square in order to get a more

compact form for the trigonometric functions of this angle:

cos(θab) = 1 − η 2 1 + η2 sin(θab) = 2 η 1 + η2 tan(θab) = 2 η 1 − η2 tanθab 2 = η Being η: η = kxa× yak (x⊤ aya+ 2) (118)

The rotation axis uab is obtained from the vector k′ in (117) which, for the case at hand,

can be written as:

k′=   r32− r23 r13− r31 r21− r12   = 4 ρ2 y ⊤ axa+ 2 (ya× xa)

As the norm of this vector results:

kk′k = _ρ42 y

⊤ axa+ 2

kya× xak

the unitary uabvector will be:

uab=

ya× xa

kya× xak

This means that we already have the axis and angle corresponding to the ”incremental”

rotation Rab as a function of Rtna: tanθab 2 = kna× R⊤atak (n⊤ aR⊤ata+ 2) (119) uab = na× R⊤ata kna× R⊤atak (120)

As Rba= R⊤ab, and as we always select the positive rotation angle from any rotation matrix,

it is clear that:

θba= θab

uba= −uab

In order to get a suitable relation between the axes and angles of rotation, we will use now the relations for the composition of two rotations from the quaternion product.

(47)

Making use of the definition of quaternion product A generic quaternion has a real part and a pure part:

◦ q= α β

Any given rotation, R, can be expressed by the three parameters given by a unitary quater-nion:

α = cosθ

2; β= sin

θ

2u

being, θ and u, as usual, the angle and axis corresponding to the rotation matrix R. Consider

again our rotation matrix Rb, which is the composition of the rotations Ra and Rab:

Rb= RaRab

The angles and axes of a composition of two rotations are related as follows, according to the quaternion composition rule:

◦

q_b = q◦_a _◦q◦_ab

αb = αaαab− β⊤aβab

βb = βa× βab+ αaβab+ αabβa

Hence, the following relations hold:

cosθb 2 = cos θa 2 cos θab 2 − sin θa 2 sin θab 2 u ⊤ a uab sinθb 2 ub = cos θab 2 sin θa 2 ua+ cos θa 2 sin θab 2 uab+ sin θa 2 sin θab 2 (ua× uab)

From these relations, we can derive two particular cases, that may be helpful for future works:

When the axis of rotation, ua, is normal to the plane defined by ta and na. In this

case, R⊤

ata is also on this same plane and, since uab was defined (see (120)) as

uab=

ka

kkak

; ka= ya× xa = na× R⊤ata

verifying the following relations:

u⊤auab= 1 ; ua× uab= 0

what means that uab= ua

Another particular case is when ua= ±na. In this case (among others), the conditions

verified are:

u⊤auab= 0 ; kua× uabk = 1

(48)

Resuming our analysis after these comments on particular cases, simpler relations can be obtained if we transform the relations into tangent-dependent relations, dividing both by

cosθa 2 cos θab 2 : cosθb 2 cosθa 2 cos θab 2 = 1 − tanθa 2 tan θab 2 u ⊤ a uab sinθb 2 cosθa 2 cos θab 2 ub = tanθa 2 ua+ tan θab 2 uab+ tan θa 2 tan θab 2 (ua× uab)

Dividing again the second one by the first one:

rb=

ra+ rab+ ra× rab

1 − r⊤

arab

(121) where we are combining axis and angle of rotation in the following parametrization:

re= tan

θe

2 ue; e = {a, b, ab} (122)

From (119) and (120), we knew that:

rab=

na× R⊤ata

(n⊤

aR⊤ata+ 2) (123)

We could also write the expression of ra in terms of rb and rab:

ra=

rb− rab− rb× rab

1 + r⊤

brab

which comes straightaway from (121), using the fact that rba= −rab. Finally, it is interesting

to write Re in terms of re. This is easily done starting from Rodrigues’ expression of the

rotation matrix:

Re= I + sin(θe)[ue]×+ (1 − cos(θe))[ue]2×

Expanding the sine and cosine as functions of the half angle, and dividing by cos2_(θ

e/2), Re= I + 2 cos2 θe 2 [re] 2 ×+ [re]× As cos2_(θ

e/2) can be written as a function of the tangent:

cos2θe 2 = 1 1 + tan2 θe 2 = 1 1 + krek2

The final expression is then:

Re= I + 2

1 + krek2

Deeper understanding of the homography decomposition for vision-based control

HAL Id: inria-00174036

https://hal.inria.fr/inria-00174036v3

Submitted on 25 Sep 2007

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

for vision-based control

Ezio Malis, Manuel Vargas

To cite this version:

Ezio Malis, Manuel Vargas. Deeper understanding of the homography decomposition for vision-based

control. [Research Report] RR-6303, INRIA. 2007, pp.90. �inria-00174036v3�

a p p o r t

d e r e c h e r c h e

Deeper understanding of the homography

decomposition for vision-based control

Ezio Malis and Manuel Vargas

N° 6303

decomposition for vision-based control

Ezio Malis

and Manuel Vargas

d’homographie pour l’asservissement visuel

Contents

1

Introduction

2

Theoretical background

2.1

Perspective projection

2.2

The homography matrix

3

The numerical method for homography

decomposi-tion

3.1

Faugeras SVD-based decomposition

3.2

Zhang SVD-based decomposition

3.3

Elimination of impossible solutions

4

A new analytical method for homography

decompo-sition

4.1

Summary of the analytical decomposition

4.2

Detailed development of the analytical decomposition

5

Relations among the possible solutions

5.1

Summary of the relations among the possible solutions

5.2

Preliminary relations

5.3

Relations for the rotation matrices

_{and Manuel Vargas}