Vision 3D artiﬁcielle - Final exam (duration: 2h30)

(1)

Vision 3D artificielle - Final exam (duration: 2h30)

P. Monasse and R. Marlet November 18th, 2014

You can choose to answer in French or English, at your convenience.

1 Relative Pose from Line Correspondences

This section proposes to explore a method estimating rotation and translation between two view positions from observed lines in orthogonal directions. The matrix of internal parametersK is supposed known.

You may use the well-known formula relating to the cross product:

(a×b)×c= (a^Tc)b−(a^Tb)c. (1) 1. Show that lines in 3D are viewed as lines in 2D by a pinhole camera.

2. Show that parallel lines in 3D along direction vector d, expressed in the coordinate system linked to the camera, are projected as concurrent lines at pointv=Kd, called vanishing point (“point de fuite” in French).

See the figure for examples of vanishing points.

3. Under what geometric conditions is the vanishing point “at infinity”?

4. Show that if we change the world coordinate system to an arbitrary one (still orthonormal), we can write v = KRd; explain the origin of the rotation matrixR.

Figure 1: Original image and three sets of parallel 3D lines in orthogonal directions.

1

(2)

5. Show that vanishing points corresponding to orthogonal directions satisfy v₁^T(K^−TK⁻¹)v2= 0. (2) 6. Suppose we take the world coordinate system so thatd1= 1 0 0T

and d2= 0 1 0T

, directions corresponding to vanishing pointsv1 andv2. Show that the rotation matrixR can be written

R=

K⁻¹v1

kK⁻¹v₁k

K⁻¹v2

kK⁻¹v₂k

K⁻¹v1

kK⁻¹v₁k×_kK^K⁻¹−1^vv²₂k.

(3) 7. Letl1 a projected line in directiond1. Show that

v1=l1×(K^−TK⁻¹v2). (4) 8. Supposing we have 3 lines, l₁ in direction d₁ and l₂, l₃ in orthogonal

directiond2. Sum up an algorithm to computeR.

9. Suppose directions d_i, i= 1,2, map to vanishing pointsv_i andv⁰_i in two views, linked by relative motion (R_rel, T_rel). Show that we have

K⁻¹vi=R^T_relK⁻¹v⁰_i. (5) 10. From an estimation ofRrel, we want to evaluate how well a triplet of lines

(l1, l2, l3) and (l⁰₁, l⁰₂, l₃⁰) fits this rotation with the criterion:

2

X

j=1

cos⁻¹ v_j^TK^−TR^T_relK⁻¹v_j⁰ kK⁻¹vjkkK⁻¹v⁰_jk

!

. (6)

Justify this criterion.

11. Propose an algorithm of type RANSAC estimating Rrel from detected lines in both images.

12. We writevij andv_ij⁰ vanishing points estimated by configurationiof three lines and directionj. We estimate the least square solution:

R_rel= arg min

R N

X

i=1 2

X

j=1

K⁻¹v_ij

kK⁻¹vijk−R^T K⁻¹v⁰_ij kK⁻¹v⁰_ijk

2

. (7)

Write this with 3×2N matricesD andD⁰ and Frobenius norm:

R_rel = arg min

R

D−R^TD⁰

2

F. (8)

(Reminder: kAk²_F = Tr(A^TA)) 13. Show that

Rrel = arg max

R Tr(RDD^0T). (9)

2

(3)

14. Writing the SVD decomposition ofD⁰D^T asU SV^T, show that

Rrel=U V^T. (10)

15. Suppose that two pairs of lines (assumed to be projections of coplanar 3D lines) intersect atp₁,p₂in left image andp⁰₁,p⁰₂in right image. Using the epipolar constraint, show thatT_rel can be estimated up to scale onceR_rel is known.

2 Feature detection and description

The goal of the following questions is to check your understanding of the course.

You can answer them with just one or a few sentences.

2.1 Repeated elements

Consider the case where you have several images of a scene containing repeated elements, e.g., several similar windows on a facade.

1. What is the impact on feature detection?

2. What is the impact on feature description?

3. What is the impact on feature matching in general?

4. What is the impact on feature matching with the SIFT matching strategy?

5. What is the impact on camera calibration?

6. What is the impact on 3D reconstruction?

2.2 Similarity measures

Consider the way Harris features are detected, usingE_{AC SSD} as in the orginal formulation of Harris and Stephens (1988).

1. What is the impact if we now replaceE_{AC SSD} withE_{AC ZSSD}? 2. What is the impact if we now replaceE_{AC SSD} withE_{AC ZNSSD}? 3. What is the impact if we now replaceEAC SSD withEAC ZNCC?

3 Graph cut for disparity map estimation

Consider the setting used for the assignment (lab work) on disparity map estimation using a graph cut, with exact multi-label optimization (cf. slides, p. 112).

3

(4)

3.1 Dependence of the neighboring system

In this setting, instead of a 4-pixel neighborhood system , consider the case of 8-pixel neighborhoods, i.e., including diagonals .

1. What bias regarding disparity estimation do we get if we still define the smoothness term as V_p,q(d_p, d_q) = λ|dp −d_q| for any two neighboring pixelsp, q?

2. What definition ofV_p,q(d_p, d_q) should we use instead to prevent this bias?

3.2 Implementation

The question is now how to implement this new definition forVp,q in the frame- work of a linear multi-label graph construction (cf. slides, p. 74+):

1. What should be the weight of edge (pj, qj) for two neighboring pixelsp, q?

2. How is the weight of edget^p_j affected?

4