• Aucun résultat trouvé

Construction of the SVD

Dans le document MACHINE LEARNING (Page 128-131)

Matrix Decompositions

4.5 Singular Value Decomposition

4.5.2 Construction of the SVD

We will next discuss why the SVD exists and show how to compute it in detail. The SVD of a general matrix shares some similarities with the eigendecomposition of a square matrix.

Remark. Compare the eigendecomposition of an SPD matrix

S =S> =P DP> (4.68)

4.5 Singular Value Decomposition 123 with the corresponding SVD

S =UΣV>. (4.69)

If we set

U =P =V , D=Σ, (4.70)

we see that the SVD of SPD matrices is their eigendecomposition. ♦ In the following, we will explore why Theorem 4.22 holds and how the SVD is constructed. Computing the SVD ofA ∈ Rm×n is equivalent to finding two sets of orthonormal bases U = (u1, . . . ,um) and V = (v1, . . . ,vn) of the codomainRmand the domainRn, respectively. From these ordered bases, we will construct the matricesU andV.

Our plan is to start with constructing the orthonormal set of right-singular vectorsv1, . . . ,vn ∈Rn. We then construct the orthonormal set of left-singular vectorsu1, . . . ,um∈Rm. Thereafter, we will link the two and require that the orthogonality of theviis preserved under the trans-formation ofA. This is important because we know that the imagesAvi

form a set of orthogonal vectors. We will then normalize these images by scalar factors, which will turn out to be the singular values.

Let us begin with constructing the right-singular vectors. The spectral theorem (Theorem 4.15) tells us that the eigenvectors of a symmetric matrix form an ONB, which also means it can be diagonalized. More-over, from Theorem 4.14 we can always construct a symmetric, positive semidefinite matrix A>A ∈ Rn×n from any rectangular matrix A ∈ Rm×n. Thus, we can always diagonalizeA>Aand obtain

A>A=P DP> =P

λ1 · · · 0 ... . .. ...

0 · · · λn

P>, (4.71) whereP is an orthogonal matrix, which is composed of the orthonormal eigenbasis. The λi > 0 are the eigenvalues of A>A. Let us assume the SVD ofAexists and inject (4.64) into (4.71). This yields

A>A= (UΣV>)>(UΣV>) =VΣ>U>UΣV>, (4.72) whereU,V are orthogonal matrices. Therefore, withU>U = I we ob-tain

A>A=VΣ>ΣV>=V

σ12 0 0 0 . .. 0 0 0 σ2n

V>. (4.73) Comparing now (4.71) and (4.73), we identify

V>=P>, (4.74)

σ2ii. (4.75)

Therefore, the eigenvectors ofA>Athat composeP are the right-singular vectors V of A (see (4.74)). The eigenvalues of A>A are the squared singular values ofΣ(see (4.75)).

To obtain the left-singular vectors U, we follow a similar procedure.

We start by computing the SVD of the symmetric matrixAA> ∈ Rm×m (instead of the previousA>A∈Rn×n). The SVD ofAyields

AA>= (UΣV>)(UΣV>)>=UΣV>>U> (4.76a)

=U

σ12 0 0 0 . .. 0 0 0 σm2

U>. (4.76b)

The spectral theorem tells us thatAA> = SDS> can be diagonalized and we can find an ONB of eigenvectors ofAA>, which are collected in S. The orthonormal eigenvectors ofAA> are the left-singular vectorsU and form an orthonormal basis in the codomain of the SVD.

This leaves the question of the structure of the matrix Σ. SinceAA>

andA>Ahave the same nonzero eigenvalues (see page 106), the nonzero entries of theΣmatrices in the SVD for both cases have to be the same.

The last step is to link up all the parts we touched upon so far. We have an orthonormal set of right-singular vectors inV. To finish the construc-tion of the SVD, we connect them with the orthonormal vectors U. To reach this goal, we use the fact the images of thevi underA have to be orthogonal, too. We can show this by using the results from Section 3.4.

We require that the inner product betweenAvi andAvj must be 0 for i6=j. For any two orthogonal eigenvectorsvi,vj,i6=j, it holds that

(Avi)>(Avj) =v>i (A>A)vj =v>ijvj) =λjv>i vj = 0. (4.77) For the case m > r, it holds that {Av1, . . . ,Avr} is a basis of an r -dimensional subspace ofRm.

To complete the SVD construction, we need left-singular vectors that are orthonormal: We normalize the images of the right-singular vectors Aviand obtain

ui:= Avi

kAvik = 1

√λi

Avi= 1 σi

Avi, (4.78)

where the last equality was obtained from (4.75) and (4.76b), showing us that the eigenvalues ofAA>are such thatσi2i.

Therefore, the eigenvectors of A>A, which we know are the right-singular vectorsvi, and their normalized images underA, the left-singular vectorsui, form two self-consistent ONBs that are connected through the singular value matrixΣ.

Let us rearrange (4.78) to obtain thesingular value equation

singular value equation

Aviiui, i= 1, . . . , r . (4.79)

4.5 Singular Value Decomposition 125 This equation closely resembles the eigenvalue equation (4.25), but the vectors on the left- and the right-hand sides are not the same.

Forn < m, (4.79) holds only fori6n, but (4.79) says nothing about the ui for i > n. However, we know by construction that they are or-thonormal. Conversely, form < n, (4.79) holds only fori6m. Fori > m, we haveAvi =0and we still know that thevi form an orthonormal set.

This means that the SVD also supplies an orthonormal basis of the kernel (null space) ofA, the set of vectorsxwithAx=0(see Section 2.7.3).

Concatenating thevias the columns ofV and theuias the columns of U yields

AV =UΣ, (4.80)

whereΣhas the same dimensions asAand a diagonal structure for rows 1, . . . , r. Hence, right-multiplying withV> yields A=UΣV>, which is the SVD ofA.

Example 4.13 (Computing the SVD)

Let us find the singular value decomposition of A=

1 0 1

−2 1 0

. (4.81)

The SVD requires us to compute the right-singular vectorsvj, the singular valuesσk, and the left-singular vectorsui.

Step 1: Right-singular vectors as the eigenbasis ofA>A. We start by computing

A>A=

We compute the singular values and right-singular vectors vj through the eigenvalue decomposition ofA>A, which is given as

A>A= and we obtain the right-singular vectors as the columns ofP so that

V =P =

Dans le document MACHINE LEARNING (Page 128-131)