Gram-Schmidt Vector Orthogonalization - Mathematical and Computational Concepts

Key Concept

2.2 Mathematical and Computational Concepts

2.2.9 Gram-Schmidt Vector Orthogonalization

Important operations in scientiﬁc computing are vector orthogonalization and normalization.

The Gram-Schmidt process starts with n linearly independent vectors x_i and ends with n orthonormal vectors q_i, i.e., vectors which are orthogonal to each other, and also their L₂ -norm is unity. Let us consider the vectors x_i, i= 0, . . . , n−1 each of length M. We want to produce vectors q_i, i = 0, . . . , n−1 where the ﬁrst vector q₀ is the normalized x₀, the vector q₁ is orthogonal to q₀ and normalized, the vector q₂ is orthogonal to q₀ and q₁, and so on. The idea is to produce a vector

y_i =x_i−(q^T_i₋₁x_i)q_i₋₁−. . .−(q^T₀x_i)q₀,

which subtracts the projection ofx_i onto each vectorq_j forj = 0, . . . , i−1. Having obtained y_i we can then normalize it to obtain the corresponding orthonormal vector, i.e.,

q_i = y_i y_i 2

. We can summarize this algorithm as follows:

• Initialize: Computer₀₀ =x₀ 2. If r₀₀ = 0 STOP, else q₀ =x₀/r₀₀.

Example: Let us assume that we want to orthonormalize the vectors

x₀ =

thus, following the above algorithm, we obtain

• r₀₀= (1²+ 0²+ 2²)^1/2 = 2.2367 and q₀ = 1

Notice that we can write



The algorithm above is presented using common mathematical abstractions, such as vectors and matrices. The beauty of C++ is that these mathematical abstractions can be implemented in C++ as “user-defined” data types, in particular for this case, asclasses. We now present the implementation of the above algorithm in C++, utilizing some predefined user-defined classes that we have created. Explanation of the class syntax of this function will be given later in section 3.1.8, along with details as to how to create your own user-defined data types.

Software Suite

Gram-Schmidt Code

In the coding example below, we are using theSCVector classthat we previously defined (section 2.1.1). Because we have defined the SCVector class to have mathematical properties just like what we would expect, we see that we can translate the algorithm given above directly into the code. Admittedly, some of the C++ syntax in the function below goes beyond what you have been taught thus far; the details of the class implementation of this code will be given later on in this book (see section 3.1.8). What you should notice, however, is that classes, such as SCVector, allow you to more closely model the mathematical definitions used in the algorithmic description of the solution of the problem.

SCstatus GramSchmidt(SCVector * x, SCVector * q){

int i,j;

int dim = x[0].Dimension();

SCVector y(dim);

SCMatrix r(dim);

r(0,0) = x[0].Norm_l2();

if(r(0,0)==0.0) return(FAIL);

else

q[0] = x[0]/r(0,0);

for(j=1;j<dim;j++){ // corresponds to Begin Loop for(i=0;i<=j-1;i++)

r(i,j) = dot(q[i],x[j]); // corresponds to 1 y = x[j];

for(i=0;i<=j-1;i++)

y = y - r(i,j)*q[i]; // corresponds to 2 r(j,j) = y.Norm_l2(); // corresponds to 3

if(r(j,j) == 0.0) return(FAIL);

else

q[j] = y/r(j,j); // corresponds to 4 }

return(SUCCESS);

}

Observe in the code above that we allocate within this function an SCMatrix rwhich we use throughout the function, and which is discarded when the function returns to its calling function. We may want to retain r, however. In this case, we can create a function which has an identical name as the previous function but contains an additional variable within the argument list. The name and the argument list are used to distinguish which function we are referring to when we call the function. (this concept will be discussed further in section 4.1.4).

Software Suite

In the function below, we pass into the functionGramSchmidt a SCMatrix r which it populates over the course of the com-putation.

SCstatus GramSchmidt(SCVector * x, SCVector * q, SCMatrix &r){

int i,j;

int dim = x[0].Dimension();

SCVector y(dim);

r(0,0) = x[0].Norm_l2();

if(r(0,0)==0.0) return(FAIL);

else

q[0] = x[0]/r(0,0);

for(j=1;j<dim;j++){ // corresponds to Begin Loop for(i=0;i<=j-1;i++)

r(i,j) = dot(q[i],x[j]); // corresponds to 1 y = x[j];

for(i=0;i<=j-1;i++)

y = y - r(i,j)*q[i]; // corresponds to 2 r(j,j) = y.Norm_l2(); // corresponds to 3

if(r(j,j) == 0.0) return(FAIL);

else

q[j] = y/r(j,j); // corresponds to 4 }

return(SUCCESS);

}

Key Concept

• Classes can help you more closely mimic the natural data struc-tures of the problem. We are not conﬁned to working with only the low level concepts of integers, ﬂoats, and characters.

QR Factorization and Code

Another important point which we will often use in this book is a special matrix factor-ization. In particular, if the vectors x_i, i= 0, . . . , n−1 form the columns of a matrix X of size m×n, alsoq_i, i= 0, . . . , n−1 form the columns of matrix Q, andr_ij are the entries of a square n×n matrix R (which turns out to be upper triangular) the following equation is valid

X=Q R

which is known asQRdecomposition (or factorization) of the matrixX, and it has important implications in obtaining eigenvalues and solutions of linear systems.

Software Suite

We now present a C++ function which accomplishes the QR decomposition of a matrix.

Just as was stated above, we input a matrix X to be decomposed into the matrices Q and R. We begin by creating two arrays of vectors q and v, which will serve as input to our original Gram-Schmidt routine. As you will see, this routine contains only two basic components:

1. A data management component, which is going from matrices to a collection of vectors and back, and

2. A call to the Gram-Schmidt routine that we wrote previously (and now you understand why we may have wanted to be able to retrieve the value of the SCMatrix r).

This routine demonstrates one important issue in scientific computing, i.e., the compromise between computational time and programmer’s time. In this case, one may argue that if we were to write a routine specifically for QR decomposition, then we could reduce some of the cost of the data management section, and thus have a “more optimal code”. However, this considerationmustbe balanced by considering how much computational time is used for data manipulation versus the time to properly write and debug an entirely new function. In this particular case, in theory, we have already written and tested our GramSchmidt(v,q,R) function, and hence we are confident that if we give the Gram-Schmidt function proper in-puts, it will return the correct solution. Hence, we can focus our programming and debugging on the extension of the concept, rather than on the details of optimization. Optimization is certainly important if we were to be calling this routine many times in a particular simu-lation; however, optimization-savy individuals, as the old saying goes, often miss the forest for the trees!

SCstatus QRDecomposition(SCMatrix X, SCMatrix &Q, SCMatrix &R){

int i,j;

int num_vecs = X.Rows();

int dim = X.Columns();

SCstatus scflag;

Vector *q = new SCVector[num_vecs](dim),

*v = new Vector[num_vecs](dim);

for(i=0;i<num_vecs;i++){

for(j=0;j<dim;j++) v[i](j) = X(j,i);

}

scflag = GramSchmidt(v,q,R);

for(i=0;i<num_vecs;i++) for(j=0;j<dim;j++)

Q(j,i) = q[i](j);

return scflag;

}

Modiﬁed Gram-Schmidt Algorithm and Code

Notice that the Gram-Schmidt method breaks down at the k^th stage if x_k is linearly de-pendent on the previous vectors x_j, j = 0, . . . , k −2 because x_k 2= 0. It has also been observed that, in practice, even if there are no actual linear dependencies, orthogonality may be lost because of ﬁnite arithmetic and round-oﬀ problems, as discussed earlier. To this

end, a modiﬁed Gram-Schmidt process has been proposed which is almost always used in computations. Speciﬁcally, an intermediate result is obtained,

y⁰_j =q_j−(q^T₀x_j)q₀,

which we project ontoq₀ (instead of the original x_j), as follows:

y¹_j =y⁰_j −(q^T₁y⁰_j)q₁,

and so on. This process then involves successive one-dimensional projections. In the follow-ing, we present a row-oriented version of the modiﬁed Gram-Schmidt algorithm.

• Initialize: Set q_i =x_i, i= 0, . . . , n−1.

• Begin Loop: Fori= 0, . . . , n−1 Do:

r_ii=||q_i||2

q_i =q_i/r_ii

Forj =i+ 1, . . . , n−1 Do:

r_ij =q^T_i q_j q_j =q_j−r_ijq_i End Loop

• End Loop.

Software Suite

We present a C++ implementation of the modiﬁed Gram-Schmidt algorithm below. With the exception of the com-mented block of code, the remaining code is identical to the original code provided above.

SCstatus ModifiedGramSchmidt(Vector * x, Vector * q, Matrix &r){

int i,j;

int dim = x[0].Dimension();

Vector y(dim);

r(0,0) = x[0].Norm_l2();

if(r(0,0)==0) return(FAIL);

else

q[0] = x[0]/r(0,0);

for(j=1;j<dim;j++){

/*******************************************************/

/* We replace the following block of lines from the */

/* original Gram-Schmidt algorithm presented above, */

/* for(i=0;i<=j-1;i++) */

/* r(i,j) = dot(q[i],x[j]); */

/* */

/* y = x[j]; */

/* for(i=0;i<=j-1;i++) */

/* y = y - r(i,j)*q[i]; */

/* */

/* with the modification described above. The */

/* following lines implement that modification. */

/*******************************************************/

y = x[j];

for(i=0;i<=j-1;i++){

r(i,j) = dot(q[i],y);

y = y - r(i,j)*q[i];

}

/*******************************************************/

/* End of Modification */

/*******************************************************/

r(j,j) = y.Norm_l2();

if(r(j,j) == 0) return(FAIL);

else

q[j] = y/r(j,j);

}

return(SUCCESS);

}

Remark 1: The computational complexity of the Gram-Schmidt process is O(mn²) irre-spective of which version is used. This is evident by comparing the comment block inserted into the Modiﬁed Gram-Schmidt code. If you carefully examine the deleted code versus the newly inserted code, you will see that the number of operations that is performed is identical. It is often the case in scientiﬁc computing that although two algorithms may be

identical mathematically (i.e., in infinite precision), one algorithm is inherently better than the other when implemented numerically. Furthermore, in this case, we see that we achieve an additional benefit from the modified algorithm at no additional cost.

Remark 2: The loss of orthogonality of Q in the modified Gram-Schmidt method depends on the condition number κ(A) of the matrix A obtained by using the specified vectors as columns [8]. In general, the orthogonality of Q can be completely lost with the classical Gram-Schmidt method while the orthogonality property may not be lost with the modified Gram-Schmidt method but it may not be acceptable when the matrix A is ill-conditioned. A better approach is to employ the Householder method discussed in section 9.3, which is more accurate and also computationally less expensive. For example, the cost for Gram-Schmidt is O(mn²) while for the Householder method is O(mn²−n³/3).

Dans le document in C++ and MPI (Page 70-78)