Given a basis B={b1,b2,…,bn}, which spans a subspace V, we want to find an orthogonal basis U={u1,u2,…,un} that also spans V.
Base case: We let u1=b1. If V is a one-dimensional subspace, then we are done.
Iterative case: We take each vector from B one by one and project it onto the existing orthogonal basis vectors. For b2, since we only have u1=b1 in our orthonormal basis, we only do one projection.
projUb2=u1⋅u1b2⋅u1u1
We know that the error vector will be orthogonal to our existing orthogonal basis, so we subtract the projection from the b vector.
u2=b2−projUb2=b2−u1⋅u1b2⋅u1u1
For each successive b vector, we repeat this algorithm. Therefore, u3=b3−projUb3=b3−u1⋅u1b3⋅u1u1−u2⋅u2b3⋅u2u2
Notes: 1. Gram-Schmidt turns a set of vectors into an orthogonal basis. Since basis vectors have to be linearly independent, Gram-Schmidt will return some 0 if the set of vectors contains linearly dependent vectors. Intuitively, since the linearly dependent vector is already in the subspace, it is equal to its projection, so the error vector is 0. 2. Order matters. You will get a different basis depending on the order of the b vectors in which you perform Gram-Schmidt.
Example
Find an orthonormal basis for B={[11],[21]}.
Let U={u1,u2} be the orthogonal basis. Then, u1=b1=[11].
However, the question asks for an orthonormal basis, so we just normalize each of these vectors to get our orthonormal basis:
{[2222],[22−22]}
System of Equations in Orthogonal Bases
Assume that U={u1,u2,…,un} is an orthogonal basis for the vector space V. For another vector y∈V, we want to express y as a linear combination of the orthogonal basis vectors and solve y=c1u1+c2u2+⋯+cnun.
If we dot both sides with ui,1≤i≤n, then we get
y⋅ui=c1u1⋅ui+c2u2⋅ui+⋯+cnun⋅ui.
However, remember that since U is an orthonormal basis, uj⋅uk=0 if j=k. Therefore, the above equation becomes
y⋅ui=ciui⋅ui
Solving for the coefficient ci,
ci=ui⋅uiy⋅ui
Therefore, in cases of orthogonal bases, we only need to find the projections of the vector y onto each basis vector.
QR Factorization
What and why
QR Factorization is basically using Gram-Schmidt to find an orthonormal basis spanning the column space of a matrix A and re-writing the Gram-Schmidt process in matrix form. QR Factorization produces an orthogonal matrix Q (remember that an orthogonal matrix has orthonormal columns) and an upper triangular matrix R, where A=QR.
Calculation
The most straighforward way to calculate Q and R is to first calculate Q using Gram-Schmidt and then normalizing the result. To find R, we can use the property that Q is orthonormal, i.e. Q−1=QT.
A=QR
QTA=QTQR
R=QTA
The other way (which will not be discussed here) is to use the Gram-Schmidt equations and re-writing them in matrix form. When we then normalize the columns Q (i.e. divide by the lengths of each vector), we multiply the corresponding row in R by the lengths of the q vectors.
Example
Using the example from above, where A=[1121], we know that Q=[222222−22].
Note that R is upper triangular. Check for yourself that A=QR.
Least Squares using QR Factorization
QR Factorization is helpful when solving doing least squares ATAx=ATb because if we can factorize A=QR, we can take advantage of the property that Q is orthonormal and that R is invertible (and easy to invert because it's upper triangular).