Projection methods in linear algebra numerics

Linear algebra classes often jump straight to the definition of a projector (as a matrix) when talking about orthogonal projections in linear spaces. As often as it happens, it is not clear how that definition arises. This is what is covered in this post.

Orthogonal projection: how to build a projector

Case 1 – 2D projection over (1,0)

It is quite straightforward to understand that orthogonal projection over (1,0) can be practically achieved by zeroing out the second component of any 2D vector, at last if the vector is expressed with respect to the canonical basis \{ e_1, e_2 \}. Albeit an idiotic statement, it is worth restating: the orthogonal projection of a 2D vector amounts to its first component alone.

How can this be put math-wise? Since we know that the dot product evaluates the similarity between two vectors, we can use that to extract the first component of a vector v. Once we have the magnitude of the first component, we only need to multiply that by e_1 itself, to know how much in the direction of e_1 we need to go. For example, starting from v = (5,6), first we get the first component as v \cdot e_1 = (5,6) \cdot (1,0) = 5; then we multiply this value by e_1 itself: 5e_1 = (5,0). This is in fact the orthogonal projection of the original vector. Writing down the operations we did in sequence, with proper transposing, we get

    \[e_1^T (e_1 v^T) = \begin{bmatrix} 1 \\ 0 \end{bmatrix} ([1, 0] \begin{bmatrix} 5 \\ 6 \end{bmatrix}) .\]

One simple and yet useful fact is that when we project a vector, its norm must not increase. This should be intuitive: the projection process either takes information away from a vector (as in the case above), or rephrases what is already there. In any way, it certainly does not add any. We may rephrase our opening fact with the following proposition:

PROP 1: ||v|| \geq ||Projection(v)||.

This is can easily be seen through the pitagorean theorem (and in fact only holds for orthogonal projection, not oblique):

    \[||v||^2 = ||proj_u(v)||^2 + ||v - proj_u(v)||^2 \geq ||proj_u(v)||^2\]

Case 2 – 2D projection over (1,1)

Attempt to apply the same technique with a random projection target, however, does not seem to work. Suppose we want to project over (1,1). Repeating what we did above for a test vector [3,0], we would get

    \[\begin{bmatrix} 1 \\ 1 \end{bmatrix} ([3, 0] \begin{bmatrix} 1 \\ 1 \end{bmatrix}) =  [3,3].\]

This violates the previously discovered fact the norm of the projection should be \leq than the original norm, so it must be wrong. In fact, visual inspection reveals that the correct orthogonal projection of [3,0] is [\frac{3}{2}, \frac{3}{2}].

The caveat here is that the vector onto which we project must have norm 1. This is vital every time we care about the direction of something, but not its magnitude, such as in this case. Normalizing [1,1] yields [\frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}]. Projecting [3,0] over [\frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}] is obtained through

    \[\begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix} ([3, 0] \begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix}) =  [\frac{3}{2}, \frac{3}{2}],\]

which now is indeed correct!

PROP 2: The vector on which we project must be a unit vector (i.e. a norm 1 vector).

Case3 – 3D projection on a plane

A good thing to think about is what happens when we want to project on more than one vector. For example, what happens if we project a point in 3D space onto a plane? The ideas is pretty much the same, and the technicalities amount to stacking in a matrix the vectors that span the place onto which to project.

Suppose we want to project the vector v = [5,7,9] onto the place spanned by \{ [1,0,0], [0,1,0] \}. The steps are the same: we still need to know how much similar v is with respect to the other two individual vectors, and then to magnify those similarities in the respective directions.

    \[\begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 5 \\ 7 \\ 9 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 5 \\ 7 \end{bmatrix} = 5 \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + 7 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 5 \\ 7 \\ 0 \end{bmatrix}\]

The only difference with the previous cases being that vectors onto which to project are put together in matrix form, in a shape in which the operations we end up making are the same as we did for the single vector cases.

The rise of the projector

As we have seen, the projection of a vector v over a set of orthonormal vectors Z is obtained as

    \[Projection_Z(v) = Z^T Z v^T .\]

And up to now, we have always done first the last product Z v^T, taking advantage of associativity. It should come as no surprise that we can also do it the other way around: first Z^T Z and then afterwards multiply the result by v^T. This Z^T Z makes up the projection matrix. However, the idea is much more understandable when written in this expanded form, as it shows the process which leads to the projector.

THOREM 1: The projection of v over an orthonormal basis Z is

    \[Projection_Z(v) = Z^T Z v^T = \underbrace{P}_{Projector} v^T .\]

So here it is: take any basis of whatever linear space, make it orthonormal, stack it in a matrix, multiply it by itself transposed, and you get a matrix whose action will be to drop any vector from any higher dimensional space onto itself. Neat.

Projector matrix properties

  • The norm of the projected vector is less than or equal to the norm of the original vector.
  • A projection matrix is idempotent: once projected, further projections don’t do anything else. This, in fact, is the only requirement that defined a projector. The other fundamental property we had asked during the previous example, i.e. that the projection basis is orthonormal, is a consequence of this. This is the definition you find in textbooks: that P^2 = P. However, if the projection is orthogonal, as we have assumed up to now, then we must also have P = P^T.
  • The eigenvalues of a projector are only 1 and 0. For an eigenvalue \lambda,

        \[\lambda v = Pv = P^2v = \lambda Pv = \lambda^2 v \Rightarrow \lambda = \lambda^2 \Rightarrow \lambda = \{0,1\}\]

  • It exists a basis X of \mathbb{R}^n such that it is possible to write P as P = [I_k \ 0_{n-k}], with k being the rank of P. If we further decompose X = [X_1, X_2], with X_1 being N \times k and X_2 being N \times N-k, the existence of the basis X shows that P really sends points from \mathbb{R}^N into Im(X_1) = Im(P) and points from \mathbb{R}^N - P(\mathbb{R}^N) into Ker(P). It also shows that \mathbb{R}^N = Im(P) + Ker(P).

Model Order Reduction

Is there any application of projection matrices to applied math? Indeed.

It is often the case (or, at least, the hope) that the solution to a differential problem lies in a low-dimensional subspace of the full solution space. If some \textbf{w}(t) \in \mathbb{R}^N is the solution to the Ordinary Differential Equation

    \begin{equation*} \frac{d\textbf{w}(t)}{dt} = \textbf{f}(\textbf{w}(t), t) \end{equation*}

then there is hope that there exists some subspace \mathcal{S} \subset \mathbb{R}^, s.t. dim(\mathcal{S}) < N in which the solution lives. If that is the case, we may rewrite it as

    \[\textbf{w}(t) = \textbf{V}_\mathcal{S}\textbf{q}(t)\]

for some appropriate coefficients (q_i(t)), which are the components of \textbf{w}(t) over the basis \textbf{V}_\mathcal{S}.

Assuming that the base \textbf{V} itself is time-invariant, and that in general \textbf{Vq(t)} will be a good but not perfect approximation of the real solution, the original differential problem can be rewritten as:

    \begin{equation*} \begin{split} \frac{d}{dt}\textbf{Vq(t)} =  \textbf{f}(Vq(t), t) + \textbf{r}(t) \\ \textbf{V}\frac{d}{dt}\textbf{q(t)} =  \textbf{f}(Vq(t), t) + \textbf{r}(t) \\ \end{split} \end{equation*}

where \textbf{r(t)} is an error.

  • Was this Helpful ?
  • yes   no

Leave a Reply

Your email address will not be published. Required fields are marked *