Projection methods in linear algebra numerics

Linear algebra classes often jump straight to the definition of a projector (as a matrix) when talking about orthogonal projections in linear spaces. As often as it happens, it is not clear how that definition arises. This is what is covered in this post.

Orthogonal projection: how to build a projector

Case 1 – 2D projection over (1,0)

It is quite straightforward to understand that orthogonal projection over (1,0) can be practically achieved by zeroing out the second component of any 2D vector, at last if the vector is expressed with respect to the canonical basis \{ e_1, e_2 \}. Albeit an idiotic statement, it is worth restating: the orthogonal projection of a 2D vector amounts to its first component alone.

How can this be put math-wise? Since we know that the dot product evaluates the similarity between two vectors, we can use that to extract the first component of a vector v. Once we have the magnitude of the first component, we only need to multiply that by e_1 itself, to know how much in the direction of e_1 we need to go. For example, starting from v = (5,6), first we get the first component as v \cdot e_1 = (5,6) \cdot (1,0) = 5; then we multiply this value by e_1 itself: 5e_1 = (5,0). This is in fact the orthogonal projection of the original vector. Writing down the operations we did in sequence, with proper transposing, we get

    \[e_1^T (e_1 v^T) = \begin{bmatrix} 1 \\ 0 \end{bmatrix} ([1, 0] \begin{bmatrix} 5 \\ 6 \end{bmatrix}) .\]

One simple and yet useful fact is that when we project a vector, its norm must not increase. This should be intuitive: the projection process either takes information away from a vector (as in the case above), or rephrases what is already there. In any way, it certainly does not add any. We may rephrase our opening fact with the following proposition:

PROP 1: ||v|| \geq ||Projection(v)||.

This is can easily be seen through the pitagorean theorem (and in fact only holds for orthogonal projection, not oblique):

    \[||v||^2 = ||proj_u(v)||^2 + ||v - proj_u(v)||^2 \geq ||proj_u(v)||^2\]

Continue reading “Projection methods in linear algebra numerics”

Reproducing a transport instability in convection-diffusion equation

Drawing from Larson-Bengzon FEM book, I wanted to experiment with transport instabilities. It looks there might be an instability in my ocean-ice model but before being able to address that, I wanted to wrap my head around the 1D simplest example one could find. And that is the Convection-Diffusion Equation:

(1)   \begin{equation*} \begin{split} - \epsilon \Delta u + b \cdot \nabla u &= f  \ \ in \ \Omega \\ u &= 0 \ \ on \ \partial \Omega \end{split} \end{equation*}

The first term -\epsilon \Delta u is responsible of the smearing of the velocity field u proportionally to \epsilon, and is thus called the diffusion term. Intuitively, it controls how much the neighbors of a given point x are influenced by the behavior of x; how much velocities (or temperatures: think of whatever you want!) diffuse in the domain.

The second term b \cdot \nabla u controls how much the velocities u are transported in the direction of the vector field b, and is thus called the convective term. A requirement for this problem to be well-posed is that \nabla \cdot b = 0 — otherwise it would mean that we allow velocities to vanish or to come into existence.

The instability is particularly likely to happen close to a boundary where a Dirichlet boundary condition is enforced. The problem is not localized only close to the boundary though, as fluctuations can travel throughout the whole domain and lead the whole solution astray.

Transport instability in 1D

The simplest convection-diffusion equation in 1D has the following form:

(2)   \begin{equation*} \epsilon u_{xx} + u_x = 1 \ \ in \ (0,1), \ \ u(0) = u(1) = 0 \end{equation*}

whose solution, for small \epsilon is approximately just u = x. This goes well with the boundary condition at 0, but not with the condition at 1, where the solution needs to go down to 0 quite abruptly to satisfy the boundary condition.

It’s easy to simulate the scenario with FEniCS and get this result (with \epsilon = 0.01 and the unit interval divided in 10 nodes):


in which we can infer two different trends: one with odd points and one with even points! In fact, if we discretize the 1D equation with finite elements we obtain:

(3)   \begin{equation*} \epsilon \frac{u_{i+1}-2u_i-u_{i-1}}{h^2} + \frac{u_{i+1}-u_{i-1}}{2h} = 1$ \end{equation*}

Continue reading “Reproducing a transport instability in convection-diffusion equation”

How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations?

To solve a classical second-order differential problem

    \begin{equation*} -(au')' + bu' + cu = f \ in \ \Omega \end{equation*}

with FEM, we first need to derive its weak formulation. This is achieved by multiplying the equation by a test function \phi and then integrating by parts to get rid of second order derivatives:

(1)   \begin{equation*} \begin{split} 0 &= \int_\Omega ((-a u')' + b u' + c u - f) \phi dx \\ &= \underbrace{\int_\Omega ((a u' \phi' + b u' \phi + c u \phi) dx}_{a(u, \phi)} \underbrace{- \int_\Omega f \phi dx - (a u' \phi)|_{\partial\Omega}}_{L(\phi)} \end{split} \end{equation*}

A typical FEM problem then reads like:

    \begin{equation*} \begin{split} \text{Find } u \in H_0'(\Omega) \ s.t. \ a(u, \phi) + L(\phi) = 0 \ \ \forall \phi \in H_0'(\Omega), \\ \text{where } H_0'(\Omega) = \{ v: \Omega \rightarrow \mathbb{R} : \int_0^1v^2(x) + v'(x)^2 dx < \infty \}. \end{split} \end{equation*}

What is the difference between imposing Dirichlet boundary conditions (ex. u(\partial \Omega) = k) and Neumann ones (u'(\partial \Omega) = k(x)) from a math perspective? Dirichlet conditions go into the definition of the space H_0', while Neumann conditions do not. Neumann conditions only affect the variational problem formulation straight away.

Continue reading “How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations?”

What is the difference between Finite Differences and Finite Element Methods?

With Finite Differences, we discretize space (i.e. we put  a grid on it) and we seek the values of the solution function at the mesh points. We still solve a discretized differential problem.

With Finite Elements, we approximate the solution as a (finite) sum of functions defined on the discretized space. These functions make up a basis of the space, and the most commonly used are the hat functions. We end up with a linear system whose unknowns are the weights associated with each of the basis functions: i.e., how much does each basis function count for out particular solution to our particular problem?

Brutally, it is finding the value of the solution function at grid points (finite differences) vs the weight of the linear combinations of the hat functions (finite elements).