Numerical Analysis — Quick Math Intuitions https://quickmathintuitions.org/category/numerical-analysis/ Sharing quick intuitions for math ideas Wed, 19 Jul 2023 07:40:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 Why does Finite Elements Methods (FEM) tile domains with triangles? https://quickmathintuitions.org/finite-elements-methods-fem-tile-domains-triangles/?pk_campaign=&pk_source= https://quickmathintuitions.org/finite-elements-methods-fem-tile-domains-triangles/#respond Tue, 19 Jul 2022 07:32:30 +0000 https://quickmathintuitions.org/?p=493 Why is it the case that the Finite Elements Method (FEM) tiles domains with triangles? With so many geometrical shapes, is there anything special triangles have to offer? For starters,…

The post Why does Finite Elements Methods (FEM) tile domains with triangles? appeared first on Quick Math Intuitions.

]]>
Why is it the case that the Finite Elements Method (FEM) tiles domains with triangles? With so many geometrical shapes, is there anything special triangles have to offer?

For starters, a surprisingly small number of regular polygons can even be used to tile a surface. If we want to avoid leaving any gaps in the space, the condition that needs to be fulfilled is that the internal angle of the polygon has to evenly divide 360. Evenly is the keyword here. The internal angle of regular polygons is $(n-2)*\frac{180}{n}$, where $n$ is the number of sides. Here below is the list of regular polygons of up to 100 sides with the corresponding internal angle:

(3, 60.0)
(4, 90.0)
(5, 108.0)
(6, 120.0)
(8, 135.0)
(9, 140.0)
(10, 144.0)
(12, 150.0)
(15, 156.0)
(18, 160.0)
(20, 162.0)
(24, 165.0)
(30, 168.0)
(36, 170.0)
(40, 171.0)
(45, 172.0)
(60, 174.0)
(72, 175.0)
(90, 176.0)

Of these, only the triangle (60), the square (90) and the hexagon (120) fit well into 360. All the others simply don’t. This first evaluation is enough to shift the initial question from “why the triangle” to “why not the square or the hexagon“? All three of them are in fact theoretically equally capable of tiling whatever “infinite surface”, and it’s not a case that beehives are hexagonal in structure.

Now the practical constraints come in:

  1. we don’t deal with boundless surfaces – we always have boundaries, and the meshing needs to follow them as accurately as possible. How much can you bend a square to make it fit a curvy boundary? Not much. With a hexagon the problem is pretty much the same. Triangles, although still not perfect, have one pointy end that can be arranged to follow a curly outline better.
  2. in FEM, every vertex of every triangle is a degree of freedom of the linear system that is then solved. Put in twice as many triangles and you’ll get almost twice as many degrees of freedom (not exactly twice as many, but hey, you get the idea). What happens if we use hexagons? Now every tile will yield 6 degrees of freedom! The problem is that these 6 vertexes are not well distributed, since they are all concentrated on the perimeter of the small hexagon – no degree of freedom covers the interior of the hexagon, which will leave quite some non-computational space while still doubling the computational cost of the simulation.

All in all, the triangle is a much less constrained shape. Think of being given a random cloud of 2D points, with the instructions of linking them with a regular polygon of your choice. Obviously no shape will work on the cloud of points as it is, so you are allowed to move the points to fit your tiling. However, the metric that defines your success is how much, overall, you have had to move individual points to manage the tiling with the polygon of your choice. Triangles will inevitably require less adjustment, while hexagons more.

All this discussion gives for granted that we only allow regular polygons. If you want to allow irregular polygons, go on and see what happens, but good luck building a theory where, from the very first piece, “everything can happen”.

The post Why does Finite Elements Methods (FEM) tile domains with triangles? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/finite-elements-methods-fem-tile-domains-triangles/feed/ 0
Projection methods in linear algebra numerics https://quickmathintuitions.org/projection-methods-linear-algebra-numerics/?pk_campaign=&pk_source= https://quickmathintuitions.org/projection-methods-linear-algebra-numerics/#respond Mon, 07 Dec 2020 15:33:15 +0000 https://quickmathintuitions.org/?p=409 Linear algebra classes often jump straight to the definition of a projector (as a matrix) when talking about orthogonal projections in linear spaces. As often as it happens, it is…

The post Projection methods in linear algebra numerics appeared first on Quick Math Intuitions.

]]>
Linear algebra classes often jump straight to the definition of a projector (as a matrix) when talking about orthogonal projections in linear spaces. As often as it happens, it is not clear how that definition arises. This is what is covered in this post.

Orthogonal projection: how to build a projector

Case 1 – 2D projection over (1,0)

It is quite straightforward to understand that orthogonal projection over (1,0) can be practically achieved by zeroing out the second component of any 2D vector, at last if the vector is expressed with respect to the canonical basis \{ e_1, e_2 \}. Albeit an idiotic statement, it is worth restating: the orthogonal projection of a 2D vector amounts to its first component alone.

How can this be put math-wise? Since we know that the dot product evaluates the similarity between two vectors, we can use that to extract the first component of a vector v. Once we have the magnitude of the first component, we only need to multiply that by e_1 itself, to know how much in the direction of e_1 we need to go. For example, starting from v = (5,6), first we get the first component as v \cdot e_1 = (5,6) \cdot (1,0) = 5; then we multiply this value by e_1 itself: 5e_1 = (5,0). This is in fact the orthogonal projection of the original vector. Writing down the operations we did in sequence, with proper transposing, we get

    \[e_1^T (e_1 v^T) = \begin{bmatrix} 1 \\ 0 \end{bmatrix} ([1, 0] \begin{bmatrix} 5 \\ 6 \end{bmatrix}) .\]

One simple and yet useful fact is that when we project a vector, its norm must not increase. This should be intuitive: the projection process either takes information away from a vector (as in the case above), or rephrases what is already there. In any way, it certainly does not add any. We may rephrase our opening fact with the following proposition:

PROP 1: ||v|| \geq ||Projection(v)||.

This is can easily be seen through the pitagorean theorem (and in fact only holds for orthogonal projection, not oblique):

    \[||v||^2 = ||proj_u(v)||^2 + ||v - proj_u(v)||^2 \geq ||proj_u(v)||^2\]

Case 2 – 2D projection over (1,1)

Attempt to apply the same technique with a random projection target, however, does not seem to work. Suppose we want to project over (1,1). Repeating what we did above for a test vector [3,0], we would get

    \[\begin{bmatrix} 1 \\ 1 \end{bmatrix} ([3, 0] \begin{bmatrix} 1 \\ 1 \end{bmatrix}) =  [3,3].\]

This violates the previously discovered fact the norm of the projection should be \leq than the original norm, so it must be wrong. In fact, visual inspection reveals that the correct orthogonal projection of [3,0] is [\frac{3}{2}, \frac{3}{2}].

The caveat here is that the vector onto which we project must have norm 1. This is vital every time we care about the direction of something, but not its magnitude, such as in this case. Normalizing [1,1] yields [\frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}]. Projecting [3,0] over [\frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}] is obtained through

    \[\begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix} ([3, 0] \begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix}) =  [\frac{3}{2}, \frac{3}{2}],\]

which now is indeed correct!

PROP 2: The vector on which we project must be a unit vector (i.e. a norm 1 vector).

Case3 – 3D projection on a plane

A good thing to think about is what happens when we want to project on more than one vector. For example, what happens if we project a point in 3D space onto a plane? The ideas is pretty much the same, and the technicalities amount to stacking in a matrix the vectors that span the place onto which to project.

Suppose we want to project the vector v = [5,7,9] onto the place spanned by \{ [1,0,0], [0,1,0] \}. The steps are the same: we still need to know how much similar v is with respect to the other two individual vectors, and then to magnify those similarities in the respective directions.

    \[\begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 5 \\ 7 \\ 9 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 5 \\ 7 \end{bmatrix} = 5 \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + 7 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 5 \\ 7 \\ 0 \end{bmatrix}\]

The only difference with the previous cases being that vectors onto which to project are put together in matrix form, in a shape in which the operations we end up making are the same as we did for the single vector cases.

The rise of the projector

As we have seen, the projection of a vector v over a set of orthonormal vectors Z is obtained as

    \[Projection_Z(v) = Z^T Z v^T .\]

And up to now, we have always done first the last product Z v^T, taking advantage of associativity. It should come as no surprise that we can also do it the other way around: first Z^T Z and then afterwards multiply the result by v^T. This Z^T Z makes up the projection matrix. However, the idea is much more understandable when written in this expanded form, as it shows the process which leads to the projector.

THOREM 1: The projection of v over an orthonormal basis Z is

    \[Projection_Z(v) = Z^T Z v^T = \underbrace{P}_{Projector} v^T .\]

So here it is: take any basis of whatever linear space, make it orthonormal, stack it in a matrix, multiply it by itself transposed, and you get a matrix whose action will be to drop any vector from any higher dimensional space onto itself. Neat.

Projector matrix properties

  • The norm of the projected vector is less than or equal to the norm of the original vector.
  • A projection matrix is idempotent: once projected, further projections don’t do anything else. This, in fact, is the only requirement that defined a projector. The other fundamental property we had asked during the previous example, i.e. that the projection basis is orthonormal, is a consequence of this. This is the definition you find in textbooks: that P^2 = P. However, if the projection is orthogonal, as we have assumed up to now, then we must also have P = P^T.
  • The eigenvalues of a projector are only 1 and 0. For an eigenvalue \lambda,

        \[\lambda v = Pv = P^2v = \lambda Pv = \lambda^2 v \Rightarrow \lambda = \lambda^2 \Rightarrow \lambda = \{0,1\}\]

  • It exists a basis X of \mathbb{R}^n such that it is possible to write P as P = [I_k \ 0_{n-k}], with k being the rank of P. If we further decompose X = [X_1, X_2], with X_1 being N \times k and X_2 being N \times N-k, the existence of the basis X shows that P really sends points from \mathbb{R}^N into Im(X_1) = Im(P) and points from \mathbb{R}^N - P(\mathbb{R}^N) into Ker(P). It also shows that \mathbb{R}^N = Im(P) + Ker(P).

Model Order Reduction

Is there any application of projection matrices to applied math? Indeed.

It is often the case (or, at least, the hope) that the solution to a differential problem lies in a low-dimensional subspace of the full solution space. If some \textbf{w}(t) \in \mathbb{R}^N is the solution to the Ordinary Differential Equation

    \begin{equation*} \frac{d\textbf{w}(t)}{dt} = \textbf{f}(\textbf{w}(t), t) \end{equation*}

then there is hope that there exists some subspace \mathcal{S} \subset \mathbb{R}^, s.t. dim(\mathcal{S}) < N in which the solution lives. If that is the case, we may rewrite it as

    \[\textbf{w}(t) = \textbf{V}_\mathcal{S}\textbf{q}(t)\]

for some appropriate coefficients (q_i(t)), which are the components of \textbf{w}(t) over the basis \textbf{V}_\mathcal{S}.

Assuming that the base \textbf{V} itself is time-invariant, and that in general \textbf{Vq(t)} will be a good but not perfect approximation of the real solution, the original differential problem can be rewritten as:

    \begin{equation*} \begin{split} \frac{d}{dt}\textbf{Vq(t)} =  \textbf{f}(Vq(t), t) + \textbf{r}(t) \\ \textbf{V}\frac{d}{dt}\textbf{q(t)} =  \textbf{f}(Vq(t), t) + \textbf{r}(t) \\ \end{split} \end{equation*}

where \textbf{r(t)} is an error.

The post Projection methods in linear algebra numerics appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/projection-methods-linear-algebra-numerics/feed/ 0
Reproducing a transport instability in convection-diffusion equation https://quickmathintuitions.org/reproducing-transport-instability-convection-diffusion-equation/?pk_campaign=&pk_source= https://quickmathintuitions.org/reproducing-transport-instability-convection-diffusion-equation/#respond Tue, 10 Nov 2020 14:39:47 +0000 https://quickmathintuitions.org/?p=411 Drawing from Larson-Bengzon FEM book, I wanted to experiment with transport instabilities. It looks there might be an instability in my ocean-ice model but before being able to address that,…

The post Reproducing a transport instability in convection-diffusion equation appeared first on Quick Math Intuitions.

]]>
Drawing from Larson-Bengzon FEM book, I wanted to experiment with transport instabilities. It looks there might be an instability in my ocean-ice model but before being able to address that, I wanted to wrap my head around the 1D simplest example one could find. And that is the Convection-Diffusion Equation:

(1)   \begin{equation*} \begin{split} - \epsilon \Delta u + b \cdot \nabla u &= f  \ \ in \ \Omega \\ u &= 0 \ \ on \ \partial \Omega \end{split} \end{equation*}

The first term -\epsilon \Delta u is responsible of the smearing of the velocity field u proportionally to \epsilon, and is thus called the diffusion term. Intuitively, it controls how much the neighbors of a given point x are influenced by the behavior of x; how much velocities (or temperatures: think of whatever you want!) diffuse in the domain.

The second term b \cdot \nabla u controls how much the velocities u are transported in the direction of the vector field b, and is thus called the convective term. A requirement for this problem to be well-posed is that \nabla \cdot b = 0 — otherwise it would mean that we allow velocities to vanish or to come into existence.

The instability is particularly likely to happen close to a boundary where a Dirichlet boundary condition is enforced. The problem is not localized only close to the boundary though, as fluctuations can travel throughout the whole domain and lead the whole solution astray.

Transport instability in 1D

The simplest convection-diffusion equation in 1D has the following form:

(2)   \begin{equation*} \epsilon u_{xx} + u_x = 1 \ \ in \ (0,1), \ \ u(0) = u(1) = 0 \end{equation*}

whose solution, for small \epsilon is approximately just u = x. This goes well with the boundary condition at 0, but not with the condition at 1, where the solution needs to go down to 0 quite abruptly to satisfy the boundary condition.

It’s easy to simulate the scenario with FEniCS and get this result (with \epsilon = 0.01 and the unit interval divided in 10 nodes):

transport-instability

in which we can infer two different trends: one with odd points and one with even points! In fact, if we discretize the 1D equation with finite elements we obtain:

(3)   \begin{equation*} \epsilon \frac{u_{i+1}-2u_i-u_{i-1}}{h^2} + \frac{u_{i+1}-u_{i-1}}{2h} = 1$ \end{equation*}

aha! From this we see that, if \epsilon is not of the same order of magnitude of h (i.e. if \epsilon << h), then the first factor becomes negligible. The problem then is that the second term contains only u_{i-1} and u_{i+1}. but not u_i. This will make it such that each node only talks to its second closest neighbor, explaining the behavior we saw in the plot before. It’s like even nodes make one solution and odd nodes a separate solution!

If \frac{\epsilon}{h} \approx 1, the solution that comes out is quite different:

transport-instability-solved

As we expected: a linear solution rapidly decaying towards the right. This is because in the above plot we had \epsilon = 0.01, h = 0.01, i.e. unit interval divided in 100 nodes.
Also notice how the problem does not pop up if the boundary conditions agree with the ideal solution (i.e. if the BC on the right is 1 instead of 0).

Solving the transport instability with a stabilization coefficient

The easiest dynamic way to fix the issue is to introduce an artificial component to \epsilon to make sure that the first term of the transport equation is never neglected, regardless of the relationship between mesh size h and \epsilon. This is a stabilization parameter:

(4)   \begin{equation*} (\epsilon + \beta h_{min}) \ u_{xx} + u_x = 1 \ \ in \ (0,1), \ \ u(0) = u(1) = 0 \end{equation*}

where h_{min} is the mesh smallest diameter. There is no single correct value for \beta: it quite depends on the other values (although it must be 0 \leq \beta < 1). Anyway, a good starting point is \beta = 0.5, which can then be tweaked according to results. This way feels a bit hacky though: “if we can’t solve it for \epsilon, let’s bump it up a bit” is pretty much the idea behind it.

With this formulation it’s also possible to derive what mesh size is needed to actually use a particular value for \epsilon. For example, if we’d like the second derivative term to have a 10^{-4} coefficient, then we need a mesh size h_{min} = \frac{10^{-4}}{\beta} \approx 10^{-3}, achieved with a 300×300 mesh, for example (which you can find out with
m=300; mesh=fenics.UnitSquareMesh(m,m); mesh.hmin()
). A uniformly fine mesh might not be needed though: it is often enough to have a coarse mesh in points where not much is happening, and very fine at problematic regions (such as boundaries, for this example).

Code — Convection-diffusion equation 1D

from fenics import *
import matplotlib.pyplot as plt

mesh = UnitIntervalMesh(100)
V = FunctionSpace(mesh, 'P', 1)

bcu = [
DirichletBC(V, Constant(0), 'near(x[0], 0)'),
DirichletBC(V, Constant(0), 'near(x[0], 1)'),
]
u = TrialFunction(V)
v = TestFunction(V)
u_ = Function(V)
f = Constant(1)
epsilon = Constant(0.01)
beta = Constant(0.5)
hmin = mesh.hmin()

a = (epsilon+beta*hmin)*dot(u.dx(0), v.dx(0))*dx + u.dx(0)*v*dx
L = v*dx

solve(a == L, u_, bcs=bcu)

print("||u|| = %s, ||u||_8 = %s" % ( \
round(norm(u_, 'L2'), 2), round(norm(u_.vector(), 'linf'), 3)
))

fig2 = plt.scatter(mesh.coordinates(), u_.compute_vertex_values())
plt.savefig('velxy.png', dpi = 300)
plt.close()

#plot(mesh)
#plt.show()

The post Reproducing a transport instability in convection-diffusion equation appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/reproducing-transport-instability-convection-diffusion-equation/feed/ 0
How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations? https://quickmathintuitions.org/dirichlet-neumann-boundary-conditions-affect-finite-element-methods-variational-formulations/?pk_campaign=&pk_source= https://quickmathintuitions.org/dirichlet-neumann-boundary-conditions-affect-finite-element-methods-variational-formulations/#respond Wed, 21 Oct 2020 09:43:09 +0000 https://quickmathintuitions.org/?p=382 To solve a classical second-order differential problem     with FEM, we first need to derive its weak formulation. This is achieved by multiplying the equation by a test function…

The post How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations? appeared first on Quick Math Intuitions.

]]>
To solve a classical second-order differential problem

    \begin{equation*} -(au')' + bu' + cu = f \ in \ \Omega \end{equation*}

with FEM, we first need to derive its weak formulation. This is achieved by multiplying the equation by a test function \phi and then integrating by parts to get rid of second order derivatives:

(1)   \begin{equation*} \begin{split} 0 &= \int_\Omega ((-a u')' + b u' + c u - f) \phi dx \\ &= \underbrace{\int_\Omega ((a u' \phi' + b u' \phi + c u \phi) dx}_{a(u, \phi)} \underbrace{- \int_\Omega f \phi dx - (a u' \phi)|_{\partial\Omega}}_{L(\phi)} \end{split} \end{equation*}

A typical FEM problem then reads like:

    \begin{equation*} \begin{split} \text{Find } u \in H_0'(\Omega) \ s.t. \ a(u, \phi) + L(\phi) = 0 \ \ \forall \phi \in H_0'(\Omega), \\ \text{where } H_0'(\Omega) = \{ v: \Omega \rightarrow \mathbb{R} : \int_0^1v^2(x) + v'(x)^2 dx < \infty \}. \end{split} \end{equation*}

What is the difference between imposing Dirichlet boundary conditions (ex. u(\partial \Omega) = k) and Neumann ones (u'(\partial \Omega) = k(x)) from a math perspective? Dirichlet conditions go into the definition of the space H_0', while Neumann conditions do not. Neumann conditions only affect the variational problem formulation straight away.

For example, in one dimension, adding the Dirichlet condition v(0) = v(1) = 0 results in the function space change H_0'(\Omega) = \{ v \in \Omega_0' : v(0)=v(1)=0 \}. With this condition, the boundary term (a u' \phi)|_{\partial\Omega} would also zero out in the variational problem. because the test function \phi belongs to H_0'.

On the other hand, by adding the Neumann condition u'(0) = u'(1) = 0, the space H_0' does not change, even though the boundary term vanishes from the variational problem in the same way as the for the Dirichlet condition. However, that term goes to zero not because of the test function anymore, but because of the value of the derivative u'. If the Neumann condition had specified a different value, such as u'(0) = u'(1) = 5, then the boundary term would not zero out!

In other words, Dirichlet conditions have the effect of further constraining the solution function space, while Neumann conditions only affect the equations.

The post How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/dirichlet-neumann-boundary-conditions-affect-finite-element-methods-variational-formulations/feed/ 0
What is the difference between Finite Differences and Finite Element Methods? https://quickmathintuitions.org/different-finite-differences-finite-element-methods/?pk_campaign=&pk_source= https://quickmathintuitions.org/different-finite-differences-finite-element-methods/#respond Wed, 27 May 2020 08:46:23 +0000 https://quickmathintuitions.org/?p=368 With Finite Differences, we discretize space (i.e. we putĀ  a grid on it) and we seek the values of the solution function at the mesh points. We still solve a…

The post What is the difference between Finite Differences and Finite Element Methods? appeared first on Quick Math Intuitions.

]]>
With Finite Differences, we discretize space (i.e. we putĀ  a grid on it) and we seek the values of the solution function at the mesh points. We still solve a discretized differential problem.

With Finite Elements, we approximate the solution as a (finite) sum of functions defined on the discretized space. These functions make up a basis of the space, and the most commonly used are the hat functions. We end up with a linear system whose unknowns are the weights associated with each of the basis functions: i.e., how much does each basis function count for out particular solution to our particular problem?

Brutally, it is finding the value of the solution function at grid points (finite differences) vs the weight of the linear combinations of the hat functions (finite elements).

The post What is the difference between Finite Differences and Finite Element Methods? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/different-finite-differences-finite-element-methods/feed/ 0