Quick Math Intuitions https://quickmathintuitions.org Sharing quick intuitions for math ideas Mon, 09 Oct 2023 09:44:38 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 What is an equation, and why do we solve them that way? https://quickmathintuitions.org/what-is-equation-how-to-solve/?pk_campaign=&pk_source= https://quickmathintuitions.org/what-is-equation-how-to-solve/#respond Fri, 29 Sep 2023 06:46:39 +0000 https://quickmathintuitions.org/?p=511 In words, an equation is a technique to figure out the value of a quantity when the only thing we have is a convoluted and self-referential definition. Enough talking, now…

The post What is an equation, and why do we solve them that way? appeared first on Quick Math Intuitions.

]]>
In words, an equation is a technique to figure out the value of a quantity when the only thing we have is a convoluted and self-referential definition. Enough talking, now to the action.

Imagine somebody comes with this riddle-sounding problem:

A brick weighs as much as a kilo plus half a brick. How much does a brick weigh?

How can we tell how much a brick weigh if its weight is defined in terms of bricks? This seems hopelessly circular.

Let’s look at the problem statement and try to clean it up, to reduce it to its bare bones.

\overbrace{\text{A}}^{1} \ \text{brick} \ \overbrace{\text{weighs as much as}}^{=} \ \overbrace{\text{a}}^{1} \ \overbrace{\text{kilo}}^{\text{kg}} \ \overbrace{\text{plus}}^{+} \ \overbrace{\text{half a brick}}^{0.5 \cdot \text{brick}}

So that in the end the problem statement reads like:

1 \text{brick} = 1 \text{kg} + 0.5 \cdot \text{brick}

It definitely reads mathy, although not properly equationy yet. Not intimidating enough. But from here on, the path is the same: keep simplify and shortening, and load each symbol with more and more meaning.

We know we’re talking bricks here, and we only need be passingly reminded, so we can make one further shortening:

1 b = 1 \text{kg} + 0.5 \cdot b

We also know we’re talking weight, and we’ll leave the = sign to carry its full meaning of “weighs as much as”, so that we can drop the kg reference:

1 b = 1 + 0.5 \cdot b

We can also agree that every time there’s 1 beside the letter b, we’ll omit it:

b = 1 + 0.5 \cdot b

And an equation is born. Oh wait, we’ve missed a step. What’s an equation without an x? Just swap all the b for x (only because it’s a convention to use the last few letters of the alphabet for unknowns quantities):

x = 1 + 0.5 \cdot x

Now this, this is an equation your math teacher could be proud of, where x is the unknown. In truth, we’ve been doing equations since the very first statement: what authority is telling us we can’t use \text{brick} in an equation, in place of the x? Who said we can’t have \text{kg} in it? Or words like \text{half}? Why do we want to make our life so complicated?

Well, look what happens if we take the following equation and put in all the words:

x + \frac{7}{5} + \frac{x}{2} = \frac{x + 5}{2}

A brick plus seven fifths of a kg plus half a brick weighs as much as half a brick plus five kilos.

Sooo long, and clunky. Plus, it’s ambiguous: “half a brick plus five kilos” could be both a) first adding together a brick and 5 kilos and then taking half of the whole thing, or b) taking half a brick first and then adding 5 kilos to that. And this is one of the simplest equations you could come up with: the amount of ambiguity grows and grows the more information you add. We cannot afford any ambiguity: even the slightest difference changes the final result. Language is poetic and flexible, but is not rigorous and unambiguous: for that, there’s math with its symbols and equations. Math is the language humanity has come up to unambiguously convey information, to take one very precise idea from one mind and transfer it to another mind without altering its shape one bit. In the recent history of natural language we’ve also tried to address the problem by adding punctuation, but that still falls short with the high precision standard that technical subjects require. The size of a nail can cause an Apollo mission to fail.

Going back and forth a problem’s statement in natural language and in math symbols is extremely valuable when learning math, as it teaches math as a language, rather than a set of abstract symbols. During my first year of high school my teacher would have us all describe math statement in natural language, or to dictate to him an equation from the textbook, spelling it out in natural language. It was tedious, and valuable, and an exercise I recommend in education.

How to solve an equation

Alright, now we know what an equation is and how one is born. Let’s get back to what we’ve built so far:

x = 1 + 0.5 \cdot x

It seems hopelessly circular still, with x on both sides of the equal. It is true: so far we haven’t done anything other than shortening and dropping words, so the problem is unaltered! Not much creativity or smartness has happened so far. Now we’ll draw from the standard math mindset that will lead us oh so far in life:

  1. think about where we want to get, what our goal is;
  2. figure out how we can get there.

Print those two points out on a giant poster and stick it into your child’s bedroom. Every theorem in math first describes the starting point, then sets out the ending goal, and finally explains how you can get there. The where requires clarity of mind, the how takes creativity and ideas. Math exploration is not as tidy, because you don’t always know if you can get where you’d like to get, so you often wander a bit around exploring where you can get from where you stand, and figuring out your goals as you walk around, charting the map of your territory.

But I digress. Always think where you want to get, and how you can get there. To get where you want, you often need to leap into a different realm.

Our goal is obtaining a statement that would read like “A brick weighs … kg”, or, in math terms, x = ... . When we get to this formulation, we can easily read out the weight of a brick. How do we get there?

Ideas in math are often hidden in plain sight — or rather, they are part of the scenery, so you have to spot them in the smallest of details. We’re talking weight. How about scales?

Our statement suggests that if we load a scale with one brick on one side and half a brick and a kilo on the other, it will be in balance.

what is equation bricks What do we know about scales? I’m gonna pretend word meaning is meaningful, and say they scale: 100 grams weigh as much as 100 grams, 80 grams weigh as much as 80 grams. It’s an obvious and useless remark, but a rephrasing suddenly makes it much more useful: if we alter the amount of weight on both sides at the same time, the scale is still in balance. If we add, subtract, divide, or multiply the same quantity to both sides, the scale stays in balance.

Now think again where we want to get and how that translates into scales: we want one side of the scale to only contain bricks, and the other to only contain kilos. From here onward, the math dopamine kick wears off and it’s all downhill. If we take away half a kilo from both sides, we get where we want (almost).

what is equation bricks almost final There’s only a minor annoyance here: we’ve now discovered that half a brick weighs as much as a kilo, but we were actually interested in full bricks. Well, we can scale the quantities, doubling them both. And hopefully it doesn’t then take a leap of faith in accepting that two half bricks make a whole brick.

what is equation bricks final

  • Was this Helpful ?
  • yes   no

The post What is an equation, and why do we solve them that way? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/what-is-equation-how-to-solve/feed/ 0
Why does Finite Elements Methods (FEM) tile domains with triangles? https://quickmathintuitions.org/finite-elements-methods-fem-tile-domains-triangles/?pk_campaign=&pk_source= https://quickmathintuitions.org/finite-elements-methods-fem-tile-domains-triangles/#respond Tue, 19 Jul 2022 07:32:30 +0000 https://quickmathintuitions.org/?p=493 Why is it the case that the Finite Elements Method (FEM) tiles domains with triangles? With so many geometrical shapes, is there anything special triangles have to offer? For starters,…

The post Why does Finite Elements Methods (FEM) tile domains with triangles? appeared first on Quick Math Intuitions.

]]>
Why is it the case that the Finite Elements Method (FEM) tiles domains with triangles? With so many geometrical shapes, is there anything special triangles have to offer?

For starters, a surprisingly small number of regular polygons can even be used to tile a surface. If we want to avoid leaving any gaps in the space, the condition that needs to be fulfilled is that the internal angle of the polygon has to evenly divide 360. Evenly is the keyword here. The internal angle of regular polygons is $(n-2)*\frac{180}{n}$, where $n$ is the number of sides. Here below is the list of regular polygons of up to 100 sides with the corresponding internal angle:

(3, 60.0)
(4, 90.0)
(5, 108.0)
(6, 120.0)
(8, 135.0)
(9, 140.0)
(10, 144.0)
(12, 150.0)
(15, 156.0)
(18, 160.0)
(20, 162.0)
(24, 165.0)
(30, 168.0)
(36, 170.0)
(40, 171.0)
(45, 172.0)
(60, 174.0)
(72, 175.0)
(90, 176.0)

Of these, only the triangle (60), the square (90) and the hexagon (120) fit well into 360. All the others simply don’t. This first evaluation is enough to shift the initial question from “why the triangle” to “why not the square or the hexagon“? All three of them are in fact theoretically equally capable of tiling whatever “infinite surface”, and it’s not a case that beehives are hexagonal in structure.

Now the practical constraints come in:

  1. we don’t deal with boundless surfaces – we always have boundaries, and the meshing needs to follow them as accurately as possible. How much can you bend a square to make it fit a curvy boundary? Not much. With a hexagon the problem is pretty much the same. Triangles, although still not perfect, have one pointy end that can be arranged to follow a curly outline better.
  2. in FEM, every vertex of every triangle is a degree of freedom of the linear system that is then solved. Put in twice as many triangles and you’ll get almost twice as many degrees of freedom (not exactly twice as many, but hey, you get the idea). What happens if we use hexagons? Now every tile will yield 6 degrees of freedom! The problem is that these 6 vertexes are not well distributed, since they are all concentrated on the perimeter of the small hexagon – no degree of freedom covers the interior of the hexagon, which will leave quite some non-computational space while still doubling the computational cost of the simulation.

All in all, the triangle is a much less constrained shape. Think of being given a random cloud of 2D points, with the instructions of linking them with a regular polygon of your choice. Obviously no shape will work on the cloud of points as it is, so you are allowed to move the points to fit your tiling. However, the metric that defines your success is how much, overall, you have had to move individual points to manage the tiling with the polygon of your choice. Triangles will inevitably require less adjustment, while hexagons more.

All this discussion gives for granted that we only allow regular polygons. If you want to allow irregular polygons, go on and see what happens, but good luck building a theory where, from the very first piece, “everything can happen”.

  • Was this Helpful ?
  • yes   no

The post Why does Finite Elements Methods (FEM) tile domains with triangles? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/finite-elements-methods-fem-tile-domains-triangles/feed/ 0
Overdetermined and underdetermined systems of equations put simply https://quickmathintuitions.org/intuition-for-overdetermined-and-underdetermined-systems-of-equations/?pk_campaign=&pk_source= https://quickmathintuitions.org/intuition-for-overdetermined-and-underdetermined-systems-of-equations/#respond Thu, 08 Apr 2021 13:14:32 +0000 http://quickmathintuitions.org/?p=37 Systems of equations are an evolution of equations, but they are often misunderstood. This article aims at providing real world examples and intuitions for systems of equations, and in particular…

The post Overdetermined and underdetermined systems of equations put simply appeared first on Quick Math Intuitions.

]]>
Systems of equations are an evolution of equations, but they are often misunderstood. This article aims at providing real world examples and intuitions for systems of equations, and in particular for overdetermined and undetermined systems.

Intuition for systems of equations

Intuitively, we can think of a system of equations as a set of requests. Let’s imagine to have a group of people in front of us, and to have to give a task to each of them. An informal example system could be the following:

  •  Anna, solve a system of linear equations;
  • George, go to the beach and have fun;
  • Luke, prevent Anna from ringing social services.

In this form, a solution to the system consists in a list of pairings person-task that satisfies the demands detailed above. In other words, giving a solution to the system amounts to saying what Anna should do, what George should do, and what Luke should do, so that the demands are satisfied. In the example above, Anna should solve a system of equations, George should go to the beach, and Luke should prevent Anna from ringing social services.

It seems pretty obvious, but this intuition will be useful when covering over/underdetermined systems.

Overdetermined systems of equations

Let’s think of having to give orders to a large number of people. It might happen that, when getting to the last person, we forget to whom we have already given an order and to whom not, and we might end up repeating some orders:

  • Anna, do the laundry;
  • George, go to the beach;
  • Luke, get Anna’s laundry dirty;
  • Sophie, prevent Luke from dirtying the laundry;
  • George, go to the beach.

Here George has received his order twice. In these cases, we say that the system is overdetermined, because it has more orders than people. The example above is innocuous, because George is simply told to the same thing twice. The simplest mathematical example of such a system is when two equations are proportional to each other:

\begin{cases} x = 1 \\ 2x = 2 \end{cases}

This is an overdetermined system with a solution: x=1. The second equation is just redundant, like a game in which the second rule states to follow the first.

How about the following instead:

  • Anna, do the laundry;
  • George, go to the beach;
  • Luke, get Anna’s laundry dirty;
  • Sophie, prevent Luke from dirtying the laundry;
  • George, bake a cake.

Here George gets two clashing orders, and is rightfully confused: he cannot go to the beach and bake a cake at the same time. He is going to disappoint us no matter what. Indeed, this system is not only overdetermined, because there are more orders than people, but is also without solution. In fact, we are unable to come up with a list of pairings people-tasks as before. If George would go to the beach, he would be ignoring the baking order; if he would bake a cake, he would be ignoring the beach order. There is no way out: there is no solution! It is a bit like a game where the second rule says not to follow the first: it is impossible to play a game like that!

The simplest mathematical example of such a system is:

\begin{cases} x = 1 \\ x = 2 \end{cases}

which does not have a solution because we ask x to be 1 and 2 at the same time — a bit like asking your neighbor to be male and female at the same time (but not queer).

So once again: when a system of equations has more equations than unknowns, we say it is overdetermined. It means that too many rules are being imposed at once, and some of them may be conflicting. However, it is false to state that an overdetermined system does not have any solution: it may or it may not. If the surplus commands are just reformulations of other orders, then it is not a problem: the system does have a solution.

Underdetermined systems of equations

If we give less orders than the number of people, we say the system is underdetermined. When this happens, at least one person must have not received any command. This time, the idea is that people who do not receive any command are free to do whatever they want.

For example, let’s imagine again to have Anna, George, and Luke lined in front of us. If our commands are:

  • Anna, do the laundry;
  • George, go to the beach.

then Luke has not received any order. Maybe he will go to the park, maybe he will prevent Anna from ringing social services… he is free to do whatever he wants: the options are infinite! In these cases, we say that Luke is a free variable. As long as Anna and George stick to what they are told, each of Luke’s options makes for a solution: that is why the system has an infinite number of solutions.

As a mathematical example, think of being asked to find values for x,y,z satisfying the following system:

\begin{cases} x = 1 \\ y = 2 \end{cases}

great, but what about z? Here z is a free variable and solutions are infinite.

However, there can also be undetermined systems with no solution. That is the case when we give too few orders, and some of them are conflicting with each other. Again with our favorite trio:

  • Anna, do the laundry;
  • George, go to the beach;
  • Anna, go to the park.

Not only are we not saying anything to Luke here, but we are also giving clashing orders to Anna. So even if the absence of commands to Luke would allow infinite solutions, Anna’s impossibility to satisfy the constraints makes it so that no solution exists.

Examples and final remarks

All in all, there are no strict rules. What appears to be an overdetermined system could turn out to be an underdetermined one, and an underdetermined system could have no solution.

Finally, notice that in mathematical reality commands usually address more than one person at a time. A system of equations in real life is something like:

\begin{cases} x + y = 1 \\ x - y = 2 \end{cases}

Here the intuition gets trickier, because each command mixes at least two people, and is hard to render in natural language. Still, the orders analogy is useful in understanding what underdetermined and overdetermined systems are and why they have infinite or no solutions.

Ex. 1 \begin{cases} x - 2y = 1 - z \\ x + z - 1 = 2y \\ x + z = 1 \\ x = 3 - z \end{cases}     in \mathbb{R}^3
An apparently overdetermined system which is actually underdetermined and does not even have a solution.

Ex. 2 \begin{cases} x + z = 1 \\ x = 3 - z \end{cases}     in \mathbb{R}^3
An overdetermined which does not have a solution.

 

  • Was this Helpful ?
  • yes   no

The post Overdetermined and underdetermined systems of equations put simply appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/intuition-for-overdetermined-and-underdetermined-systems-of-equations/feed/ 0
Projection methods in linear algebra numerics https://quickmathintuitions.org/projection-methods-linear-algebra-numerics/?pk_campaign=&pk_source= https://quickmathintuitions.org/projection-methods-linear-algebra-numerics/#respond Mon, 07 Dec 2020 15:33:15 +0000 https://quickmathintuitions.org/?p=409 Linear algebra classes often jump straight to the definition of a projector (as a matrix) when talking about orthogonal projections in linear spaces. As often as it happens, it is…

The post Projection methods in linear algebra numerics appeared first on Quick Math Intuitions.

]]>
Linear algebra classes often jump straight to the definition of a projector (as a matrix) when talking about orthogonal projections in linear spaces. As often as it happens, it is not clear how that definition arises. This is what is covered in this post.

Orthogonal projection: how to build a projector

Case 1 – 2D projection over (1,0)

It is quite straightforward to understand that orthogonal projection over (1,0) can be practically achieved by zeroing out the second component of any 2D vector, at last if the vector is expressed with respect to the canonical basis \{ e_1, e_2 \}. Albeit an idiotic statement, it is worth restating: the orthogonal projection of a 2D vector amounts to its first component alone.

How can this be put math-wise? Since we know that the dot product evaluates the similarity between two vectors, we can use that to extract the first component of a vector v. Once we have the magnitude of the first component, we only need to multiply that by e_1 itself, to know how much in the direction of e_1 we need to go. For example, starting from v = (5,6), first we get the first component as v \cdot e_1 = (5,6) \cdot (1,0) = 5; then we multiply this value by e_1 itself: 5e_1 = (5,0). This is in fact the orthogonal projection of the original vector. Writing down the operations we did in sequence, with proper transposing, we get

    \[e_1^T (e_1 v^T) = \begin{bmatrix} 1 \\ 0 \end{bmatrix} ([1, 0] \begin{bmatrix} 5 \\ 6 \end{bmatrix}) .\]

One simple and yet useful fact is that when we project a vector, its norm must not increase. This should be intuitive: the projection process either takes information away from a vector (as in the case above), or rephrases what is already there. In any way, it certainly does not add any. We may rephrase our opening fact with the following proposition:

PROP 1: ||v|| \geq ||Projection(v)||.

This is can easily be seen through the pitagorean theorem (and in fact only holds for orthogonal projection, not oblique):

    \[||v||^2 = ||proj_u(v)||^2 + ||v - proj_u(v)||^2 \geq ||proj_u(v)||^2\]

Case 2 – 2D projection over (1,1)

Attempt to apply the same technique with a random projection target, however, does not seem to work. Suppose we want to project over (1,1). Repeating what we did above for a test vector [3,0], we would get

    \[\begin{bmatrix} 1 \\ 1 \end{bmatrix} ([3, 0] \begin{bmatrix} 1 \\ 1 \end{bmatrix}) =  [3,3].\]

This violates the previously discovered fact the norm of the projection should be \leq than the original norm, so it must be wrong. In fact, visual inspection reveals that the correct orthogonal projection of [3,0] is [\frac{3}{2}, \frac{3}{2}].

The caveat here is that the vector onto which we project must have norm 1. This is vital every time we care about the direction of something, but not its magnitude, such as in this case. Normalizing [1,1] yields [\frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}]. Projecting [3,0] over [\frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}] is obtained through

    \[\begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix} ([3, 0] \begin{bmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{bmatrix}) =  [\frac{3}{2}, \frac{3}{2}],\]

which now is indeed correct!

PROP 2: The vector on which we project must be a unit vector (i.e. a norm 1 vector).

Case3 – 3D projection on a plane

A good thing to think about is what happens when we want to project on more than one vector. For example, what happens if we project a point in 3D space onto a plane? The ideas is pretty much the same, and the technicalities amount to stacking in a matrix the vectors that span the place onto which to project.

Suppose we want to project the vector v = [5,7,9] onto the place spanned by \{ [1,0,0], [0,1,0] \}. The steps are the same: we still need to know how much similar v is with respect to the other two individual vectors, and then to magnify those similarities in the respective directions.

    \[\begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 5 \\ 7 \\ 9 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 5 \\ 7 \end{bmatrix} = 5 \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + 7 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 5 \\ 7 \\ 0 \end{bmatrix}\]

The only difference with the previous cases being that vectors onto which to project are put together in matrix form, in a shape in which the operations we end up making are the same as we did for the single vector cases.

The rise of the projector

As we have seen, the projection of a vector v over a set of orthonormal vectors Z is obtained as

    \[Projection_Z(v) = Z^T Z v^T .\]

And up to now, we have always done first the last product Z v^T, taking advantage of associativity. It should come as no surprise that we can also do it the other way around: first Z^T Z and then afterwards multiply the result by v^T. This Z^T Z makes up the projection matrix. However, the idea is much more understandable when written in this expanded form, as it shows the process which leads to the projector.

THOREM 1: The projection of v over an orthonormal basis Z is

    \[Projection_Z(v) = Z^T Z v^T = \underbrace{P}_{Projector} v^T .\]

So here it is: take any basis of whatever linear space, make it orthonormal, stack it in a matrix, multiply it by itself transposed, and you get a matrix whose action will be to drop any vector from any higher dimensional space onto itself. Neat.

Projector matrix properties

  • The norm of the projected vector is less than or equal to the norm of the original vector.
  • A projection matrix is idempotent: once projected, further projections don’t do anything else. This, in fact, is the only requirement that defined a projector. The other fundamental property we had asked during the previous example, i.e. that the projection basis is orthonormal, is a consequence of this. This is the definition you find in textbooks: that P^2 = P. However, if the projection is orthogonal, as we have assumed up to now, then we must also have P = P^T.
  • The eigenvalues of a projector are only 1 and 0. For an eigenvalue \lambda,

        \[\lambda v = Pv = P^2v = \lambda Pv = \lambda^2 v \Rightarrow \lambda = \lambda^2 \Rightarrow \lambda = \{0,1\}\]

  • It exists a basis X of \mathbb{R}^n such that it is possible to write P as P = [I_k \ 0_{n-k}], with k being the rank of P. If we further decompose X = [X_1, X_2], with X_1 being N \times k and X_2 being N \times N-k, the existence of the basis X shows that P really sends points from \mathbb{R}^N into Im(X_1) = Im(P) and points from \mathbb{R}^N - P(\mathbb{R}^N) into Ker(P). It also shows that \mathbb{R}^N = Im(P) + Ker(P).

Model Order Reduction

Is there any application of projection matrices to applied math? Indeed.

It is often the case (or, at least, the hope) that the solution to a differential problem lies in a low-dimensional subspace of the full solution space. If some \textbf{w}(t) \in \mathbb{R}^N is the solution to the Ordinary Differential Equation

    \begin{equation*} \frac{d\textbf{w}(t)}{dt} = \textbf{f}(\textbf{w}(t), t) \end{equation*}

then there is hope that there exists some subspace \mathcal{S} \subset \mathbb{R}^, s.t. dim(\mathcal{S}) < N in which the solution lives. If that is the case, we may rewrite it as

    \[\textbf{w}(t) = \textbf{V}_\mathcal{S}\textbf{q}(t)\]

for some appropriate coefficients (q_i(t)), which are the components of \textbf{w}(t) over the basis \textbf{V}_\mathcal{S}.

Assuming that the base \textbf{V} itself is time-invariant, and that in general \textbf{Vq(t)} will be a good but not perfect approximation of the real solution, the original differential problem can be rewritten as:

    \begin{equation*} \begin{split} \frac{d}{dt}\textbf{Vq(t)} =  \textbf{f}(Vq(t), t) + \textbf{r}(t) \\ \textbf{V}\frac{d}{dt}\textbf{q(t)} =  \textbf{f}(Vq(t), t) + \textbf{r}(t) \\ \end{split} \end{equation*}

where \textbf{r(t)} is an error.

  • Was this Helpful ?
  • yes   no

The post Projection methods in linear algebra numerics appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/projection-methods-linear-algebra-numerics/feed/ 0
Reproducing a transport instability in convection-diffusion equation https://quickmathintuitions.org/reproducing-transport-instability-convection-diffusion-equation/?pk_campaign=&pk_source= https://quickmathintuitions.org/reproducing-transport-instability-convection-diffusion-equation/#respond Tue, 10 Nov 2020 14:39:47 +0000 https://quickmathintuitions.org/?p=411 Drawing from Larson-Bengzon FEM book, I wanted to experiment with transport instabilities. It looks there might be an instability in my ocean-ice model but before being able to address that,…

The post Reproducing a transport instability in convection-diffusion equation appeared first on Quick Math Intuitions.

]]>
Drawing from Larson-Bengzon FEM book, I wanted to experiment with transport instabilities. It looks there might be an instability in my ocean-ice model but before being able to address that, I wanted to wrap my head around the 1D simplest example one could find. And that is the Convection-Diffusion Equation:

(1)   \begin{equation*} \begin{split} - \epsilon \Delta u + b \cdot \nabla u &= f  \ \ in \ \Omega \\ u &= 0 \ \ on \ \partial \Omega \end{split} \end{equation*}

The first term -\epsilon \Delta u is responsible of the smearing of the velocity field u proportionally to \epsilon, and is thus called the diffusion term. Intuitively, it controls how much the neighbors of a given point x are influenced by the behavior of x; how much velocities (or temperatures: think of whatever you want!) diffuse in the domain.

The second term b \cdot \nabla u controls how much the velocities u are transported in the direction of the vector field b, and is thus called the convective term. A requirement for this problem to be well-posed is that \nabla \cdot b = 0 — otherwise it would mean that we allow velocities to vanish or to come into existence.

The instability is particularly likely to happen close to a boundary where a Dirichlet boundary condition is enforced. The problem is not localized only close to the boundary though, as fluctuations can travel throughout the whole domain and lead the whole solution astray.

Transport instability in 1D

The simplest convection-diffusion equation in 1D has the following form:

(2)   \begin{equation*} \epsilon u_{xx} + u_x = 1 \ \ in \ (0,1), \ \ u(0) = u(1) = 0 \end{equation*}

whose solution, for small \epsilon is approximately just u = x. This goes well with the boundary condition at 0, but not with the condition at 1, where the solution needs to go down to 0 quite abruptly to satisfy the boundary condition.

It’s easy to simulate the scenario with FEniCS and get this result (with \epsilon = 0.01 and the unit interval divided in 10 nodes):

transport-instability

in which we can infer two different trends: one with odd points and one with even points! In fact, if we discretize the 1D equation with finite elements we obtain:

(3)   \begin{equation*} \epsilon \frac{u_{i+1}-2u_i-u_{i-1}}{h^2} + \frac{u_{i+1}-u_{i-1}}{2h} = 1$ \end{equation*}

aha! From this we see that, if \epsilon is not of the same order of magnitude of h (i.e. if \epsilon << h), then the first factor becomes negligible. The problem then is that the second term contains only u_{i-1} and u_{i+1}. but not u_i. This will make it such that each node only talks to its second closest neighbor, explaining the behavior we saw in the plot before. It’s like even nodes make one solution and odd nodes a separate solution!

If \frac{\epsilon}{h} \approx 1, the solution that comes out is quite different:

transport-instability-solved

As we expected: a linear solution rapidly decaying towards the right. This is because in the above plot we had \epsilon = 0.01, h = 0.01, i.e. unit interval divided in 100 nodes.
Also notice how the problem does not pop up if the boundary conditions agree with the ideal solution (i.e. if the BC on the right is 1 instead of 0).

Solving the transport instability with a stabilization coefficient

The easiest dynamic way to fix the issue is to introduce an artificial component to \epsilon to make sure that the first term of the transport equation is never neglected, regardless of the relationship between mesh size h and \epsilon. This is a stabilization parameter:

(4)   \begin{equation*} (\epsilon + \beta h_{min}) \ u_{xx} + u_x = 1 \ \ in \ (0,1), \ \ u(0) = u(1) = 0 \end{equation*}

where h_{min} is the mesh smallest diameter. There is no single correct value for \beta: it quite depends on the other values (although it must be 0 \leq \beta < 1). Anyway, a good starting point is \beta = 0.5, which can then be tweaked according to results. This way feels a bit hacky though: “if we can’t solve it for \epsilon, let’s bump it up a bit” is pretty much the idea behind it.

With this formulation it’s also possible to derive what mesh size is needed to actually use a particular value for \epsilon. For example, if we’d like the second derivative term to have a 10^{-4} coefficient, then we need a mesh size h_{min} = \frac{10^{-4}}{\beta} \approx 10^{-3}, achieved with a 300×300 mesh, for example (which you can find out with
m=300; mesh=fenics.UnitSquareMesh(m,m); mesh.hmin()
). A uniformly fine mesh might not be needed though: it is often enough to have a coarse mesh in points where not much is happening, and very fine at problematic regions (such as boundaries, for this example).

Code — Convection-diffusion equation 1D

from fenics import *
import matplotlib.pyplot as plt

mesh = UnitIntervalMesh(100)
V = FunctionSpace(mesh, 'P', 1)

bcu = [
DirichletBC(V, Constant(0), 'near(x[0], 0)'),
DirichletBC(V, Constant(0), 'near(x[0], 1)'),
]
u = TrialFunction(V)
v = TestFunction(V)
u_ = Function(V)
f = Constant(1)
epsilon = Constant(0.01)
beta = Constant(0.5)
hmin = mesh.hmin()

a = (epsilon+beta*hmin)*dot(u.dx(0), v.dx(0))*dx + u.dx(0)*v*dx
L = v*dx

solve(a == L, u_, bcs=bcu)

print("||u|| = %s, ||u||_8 = %s" % ( \
round(norm(u_, 'L2'), 2), round(norm(u_.vector(), 'linf'), 3)
))

fig2 = plt.scatter(mesh.coordinates(), u_.compute_vertex_values())
plt.savefig('velxy.png', dpi = 300)
plt.close()

#plot(mesh)
#plt.show()
  • Was this Helpful ?
  • yes   no

The post Reproducing a transport instability in convection-diffusion equation appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/reproducing-transport-instability-convection-diffusion-equation/feed/ 0
What is the Rossby number? https://quickmathintuitions.org/what-is-the-rossby-number/?pk_campaign=&pk_source= https://quickmathintuitions.org/what-is-the-rossby-number/#respond Wed, 21 Oct 2020 09:50:36 +0000 https://quickmathintuitions.org/?p=400 The Rossby number is used to describe whether a phenomenon is large-scale, i.e. if it is affected by earth’s rotation. But do we actually quantify if a fluid flow is…

The post What is the Rossby number? appeared first on Quick Math Intuitions.

]]>
The Rossby number is used to describe whether a phenomenon is large-scale, i.e. if it is affected by earth’s rotation. But do we actually quantify if a fluid flow is affected by earth’s rotation?

Consider two quantities L and U, with L being a characteristic scale-length of the phenomenon (ex. distance between two peaks, distance between two isobars, length of simulation domain) and U the horizontal velocity scale of the motion. The ratio \frac{L}{U} is the time it takes to the motion to cover a distance L with velocity U. If this time is bigger than the period of earth’s rotation, then the phenomenon IS affected by the rotation.

So if \frac{L}{U} \geq \frac{1}{\Omega}, then the phenomenon IS a large-scale one. Thus we can define \epsilon = \frac{U}{2L \Omega} and say that for \epsilon \leq 1 a phenomenon is large scale. Phenomena with small Rossby number are dominated by Coriolis force behavior, while those with large Rossby number are dominated by inertial forces (ex: a tornado). However, rotational effects are more evident for low latitudes (i.e. near the equator), so the Rossby number can be different depending on where on earth we are.

(Notice that \Omega is in theory equal to 2 \Omega \sin(\phi), with \Omega being the earth rotational velocity and \phi the angle between the axis of rotation and the direction of fluid movement. In the geophysical context, flows are mostly horizontal (also due to density stratification in both atmosphere and ocean), so \sin(\phi) can be approximated with 1. There is a bunch of different notation, but this \Omega is also referred to as f, called the Coriolis frequency.)

  • Was this Helpful ?
  • yes   no

The post What is the Rossby number? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/what-is-the-rossby-number/feed/ 0
How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations? https://quickmathintuitions.org/dirichlet-neumann-boundary-conditions-affect-finite-element-methods-variational-formulations/?pk_campaign=&pk_source= https://quickmathintuitions.org/dirichlet-neumann-boundary-conditions-affect-finite-element-methods-variational-formulations/#respond Wed, 21 Oct 2020 09:43:09 +0000 https://quickmathintuitions.org/?p=382 To solve a classical second-order differential problem     with FEM, we first need to derive its weak formulation. This is achieved by multiplying the equation by a test function…

The post How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations? appeared first on Quick Math Intuitions.

]]>
To solve a classical second-order differential problem

    \begin{equation*} -(au')' + bu' + cu = f \ in \ \Omega \end{equation*}

with FEM, we first need to derive its weak formulation. This is achieved by multiplying the equation by a test function \phi and then integrating by parts to get rid of second order derivatives:

(1)   \begin{equation*} \begin{split} 0 &= \int_\Omega ((-a u')' + b u' + c u - f) \phi dx \\ &= \underbrace{\int_\Omega ((a u' \phi' + b u' \phi + c u \phi) dx}_{a(u, \phi)} \underbrace{- \int_\Omega f \phi dx - (a u' \phi)|_{\partial\Omega}}_{L(\phi)} \end{split} \end{equation*}

A typical FEM problem then reads like:

    \begin{equation*} \begin{split} \text{Find } u \in H_0'(\Omega) \ s.t. \ a(u, \phi) + L(\phi) = 0 \ \ \forall \phi \in H_0'(\Omega), \\ \text{where } H_0'(\Omega) = \{ v: \Omega \rightarrow \mathbb{R} : \int_0^1v^2(x) + v'(x)^2 dx < \infty \}. \end{split} \end{equation*}

What is the difference between imposing Dirichlet boundary conditions (ex. u(\partial \Omega) = k) and Neumann ones (u'(\partial \Omega) = k(x)) from a math perspective? Dirichlet conditions go into the definition of the space H_0', while Neumann conditions do not. Neumann conditions only affect the variational problem formulation straight away.

For example, in one dimension, adding the Dirichlet condition v(0) = v(1) = 0 results in the function space change H_0'(\Omega) = \{ v \in \Omega_0' : v(0)=v(1)=0 \}. With this condition, the boundary term (a u' \phi)|_{\partial\Omega} would also zero out in the variational problem. because the test function \phi belongs to H_0'.

On the other hand, by adding the Neumann condition u'(0) = u'(1) = 0, the space H_0' does not change, even though the boundary term vanishes from the variational problem in the same way as the for the Dirichlet condition. However, that term goes to zero not because of the test function anymore, but because of the value of the derivative u'. If the Neumann condition had specified a different value, such as u'(0) = u'(1) = 5, then the boundary term would not zero out!

In other words, Dirichlet conditions have the effect of further constraining the solution function space, while Neumann conditions only affect the equations.

  • Was this Helpful ?
  • yes   no

The post How do Dirichlet and Neumann boundary conditions affect Finite Element Methods variational formulations? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/dirichlet-neumann-boundary-conditions-affect-finite-element-methods-variational-formulations/feed/ 0
A gentle (and short) introduction to Gröbner Bases https://quickmathintuitions.org/gentle-introduction-grobner-bases/?pk_campaign=&pk_source= https://quickmathintuitions.org/gentle-introduction-grobner-bases/#respond Wed, 03 Jun 2020 08:19:27 +0000 https://quickmathintuitions.org/?p=372 Taken from my report for a Computer Algebra course. Motivation We know there are plenty of methods to solve a system of linear equations (to name a few: Gauss elimination,…

The post A gentle (and short) introduction to Gröbner Bases appeared first on Quick Math Intuitions.

]]>

Taken from my report for a Computer Algebra course.

Motivation

We know there are plenty of methods to solve a system of linear equations (to name a few: Gauss elimination, QR or LU factorization). In fact, it is straightforward to check whether a linear system has any solutions, and if it does, how many of them there are. But what if the system is made of non-linear equations? The invention of Groebner bases and the field of computational algebra came up to answer these questions.

In this text we will recap the theory behind single-variable polynomials and extend it to multiple-variable ones, ultimately getting to the definition of Groebner bases.

In some cases, the transition from one to multiple variables is smooth and pretty much an extension of the simple case (for example for the Greatest Common Divisor algorithm). In other cases, however, there are conceptual jumps to be made. To give an example, single variable polynomials have always a finite number of roots, while this does not hold for multivariable polynomials. Intuitively, the reason is that a polynomial in one variable describes a curve in the plane, which can only intersect the x-axis a discrete and finite number of times. On the other hand, a multivariate polynomial describes a surface in space, which will always intersect the 0-plane in a continuum of points.

Preliminaries

All throughout these notes, it will be important to have in mind some basic algebra definitions.

To begin with, we ask what is the most basic (but useful) structure we can put on a set. We ask, for example, given the set of natural numbers, what do we need to do to allow basic manipulation (i.e. summation)? This leads us to the definition of group.

DEF 1: A group is made of a set \mathcal{G} with one binary operation + such that:

  • The operation is closed: a+b \in \mathcal{G} \ \forall a,b \in \mathcal{G}
  • The operation is associative: a+(b+c)=(a+b)+c \ \forall a,b,c \in \mathcal{G}
  • The operation + has an identity element 0 s.t. g+0 = g \ \forall g \mathcal{G}
  • Each element has an inverse element: \forall g \in \mathcal{G}, \exists h \in \mathcal{G} : g+h=0

A group is usually denoted with (\mathcal{G}, +).
Notice that we did not ask anything about commutativity!

Then, the notion of group can be made richer and more complex: first into that of ring, then into that of field.

DEF 2: A ring is a group with an extra operation (\mathcal{G}, +, *) which sastisfies the following properties:

  • The operation + is commutative: a+b=b+a \ \forall a,b \in \mathcal{G}
  • The operation * is closed: a*b \in \mathcal{G} \ \forall a,b \in \mathcal{G}
  • The operation * has an identity element 1 s.t. g*1 = g \ \forall g
  • The operation * is associative: a*(b*c)=(a*b)*c \ \forall a,b,c \in \mathcal{G}
  • The operation * is distributive with respect to +

DEF. 3: A field \mathcal{K} is a ring in which all elements have an inverse with respect to the operation *.

All throughout these notes, the symbol \mathcal{K} will denote a field.

DEF 4: A monomial is a product x_1^{\alpha_1} \cdots x_n^{\alpha_n}, with \alpha_i \in \mathbb{N}. Its degree is the sum of the exponents.

DEF 5: A polynomial is a linear combinations of monomials.

We conclude by noting that the space of polynomials with coefficients taken from a field \mathcal{K} makes a ring, denoted with \mathcal{K}[x_1, \cdots, x_n].

Affine varieties and ideals

Our first step towards formalizing the theory for non-linear systems is to understand what the space of solutions looks like. As much as we know that linear spaces are the solutions spaces for linear systems, there is something analogous for non-linear systems, and that is affine varieties.

DEF 6: Given f_1, \cdots, f_s polynomials in \mathcal{K}[x_1, \cdots, x_n], the affine variety over them is the set of their common roots:

    \[V(f_1, \cdots, f_s) = \{ (a_1, \cdots, a_n) \in \mathcal{K}^n : f_i(a_1, \cdots, a_n) = 0 \ \forall i = 1, \cdots, s\}\]

EX 1: V(x_1+x_2-1, x_2+1) = \{ (2, -1) \}

When working with rings, as it is our case, the notion of ideal is important. The reason for its importance is that ideals turn out to be kernels of ring homomorphisms — or, in other words, that they are the “good sets” that can be used to take ring quotients.

DEF 7: An ideal is a subset I \subset \mathcal{K}[x_1, \cdots, x_n] such that:

  • 0 \in I
  • it is closed w.r.t +: f+g \in I \ \forall f,g \in I
  • it is closed w.r.t * for elements in the ring: f*g \in I \ \forall f \in I, g \in \mathcal{K}[x_1, \cdots, x_n]

Given some elements of a ring, we might wonder what is the way to build an ideal (the smallest) that would contain them.

DEF 8: Given f_1, \cdots, f_s polynomials, the ideal generated by them is the set of combinations with coefficients taken from the ring:

    \[<f_1, \cdots, f_s> = \{ \sum_i^s h_i f_i, \ \ h_i \in \mathcal{K}[x_1, \cdots, x_n] \}\]

Having introduced ideals, we immediately find a result that is linked to our purpose of non-linear systems inspection: a simple way to check if a system has solutions or not.

THEO 1: If 1 \in I=<f_1, \cdots, f_s>, then V(I) = \emptyset.
PROOF: Since 1 \in I, it must be possible to write it as a combination of the form 1 = \sum h_i f_i. Now, if we suppose that V(I) is not empty, then one of its points a is a root of all the f_i. This would mean that \sum h_i f_i(a) = 0 \neq 1, which is absurd.

Groebner bases

Groebner bases give a computational method for solving non-linear systems of equations through an apt sequence of intersection of ideals. To state its definition, we first need to know what a monomial ordering is. Intuitively, we can think of such an ordering as a way to compare monomials — the technical definition does not add much more concept. Different orderings are possible.

Once we have a way of ordering monomials, it is also possible to define the leading monomial (denoted as LM) of a given polynomial. For single variable polynomials it is pretty straightforward, but for the multi-variate case we need to define an ordering first (some possible options are: lexicographic, graded lexicographic, graded reverse lexicographic).

DEF 9: Given a monomial ordering, a Groebner basis of an ideal I w.r.t the ordering is a finite subset G = \{ g_1, \cdots, g_s \} \subset I s.t. <LM(g_1), \cdots, LM(g_s)> = LM(I).

This basis is a generating set for the ideal, but notice how it depends on the ordering! Finally, it is possible to prove that every ideal has a Groebner basis (Hilbert’s basis theorem).

From here now, the rationale is that, given a system of polynomial equations, we can see the polynomials as generators of some ideal. That ideal will have a Groebner basis, and there is an algorithm to build one (Buchberger algorithm). From there, apt ideal operations will allow to solve the system by eliminating the variables.

We now describe this elimination algorithm with an example:

(1)   \begin{equation*}  \begin{cases} x^2+y+z=1 \\ x + y^2 +z=1 \\ x+y+z^2=1 \end{cases} \end{equation*}

Given the ideal

    \[I = <x^2+y+z-1, x + y^2 +z-1, x+y+z^2-1>,\]

then a Groebner basis with respect to the (lexicographical order) is

(2)   \begin{equation*} \begin{cases} g_1=x+y+z^2-1 \\ g_2=y^2-y-z^2+z \\ g_3=2yz^2+z^4-z^2\\ g_4=z^6-4z^4+4z^3-z^2 \end{cases} \end{equation*}

which can be used to compute the solutions of the initial system (1).

To do so, first consider the ideal I \cap \mathbb{C}[z], which practically corresponds to all polynomials in I where x,y are not present. In our case, we are left only with one element from the basis which only involve z: g_4=z^6-4z^4+4z^3-z^2. The roots of g_4 are 0,1,-1 \pm \sqrt{2}.

The values for z can then be used to find the possible values for y using polynomial g_3, g_2, which only involve y,z. Finally, once possible values for y,z are known, they can be used to find the corresponding values for x through g_1.

This example will yield the following solutions:

(3)   \begin{equation*} \begin{cases} (1, 0, 0), (0, 1, 0), (0, 0, 1), \\ (-1 + \sqrt{2}, -1 + \sqrt{2}, -1 + \sqrt{2}), \\ (-1 - \sqrt{2}, -1 - \sqrt{2}, -1 - \sqrt{2}) \end{cases} \end{equation*}

  • Was this Helpful ?
  • yes   no

The post A gentle (and short) introduction to Gröbner Bases appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/gentle-introduction-grobner-bases/feed/ 0
What is the difference between Finite Differences and Finite Element Methods? https://quickmathintuitions.org/different-finite-differences-finite-element-methods/?pk_campaign=&pk_source= https://quickmathintuitions.org/different-finite-differences-finite-element-methods/#respond Wed, 27 May 2020 08:46:23 +0000 https://quickmathintuitions.org/?p=368 With Finite Differences, we discretize space (i.e. we put  a grid on it) and we seek the values of the solution function at the mesh points. We still solve a…

The post What is the difference between Finite Differences and Finite Element Methods? appeared first on Quick Math Intuitions.

]]>
With Finite Differences, we discretize space (i.e. we put  a grid on it) and we seek the values of the solution function at the mesh points. We still solve a discretized differential problem.

With Finite Elements, we approximate the solution as a (finite) sum of functions defined on the discretized space. These functions make up a basis of the space, and the most commonly used are the hat functions. We end up with a linear system whose unknowns are the weights associated with each of the basis functions: i.e., how much does each basis function count for out particular solution to our particular problem?

Brutally, it is finding the value of the solution function at grid points (finite differences) vs the weight of the linear combinations of the hat functions (finite elements).

  • Was this Helpful ?
  • yes   no

The post What is the difference between Finite Differences and Finite Element Methods? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/different-finite-differences-finite-element-methods/feed/ 0
A note on the hopes for Fully Homomorphic Signatures https://quickmathintuitions.org/note-hopes-fully-homomorphic-signatures/?pk_campaign=&pk_source= https://quickmathintuitions.org/note-hopes-fully-homomorphic-signatures/#respond Thu, 29 Aug 2019 15:34:49 +0000 http://quickmathintuitions.org/?p=351 This is taken from my Master Thesis on Homomorphic Signatures over Lattices. See also But WHY is the Lattices Bounded Distance Decoding Problem difficult?. What are homomorphic signatures Imagine that…

The post A note on the hopes for Fully Homomorphic Signatures appeared first on Quick Math Intuitions.

]]>

This is taken from my Master Thesis on Homomorphic Signatures over Lattices.
See also But WHY is the Lattices Bounded Distance Decoding Problem difficult?.

What are homomorphic signatures

Imagine that Alice owns a large data set, over which she would like to perform some computation. In a homomorphic signature scheme, Alice signs the data set with her secret key and uploads the signed data to an untrusted server. The server then performs the computation modeled by the function g to obtain the result y = g(x) over the signed data.

Alongside the result y, the server also computes a signature \sigma_{g,y} certifying that y is the correct result for g(x). The signature should be short – at any rate, it must be independent of the size of x. Using Alice’s public verification key, anybody can verify the tuple (g,y,\sigma_{g,y}) without having to retrieve all the data set x nor to run the computation g(x) on their own again.

The signature \sigma_{g,y} is a homomorphic signature, where homomorphic has the same meaning as the mathematical definition: ‘mapping of a mathematical structure into another one in such a way that the result obtained by applying the operations to elements of the first structure is mapped onto the result obtained by applying the corresponding operations to their respective images in the second one‘. In our case, the operations are represented by the function f, and the mapping is from the matrices U_i \in \mathbb{Z}_q^{n \times n} to the matrices V_i \in \mathbb{Z}_q^{n \times m}.

homomorphic signatures

Notice how the very idea of homomorphic signatures challenges the basic security requirements of traditional digital signatures. In fact, for a traditional signatures scheme we require that it should be computationally infeasible to generate a valid signature for a party without knowing that party’s private key. Here, we need to be able to generate a valid signature on some data (i.e. results of computation, like g(x)) without knowing the secret key. What we require, though, is that it must be computationally infeasible to forge a valid signature \sigma' for a result y' \neq g(x). In other words, the security requirement is that it must not be possible to cheat on the signature of the result: if the provided result is validly signed, then it must be the correct result.

The next ideas stem from the analysis of the signature scheme devised by Gorbunov, Vaikuntanathan and Wichs. It relies on the Short Integer Solution hard problem on lattices. The scheme presents several limitations and possible improvements, but it is also the first homomorphic signature scheme able to evaluate arbitrary arithmetic circuits over signed data.

Def. – A signature scheme is said to be leveled homomorphic if it can only evaluate circuits of fixed depth d over the signed data, with d being function of the security parameter. In particular, each signature \sigma_i comes with a noise level \beta_i: if, combining the signatures into the result signature \sigma, the noise level grows to exceed a given threshold \beta^*, then the signature \sigma is no longer guaranteed to be correct.

Def. – A signature scheme is said to be fully homomorphic if it supports the evaluation of any arithmetic circuit (albeit possibly being of fixed size, i.e. leveled). In other words, there is no limitation on the “richness” of the function to be evaluated, although there may be on its complexity.

Let us remark that, to date, no (non-leveled) fully homomorphic signature scheme has been devised yet. The state of the art still lies in leveled schemes. On the other hand, a great breakthrough was the invention of a fully homomorphic encryption scheme by Craig Gentry.

On the hopes for homomorphic signatures

The main limitation of the current construction (GVW15) is that verifying the correctness of the computation takes Alice roughly as much time as the computation of g(x) itself. However, what she gains is that she does not have to store the data set long term, but can do only with the signatures.

To us, this limitation makes intuitive sense, and it is worth comparing it with real life. In fact, if one wants to judge the work of someone else, they cannot just look at it without any preparatory work. Instead, they have to have spent (at least) a comparable amount of time studying/learning the content to be able to evaluate the work.

For example, a good musician is required to evaluate the performance of Beethoven’s Ninth Symphony by some orchestra. Notice how anybody with some musical knowledge could evaluate whether what is being played makes sense (for instance, whether it actually is the Ninth Symphony and not something else). On the other hand, evaluating the perfection of performance is something entirely different and requires years of study in the music field and in-depth knowledge of the particular symphony itself.

That is why it looks like hoping to devise a homomorphic scheme in which the verification time is significantly shorter than the computation time would be against what is rightful to hope. It may be easy to judge whether the result makes sense (for example, it is not a letter if we expected an integer), but is difficult if we want to evaluate perfect correctness.

However, there is one more caveat. If Alice has to verify the result of the same function g over two different data sets, then the verification cost is basically the same (amortized verification). Again, this makes sense: when one is skilled enough to evaluate the performance of the Ninth Symphony by the Berlin Philharmonic, they are also skilled enough to evaluate the performance of the same piece by the Vienna Philharmonic, without having to undergo any significant further work other than going and listening to the performance.

 

So, although it does not seem feasible to devise a scheme that guarantees the correctness of the result and in which the verification complexity is significantly less than the computation complexity, not all hope for improvements is lost. In fact, it may be possible to obtain a scheme in which verification is faster, but the correctness is only probabilistically guaranteed.

Back to our music analogy, we can imagine the evaluator listening to a handful of minutes of the Symphony and evaluate the whole performance from the little he has heard. However, the orchestra has no idea at what time the evaluator will show up, and for how long they will listen. Clearly, if the orchestra makes a mistake in those few minutes, the performance is not perfect; on the other hand, if what they hear is flawless, then there is some probability that the whole play is perfect.

Similarly, the scheme may be tweaked to only partially check the signature result, thus assigning a probabilistic measure of correctness. As a rough example, we may think of not computing the homomorphic transformations over the U_i matrices wholly, but only calculating a few, randomly-placed entries. Then, if those entries are all correct, it is very unlikely (and it quickly gets more so as the number of checked entries increases, of course) that the result is wrong. After all, to cheat, the third party would need to guess several numbers in \mathbb{Z}_q, each having 1/q likelihood of coming up!

Another idea would be for the music evaluator to delegate another person to check for the quality of the performance, by giving them some precise and detailed features to look for when hearing the play. In the homomorphic scheme, this may translate in looking for some specific features in the result, some characteristics we know a priori that must be in the result. For example, we may know that the result must be a prime number, or must satisfy some constraint, or a relation with something much easier to check. In other words, we may be able to reduce the correctness check to a few fundamental traits that are very easy to check, but also provide some guarantee of correctness. This method seems much harder to model, though.

  • Was this Helpful ?
  • yes   no

The post A note on the hopes for Fully Homomorphic Signatures appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/note-hopes-fully-homomorphic-signatures/feed/ 0