lattices – Quick Math Intuitions https://quickmathintuitions.org Sharing quick intuitions for math ideas Thu, 31 Aug 2023 05:27:06 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 A note on the hopes for Fully Homomorphic Signatures https://quickmathintuitions.org/note-hopes-fully-homomorphic-signatures/?pk_campaign=&pk_source= https://quickmathintuitions.org/note-hopes-fully-homomorphic-signatures/#respond Thu, 29 Aug 2019 15:34:49 +0000 http://quickmathintuitions.org/?p=351 This is taken from my Master Thesis on Homomorphic Signatures over Lattices. See also But WHY is the Lattices Bounded Distance Decoding Problem difficult?. What are homomorphic signatures Imagine that…

The post A note on the hopes for Fully Homomorphic Signatures appeared first on Quick Math Intuitions.

]]>

This is taken from my Master Thesis on Homomorphic Signatures over Lattices.
See also But WHY is the Lattices Bounded Distance Decoding Problem difficult?.

What are homomorphic signatures

Imagine that Alice owns a large data set, over which she would like to perform some computation. In a homomorphic signature scheme, Alice signs the data set with her secret key and uploads the signed data to an untrusted server. The server then performs the computation modeled by the function g to obtain the result y = g(x) over the signed data.

Alongside the result y, the server also computes a signature \sigma_{g,y} certifying that y is the correct result for g(x). The signature should be short – at any rate, it must be independent of the size of x. Using Alice’s public verification key, anybody can verify the tuple (g,y,\sigma_{g,y}) without having to retrieve all the data set x nor to run the computation g(x) on their own again.

The signature \sigma_{g,y} is a homomorphic signature, where homomorphic has the same meaning as the mathematical definition: ‘mapping of a mathematical structure into another one in such a way that the result obtained by applying the operations to elements of the first structure is mapped onto the result obtained by applying the corresponding operations to their respective images in the second one‘. In our case, the operations are represented by the function f, and the mapping is from the matrices U_i \in \mathbb{Z}_q^{n \times n} to the matrices V_i \in \mathbb{Z}_q^{n \times m}.

homomorphic signatures

Notice how the very idea of homomorphic signatures challenges the basic security requirements of traditional digital signatures. In fact, for a traditional signatures scheme we require that it should be computationally infeasible to generate a valid signature for a party without knowing that party’s private key. Here, we need to be able to generate a valid signature on some data (i.e. results of computation, like g(x)) without knowing the secret key. What we require, though, is that it must be computationally infeasible to forge a valid signature \sigma' for a result y' \neq g(x). In other words, the security requirement is that it must not be possible to cheat on the signature of the result: if the provided result is validly signed, then it must be the correct result.

The next ideas stem from the analysis of the signature scheme devised by Gorbunov, Vaikuntanathan and Wichs. It relies on the Short Integer Solution hard problem on lattices. The scheme presents several limitations and possible improvements, but it is also the first homomorphic signature scheme able to evaluate arbitrary arithmetic circuits over signed data.

Def. – A signature scheme is said to be leveled homomorphic if it can only evaluate circuits of fixed depth d over the signed data, with d being function of the security parameter. In particular, each signature \sigma_i comes with a noise level \beta_i: if, combining the signatures into the result signature \sigma, the noise level grows to exceed a given threshold \beta^*, then the signature \sigma is no longer guaranteed to be correct.

Def. – A signature scheme is said to be fully homomorphic if it supports the evaluation of any arithmetic circuit (albeit possibly being of fixed size, i.e. leveled). In other words, there is no limitation on the “richness” of the function to be evaluated, although there may be on its complexity.

Let us remark that, to date, no (non-leveled) fully homomorphic signature scheme has been devised yet. The state of the art still lies in leveled schemes. On the other hand, a great breakthrough was the invention of a fully homomorphic encryption scheme by Craig Gentry.

On the hopes for homomorphic signatures

The main limitation of the current construction (GVW15) is that verifying the correctness of the computation takes Alice roughly as much time as the computation of g(x) itself. However, what she gains is that she does not have to store the data set long term, but can do only with the signatures.

To us, this limitation makes intuitive sense, and it is worth comparing it with real life. In fact, if one wants to judge the work of someone else, they cannot just look at it without any preparatory work. Instead, they have to have spent (at least) a comparable amount of time studying/learning the content to be able to evaluate the work.

For example, a good musician is required to evaluate the performance of Beethoven’s Ninth Symphony by some orchestra. Notice how anybody with some musical knowledge could evaluate whether what is being played makes sense (for instance, whether it actually is the Ninth Symphony and not something else). On the other hand, evaluating the perfection of performance is something entirely different and requires years of study in the music field and in-depth knowledge of the particular symphony itself.

That is why it looks like hoping to devise a homomorphic scheme in which the verification time is significantly shorter than the computation time would be against what is rightful to hope. It may be easy to judge whether the result makes sense (for example, it is not a letter if we expected an integer), but is difficult if we want to evaluate perfect correctness.

However, there is one more caveat. If Alice has to verify the result of the same function g over two different data sets, then the verification cost is basically the same (amortized verification). Again, this makes sense: when one is skilled enough to evaluate the performance of the Ninth Symphony by the Berlin Philharmonic, they are also skilled enough to evaluate the performance of the same piece by the Vienna Philharmonic, without having to undergo any significant further work other than going and listening to the performance.

 

So, although it does not seem feasible to devise a scheme that guarantees the correctness of the result and in which the verification complexity is significantly less than the computation complexity, not all hope for improvements is lost. In fact, it may be possible to obtain a scheme in which verification is faster, but the correctness is only probabilistically guaranteed.

Back to our music analogy, we can imagine the evaluator listening to a handful of minutes of the Symphony and evaluate the whole performance from the little he has heard. However, the orchestra has no idea at what time the evaluator will show up, and for how long they will listen. Clearly, if the orchestra makes a mistake in those few minutes, the performance is not perfect; on the other hand, if what they hear is flawless, then there is some probability that the whole play is perfect.

Similarly, the scheme may be tweaked to only partially check the signature result, thus assigning a probabilistic measure of correctness. As a rough example, we may think of not computing the homomorphic transformations over the U_i matrices wholly, but only calculating a few, randomly-placed entries. Then, if those entries are all correct, it is very unlikely (and it quickly gets more so as the number of checked entries increases, of course) that the result is wrong. After all, to cheat, the third party would need to guess several numbers in \mathbb{Z}_q, each having 1/q likelihood of coming up!

Another idea would be for the music evaluator to delegate another person to check for the quality of the performance, by giving them some precise and detailed features to look for when hearing the play. In the homomorphic scheme, this may translate in looking for some specific features in the result, some characteristics we know a priori that must be in the result. For example, we may know that the result must be a prime number, or must satisfy some constraint, or a relation with something much easier to check. In other words, we may be able to reduce the correctness check to a few fundamental traits that are very easy to check, but also provide some guarantee of correctness. This method seems much harder to model, though.

The post A note on the hopes for Fully Homomorphic Signatures appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/note-hopes-fully-homomorphic-signatures/feed/ 0
But WHY is the Lattices Bounded Distance Decoding Problem difficult? https://quickmathintuitions.org/why-lattices-bounded-distance-decoding-problem-difficult/?pk_campaign=&pk_source= https://quickmathintuitions.org/why-lattices-bounded-distance-decoding-problem-difficult/#respond Wed, 28 Aug 2019 20:54:04 +0000 http://quickmathintuitions.org/?p=306 This is taken from my Master Thesis on Homomorphic Signatures over Lattices. Introduction to lattices and the Bounded Distance Decoding Problem A lattice is a discrete subgroup , where the…

The post But WHY is the Lattices Bounded Distance Decoding Problem difficult? appeared first on Quick Math Intuitions.

]]>

This is taken from my Master Thesis on Homomorphic Signatures over Lattices.

Introduction to lattices and the Bounded Distance Decoding Problem

A lattice is a discrete subgroup \mathcal{L} \subset \mathbb{R}^n, where the word discrete means that each x \in \mathcal{L} has a neighborhood in \mathbb{R}^n that, when intersected with \mathcal{L} results in x itself only. One can think of lattices as being grids, although the coordinates of the points need not be integer. Indeed, all lattices are isomorphic to \mathbb{Z}^n, but it may be a grid of points with non-integer coordinates.

Another very nice way to define a lattice is: given n independent vectors b_i \in \mathbb{R}^n, the lattice \mathcal{L} generated by that base is the set of all linear combinations of them with integer coefficients:

    \[\mathcal{L} = \{\sum\limits_{i=0}^{n} z_i b_i, \ b_i \in \mathbb{R}^n, z_i \in \mathbb{Z} \}\]

Then, we can go on to define the Bounded Distance Decoding problem (BDD), which is used in lattice-based cryptography (more specifically, for example in trapdoor homomorphic encryption) and believed to be hard in general.

Given an arbitrary basis of a lattice \mathcal{L}, and a point x \in \mathbb{R}^n not necessarily belonging to \mathcal{L}, find the point of \mathcal{L} that is closest to x. We are also guaranteed that x is very close to one of the lattice points. Notice how we are relying on an arbitrary basis – if we claim to be able to solve the problem, we should be able to do so with any basis.

Bounded Distance Problem example: given the blue points, devise an algorithm that pinpoints the one closest to the red target.

Now, as the literature goes, this is a problem that is hard in general, but easy if the basis is nice enough. So, for example for encryption, the idea is that we can encode our secret message as a lattice point, and then add to it some small noise (i.e. a small element v \in \mathbb{R}^n). This basically generates an instance of the BDD problem, and then the decoding can only be done by someone who holds the good basis for the lattice, while those having a bad basis are going to have a hard time decrypting the ciphertext.

However, albeit of course there is no proof of this (it is a problem believed to be hard), I wanted to get at least some clue on why it should be easy with a nice basis and hard with a bad one (GGH is an example schema that employs techniques based on this).

So now to our real question: why is the Bounded Distance Decoding problem hard (or easy)? Nobody I asked could answer my questions, nor could I find any resource detailing it, so here come my intuitions.

Why the Bounded Distance Decoding problem is easy with a nice basis

Let’s first say what a good basis is. A basis is good if it is made of nearly orthogonal short vectors. This is a pretty vague definition, so let’s make it a bit more specific (although tighter): we want a base in which each of its b_i is of the form (0, ..., 0, k, 0, ..., 0) for some k \in \mathbb{R}. One can imagine k being smaller than some random value, like 10. (This shortness is pretty vague and its role will be clearer later.) In other words, a nice basis is the canonical one, in which each vector has been re-scaled by an independent real factor.

To get a flavor of why the Bounded Distance Decoding problem is easy with a nice basis, let’s make an example. Consider \mathbb{R}^2, with b_0 = (\frac{1}{2}, 0), b_1 = (0, \frac{5}{4}) as basis vectors. Suppose we are given x = (\frac{3}{7}, \frac{9}{10}) as challenge point. It does not belong to the lattice generated by b_0, b_1, but it is only (\frac{1}{14}, \frac{9}{25}) away from the point (\frac{1}{2}, \frac{5}{4}), which does belong to the lattice.

Now, what does one have to do to solve this problem? Let’s get a graphical feeling for it and formalize it.

Buonded Distance Decoding problem example with good basis
Buonded Distance Decoding problem example with good basis

We are looking for the lattice point closest to x. So, sitting on x, we are looking for the linear combination with integer coefficients of the basis vectors that is closest to us. Breaking it component-wise, we are looking for \min y, z \in \mathbb{R} and k, j \in \mathbb{Z} such that they are solution of:

    \[\begin{cases} \frac{3}{7} + y = \frac{1}{2} k \\ \frac{9}{10} + z = \frac{5}{4} j \end{cases}\]

This may seem a difficult optimization problem, but in truth it is very simple! The reason is that each of the equations is independent, so we can solve them one by one – the individual minimum problems are easy and can be solved quickly. (One could also put boundaries on y, z with respect to the norm of the basis vectors, but it is not vital now.)

So the overall complexity of solving BDD with a good basis is \theta(\theta(\min)n), which is okay.

Why the Bounded Distance Decoding problem is hard with a bad basis

A bad basis is any basis that does not satisfy any of the two conditions of a nice basis: it may be poorly orthogonal, or may be made of long vectors. We will later try to understand what roles these differences play in solving the problem: for now, let’s just consider an example again.

Another basis for the lattice generated by the nice basis we picked before ((\frac{1}{2}, 0), (0, \frac{5}{4})) is b_0 = (\frac{9}{2}, \frac{5}{4}), b_1 = (5, \frac{10}{4}). This is a bad one.

Buonded Distance Decoding problem example with bad basis
Buonded Distance Decoding problem example with bad basis

Let’s write down the system of equations coordinate-wise as we did for the nice basis. We are looking for \min y, z \in \mathbb{R} and k, j \in \mathbb{Z} such that they are solution of:

    \[\begin{cases} \frac{3}{7} + y = \frac{9}{2} k + 5 j \\ \frac{9}{10} + z = \frac{5}{4} k + \frac{10}{4} j \end{cases}\]

Now look! This may look similar as before, but this time it really is a system, the equations are no longer independent: we have 3 unknowns and 2 equations. The system is under-determined! This already means that, in principle, there are infinite solutions. Moreover, we are also trying to find a solution that is constrained to be minimum. Especially with big n, solving this optimization problem can definitely be non-trivial!

On the differences between a good and a bad basis

So far so good: we have discovered why the Bounded Distance Decoding problem is easy with a good basis and difficult with a bad one. But still, what does a good basis have to make it easy? How do its properties related to easy of solution?

We enforced two conditions: orthogonality and shortness. Actually, we even required something stronger than orthogonality: that the good basis was basically a stretched version of the canonical one – i.e. had only one non-zero entry.

Let’s think for a second in terms of canonical basis \{e_i = (0, ..., 0, 1, 0, ... 0)\}. This is what makes the minimum problems independent and allows for easy resolution of the BDD problem. However, when dealing with cryptography matters, we cannot always use the same basis, we need some randomness. That is why we required to use a set of independent vectors each having only one non-zero coordinate: it is the main feature that makes the problem easy (at least for the party having the good basis).

We also asked for shortness. This does not give immediate advantage to who holds the good basis, but makes it harder to solve the problem for those holding the bad one. The idea is that, given a challenge point x \in \mathbb{R}^n, if we have short basis vectors, we can take small steps from it and look around us for nearby points. It may take some time to find the best one, but we are still not looking totally astray. Instead, if we have long vectors, every time we use one we have to make a big leap in one direction. In other words, who has the good basis knows the step size of the lattice, and thus can take steps of considerate size. slowly poking around; who has the bad basis takes huge jumps and may have a hard time pinpointing the right point.

It is true, though, that the features of a good basis usually only include shortness and orthogonality, and not the “rescaling of the canonical basis” we assumed in the first place. So, let’s consider a basis of that kind, like \{v_1 = (\frac{\sqrt{3}}{2}, \frac{1}{2}), v_2 = (\frac{1}{2}, \frac{\sqrt{3}}{2})\}. If we wrote down the minimum problem we would have to solve given a challenge point, it would be pretty similar to the one with the bad basis, with the equations not being independent. Looks like bad luck, uh?

However, not all hope is lost! In fact, we can look for the rotation matrix that will turn that basis into a stretching of the canonical one, finding v_1', v_2'! Then we can rotate the challenge point x as well, and solve the problem with respect to those new basis vectors. Of course that is not going to be the solution to the problem, but we can easily rotate it back to find the real solution!

However, given that using a basis of this kind does not make the opponent job any harder, but only increases the computational cost for the honest party, I do not see why this should ever be used. Instead, I guess the best choices for good basis are the stretched canonical ones.

(This may be obvious, but having a generic orthogonal basis is not enough for an opponent to break the problem. If it is orthogonal, but its vectors are long, bad luck!)

The post But WHY is the Lattices Bounded Distance Decoding Problem difficult? appeared first on Quick Math Intuitions.

]]>
https://quickmathintuitions.org/why-lattices-bounded-distance-decoding-problem-difficult/feed/ 0