Toomas Krips: „Improving performance of secure real-number operations“

On the 12th of June, Toomas Krips defended his PhD on „Improving performance of secure real-number operations“ in which he describes methods on how to improve real-number computations on secure computations. In the following blog post, Toomas tells a story on secure computations, real numbers and his doctoral thesis.

There are two main components of secure computation – analysing data and privacy. Big data is all the rage nowadays and privacy seems to be not entirely dead, so I will give no explanations why we should care about these two. They are quite at odds with each other – it seems that to analyse data we need to access it and privacy seems to be a lot about protecting one’s data from those who would want to access it.

Secure computation (or SC for short) is largely about overcoming that problem. We note that one often does not need raw data as such but the outputs of analysed data. States, researchers, companies, and NGOs often do not need to know about the age, health, or drinking habits of any individual, but rather are interested in large-scale information, such as averages or trends or patterns.

Secure computation lives in this gap. It is a method for taking “encrypted” data, performing operations on it, and outputting an “encrypted” answer to a party that can decrypt it. Note that this means that nobody who is involved in the analysis actually sees any concrete data. The only thing that can be learned is the answer – just as we wanted. The reason that “encrypted” is in scare quotes is that the method used to make the data inaccessible to the computing parties is in some cases not encryption but some other method, such as secret sharing.

We know that it is theoretically possible to compute anything that is computable with secure computation techniques. The problem word here is of course “theoretically”. For many areas, converting standard computation to SC without thinking about it makes the process so vastly inefficient that it is unusable in practice. Thus, it is necessary to develop not only the security part of SC but to also build computational methods for this setting. My thesis is focused on that area of SC.

More specifically, my thesis tries to find tricks and methods to improve real-number (numbers like 2.958903, -3243.019 or 0.000000432) computations on SC. Originally, SC was done on integers (numbers like 23, -5, or 10383923) bits (numbers like, uh, 0 and 1), since they were easier to handle and you could build real numbers out of them theoretically. Which was fine and dandy but now people have been also working on non-integer types directly and one of those people is apparently me.

As any good fairy-tale, my thesis is built on three parts because three papers are what the institute requires from a doctoral student. All three are papers attempt to improve real-number secure computations, but do it with different methods.

First paper
The first paper is a story about two types of real-number data types and their happy reunion.
You see, when people want to represent real numbers in ordinary computations, they use two different data types. Well, mostly just one, but the other one deserves an honourable mention. These are called fixed-point numbers and floating-point numbers.

Fixed-point numbers are a method for representing where we really want to pretend that we are using integers. It is a bit, like if you have to store 142.542003, so you store 142542003 and agree to remember that there was supposed to be a dot there. They are nice because integers are simple to handle (especially in SC, as we saw before), however, they are sort of inflexible – you get the same precision no matter whether your value is between 0.63 and 0.64 or between 1 billion and 2 billion. That is not very nice.

The other type is floating-point numbers. Floating-point numbers are much more flexible and they are the data type that is used everywhere in normal computing for real numbers. However, they are quite a lot more complicated than fixed-point numbers and this is quite bad news in SC – because, as a rule of thumb, in SC, the more complicated your scheme is, the more if-s and when-s and flags there are, the more resource-consuming it will be. Floating-point numbers are slow in SC.

Therefore, the first paper aims to improve floating-point computations for a few functions. Floating-point numbers consist of three parts and the key trick here is noticing that one of those pieces looks a lot like a fixed-point number – and if we temporarily turn that part into a fixed-point number, everything is just so much nicer. However, this is so for only a few functions. For some other functions, the changing process is so costly that it eats up what we won from it.

Second paper
The second paper is a story about using your free lunches. There is a common SC framework called secret sharing that has quite a nice property – namely, it benefits greatly from the parallel composition. This means that it does not really matter if you want to do, say 1000 multiplication simultaneously or just one multiplication – it will take approximately the same amount of time. This property, of course, holds up to a certain number depending on the circumstances.

This feels like a very nice property so you would like to use and abuse it as much as possible. Often, when working with large amounts of data, this property is already fully used, but even then, at some point, you may want to do some operation on only a single value. For example – say that you want to compute the square root of an average of values in your database for statistical purposes – even if your database has a billion entries – there is still only one average. Could we still somehow use this ability?

It turns out that you can (provided a few ifs). We take inspiration from Monte Carlo methods – there you run a very large number of tests, and the amount of tests that pass gives you the answer. Now, note here that these tests can be run in parallel. This is what allows us to use our free lunch.

Third paper
The third paper is a story about a new kid on the block.

Remember the talk about fixed-point numbers and floating-point numbers? They are old and well-used in normal computation, but the rules in secure computation differ from the normal computation. Maybe we should search for more options for building real numbers. So this paper builds a new number type that we call golden numbers.

The name comes from the golden ratio φ that perhaps does not require further introduction in a popular science blog post because it is perhaps one of the most popular-science mathematical concepts there is.

The important part here, however, is that φ satisfies φ^2 = φ+1.

Our new number type is built with the following trick – to represent a real number x, you store a pair of integers, (a, b), so that x ≈ a – φb. For example, (10, 4) would represent
10- 4φ ≈ 3.53. Well, okay, but why should we do something so contrived?

The reason here is that it makes addition and multiplication nice. Say you want to add (a, b) and (c, d). Then the sum would be (a+c, b+d) – you could do the addition of real numbers by only doing integer addition.

Moreover, because φ^2 = φ+1, (a,b)*(c,d)=(a*c+b*d,b*c+a*d-b*d)
– it seems that you can do real number multiplication with only integer addition and multiplication. That would be nice because even with fixed-point numbers you have to do some expensive operations when you multiply in SC. Are we so much ahead of fixed-point and floating-point secure computation.

Well, the story is not that nice as expected. The problem is that when you multiply golden numbers, the integers used for representation grow very fast and you will get to overflow fast. We need to do something about it.

The trick here is that we build a suitable database of golden numbers (x1, y1), ..(xk, yk) so that x1 – φ y1≈ 0, x2 – φ y2≈ 0, …, xk – φ yk≈ 0. Then we can deduct these golden numbers from any other golden section to our hearts’ delight without changing the value of that number. If we do it cleverly enough, we can keep the integers used for representation small enough. Alas, this does not come free, but we have done it more cheaply than the comparable fixed-point operations.

Therefore, this was the story of secure computation, real numbers, three tricks and my doctoral thesis. Hope you liked it.

DSpace – https://dspace.ut.ee/handle/10062/63763

Shakshi Sharma: “Fighting Misinformation in the Digital Age”

Rahul Goel: “Mining Social Well-being Using Mobile Data”

Dmytro Fishman: “Developing a data analysis pipeline for automated protein profiling in immunology”