Monday, May 12, 2014

How to Hide What You're Looking at

Elaine Shi's talk at the MPC workshop in Aarhus covered her ORAM scheme and its usage in the context of garbled circuits.

The general motivation for ORAM is the fact that access patterns can leakage information. For example, imagine a genome database in the cloud. Even when the actual sequences are encrypted, the accessed position might leak the intention of the query because diseases are linked to certain positions in the genome.

Informally, oblivious random access machines are algorithms that allow memory-restricted clients to access the memory on a server without leaking the access pattern. It is known that this incurs at least an overhead logarithmic in the size of the server memory.

After a line of works that all involve some form of reshuffling, the scheme proposed by Shi et al. in 2011 was the first to avoid this. Moreover, it is conceptually easy and can be explained by about two dozen lines of pseudocode. The core idea is to organize the memory as a tree of buckets. Every block is assigned a random leaf node, which is stored on the client in a first step.  Reading is done by looking for the block in all buckets on the path to the assigned leaf node; a block is inserted by assigning it a new random leaf node and writing it to the root bucket. To distribute the blocks over the tree in their respective paths, a data-oblivious so-called eviction is run, which takes different forms in the different variants.

It is easy to prove that the resulting scheme is oblivious because the information given to the server is just the random path of a block, which is used only once. However, there is a chance that a bucket overflows, in which case there is no secure recovery. The neglibility of this event is much harder to prove.

The described construction requires the client to store the random paths of all blocks, which uses a lot of memory. However, it is possible to outsource this to server. Done recursively, this approach reduces the memorage usage on the client size to constant.

With O(log N) block size, the original recursive scheme incurs an overhead of O(log^3 N), which can be reduced to O(log^2 N) if the block size is chosen differently for the position maps

A newer variant dubbed Path ORAM uses logarithmic client memory to store a so-called stash where a whole path of buckets is loaded to for every access. After reading from and writing to the stash, the blocks are written back to a new random path as deeply as possible. Some blocks may remain in the stash. This approach allows a constant bucket size. Together with the recursion explained above, this results in a scheme that has the same overhead as the lower bound but with non-constant client memory and negligible failure probablity.

ORAM is useful in the context of MPC because it facilitates a significant speedup of certain computations. Basic MPC only allows to compute circuits. However, some computations are less efficient when implemented as a circuit instead of a RAM program. For example, accessing an array at one position has constant complexity in a RAM program, but an MPC circuit hiding the position has to access to whole array.

Elaine Shi presented a new compiler that compiles C code to a garbled circuit protocol using ORAM techniques. For the sake of efficiency, it does not allow to branch on secret values but it allows to use data structures such as arrays in an oblivious way. The goals of this compiler are usability for non-experts and efficiency improvements through static analysis. The latter is done by minimizing the use of ORAM and running computations on public and local data locally.

Branching on secret values would require that, for every step of the computation, the instruction is fetched from RAM and all possible instructions are executed in a way such that only the desired instruction takes effect. This includes further accesses to the RAM, which makes this approach quite expensive. Excluding branching does not rule out certain conditional constructs that involve executing both branches and then selecting the desired result according to a secret bit.

Being less oblivious also allows optimizations such as avoiding ORAM techniques for memory accesses that do not depend on secret data or using several ORAM instances for different data structures. For example, Dijkstra's shortest-path algorithm requires an array to store the weights of the edges and a priority queue during the execution.

One can conclude that the simplicity of binary tree ORAM makes it a useful tool to avoid the shortcomings of circuits in multiparty computation.

No comments:

Post a Comment