Scalable Secure Computation of Statistical Functions with Applications to k-Nearest Neighbors

01/22/2018
by   Hayim Shaul, et al.
0

Given a set S of n d-dimensional points, the k-nearest neighbors (KNN) is the problem of quickly finding k points in S that are nearest to a query point q. The k-nearest neighbors problem has applications in machine learning for classification and regression and also in searching. The secure version of KNN where either q or S are encrypted, has applications such as providing services over sensitive (such as medical or localization) data. In this work we present the first scalable and efficient algorithm for solving KNN with Fully Homomorphic Encryption (FHE) that is realized by a polynomial whose degree is independent of n, the number of points. We implemented our algorithm in an open source library based on HELib implementation for the Brakerski-Gentry-Vakuntanthan's FHE scheme, and ran experiments on MIT's OpenStack cloud. Our experiments show that given a query point q, we can find the set of 20 nearest points out of more than 1000 points in less than an hour. Our result introduces a statistical coreset, which is a data summarization technique that allows statistical functions, such as moments, to be efficiently and scalably computed. As a central tool, we design a new coin toss technique which we use to build the coreset. This coin toss technique and computation of statistical functions may be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2019

Active Search for Nearest Neighbors

In pattern recognition or machine learning, it is a very fundamental tas...
research
05/15/2020

Efficient Distributed Algorithms for the K-Nearest Neighbors Problem

The K-nearest neighbors is a basic problem in machine learning with nume...
research
08/05/2023

Secure Computation over Encrypted Databases

Sensitive applications running on the cloud often require data to be sto...
research
02/25/2019

Adaptive Estimation for Approximate k-Nearest-Neighbor Computations

Algorithms often carry out equally many computations for "easy" and "har...
research
04/05/2020

A new hashing based nearest neighbors selection technique for big datasets

KNN has the reputation to be the word simplest but efficient supervised ...
research
10/10/2018

Technical Report: KNN Joins Using a Hybrid Approach: Exploiting CPU/GPU Workload Characteristics

This paper studies finding the K nearest neighbors (KNN) of all points i...
research
12/20/2021

Efficient Wind Speed Nowcasting with GPU-Accelerated Nearest Neighbors Algorithm

This paper proposes a simple yet efficient high-altitude wind nowcasting...

Please sign up or login with your details

Forgot password? Click here to reset