GALA: Greedy ComputAtion for Linear Algebra in Privacy-Preserved Neural Networks

05/05/2021
by   Qiao Zhang, et al.
0

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, privacy-preserved computation is still expensive. Our investigation has found that the most time-consuming component of the HE-based linear computation is a series of Permutation (Perm) operations that are imperative for dot product and convolution in privacy-preserved MLaaS. To this end, we propose GALA: Greedy computAtion for Linear Algebra in privacy-preserved neural networks, which views the HE-based linear computation as a series of Homomorphic Add, Mult and Perm operations and chooses the least expensive operation in each linear computation step to reduce the overall cost. GALA makes the following contributions: (1) It introduces a row-wise weight matrix encoding and combines the share generation that is needed for the GC-based nonlinear computation, to reduce the Perm operations for the dot product; (2) It designs a first-Add-second-Perm approach (named kernel grouping) to reduce Perm operations for convolution. As such, GALA efficiently reduces the cost for the HE-based linear computation, which is a critical building block in almost all of the recent frameworks for privacy-preserved neural networks, including GAZELLE (Usenix Security'18), DELPHI (Usenix Security'20), and CrypTFlow2 (CCS'20). With its deep optimization of the HE-based linear computation, GALA can be a plug-and-play module integrated into these systems to further boost their efficiency. Our experiments show that it achieves a significant speedup up to 700x for the dot product and 14x for the convolution computation under different data dimensions. Meanwhile, GALA demonstrates an encouraging runtime boost by 2.5x, 2.7x, 3.2x, 8.3x, 7.7x, and 7.5x over GAZELLE and 6.5x, 6x, 5.7x, 4.5x, 4.2x, and 4.1x over CrypTFlow2, on AlexNet, VGG, ResNet-18, ResNet-50, ResNet-101, and ResNet-152, respectively.

READ FULL TEXT

page 1

page 8

page 9

page 14

research
09/04/2022

Joint Linear and Nonlinear Computation across Functions for Efficient Privacy-Preserving Neural Network Inference

While it is encouraging to witness the recent development in privacy-pre...
research
10/19/2022

RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations

The training of graph neural networks (GNNs) is extremely time consuming...
research
06/25/2023

Im2win: Memory Efficient Convolution On SIMD Architectures

Convolution is the most expensive operation among neural network operati...
research
03/25/2020

ESSOP: Efficient and Scalable Stochastic Outer Product Architecture for Deep Learning

Deep neural networks (DNNs) have surpassed human-level accuracy in a var...
research
07/23/2019

PointAtrousGraph: Deep Hierarchical Encoder-Decoder with Atrous Convolution for Point Clouds

Motivated by the success of encoding multi-scale contextual information ...
research
05/03/2019

Convolution is outer product

The inner product operation between tensors is the corner stone of deep ...

Please sign up or login with your details

Forgot password? Click here to reset