SIMC 2.0: Improved Secure ML Inference Against Malicious Clients
In this paper, we study the problem of secure ML inference against a malicious client and a semi-trusted server such that the client only learns the inference output while the server learns nothing. This problem is first formulated by Lehmkuhl et al. with a solution (MUSE, Usenix Security'21), whose performance is then substantially improved by Chandran et al.'s work (SIMC, USENIX Security'22). However, there still exists a nontrivial gap in these efforts towards practicality, giving the challenges of overhead reduction and secure inference acceleration in an all-round way. We propose SIMC 2.0, which complies with the underlying structure of SIMC, but significantly optimizes both the linear and non-linear layers of the model. Specifically, (1) we design a new coding method for homomorphic parallel computation between matrices and vectors. It is custom-built through the insight into the complementarity between cryptographic primitives in SIMC. As a result, it can minimize the number of rotation operations incurred in the calculation process, which is very computationally expensive compared to other homomorphic operations e.g., addition, multiplication). (2) We reduce the size of the garbled circuit (GC) (used to calculate nonlinear activation functions, e.g., ReLU) in SIMC by about two thirds. Then, we design an alternative lightweight protocol to perform tasks that are originally allocated to the expensive GCs. Compared with SIMC, our experiments show that SIMC 2.0 achieves a significant speedup by up to 17.4× for linear layer computation, and at least 1.3× reduction of both the computation and communication overheads in the implementation of non-linear layers under different data dimensions. Meanwhile, SIMC 2.0 demonstrates an encouraging runtime boost by 2.3∼ 4.3× over SIMC on different state-of-the-art ML models.
READ FULL TEXT