Privacy-preserving Similarity Calculation of Speaker Features Using Fully Homomorphic Encryption

Recent advances in machine learning techniques are enabling Automated Speech Recognition (ASR) more accurate and practical. The evidence of this can be seen in the rising number of smart devices with voice processing capabilities. More and more devices around us are in-built with ASR technology. This poses serious privacy threats as speech contains unique biometric characteristics and personal data. However, the privacy concern can be mitigated if the voice features are processed in the encrypted domain. Within this context, this paper proposes an algorithm to redesign the back-end of the speaker verification system using fully homomorphic encryption techniques. The solution exploits the Cheon-Kim-Kim-Song (CKKS) fully homomorphic encryption scheme to obtain a real-time and non-interactive solution. The proposed solution contains a novel approach based on Newton Raphson method to overcome the limitation of CKKS scheme (i.e., calculating an inverse square-root of an encrypted number). This provides an efficient solution with less multiplicative depths for a negligible loss in accuracy. The proposed algorithm is validated using a well-known speech dataset. The proposed algorithm performs encrypted-domain verification in real-time (with less than 1.3 seconds delay) for a 2.8% equal-error-rate loss compared to plain-domain verification.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset