Yet another Improvement of Plantard Arithmetic for Faster Kyber on Low-end 32-bit IoT Devices

09/01/2023
by   Junhao Huang, et al.
0

This paper presents another improved version of Plantard arithmetic that could speed up Kyber implementations on two low-end 32-bit IoT platforms (ARM Cortex-M3 and RISC-V) without SIMD extensions. Specifically, we further enlarge the input range of the Plantard arithmetic without modifying its computation steps. After tailoring the Plantard arithmetic for Kyber's modulus, we show that the input range of the Plantard multiplication by a constant is at least 2.45 times larger than the original design in TCHES2022. Then, two optimization techniques for efficient Plantard arithmetic on Cortex-M3 and RISC-V are presented. We show that the Plantard arithmetic supersedes both Montgomery and Barrett arithmetic on low-end 32-bit platforms. With the enlarged input range and the efficient implementation of the Plantard arithmetic on these platforms, we propose various optimization strategies for NTT/INTT. We minimize or entirely eliminate the modular reduction of coefficients in NTT/INTT by taking advantage of the larger input range of the proposed Plantard arithmetic on low-end 32-bit platforms. Furthermore, we propose two memory optimization strategies that reduce 23.50 implementation when compared to its counterpart on Cortex-M4. The proposed optimizations make the speed-version implementation more feasible on low-end IoT devices. Thanks to the aforementioned optimizations, our NTT/INTT implementation shows considerable speedups compared to the state-of-the-art work. Overall, we demonstrate the applicability of the speed-version Kyber implementation on memory-constrained IoT platforms and set new speed records for Kyber on these platforms.

READ FULL TEXT
research
05/24/2018

Deploy Large-Scale Deep Neural Networks in Resource Constrained IoT Devices with Local Quantization Region

Implementing large-scale deep neural networks with high computational co...
research
04/26/2022

Accelerating Fully Homomorphic Encryption by Bridging Modular and Bit-Level Arithmetic

The dramatic increase of data breaches in modern computing platforms has...
research
08/07/2023

FPPU: Design and Implementation of a Pipelined Full Posit Processing Unit

By exploiting the modular RISC-V ISA this paper presents the customizati...
research
01/15/2018

A Multi-layer Recursive Residue Number System

We present a method to increase the dynamical range of a Residue Number ...
research
04/20/2022

Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware

The majority of IoT devices like smartwatches, smart plugs, HVAC control...
research
06/04/2020

Neural Network for Low-Memory IoT Devices and MNIST Image Recognition Using Kernels Based on Logistic Map

The study presents a neural network, which uses filters based on logisti...
research
10/25/2016

On the optimality of ternary arithmetic for compactness and hardware design

In this paper, the optimality of ternary arithmetic is investigated unde...

Please sign up or login with your details

Forgot password? Click here to reset