Voltage Scaling for Partitioned Systolic Array in A Reconfigurable Platform

02/13/2021
by   Rourab paul, et al.
0

The exponential emergence of Field Programmable Gate Array (FPGA) has accelerated the research of hardware implementation of Deep Neural Network (DNN). Among all DNN processors, domain specific architectures, such as, Google's Tensor Processor Unit (TPU) have outperformed conventional GPUs. However, implementation of TPUs in reconfigurable hardware should emphasize energy savings to serve the green computing requirement. Voltage scaling, a popular approach towards energy savings, can be a bit critical in FPGA as it may cause timing failure if not done in an appropriate way. In this work, we present an ultra low power FPGA implementation of a TPU for edge applications. We divide the systolic-array of a TPU into different FPGA partitions, where each partition uses different near threshold (NTC) biasing voltages to run its FPGA cores. The biasing voltage for each partition is roughly calculated by the proposed offline schemes. However, further calibration of biasing voltage is done by the proposed online scheme. Four clustering algorithms based on the slack value of different design paths study the partitioning of FPGA. To overcome the timing failure caused by NTC, the higher slack paths are placed in lower voltage partitions and lower slack paths are placed in higher voltage partitions. The proposed architecture is simulated in Artix-7 FPGA using the Vivado design suite and Python tool. The simulation results substantiate the implementation of voltage scaled TPU in FPGAs and also justifies its power efficiency.

READ FULL TEXT
research
08/17/2022

Near Threshold Computation of Partitioned Ring Learning With Error (RLWE) Post Quantum Cryptography on Reconfigurable Architecture

Ring Learning With Error (RLWE) algorithm is used in Post Quantum Crypto...
research
05/12/2019

Reconfigurable Hardware Implementation of the Successive Overrelaxation Method

In this chapter, we study the feasibility of implementing SOR in reconfi...
research
10/23/2017

Amorphous Dynamic Partial Reconfiguration with Flexible Boundaries to Remove Fragmentation

Dynamic partial reconfiguration (DPR) allows one region of an field-prog...
research
06/03/2021

Multiplierless MP-Kernel Machine For Energy-efficient Edge Devices

We present a novel framework for designing multiplierless kernel machine...
research
04/07/2021

NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic

While there is a large body of research on efficient processing of deep ...
research
03/10/2018

Integrated Optimization of Partitioning, Scheduling and Floorplanning for Partially Dynamically Reconfigurable Systems

Confronted with the challenge of high performance for applications and t...
research
05/21/2019

FPGA-based Mining of Lyra2REv2 Cryptocurrencies

Lyra2REv2 is a hashing algorithm that consists of a chain of individual ...

Please sign up or login with your details

Forgot password? Click here to reset