DeepAI AI Chat
Log In Sign Up

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

by   Mahdi Nazemi, et al.
University of Southern California

Major advancements in building general-purpose and customized hardware have been one of the key enablers of versatility and pervasiveness of machine learning models such as deep neural networks. To sustain this ubiquitous deployment of machine learning models and cope with their computational and storage complexity, several solutions such as low-precision representation of model parameters using fixed-point representation and deploying approximate arithmetic operations have been employed. Studying the potency of such solutions in different applications requires integrating them into existing machine learning frameworks for high-level simulations as well as implementing them in hardware to analyze their effects on power/energy dissipation, throughput, and chip area. Lop is a library for design space exploration that bridges the gap between machine learning and efficient hardware realization. It comprises a Python module, which can be integrated with some of the existing machine learning frameworks and implements various customizable data representations including fixed-point and floating-point as well as approximate arithmetic operations.Furthermore, it includes a highly-parameterized Scala module, which allows synthesizing hardware based on the said data representations and arithmetic operations. Lop allows researchers and designers to quickly compare quality of their models using various data representations and arithmetic operations in Python and contrast the hardware cost of viable representations by synthesizing them on their target platforms (e.g., FPGA or ASIC). To the best of our knowledge, Lop is the first library that allows both software simulation and hardware realization using customized data representations and approximate computing techniques.


Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks

While Deep Neural Networks (DNNs) push the state-of-the-art in many mach...

Training DNNs with Hybrid Block Floating Point

The wide adoption of DNNs has given birth to unrelenting computing requi...

FPT: a Fixed-Point Accelerator for Torus Fully Homomorphic Encryption

Fully Homomorphic Encryption is a technique that allows computation on e...

Issues with rounding in the GCC implementation of the ISO 18037:2008 standard fixed-point arithmetic

We describe various issues caused by the lack of round-to-nearest mode i...

Synthesizing Power and Area Efficient Image Processing Pipelines on FPGAs using Customized Bit-widths

High-level synthesis (HLS) has received significant attention in recent ...