FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism

02/24/2021
by   Je Yang, et al.
0

In this paper, we present a deep reinforcement learning platform named FIXAR which employs fixed-point data types and arithmetic units for the first time using a SW/HW co-design approach. Starting from 32-bit fixed-point data, Quantization-Aware Training (QAT) reduces its data precision based on the range of activations and performs retraining to minimize the reward degradation. FIXAR proposes the adaptive array processing core composed of configurable processing elements to support both intra-layer parallelism and intra-batch parallelism for high-throughput inference and training. Finally, FIXAR was implemented on Xilinx U50 and achieves 25293.3 inferences per second (IPS) training throughput and 2638.0 IPS/W accelerator efficiency, which is 2.7 times faster and 15.4 times more energy efficient than those of the CPU-GPU platform without any accuracy degradation.

READ FULL TEXT

page 1

page 4

page 5

research
02/09/2015

Deep Learning with Limited Numerical Precision

Training of large-scale deep neural networks is often constrained by the...
research
02/10/2022

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Neural network quantization is a promising compression technique to redu...
research
05/12/2023

Accelerator-Aware Training for Transducer-Based Speech Recognition

Machine learning model weights and activations are represented in full-p...
research
03/23/2022

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Increasingly larger and better Transformer models keep advancing state-o...
research
07/16/2019

Learning Multimodal Fixed-Point Weights using Gradient Descent

Due to their high computational complexity, deep neural networks are sti...
research
02/19/2020

SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization

Deep neural networks (DNNs) have been proven to outperform classical met...
research
01/11/2023

TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks

TAPS is a Topology-Aware intra-operator Parallelism strategy Searching a...

Please sign up or login with your details

Forgot password? Click here to reset