Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation

07/16/2022
by   Yu-Shan Tai, et al.
0

Recently, deep convolutional neural networks (CNNs) have achieved many eye-catching results. However, deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference, i.e., activation. Existing research utilizes mixed-precision and dimension reduction to reduce computational complexity but pays less attention to its application for activation compression. To further exploit the redundancy in activation, we propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates specific compression policies according to their importance. In addition, the proposed dynamic searching technique enlarges search space and finds out the optimal bit-width allocation automatically. Our experimental results show that the proposed methods improve 3.54 mixed-precision methods on ResNet18 and MobileNetv2, respectively.

READ FULL TEXT
research
10/17/2021

Compression-aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations

Convolutional neural networks (CNNs) achieve remarkable performance in a...
research
04/13/2020

Rethinking Differentiable Search for Mixed-Precision Neural Networks

Low-precision networks, with weights and activations quantized to low bi...
research
01/12/2021

Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks

As neural networks gain widespread adoption in embedded devices, there i...
research
05/30/2019

Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers

This paper presents a novel end-to-end methodology for enabling the depl...
research
12/10/2020

A MAC-less Neural Inference Processor Supporting Compressed, Variable Precision Weights

This paper introduces two architectures for the inference of convolution...
research
10/01/2018

Dynamic Sparse Graph for Efficient Deep Learning

We propose to execute deep neural networks (DNNs) with dynamic and spars...
research
04/06/2022

Knowledge Base Index Compression via Dimensionality and Precision Reduction

Recently neural network based approaches to knowledge-intensive NLP task...

Please sign up or login with your details

Forgot password? Click here to reset