Block-Sparse Recurrent Neural Networks

11/08/2017
by   Sharan Narang, et al.
0

Recurrent Neural Networks (RNNs) are used in state-of-the-art models in domains such as speech recognition, machine translation, and language modelling. Sparsity is a technique to reduce compute and memory requirements of deep learning models. Sparse RNNs are easier to deploy on devices and high-end server processors. Even though sparse operations need less compute and memory relative to their dense counterparts, the speed-up observed by using sparse operations is less than expected on different hardware platforms. In order to address this issue, we investigate two different approaches to induce block sparsity in RNNs: pruning blocks of weights in a layer and using group lasso regularization to create blocks of weights with zeros. Using these techniques, we demonstrate that we can create block-sparse RNNs with sparsity ranging from 80 by roughly 10x. Additionally, we can prune a larger dense network to recover this loss in accuracy while maintaining high block sparsity and reducing the overall parameter count. Our technique works with a variety of block sizes up to 32x32. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2017

Exploring Sparsity in Recurrent Neural Networks

Recurrent Neural Networks (RNN) are widely used to solve a variety of pr...
research
09/06/2019

RNN Architecture Learning with Sparse Regularization

Neural models for NLP typically use large numbers of parameters to reach...
research
06/16/2021

Algorithm to Compilation Co-design: An Integrated View of Neural Network Sparsity

Reducing computation cost, inference latency, and memory footprint of ne...
research
06/13/2022

EGRU: Event-based GRU for activity-sparse inference and learning

The scalability of recurrent neural networks (RNNs) is hindered by the s...
research
05/29/2019

Rethinking Full Connectivity in Recurrent Neural Networks

Recurrent neural networks (RNNs) are omnipresent in sequence modeling ta...
research
04/26/2018

Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip

Recurrent Neural Networks (RNNs) are powerful tools for solving sequence...
research
04/11/2023

Training Large Language Models Efficiently with Sparsity and Dataflow

Large foundation language models have shown their versatility in being a...

Please sign up or login with your details

Forgot password? Click here to reset