Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices

03/31/2021
by   Gonçalo Mordido, et al.
0

We demonstrate that 1x1-convolutions in 1D time-channel separable convolutions may be replaced by constant, sparse random ternary matrices with weights in {-1,0,+1}. Such layers do not perform any multiplications and do not require training. Moreover, the matrices may be generated on the chip during computation and therefore do not require any memory access. With the same parameter budget, we can afford deeper and more expressive models, improving the Pareto frontiers of existing models on several tasks. For command recognition on Google Speech Commands v1, we improve the state-of-the-art accuracy from 97.21% to 97.41% at the same network size. Alternatively, we can lower the cost of existing models. For speech recognition on Librispeech, we half the number of weights to be trained while only sacrificing about 1% of the floating-point baseline's word error rate.

READ FULL TEXT

page 2

page 3

research
01/27/2020

Scaling Up Online Speech Recognition Using ConvNets

We design an online end-to-end speech recognition system based on Time-D...
research
01/16/2017

Towards a New Interpretation of Separable Convolutions

In recent times, the use of separable convolutions in deep convolutional...
research
02/12/2021

Depthwise Separable Convolutions Allow for Fast and Memory-Efficient Spectral Normalization

An increasing number of models require the control of the spectral norm ...
research
10/08/2021

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

In this paper, we propose TitaNet, a novel neural network architecture f...
research
06/03/2019

Separable Layers Enable Structured Efficient Linear Substitutions

In response to the development of recent efficient dense layers, this pa...
research
02/23/2021

Memory-efficient Speech Recognition on Smart Devices

Recurrent transducer models have emerged as a promising solution for spe...
research
07/10/2020

Conditioned Time-Dilated Convolutions for Sound Event Detection

Sound event detection (SED) is the task of identifying sound events alon...

Please sign up or login with your details

Forgot password? Click here to reset