End-to-End Optimized Speech Coding with Deep Neural Networks

10/25/2017
by   Srihari Kankanahalli, et al.
0

Modern compression algorithms are often the result of laborious domain-specific research; industry standards such as MP3, JPEG, and AMR-WB took years to develop and were largely hand-designed. We present a deep neural network model which optimizes all the steps of a wideband speech coding pipeline (compression, quantization, entropy coding, and decompression) end-to-end directly from raw speech data -- no manual feature engineering necessary, and it trains in hours. In testing, our DNN-based coder performs on par with the AMR-WB standard at a variety of bitrates ( 9kbps up to 24kbps). It also runs in realtime on a 3.8GhZ Intel CPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2019

Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Speech codecs learn compact representations of speech signals to facilit...
research
07/27/2019

DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks

The field of video compression has developed some of the most sophistica...
research
07/23/2020

End-to-end Learning of Compressible Features

Pre-trained convolutional neural networks (CNNs) are powerful off-the-sh...
research
11/27/2017

DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess

We present an end-to-end learning method for chess, relying on deep neur...
research
07/07/2022

NESC: Robust Neural End-2-End Speech Coding with GANs

Neural networks have proven to be a formidable tool to tackle the proble...
research
05/12/2019

Deep Vocoder: Low Bit Rate Speech Compression of Speech with Deep Autoencoder

Inspired by the success of deep neural networks (DNNs) in speech process...
research
03/27/2021

Scalable and Efficient Neural Speech Coding

This work presents a scalable and efficient neural waveform codec (NWC) ...

Please sign up or login with your details

Forgot password? Click here to reset