DeepAI AI Chat
Log In Sign Up

NESC: Robust Neural End-2-End Speech Coding with GANs

by   Nicola Pia, et al.

Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge. Therefore, we present Neural End-2-End Speech Codec (NESC) a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps. The encoder uses a new architecture configuration, which relies on our proposed Dual-PathConvRNN (DPCRNN) layer, while the decoder architecture is based on our previous work Streamwise-StyleMelGAN. Our subjective listening tests on clean and noisy speech show that NESC is particularly robust to unseen conditions and signal perturbations.


Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding

Most current very low bit rate (VLBR) speech coding systems use hidden M...

Ultra-Low-Bitrate Speech Coding with Pretrained Transformers

Speech coding facilitates the transmission of speech over low-bandwidth ...

Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

In order to efficiently transmit and store speech signals, speech codecs...

Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding

Low and ultra-low-bitrate neural speech coding achieves unprecedented co...

Practical cognitive speech compression

This paper presents a new neural speech compression method that is pract...

Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

Audio codecs based on discretized neural autoencoders have recently been...

End-to-End Optimized Speech Coding with Deep Neural Networks

Modern compression algorithms are often the result of laborious domain-s...