Low Bit-Rate Wideband Speech Coding: A Deep Generative Model based Approach

02/04/2021
by   Gang Min, et al.
7

Traditional low bit-rate speech coding approach only handles narrowband speech at 8kHz, which limits further improvements in speech quality. Motivated by recent successful exploration of deep learning methods for image and speech compression, this paper presents a new approach through vector quantization (VQ) of mel-frequency cepstral coefficients (MFCCs) and using a deep generative model called WaveGlow to provide efficient and high-quality speech coding. The coding feature is sorely an 80-dimension MFCCs vector for 16kHz wideband speech, then speech coding at the bit-rate throughout 1000-2000 bit/s could be scalably implemented by applying different VQ schemes for MFCCs vector. This new deep generative network based codec works fast as the WaveGlow model abandons the sample-by-sample autoregressive mechanism. We evaluated this new approach over the multi-speaker TIMIT corpus, and experimental results demonstrate that it provides better speech quality compared with the state-of-the-art classic MELPe codec at lower bit-rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
12/01/2017

Wavenet based low rate speech coding

Traditional parametric coding of speech facilitates low rate but provide...
research
03/26/2020

Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders

This study compares the performances of different algorithms for coding ...
research
07/01/2019

Analysis by Adversarial Synthesis -- A Novel Approach for Speech Vocoding

Classical parametric speech coding techniques provide a compact represen...
research
11/07/2018

High-quality speech coding with SampleRNN

We provide a speech coding scheme employing a generative model based on ...
research
08/09/2021

A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate

Recently, GAN vocoders have seen rapid progress in speech synthesis, sta...
research
10/14/2019

Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

In order to efficiently transmit and store speech signals, speech codecs...
research
02/18/2021

Generative Speech Coding with Predictive Variance Regularization

The recent emergence of machine-learning based generative models for spe...

Please sign up or login with your details

Forgot password? Click here to reset