Variational Speech Waveform Compression to Catalyze Semantic Communications

12/10/2022
by   Shengshi Yao, et al.
0

We propose a novel neural waveform compression method to catalyze emerging speech semantic communications. By introducing nonlinear transform and variational modeling, we effectively capture the dependencies within speech frames and estimate the probabilistic distribution of the speech feature more accurately, giving rise to better compression performance. In particular, the speech signals are analyzed and synthesized by a pair of nonlinear transforms, yielding latent features. An entropy model with hyperprior is built to capture the probabilistic distribution of latent features, followed with quantization and entropy coding. The proposed waveform codec can be optimized flexibly towards arbitrary rate, and the other appealing feature is that it can be easily optimized for any differentiable loss function, including perceptual loss used in semantic communications. To further improve the fidelity, we incorporate residual coding to mitigate the degradation arising from quantization distortion at the latent space. Results indicate that achieving the same performance, the proposed method saves up to 27 widely used adaptive multi-rate wideband (AMR-WB) codec as well as emerging neural waveform coding methods.

READ FULL TEXT
research
05/25/2023

NVTC: Nonlinear Vector Transform Coding

In theory, vector quantization (VQ) is always better than scalar quantiz...
research
12/21/2021

Nonlinear Transform Source-Channel Coding for Semantic Communications

In this paper, we propose a new class of high-efficient deep joint sourc...
research
04/01/2022

Adaptive hybrid speech coding with a MLP LPC structure

In the last years there has been a growing interest for nonlinear speech...
research
11/04/2022

Wireless Deep Speech Semantic Transmission

In this paper, we propose a new class of high-efficiency semantic coded ...
research
07/06/2020

Nonlinear Transform Coding

We review a class of methods that can be collected under the name nonlin...
research
03/26/2023

Improved Nonlinear Transform Source-Channel Coding to Catalyze Semantic Communications

Recent deep learning methods have led to increased interest in solving h...
research
05/26/2019

Robust probabilistic modeling of photoplethysmography signals with application to the classification of premature beats

In this paper we propose a robust approach to model photoplethysmography...

Please sign up or login with your details

Forgot password? Click here to reset