Towards Error-Resilient Neural Speech Coding

07/03/2022
by   Huaying Xue, et al.
0

Neural audio coding has shown very promising results recently in the literature to largely outperform traditional codecs but limited attention has been paid on its error resilience. Neural codecs trained considering only source coding tend to be extremely sensitive to channel noises, especially in wireless channels with high error rate. In this paper, we investigate how to elevate the error resilience of neural audio codecs for packet losses that often occur during real-time communications. We propose a feature-domain packet loss concealment algorithm (FD-PLC) for real-time neural speech coding. Specifically, we introduce a self-attention-based module on the received latent features to recover lost frames in the feature domain before the decoder. A hybrid segment-level and frame-level frequency-domain discriminator is employed to guide the network to focus on both the generative quality of lost frames and the continuity with neighbouring frames. Experimental results on several error patterns show that the proposed scheme can achieve better robustness compared with the corresponding error-free and error-resilient baselines. We also show that feature-domain concealment is superior to waveform-domain counterpart as post-processing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2023

Grace++: Loss-Resilient Real-Time Video Communication under High Network Latency

In real-time videos, resending any packets, especially in networks with ...
research
03/31/2022

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction

Speaker extraction algorithm extracts the target speech from a mixture s...
research
05/11/2022

Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model

As deep speech enhancement algorithms have recently demonstrated capabil...
research
03/26/2022

A Neural Vocoder Based Packet Loss Concealment Algorithm

The packet loss problem seriously affects the quality of service in Voic...
research
07/04/2022

TMGAN-PLC: Audio Packet Loss Concealment using Temporal Memory Generative Adversarial Network

Real-time communications in packet-switched networks have become widely ...
research
10/28/2019

The Effect of Erasure Coding on the Burstiness of Packet Loss

The perceived quality of real-time media delivered over IP networks depe...
research
02/27/2019

S-PRAC: Fast Partial Packet Recovery with Network Coding in Very Noisy Wireless Channels

Well-known error detection and correction solutions in wireless communic...

Please sign up or login with your details

Forgot password? Click here to reset