Textual Echo Cancellation

08/13/2020
by   Shaojin Ding, et al.
0

In this paper, we propose Textual Echo Cancellation (TEC) - a framework for cancelling the text-to-speech (TTS) playback echo from overlapped speech recordings. Such a system can largely improve speech recognition performance and user experience for intelligent devices such as smart speakers, as the user can talk to the device while the device is still playing the TTS signal responding to the previous query. We implement this system by using a novel sequence-to-sequence model with multi-source attention that takes both the microphone mixture signal and the source text of the TTS playback as inputs, and predicts the enhanced audio. Experiments show that the textual information of the TTS playback is critical to the enhancement performance. Besides, the text sequence is much smaller in size compared with the raw acoustic signal of the TTS playback, and can be immediately transmitted to the device and the ASR server even before the playback is synthesized. Therefore, our proposed approach effectively reduces Internet communication and latency compared with alternative approaches such as acoustic echo cancellation (AEC).

READ FULL TEXT
research
10/20/2020

Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction

We present a novel multi-modal unspoken punctuation prediction system fo...
research
04/28/2020

Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise

Attention-based sequence-to-sequence (seq2seq) speech synthesis has achi...
research
09/09/2019

Spreech: A System for Privacy-Preserving Speech Transcription

New Advances in machine learning and the abundance of speech datasets ha...
research
09/09/2019

Prεεch: A System for Privacy-Preserving Speech Transcription

New Advances in machine learning and the abundance of speech datasets ha...
research
04/23/2018

ASR Performance Prediction on Unseen Broadcast Programs using Convolutional Neural Networks

In this paper, we address a relatively new task: prediction of ASR perfo...
research
02/27/2022

ICASSP 2022 Acoustic Echo Cancellation Challenge

The ICASSP 2022 Acoustic Echo Cancellation Challenge is intended to stim...
research
10/21/2022

Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR

Accurate prediction of the user intent to interact with a voice assistan...

Please sign up or login with your details

Forgot password? Click here to reset