AVATAR: Robust Voice Search Engine Leveraging Autoregressive Document Retrieval and Contrastive Learning

09/04/2023
by   Yi-Cheng Wang, et al.
0

Voice, as input, has progressively become popular on mobiles and seems to transcend almost entirely text input. Through voice, the voice search (VS) system can provide a more natural way to meet user's information needs. However, errors from the automatic speech recognition (ASR) system can be catastrophic to the VS system. Building on the recent advanced lightweight autoregressive retrieval model, which has the potential to be deployed on mobiles, leading to a more secure and personal VS assistant. This paper presents a novel study of VS leveraging autoregressive retrieval and tackles the crucial problems facing VS, viz. the performance drop caused by ASR noise, via data augmentations and contrastive learning, showing how explicit and implicit modeling the noise patterns can alleviate the problems. A series of experiments conducted on the Open-Domain Question Answering (ODSQA) confirm our approach's effectiveness and robustness in relation to some strong baseline systems.

READ FULL TEXT
research
09/26/2022

On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering

Interacting with a speech interface to query a Question Answering (QA) s...
research
05/18/2023

A Lexical-aware Non-autoregressive Transformer-based ASR Model

Non-autoregressive automatic speech recognition (ASR) has become a mains...
research
08/05/2019

V2S attack: building DNN-based voice conversion from automatic speaker verification

This paper presents a new voice impersonation attack using voice convers...
research
12/19/2018

Streaming Voice Query Recognition using Causal Convolutional Recurrent Neural Networks

Voice-enabled commercial products are ubiquitous, typically enabled by l...
research
10/08/2021

SCaLa: Supervised Contrastive Learning for End-to-End Automatic Speech Recognition

End-to-end Automatic Speech Recognition (ASR) models are usually trained...
research
10/05/2021

Voice Information Retrieval In Collaborative Information Seeking

Voice information retrieval is a technique that provides Information Ret...
research
04/10/2022

Deep Conditional Representation Learning for Drum Sample Retrieval by Vocalisation

Imitating musical instruments with the human voice is an efficient way o...

Please sign up or login with your details

Forgot password? Click here to reset