SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability

08/07/2023
by   Xian Shi, et al.
0

Hotword customization is one of the important issues remained in ASR field - it is of value to enable users of ASR systems to customize names of entities, persons and other phrases. The past few years have seen both implicit and explicit modeling strategies for ASR contextualization developed. While these approaches have performed adequately, they still exhibit certain shortcomings such as instability in effectiveness. In this paper we propose Semantic-augmented Contextual-Paraformer (SeACo-Paraformer) a novel NAR based ASR system with flexible and effective hotword customization ability. It combines the accuracy of the AED-based model, the efficiency of the NAR model, and the excellent performance in contextualization. In 50,000 hours industrial big data experiments, our proposed model outperforms strong baselines in customization and general ASR tasks. Besides, we explore an efficient way to filter large scale incoming hotwords for further improvement. The source codes and industrial models proposed and compared are all opened as well as two hotword test sets.

READ FULL TEXT
research
03/02/2022

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Contextual biasing is an important and challenging task for end-to-end a...
research
05/21/2023

CASA-ASR: Context-Aware Speaker-Attributed ASR

Recently, speaker-attributed automatic speech recognition (SA-ASR) has a...
research
04/01/2022

Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems

Large-scale language models (LLMs) such as GPT-2, BERT and RoBERTa have ...
research
02/22/2023

Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation

We previously proposed contextual spelling correction (CSC) to correct t...
research
06/29/2022

Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems

End-2-end (E2E) models have become increasingly popular in some ASR task...
research
06/04/2023

SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings

Contextual spelling correction models are an alternative to shallow fusi...
research
01/29/2023

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

Conventional ASR systems use frame-level phoneme posterior to conduct fo...

Please sign up or login with your details

Forgot password? Click here to reset