SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

09/04/2023
by   Jiaxu Zhu, et al.
0

Recently, excellent progress has been made in speech recognition. However, pure data-driven approaches have struggled to solve the problem in domain-mismatch and long-tailed data. Considering that knowledge-driven approaches can help data-driven approaches alleviate their flaws, we introduce sememe-based semantic knowledge information to speech recognition (SememeASR). Sememe, according to the linguistic definition, is the minimum semantic unit in a language and is able to represent the implicit semantic information behind each word very well. Our experiments show that the introduction of sememe information can improve the effectiveness of speech recognition. In addition, our further experiments show that sememe knowledge can improve the model's recognition of long-tailed data and enhance the model's domain generalization ability.

READ FULL TEXT
research
05/21/2019

Acoustic-to-Word Models with Conversational Context Information

Conversational context information, higher-level knowledge that spans ac...
research
06/14/2023

Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure

To address the issue of poor generalization ability in end-to-end speech...
research
08/07/2018

Dialog-context aware end-to-end speech recognition

Existing speech recognition systems are typically built at the sentence ...
research
05/30/2017

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Eliminating the negative effect of non-stationary environmental noise is...
research
02/21/2023

Co-Driven Recognition of Semantic Consistency via the Fusion of Transformer and HowNet Sememes Knowledge

Semantic consistency recognition aims to detect and judge whether the se...
research
02/20/2023

Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session

Machine Listening, as usually formalized, attempts to perform a task tha...

Please sign up or login with your details

Forgot password? Click here to reset