CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

05/29/2023
by   Juan Zuluaga-Gomez, et al.
0

Despite the recent advancements in Automatic Speech Recognition (ASR), the recognition of accented speech still remains a dominant problem. In order to create more inclusive ASR systems, research has shown that the integration of accent information, as part of a larger ASR framework, can lead to the mitigation of accented speech errors. We address multilingual accent classification through the ECAPA-TDNN and Wav2Vec 2.0/XLSR architectures which have been proven to perform well on a variety of speech-related downstream tasks. We introduce a simple-to-follow recipe aligned to the SpeechBrain toolkit for accent classification based on Common Voice 7.0 (English) and Common Voice 11.0 (Italian, German, and Spanish). Furthermore, we establish new state-of-the-art for English accent classification with as high as 95 accuracy. We also study the internal categorization of the Wav2Vev 2.0 embeddings through t-SNE, noting that there is a level of clustering based on phonological similarity. (Our recipe is open-source in the SpeechBrain toolkit, see: https://github.com/speechbrain/speechbrain/tree/develop/recipes)

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2020

The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS

This paper presents the sequence-to-sequence (seq2seq) baseline system f...
research
12/13/2019

Common Voice: A Massively-Multilingual Speech Corpus

The Common Voice corpus is a massively-multilingual collection of transc...
research
06/06/2023

Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses

Alzheimer's Disease (AD) is the world's leading neurodegenerative diseas...
research
11/30/2022

EURO: ESPnet Unsupervised ASR Open-source Toolkit

This paper describes the ESPnet Unsupervised ASR Open-source Toolkit (EU...
research
11/04/2022

Multi-blank Transducers for Speech Recognition

This paper proposes a modification to RNN-Transducer (RNN-T) models for ...
research
04/01/2021

Configurable Privacy-Preserving Automatic Speech Recognition

Voice assistive technologies have given rise to far-reaching privacy and...
research
04/12/2022

ASR in German: A Detailed Error Analysis

The amount of freely available systems for automatic speech recognition ...

Please sign up or login with your details

Forgot password? Click here to reset