Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models

The NURC Project that started in 1969 to study the cultured linguistic urban norm spoken in five Brazilian capitals, was responsible for compiling a large corpus for each capital. The digitized NURC/SP comprises 375 inquiries in 334 hours of recordings taken in São Paulo capital. Although 47 inquiries have transcripts, there was no alignment between the audio-transcription, and 328 inquiries were not transcribed. This article presents an evaluation and error analysis of three automatic speech recognition models trained with spontaneous speech in Portuguese and one model trained with prepared speech. The evaluation allowed us to choose the best model, using WER and CER metrics, in a manually aligned sample of NURC/SP, to automatically transcribe 284 hours.

READ FULL TEXT

page 8

page 10

research
10/11/2022

Automatic Speech Recognition of Low-Resource Languages Based on Chukchi

The following paper presents a project focused on the research and creat...
research
07/30/2021

USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments

We present a freely available speech corpus for the Uzbek language and r...
research
04/09/2018

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition

Describes an audio dataset of spoken words designed to help train and ev...
research
02/20/2021

The Use of Voice Source Features for Sung Speech Recognition

In this paper, we ask whether vocal source features (pitch, shimmer, jit...
research
05/21/2020

Large scale evaluation of importance maps in automatic speech recognition

In this paper, we propose a metric that we call the structured saliency ...
research
03/26/2021

Construction of a Large-scale Japanese ASR Corpus on TV Recordings

This paper presents a new large-scale Japanese speech corpus for trainin...
research
12/25/2021

Multi-Dialect Arabic Speech Recognition

This paper presents the design and development of multi-dialect automati...

Please sign up or login with your details

Forgot password? Click here to reset