A Speech Test Set of Practice Business Presentations with Additional Relevant Texts

08/02/2019
by   Dominik Macháček, et al.
0

We present a test corpus of audio recordings and transcriptions of presentations of students' enterprises together with their slides and web-pages. The corpus is intended for evaluation of automatic speech recognition (ASR) systems, especially in conditions where the prior availability of in-domain vocabulary and named entities is benefitable. The corpus consists of 39 presentations in English, each up to 90 seconds long. The speakers are high school students from European countries with English as their second language. We benchmark three baseline ASR systems on the corpus and show their imperfection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2022

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline

This paper introduces a new corpus of Mandarin-English code-switching sp...
research
01/26/2022

The Norwegian Parliamentary Speech Corpus

The Norwegian Parliamentary Speech Corpus (NPSC) is a speech dataset wit...
research
05/25/2023

Svarah: Evaluating English ASR Systems on Indian Accents

India is the second largest English-speaking country in the world with a...
research
03/26/2021

Construction of a Large-scale Japanese ASR Corpus on TV Recordings

This paper presents a new large-scale Japanese speech corpus for trainin...
research
02/15/2021

Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon

In this paper, we introduce the first large vocabulary speech recognitio...
research
02/11/2023

ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems

Recent years have witnessed wider adoption of Automated Speech Recogniti...
research
04/06/2021

EasyCall corpus: a dysarthric speech dataset

This paper introduces a new dysarthric speech command dataset in Italian...

Please sign up or login with your details

Forgot password? Click here to reset