Enabling Interactive Transcription in an Indigenous Community

11/12/2020
by   Éric Le Ferrand, et al.
0

We propose a novel transcription workflow which combines spoken term detection and human-in-the-loop, together with a pilot experiment. This work is grounded in an almost zero-resource scenario where only a few terms have so far been identified, involving two endangered languages. We show that in the early stages of transcription, when the available data is insufficient to train a robust ASR system, it is possible to take advantage of the transcription of a small number of isolated words in order to bootstrap the transcription of a speech collection.

READ FULL TEXT
research
06/11/2021

Spoken Term Detection Methods for Sparse Transcription in Very Low-resource Settings

We investigate the efficiency of two very different spoken term detectio...
research
11/12/2020

Cross-lingual and Multilingual Spoken Term Detection for Low-Resource Indian Languages

Spoken Term Detection (STD) is the task of searching for words or phrase...
research
04/16/2023

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers

In this paper we propose a novel virtual simulation-pilot engine for spe...
research
10/06/2020

Textual Supervision for Visually Grounded Spoken Language Understanding

Visually-grounded models of spoken language understanding extract semant...
research
04/15/2022

Automated speech tools for helping communities process restricted-access corpora for language revival efforts

Many archival recordings of speech from endangered languages remain unan...

Please sign up or login with your details

Forgot password? Click here to reset