Unsupervised Word Segmentation from Speech with Attention

06/18/2018
by   Pierre Godard, et al.
0

We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2018

Bayesian Models for Unit Discovery on a Very Low Resource Language

Developing speech technologies for low-resource languages has become a v...
research
08/14/2018

R-grams: Unsupervised Learning of Semantic Units in Natural Language

This paper introduces a novel type of data-driven segmented unit that we...
research
10/18/2019

Controlling Utterance Length in NMT-based Word Segmentation with Attention

One of the basic tasks of computational language documentation (CLD) is ...
research
06/08/2021

Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings

When documenting oral-languages, Unsupervised Word Segmentation (UWS) fr...
research
06/29/2019

Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings

Since Bahdanau et al. [1] first introduced attention for neural machine ...
research
03/30/2020

Investigating Language Impact in Bilingual Approaches for Computational Language Documentation

For endangered languages, data collection campaigns have to accommodate ...
research
07/27/2018

A small Griko-Italian speech translation corpus

This paper presents an extension to a very low-resource parallel corpus ...

Please sign up or login with your details

Forgot password? Click here to reset