Unsupervised Domain Discovery using Latent Dirichlet Allocation for Acoustic Modelling in Speech Recognition

09/08/2015
by   Mortaza Doulaty, et al.
0

Speech recognition systems are often highly domain dependent, a fact widely reported in the literature. However the concept of domain is complex and not bound to clear criteria. Hence it is often not evident if data should be considered to be out-of-domain. While both acoustic and language models can be domain specific, work in this paper concentrates on acoustic modelling. We present a novel method to perform unsupervised discovery of domains using Latent Dirichlet Allocation (LDA) modelling. Here a set of hidden domains is assumed to exist in the data, whereby each audio segment can be considered to be a weighted mixture of domain properties. The classification of audio segments into domains allows the creation of domain specific acoustic models for automatic speech recognition. Experiments are conducted on a dataset of diverse speech data covering speech from radio and TV broadcasts, telephone conversations, meetings, lectures and read speech, with a joint training set of 60 hours and a test set of 6 hours. Maximum A Posteriori (MAP) adaptation to LDA based domains was shown to yield relative Word Error Rate (WER) improvements of up to 16 compared with models adapted with human-labelled prior domain knowledge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2015

Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation

This paper presents a new method for the discovery of latent domains in ...
research
07/02/2019

Latent Dirichlet Allocation Based Acoustic Data Selection for Automatic Speech Recognition

Selecting in-domain data from a large pool of diverse and out-of-domain ...
research
09/08/2015

Data-selective Transfer Learning for Multi-Domain Speech Recognition

Negative transfer in training of acoustic models for automatic speech re...
research
03/09/2020

Toward Cross-Domain Speech Recognition with End-to-End Models

In the area of multi-domain speech recognition, research in the past foc...
research
08/16/2018

Toward domain-invariant speech recognition via large scale training

Current state-of-the-art automatic speech recognition systems are traine...
research
06/23/2022

A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery

Latent Dirichlet allocation (LDA) is widely used for unsupervised topic ...
research
05/16/2020

Learning Joint Articulatory-Acoustic Representations with Normalizing Flows

The articulatory geometric configurations of the vocal tract and the aco...

Please sign up or login with your details

Forgot password? Click here to reset