The 2015 Sheffield System for Transcription of Multi-Genre Broadcast Media

12/21/2015
by   Oscar Saz, et al.
0

We describe the University of Sheffield system for participation in the 2015 Multi-Genre Broadcast (MGB) challenge task of transcribing multi-genre broadcast shows. Transcription was one of four tasks proposed in the MGB challenge, with the aim of advancing the state of the art of automatic speech recognition, speaker diarisation and automatic alignment of subtitles for broadcast media. Four topics are investigated in this work: Data selection techniques for training with unreliable data, automatic speech segmentation of broadcast media shows, acoustic modelling and adaptation in highly variable environments, and language modelling of multi-genre shows. The final system operates in multiple passes, using an initial unadapted decoding stage to refine segmentation, followed by three adapted passes: a hybrid DNN pass with input features normalised by speaker-based cepstral normalisation, another hybrid stage with input features normalised by speaker feature-MLLR transformations, and finally a bottleneck-based tandem stage with noise and speaker factorisation. The combination of these three system outputs provides a final error rate of 27.5 multi-genre shows.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2022

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems

Fundamental modelling differences between hybrid and end-to-end (E2E) au...
research
06/14/2019

Cumulative Adaptation for BLSTM Acoustic Models

This paper addresses the robust speech recognition problem as an adaptat...
research
06/10/2016

Automatic Genre and Show Identification of Broadcast Media

Huge amounts of digital videos are being produced and broadcast every da...
research
11/16/2015

Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation

This paper presents a new method for the discovery of latent domains in ...
research
02/01/2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription

State-of-the-art English automatic speech recognition systems typically ...
research
08/23/2019

Gender Representation in French Broadcast Corpora and Its Impact on ASR Performance

This paper analyzes the gender representation in four major corpora of F...
research
08/04/2023

Speaker Diarization of Scripted Audiovisual Content

The media localization industry usually requires a verbatim script of th...

Please sign up or login with your details

Forgot password? Click here to reset