Transfer Learning for Speech Recognition on a Budget

06/01/2017
by   Julius Kunze, et al.
0

End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural network originally trained for English ASR to the German language. We show that this technique allows faster training on consumer-grade resources while requiring less training data in order to achieve the same accuracy, thereby lowering the cost of training ASR models in other languages. Model introspection revealed that small adaptations to the network's weights were sufficient for good performance, especially for inner layers.

READ FULL TEXT
research
09/02/2021

Coarse-To-Fine And Cross-Lingual ASR Transfer

End-to-end neural automatic speech recognition systems achieved recently...
research
04/03/2022

Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents

Automated Speech Recognition (ASR) is an interdisciplinary application o...
research
08/12/2020

Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

Transfer learning (TL) is widely used in conventional hybrid automatic s...
research
11/01/2021

A transfer learning based approach for pronunciation scoring

Phone-level pronunciation scoring is a challenging task, with performanc...
research
07/20/2022

Towards Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription

Automatic speech recognition (ASR) has progressed significantly in recen...
research
09/21/2020

End-to-End Bengali Speech Recognition

Bengali is a prominent language of the Indian subcontinent. However, whi...

Please sign up or login with your details

Forgot password? Click here to reset