Using fine-tuning and min lookahead beam search to improve Whisper

09/19/2023
by   Andrea Do, et al.
0

The performance of Whisper in low-resource languages is still far from perfect. In addition to a lack of training data on low-resource languages, we identify some limitations in the beam search algorithm used in Whisper. To address these issues, we fine-tune Whisper on additional data and propose an improved decoding algorithm. On the Vietnamese language, fine-tuning Whisper-Tiny with LoRA leads to an improvement of 38.49 in WER over the zero-shot Whisper-Tiny setting which is a further reduction of 1.45 compared to full-parameter fine-tuning. Additionally, by using Filter-Ends and Min Lookahead decoding algorithms, the WER reduces by 2.26 on average over a range of languages compared to standard beam search. These results generalise to larger Whisper model sizes. We also prove a theorem that Min Lookahead outperforms the standard beam search algorithm used in Whisper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2023

Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages

We train a MOS prediction model based on wav2vec 2.0 using the open-acce...
research
06/20/2023

Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts

Large language models (LLMs) are known to effectively perform tasks by s...
research
11/24/2022

Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes

In this paper, we move towards combining large parametric models with no...
research
04/04/2021

Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties

Models pre-trained on multiple languages have shown significant promise ...
research
03/16/2022

Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?

What can pre-trained multilingual sequence-to-sequence models like mBART...
research
08/29/2022

Assessing, testing and estimating the amount of fine-tuning by means of active information

A general framework is introduced to estimate how much external informat...
research
02/16/2021

Zero-Shot Adaptation for mmWave Beam-Tracking on Overhead Messenger Wires through Robust Adversarial Reinforcement Learning

This paper discusses the opportunity of bringing the concept of zero-sho...

Please sign up or login with your details

Forgot password? Click here to reset