Few-shot learning with attention-based sequence-to-sequence models

11/08/2018
by   Bertrand Higy, et al.
0

End-to-end approaches have recently become popular as a means of simplifying the training and deployment of speech recognition systems. However, they often require large amounts of data to perform well on large vocabulary tasks. With the aim of making end-to-end approaches usable by a broader range of researchers, we explore the potential to use end-to-end methods in small vocabulary contexts where smaller datasets may be used. A significant drawback of small-vocabulary systems is the difficulty of expanding the vocabulary beyond the original training samples -- therefore we also study strategies to extend the vocabulary with only few examples per new class (few-shot learning). Our results show that an attention-based encoder-decoder can be competitive against a strong baseline on a small vocabulary keyword classification task, reaching 97.5 shows promising results on the few-shot learning problem where a simple strategy achieved 34.8 each new class. This score goes up to 80.3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

'Less Than One'-Shot Learning: Learning N Classes From M<N Samples

Deep neural networks require large training sets but suffer from high co...
research
05/08/2018

Improved training of end-to-end attention models for speech recognition

Sequence-to-sequence attention-based models on subword units allow simpl...
research
08/06/2020

Few-Shot Drum Transcription in Polyphonic Music

Data-driven approaches to automatic drum transcription (ADT) are often l...
research
09/21/2022

A Few Shot Multi-Representation Approach for N-gram Spotting in Historical Manuscripts

Despite recent advances in automatic text recognition, the performance r...
research
08/22/2017

Dynamic Input Structure and Network Assembly for Few-Shot Learning

The ability to learn from a small number of examples has been a difficul...
research
11/23/2020

End-to-end Silent Speech Recognition with Acoustic Sensing

Silent speech interfaces (SSI) has been an exciting area of recent inter...
research
02/21/2023

Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation

For end-to-end speech translation, regularizing the encoder with the Con...

Please sign up or login with your details

Forgot password? Click here to reset