Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images

08/14/2020
by   Leanne Nortje, et al.
0

We consider the task of multimodal one-shot speech-image matching. An agent is shown a picture along with a spoken word describing the object in the picture, e.g. cookie, broccoli and ice-cream. After observing one paired speech-image example per class, it is shown a new set of unseen pictures, and asked to pick the "ice-cream". Previous work attempted to tackle this problem using transfer learning: supervised models are trained on labelled background data not containing any of the one-shot classes. Here we compare transfer learning to unsupervised models trained on unlabelled in-domain data. On a dataset of paired isolated spoken and visual digits, we specifically compare unsupervised autoencoder-like models to supervised classifier and Siamese neural networks. In both unimodal and multimodal few-shot matching experiments, we find that transfer learning outperforms unsupervised training. We also present experiments towards combining the two methodologies, but find that transfer learning still performs best (despite idealised experiments showing the benefits of unsupervised learning).

READ FULL TEXT
research
12/10/2020

Direct multimodal few-shot learning of speech and images

We propose direct multimodal few-shot models that learn a shared embeddi...
research
11/09/2018

Multimodal One-Shot Learning of Speech and Images

Imagine a robot is shown new concepts visually together with spoken tags...
research
08/04/2023

Self-Normalizing Neural Network, Enabling One Shot Transfer Learning for Modeling EDFA Wavelength Dependent Gain

We present a novel ML framework for modeling the wavelength-dependent ga...
research
06/20/2023

Visually grounded few-shot word learning in low-resource settings

We propose a visually grounded speech model that learns new words and th...
research
09/04/2020

GPU-based Self-Organizing Maps for Post-Labeled Few-Shot Unsupervised Learning

Few-shot classification is a challenge in machine learning where the goa...
research
12/05/2022

Minimum Class Confusion based Transfer for Land Cover Segmentation in Rural and Urban Regions

Transfer Learning methods are widely used in satellite image segmentatio...
research
10/01/2020

Using ROC and Unlabeled Data for Increasing Low-Shot Transfer Learning Classification Accuracy

One of the most important characteristics of human visual intelligence i...

Please sign up or login with your details

Forgot password? Click here to reset