Learning Efficient Representations for Keyword Spotting with Triplet Loss

by   Roman Vygon, et al.

In the past few years, triplet loss-based metric embeddings have become a de-facto standard for several important computer vision problems, most notably, person reidentification. On the other hand, in the area of speech recognition the metric embeddings generated by the triplet loss are rarely used even for classification problems. We fill this gap showing that a combination of two representation learning techniques: a triplet loss-based embedding and a variant of kNN for classification instead of cross-entropy loss significantly (by 26 on a LibriSpeech-derived LibriWords datasets. To do so, we propose a novel phonetic similarity based triplet mining approach. We also match the current best published SOTA for Google Speech Commands dataset V2 10+2-class classification with an architecture that is about 6 times more compact and improve the current best published SOTA for 35-class classification on Google Speech Commands dataset V2 by over 40



There are no comments yet.


page 6


In Defense of the Triplet Loss for Person Re-Identification

In the past few years, the field of computer vision has gone through a r...

Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems

We present several methods to improve the generalisation of language ide...

A Quadruplet Loss for Enforcing Semantically Coherent Embeddings in Multi-output Classification Problems

This paper describes one objective function for learning semantically co...

A Decidability-Based Loss Function

Nowadays, deep learning is the standard approach for a wide range of pro...

Vector Embeddings with Subvector Permutation Invariance using a Triplet Enhanced Autoencoder

The use of deep neural network (DNN) autoencoders (AEs) has recently exp...

Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches

In this work, we evaluate two different image clustering objectives, k-m...

Scenario Aware Speech Recognition: Advancements for Apollo Fearless Steps CHiME-4 Corpora

In this study, we propose to investigate triplet loss for the purpose of...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.