Tackling Early Sparse Gradients in Softmax Activation Using Leaky Squared Euclidean Distance

11/27/2018
by   Wei Shen, et al.
0

Softmax activation is commonly used to output the probability distribution over categories based on certain distance metric. In scenarios like one-shot learning, the distance metric is often chosen to be squared Euclidean distance between the query sample and the category prototype. This practice works well in most time. However, we find that choosing squared Euclidean distance may cause distance explosion leading gradients to be extremely sparse in the early stage of back propagation. We term this phenomena as the early sparse gradients problem. Though it doesn't deteriorate the convergence of the model, it may set up a barrier to further model improvement. To tackle this problem, we propose to use leaky squared Euclidean distance to impose a restriction on distances. In this way, we can avoid distance explosion and increase the magnitude of gradients. Extensive experiments are conducted on Omniglot and miniImageNet datasets. We show that using leaky squared Euclidean distance can improve one-shot classification accuracy on both datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2022

A new distance measurement and its application in K-Means Algorithm

K-Means clustering algorithm is one of the most commonly used clustering...
research
01/10/2016

On Clustering Time Series Using Euclidean Distance and Pearson Correlation

For time series comparisons, it has often been observed that z-score nor...
research
07/11/2022

Deep Squared Euclidean Approximation to the Levenshtein Distance for DNA Storage

Storing information in DNA molecules is of great interest because of its...
research
10/27/2016

Local Similarity-Aware Deep Feature Embedding

Existing deep embedding methods in vision tasks are capable of learning ...
research
07/15/2020

An Õ(n^5/4) Time ε-Approximation Algorithm for RMS Matching in a Plane

The 2-Wasserstein distance (or RMS distance) is a useful measure of simi...
research
03/21/2021

Hierarchical Representation based Query-Specific Prototypical Network for Few-Shot Image Classification

Few-shot image classification aims at recognizing unseen categories with...

Please sign up or login with your details

Forgot password? Click here to reset