Optimizing Non-decomposable Measures with Deep Networks

01/31/2018
by   Amartya Sanyal, et al.
0

We present a class of algorithms capable of directly training deep neural networks with respect to large families of task-specific performance measures such as the F-measure and the Kullback-Leibler divergence that are structured and non-decomposable. This presents a departure from standard deep learning techniques that typically use squared or cross-entropy loss functions (that are decomposable) to train neural networks. We demonstrate that directly training with task-specific loss functions yields much faster and more stable convergence across problems and datasets. Our proposed algorithms and implementations have several novel features including (i) convergence to first order stationary points despite optimizing complex objective functions; (ii) use of fewer training samples to achieve a desired level of convergence, (iii) a substantial reduction in training time, and (iv) a seamless integration of our implementation into existing symbolic gradient frameworks. We implement our techniques on a variety of deep architectures including multi-layer perceptrons and recurrent neural networks and show that on a variety of benchmark and real data sets, our algorithms outperform traditional approaches to training deep networks, as well as some recent approaches to task-specific training of neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2021

A novel multi-scale loss function for classification problems in machine learning

We introduce two-scale loss functions for use in various gradient descen...
research
02/21/2018

Smooth Loss Functions for Deep Top-k Classification

The top-k error is a common measure of performance in machine learning a...
research
03/17/2023

Alternate Loss Functions Can Improve the Performance of Artificial Neural Networks

All machine learning algorithms use a loss, cost, utility or reward func...
research
06/24/2020

Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks

Deep neural networks (DNNs) are powerful learning machines that have ena...
research
07/26/2018

Superpixel Sampling Networks

Superpixels provide an efficient low/mid-level representation of image d...
research
11/22/2016

Relaxed Earth Mover's Distances for Chain- and Tree-connected Spaces and their use as a Loss Function in Deep Learning

The Earth Mover's Distance (EMD) computes the optimal cost of transformi...
research
05/18/2018

Reconstruction of training samples from loss functions

This paper presents a new mathematical framework to analyze the loss fun...

Please sign up or login with your details

Forgot password? Click here to reset