Two-temperature logistic regression based on the Tsallis divergence

05/19/2017
by   Ehsan Amid, et al.
0

We develop a variant of multiclass logistic regression that achieves three properties: i) We minimize a non-convex surrogate loss which makes the method robust to outliers, ii) our method allows transitioning between non-convex and convex losses by the choice of the parameters, iii) the surrogate loss is Bayes consistent, even in the non-convex case. The algorithm has one weight vector per class and the surrogate loss is a function of the linear activations (one per class). The surrogate loss of an example with linear activation vector a and class c has the form -_t_1_t_2 (a_c - G_t_2(a)) where the two temperatures t_1 and t_2 "temper" the and , respectively, and G_t_2 is a generalization of the log-partition function. We motivate this loss using the Tsallis divergence. As the temperature of the logarithm becomes smaller than the temperature of the exponential, the surrogate loss becomes "more quasi-convex". Various tunings of the temperatures recover previous methods and tuning the degree of non-convexity is crucial in the experiments. The choice t_1<1 and t_2>1 performs best experimentally. We explain this by showing that t_1 < 1 caps the surrogate loss and t_2 >1 makes the predictive distribution have a heavy tail.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2019

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

We introduce a temperature into the exponential function and replace the...
research
04/05/2019

Logitron: Perceptron-augmented classification model based on an extended logistic loss function

Classification is the most important process in data analysis. However, ...
research
12/18/2018

Consistent Robust Adversarial Prediction for General Multiclass Classification

We propose a robust adversarial prediction framework for general multicl...
research
02/17/2023

Smoothly Giving up: Robustness for Simple Models

There is a growing need for models that are interpretable and have reduc...
research
02/05/2019

A General Theory for Structured Prediction with Smooth Convex Surrogates

In this work we provide a theoretical framework for structured predictio...
research
10/04/2019

Bregman-divergence-guided Legendre exponential dispersion model with finite cumulants (K-LED)

Exponential dispersion model is a useful framework in machine learning a...
research
02/22/2022

Nonconvex Extension of Generalized Huber Loss for Robust Learning and Pseudo-Mode Statistics

We propose an extended generalization of the pseudo Huber loss formulati...

Please sign up or login with your details

Forgot password? Click here to reset