Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice

10/16/2020
by   Andrii Babii, et al.
0

The importance of asymmetries in prediction problems arising in economics has been recognized for a long time. In this paper, we focus on binary choice problems in a data-rich environment with general loss functions. In contrast to the asymmetric regression problems, the binary choice with general loss functions and high-dimensional datasets is challenging and not well understood. Econometricians have studied binary choice problems for a long time, but the literature does not offer computationally attractive solutions in data-rich environments. In contrast, the machine learning literature has many computationally attractive algorithms that form the basis for much of the automated procedures that are implemented in practice, but it is focused on symmetric loss functions that are independent of individual characteristics. One of the main contributions of our paper is to show that the theoretically valid predictions of binary outcomes with arbitrary loss functions can be achieved via a very simple reweighting of the logistic regression, or other state-of-the-art machine learning techniques, such as boosting or (deep) neural networks. We apply our analysis to racial justice in pretrial detention.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2023

A survey and taxonomy of loss functions in machine learning

Most state-of-the-art machine learning techniques revolve around the opt...
research
07/18/2019

Minimizing the expected value of the asymmetric loss and an inequality of the variance of the loss

For some estimations and predictions, we solve minimization problems wit...
research
01/18/2023

An Analysis of Loss Functions for Binary Classification and Regression

This paper explores connections between margin-based loss functions and ...
research
02/23/2023

The Geometry of Mixability

Mixable loss functions are of fundamental importance in the context of p...
research
02/17/2020

Sharp Asymptotics and Optimal Performance for Inference in Binary Models

We study convex empirical risk minimization for high-dimensional inferen...
research
09/28/2022

TRBoost: A Generic Gradient Boosting Machine based on Trust-region Method

A generic Gradient Boosting Machine called Trust-region Boosting (TRBoos...
research
02/21/2020

Generalisation error in learning with random features and the hidden manifold model

We study generalised linear regression and classification for a syntheti...

Please sign up or login with your details

Forgot password? Click here to reset