Risk Aversion In Learning Algorithms and an Application To Recommendation Systems

05/10/2022
by   Andreas Haupt, et al.
0

Consider a bandit learning environment. We demonstrate that popular learning algorithms such as Upper Confidence Band (UCB) and ε-Greedy exhibit risk aversion: when presented with two arms of the same expectation, but different variance, the algorithms tend to not choose the riskier, i.e. higher variance, arm. We prove that ε-Greedy chooses the risky arm with probability tending to 0 when faced with a deterministic and a Rademacher-distributed arm. We show experimentally that UCB also shows risk-averse behavior, and that risk aversion is present persistently in early rounds of learning even if the riskier arm has a slightly higher expectation. We calibrate our model to a recommendation system and show that algorithmic risk aversion can decrease consumer surplus and increase homogeneity. We discuss several extensions to other bandit algorithms, reinforcement learning, and investigate the impacts of algorithmic risk aversion for decision theory.

READ FULL TEXT
research
06/24/2022

Risk-averse Contextual Multi-armed Bandit Problem with Linear Payoffs

In this paper we consider the contextual multi-armed bandit problem for ...
research
08/18/2022

Communication-Efficient Collaborative Best Arm Identification

We investigate top-m arm identification, a basic problem in bandit theor...
research
06/04/2019

The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

We study the behavior of stochastic bandits algorithms under strategic b...
research
11/13/2020

Rebounding Bandits for Modeling Satiation Effects

Psychological research shows that enjoyment of many goods is subject to ...
research
07/27/2023

Adversarial Sleeping Bandit Problems with Multiple Plays: Algorithm and Ranking Application

This paper presents an efficient algorithm to solve the sleeping bandit ...
research
05/10/2014

Functional Bandits

We introduce the functional bandit problem, where the objective is to fi...
research
10/11/2022

The Typical Behavior of Bandit Algorithms

We establish strong laws of large numbers and central limit theorems for...

Please sign up or login with your details

Forgot password? Click here to reset