Adaptive prior probabilities via optimization of risk and entropy

03/18/2018
by   Armen E. Allahverdyan, et al.
0

An agent choosing between various actions tends to take the one with the lowest loss. But this choice is arguably too rigid (not adaptive) to be useful in complex situations, e.g. where exploration-exploitation trade-off is relevant, or in creative task solving. Here we study an agent that -- given a certain average utility invested into adaptation -- chooses his actions via probabilities obtained through optimizing the entropy. As we argue, entropy minimization corresponds to a risk-averse agent, whereas a risk-seeking agent will maximize the entropy. The entropy minimization can (under certain conditions) recover the epsilon-greedy probabilities known in reinforced learning. We show that the entropy minimization -- in contrast to its maximization -- leads to rudimentary forms of intelligent behavior: (i) the agent accounts for extreme events, especially when he did not invest much into adaptation. (ii) He chooses the action related to lesser loss (lesser of two evils) when confronted with two actions with comparable losses. (iii) The agent is subject to effects similar to cognitive dissonance and frustration. Neither of these features are shown by the risk-seeking agent whose probabilities are given by the maximum entropy. Mathematically, the difference between entropy maximization versus its minimization corresponds with maximizing a convex function (in a convex domain, i.e.convex programming) versus minimizing it (concave programming).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2021

Understanding the origin of information-seeking exploration in probabilistic objectives for control

The exploration-exploitation trade-off is central to the description of ...
research
07/10/2020

Generalized Maximum Entropy for Supervised Classification

The maximum entropy principle advocates to evaluate events' probabilitie...
research
07/25/2018

Variational Bayesian Reinforcement Learning with Regret Bounds

We consider the exploration-exploitation trade-off in reinforcement lear...
research
02/05/2020

Entropy Minimization vs. Diversity Maximization for Domain Adaptation

Entropy minimization has been widely used in unsupervised domain adaptat...
research
03/12/2022

Maximization of Mathai's Entropy under the Constraints of Generalized Gini and Gini mean difference indices and its Applications in Insurance

Statistical Physics, Diffusion Entropy Analysis and Information Theory c...
research
08/28/2023

Entropy-Based Strategies for Multi-Bracket Pools

Much work in the March Madness literature has discussed how to estimate ...
research
12/16/2021

High-dimensional logistic entropy clustering

Minimization of the (regularized) entropy of classification probabilitie...

Please sign up or login with your details

Forgot password? Click here to reset