Online Convex Optimization Perspective for Learning from Dynamically Revealed Preferences

by   Violet Xinying Chen, et al.

We study the problem of online learning (OL) from revealed preferences: a learner wishes to learn an agent's private utility function through observing the agent's utility-maximizing actions in a changing environment. We adopt an online inverse optimization setup, where the learner observes a stream of agent's actions in an online fashion and the learning performance is measured by regret associated with a loss function. Due to the inverse optimization component, attaining or proving convexity is difficult for all of the usual loss functions in the literature. We address this challenge by designing a new loss function that is convex under relatively mild assumptions. Moreover, we establish that the regret with respect to our new loss function also bounds the regret with respect to all other usual loss functions. This then allows us to design a flexible OL framework that enables a unified treatment of loss functions and supports a variety of online convex optimization algorithms. We demonstrate with theoretical and empirical evidence that our framework based on the new loss function (in particular online Mirror Descent) has significant advantages in terms of eliminating technical assumptions as well as regret performance and solution time over other OL algorithms from the literature.



There are no comments yet.


page 1

page 2

page 3

page 4


Grinding the Space: Learning to Classify Against Strategic Agents

We study the problem of online learning in strategic classification sett...

Convergence Analysis of Optimization Algorithms

The regret bound of an optimization algorithms is one of the basic crite...

Adaptivity and Optimality: A Universal Algorithm for Online Convex Optimization

In this paper, we study adaptive online convex optimization, and aim to ...

Strategic Classification from Revealed Preferences

We study an online linear classification problem, in which the data is g...

Online non-convex optimization with imperfect feedback

We consider the problem of online learning with non-convex losses. In te...

Zeroth-order non-convex learning via hierarchical dual averaging

We propose a hierarchical version of dual averaging for zeroth-order onl...

Risk Guarantees for End-to-End Prediction and Optimization Processes

Prediction models are often employed in estimating parameters of optimiz...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.