Adaptive Decision-Making with Constraints and Dependent Losses: Performance Guarantees and Applications to Online and Nonlinear Identification
We consider adaptive decision-making problems where an agent optimizes a cumulative performance objective by repeatedly choosing among a finite set of options. Compared to the classical prediction-with-expert-advice set-up, we consider situations where losses are constrained and derive algorithms that exploit the additional structure in optimal and computationally efficient ways. Our algorithm and our analysis is instance dependent, that is, suboptimal choices of the environment are exploited and reflected in our regret bounds. The constraints handle general dependencies between losses (even across time), and are flexible enough to also account for a loss budget, which the environment is not allowed to exceed. The performance of the resulting algorithms is highlighted in two numerical examples, which include a nonlinear and online system identification task.
READ FULL TEXT