
Asynchronous Stochastic Optimization Robust to Arbitrary Delays
We consider stochastic optimization with delayed gradients where, at eac...
Minimax Regret for Stochastic Shortest Path
We study the Stochastic Shortest Path (SSP) problem in which an agent ha...
Online Markov Decision Processes with Aggregate Bandit Feedback
We study a novel variant of online finitehorizon Markov Decision Proces...
Nearoptimal Regret Bounds for Stochastic Shortest Path
Stochastic shortest path (SSP) is a wellknown problem in planning and c...
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
We consider the problem of learning in Linear Quadratic Control systems ...
Apprenticeship Learning via FrankWolfe
We consider the applications of the FrankWolfe (FW) algorithm for Appre...
Average reward reinforcement learning with unknown mixing times
We derive and analyze learning algorithms for policy evaluation, apprent...
Learning LinearQuadratic Regulators Efficiently with only √(T) Regret
We present the first computationallyefficient algorithm with O(√(T)) r...
Learning and Generalization for Matching Problems
We study a classic algorithmic problem through the lens of statistical l...
Incentivizing the Dynamic Workforce: Learning Contracts in the GigEconomy
In principalagent models, a principal offers a contract to an agent to ...
Online Linear Quadratic Control
We study the problem of controlling linear timeinvariant systems with k...
Planning and Learning with Stochastic Action Sets
In many practical uses of reinforcement learning (RL) the set of actions...
Online Learning with Feedback Graphs Without the Graphs
We study an online learning framework introduced by Mannor and Shamir (2...
Alon Cohen
