
Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √(T) Regret
We consider the task of learning to control a linear dynamical system un...
read it

The Pendulum Arrangement: Maximizing the Escape Time of Heterogeneous Random Walks
We identify a fundamental phenomenon of heterogeneous one dimensional ra...
read it

Bandit Linear Control
We consider the problem of controlling a known linear dynamical system u...
read it

Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
We consider the problem of learning in Linear Quadratic Control systems ...
read it

A General Approach to MultiArmed Bandits Under Risk Criteria
Different riskrelated criteria have received recent interest in learnin...
read it
Asaf Cassel
is this you? claim profile