
VFormation via Model Predictive Control
We present recent results that demonstrate the power of viewing the prob...
read it

Robust Batch Policy Learning in Markov Decision Processes
We study the sequential decision making problem in Markov decision proce...
read it

Nearly Minimax Optimal Regret for Learning Infinitehorizon Averagereward MDPs with Linear Function Approximation
We study reinforcement learning in an infinitehorizon averagereward se...
read it

A SampleEfficient Algorithm for Episodic FiniteHorizon MDP with Constraints
Constrained Markov Decision Processes (CMDPs) formalize sequential decis...
read it

Recurrent Model Predictive Control
This paper proposes an offline algorithm, called Recurrent Model Predic...
read it

Adaptive Neighborhood Resizing for Stochastic Reachability in MultiAgent Systems
We present DAMPC, a distributed, adaptivehorizon and adaptiveneighborh...
read it

Estimating action plans for smart poultry houses
In poultry farming, the systematic choice, update, and implementation of...
read it
ARES: Adaptive RecedingHorizon Synthesis of Optimal Plans
We introduce ARES, an efficient approximation algorithm for generating optimal plans (action sequences) that take an initial state of a Markov Decision Process (MDP) to a state whose cost is below a specified (convergence) threshold. ARES uses Particle Swarm Optimization, with adaptive sizing for both the receding horizon and the particle swarm. Inspired by Importance Splitting, the length of the horizon and the number of particles are chosen such that at least one particle reaches a nextlevel state, that is, a state where the cost decreases by a required delta from the previouslevel state. The level relation on states and the plans constructed by ARES implicitly define a Lyapunov function and an optimal policy, respectively, both of which could be explicitly generated by applying ARES to all states of the MDP, up to some topological equivalence relation. We also assess the effectiveness of ARES by statistically evaluating its rate of success in generating optimal plans. The ARES algorithm resulted from our desire to clarify if flying in Vformation is a flocking policy that optimizes energy conservation, clear view, and velocity alignment. That is, we were interested to see if one could find optimal plans that bring a flock from an arbitrary initial state to a state exhibiting a single connected Vformation. For flocks with 7 birds, ARES is able to generate a plan that leads to a Vformation in 95 63 seconds, on average. ARES can also be easily customized into a modelpredictive controller (MPC) with an adaptive receding horizon and statistical guarantees of convergence. To the best of our knowledge, our adaptivesizing approach is the first to provide convergence guarantees in recedinghorizon techniques.
READ FULL TEXT
Comments
There are no comments yet.