
Reinforcement Learning for Datacenter Congestion Control
We approach the task of network congestion control in datacenters using ...
Acting in Delayed Environments with NonStationary Markov Policies
The standard Markov Decision Process (MDP) formulation hinges on the ass...
The Architectural Implications of Distributed Reinforcement Learning on CPUGPU Systems
With deep reinforcement learning (RL) methods achieving results that exc...
A Tale of TwoTimescale Reinforcement Learning with the Tightest FiniteTime Bound
Policy evaluation in reinforcement learning is often conducted using two...
How to Combine TreeSearch Methods in Reinforcement Learning
Finitehorizon lookahead policies are abundantly used in Reinforcement L...
MultipleStep Greedy Policies in Online and Approximate Reinforcement Learning
Multiplestep lookahead policies have demonstrated high empirical compet...
Beyond the One Step Greedy Approach in Reinforcement Learning
The famous Policy Iteration algorithm alternates between policy improvem...
Safe Exploration in Continuous Action Spaces
We address the problem of deploying a reinforcement learning (RL) agent ...
ChanceConstrained Outage Scheduling using a Machine Learning Proxy
Outage scheduling aims at defining, over a horizon of several months to ...
Finite Sample Analyses for TD(0) with Function Approximation
TD(0) is one of the most commonly used algorithms in reinforcement learn...
Unit Commitment using Nearest Neighbor as a ShortTerm Proxy
We devise the Unit Commitment Nearest Neighbor (UCNN) algorithm to be us...
Hierarchical Decision Making In Electricity Grid Management
The power grid is a complex and vital system that necessitates careful r...
