Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization

06/20/2023
by   Matias Alvo, et al.
0

Inventory management offers unique opportunities for reliably evaluating and applying deep reinforcement learning (DRL). Rather than evaluate DRL algorithms by comparing against one another or against human experts, we can compare to the optimum itself in several problem classes with hidden structure. Our DRL methods consistently recover near-optimal policies in such settings, despite being applied with up to 600-dimensional raw state vectors. In others, they can vastly outperform problem-specific heuristics. To reliably apply DRL, we leverage two insights. First, one can directly optimize the hindsight performance of any policy using stochastic gradient descent. This uses (i) an ability to backtest any policy's performance on a subsample of historical demand observations, and (ii) the differentiability of the total cost incurred on any subsample with respect to policy parameters. Second, we propose a natural neural network architecture to address problems with weak (or aggregate) coupling constraints between locations in an inventory network. This architecture employs weight duplication for “sibling” locations in the network, and state summarization. We justify this architecture through an asymptotic guarantee, and empirically affirm its value in handling large-scale problems.

READ FULL TEXT
research
10/06/2021

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Recent work applying deep reinforcement learning (DRL) to solve travelin...
research
04/19/2021

Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning

Deep reinforcement learning (DRL) has successfully solved various proble...
research
07/31/2020

Deep Reinforcement Learning using Cyclical Learning Rates

Deep Reinforcement Learning (DRL) methods often rely on the meticulous t...
research
04/24/2019

Neural Logic Reinforcement Learning

Deep reinforcement learning (DRL) has achieved significant breakthroughs...
research
08/07/2019

Large-scale traffic signal control using machine learning: some traffic flow considerations

This paper uses supervised learning, random search and deep reinforcemen...
research
09/14/2021

WaveCorr: Correlation-savvy Deep Reinforcement Learning for Portfolio Management

The problem of portfolio management represents an important and challeng...

Please sign up or login with your details

Forgot password? Click here to reset