Interpretable Reinforcement Learning via Neural Additive Models for Inventory Management

by   Julien Siems, et al.

The COVID-19 pandemic has highlighted the importance of supply chains and the role of digital management to react to dynamic changes in the environment. In this work, we focus on developing dynamic inventory ordering policies for a multi-echelon, i.e. multi-stage, supply chain. Traditional inventory optimization methods aim to determine a static reordering policy. Thus, these policies are not able to adjust to dynamic changes such as those observed during the COVID-19 crisis. On the other hand, conventional strategies offer the advantage of being interpretable, which is a crucial feature for supply chain managers in order to communicate decisions to their stakeholders. To address this limitation, we propose an interpretable reinforcement learning approach that aims to be as interpretable as the traditional static policies while being as flexible and environment-agnostic as other deep learning-based reinforcement learning solutions. We propose to use Neural Additive Models as an interpretable dynamic policy of a reinforcement learning agent, showing that this approach is competitive with a standard full connected policy. Finally, we use the interpretability property to gain insights into a complex ordering strategy for a simple, linear three-echelon inventory supply chain.


page 1

page 2

page 3

page 4


MARLIM: Multi-Agent Reinforcement Learning for Inventory Management

Maintaining a balance between the supply and demand of products by optim...

Cooperative Multi-Agent Reinforcement Learning for Inventory Management

With Reinforcement Learning (RL) for inventory management (IM) being a n...

Deep Reinforcement Learning for a Two-Echelon Supply Chain with Seasonal Demand

This paper leverages recent developments in reinforcement learning and d...

The Role of Intent-Based Networking in ICT Supply Chains

The evolution towards Industry 4.0 is driving the need for innovative so...

Interpretable Reinforcement Learning with Ensemble Methods

We propose to use boosted regression trees as a way to compute human-int...

Identifying contributors to supply chain outcomes in a multi-echelon setting: a decentralised approach

Organisations often struggle to identify the causes of change in metrics...

Calculus of Consent via MARL: Legitimating the Collaborative Governance Supplying Public Goods

Public policies that supply public goods, especially those involve colla...

Please sign up or login with your details

Forgot password? Click here to reset