Smart Containers With Bidding Capacity: A Policy Gradient Algorithm for Semi-Cooperative Learning

05/01/2020
by   Wouter van Heeswijk, et al.
0

Smart modular freight containers – as propagated in the Physical Internet paradigm – are equipped with sensors, data storage capability and intelligence that enable them to route themselves from origin to destination without manual intervention or central governance. In this self-organizing setting, containers can autonomously place bids on transport services in a spot market setting. However, for individual containers it may be difficult to learn good bidding policies due to limited observations. By sharing information and costs between one another, smart containers can jointly learn bidding policies, even though simultaneously competing for the same transport capacity. We replicate this behavior by learning stochastic bidding policies in a semi-cooperative multi agent setting. To this end, we develop a reinforcement learning algorithm based on the policy gradient framework. Numerical experiments show that sharing solely bids and acceptance decisions leads to stable bidding policies. Additional system information only marginally improves performance; individual job properties suffice to place appropriate bids. Furthermore, we find that carriers may have incentives not to share information with the smart containers. The experiments give rise to several directions for follow-up research, in particular the interaction between smart containers and transport services in self-organizing logistics.

READ FULL TEXT
research
11/23/2020

Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

This paper introduces an information-theoretic constraint on learned pol...
research
02/18/2021

Strategic bidding in freight transport using deep reinforcement learning

This paper presents a multi-agent reinforcement learning algorithm to re...
research
03/29/2021

Deep reinforcement learning of event-triggered communication and control for multi-agent cooperative transport

In this paper, we explore a multi-agent reinforcement learning approach ...
research
08/06/2018

Learning to Share and Hide Intentions using Information Regularization

Learning to cooperate with friends and compete with foes is a key compon...
research
10/20/2021

Independent Natural Policy Gradient Always Converges in Markov Potential Games

Multi-agent reinforcement learning has been successfully applied to full...
research
05/27/2018

Contextual Policy Optimisation

Policy gradient methods have been successfully applied to a variety of r...
research
02/23/2021

PolicySpace2: modeling markets and endogenous housing policies

Policymakers decide on alternative policies facing restricted budgets an...

Please sign up or login with your details

Forgot password? Click here to reset