Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs

by   Riccardo Poiani, et al.

With the continuous growth of the global economy and markets, resource imbalance has risen to be one of the central issues in real logistic scenarios. In marine transportation, this trade imbalance leads to Empty Container Repositioning (ECR) problems. Once the freight has been delivered from an exporting country to an importing one, the laden will turn into empty containers that need to be repositioned to satisfy new goods requests in exporting countries. In such problems, the performance that any cooperative repositioning policy can achieve strictly depends on the routes that vessels will follow (i.e., fleet deployment). Historically, Operation Research (OR) approaches were proposed to jointly optimize the repositioning policy along with the fleet of vessels. However, the stochasticity of future supply and demand of containers, together with black-box and non-linear constraints that are present within the environment, make these approaches unsuitable for these scenarios. In this paper, we introduce a novel framework, Configurable Semi-POMDPs, to model this type of problems. Furthermore, we provide a two-stage learning algorithm, "Configure Conquer" (CC), that first configures the environment by finding an approximation of the optimal fleet deployment strategy, and then "conquers" it by learning an ECR policy in this tuned environmental setting. We validate our approach in large and real-world instances of the problem. Our experiments highlight that CC avoids the pitfalls of OR methods and that it is successful at optimizing both the ECR policy and the fleet of vessels, leading to superior performance in world trade environments.


Configurable Markov Decision Processes

In many real-world problems, there is the possibility to configure, to a...

Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Bayesian approaches developed to solve the optimal design of sequential ...

Self-Supervised Policy Adaptation during Deployment

In most real world scenarios, a policy trained by reinforcement learning...

Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

Many real-world physical control systems are required to satisfy constra...

Action Set Based Policy Optimization for Safe Power Grid Management

Maintaining the stability of the modern power grid is becoming increasin...

Instance generation tool for on-demand transportation problems

We present REQreate, a tool to generate instances for on-demand transpor...

Efficient XAI Techniques: A Taxonomic Survey

Recently, there has been a growing demand for the deployment of Explaina...

Please sign up or login with your details

Forgot password? Click here to reset