Constrained Reinforcement Learning for Dynamic Material Handling

by   Chengpeng Hu, et al.

As one of the core parts of flexible manufacturing systems, material handling involves storage and transportation of materials between workstations with automated vehicles. The improvement in material handling can impulse the overall efficiency of the manufacturing system. However, the occurrence of dynamic events during the optimisation of task arrangements poses a challenge that requires adaptability and effectiveness. In this paper, we aim at the scheduling of automated guided vehicles for dynamic material handling. Motivated by some real-world scenarios, unknown new tasks and unexpected vehicle breakdowns are regarded as dynamic events in our problem. We formulate the problem as a constrained Markov decision process which takes into account tardiness and available vehicles as cumulative and instantaneous constraints, respectively. An adaptive constrained reinforcement learning algorithm that combines Lagrangian relaxation and invalid action masking, named RCPOM, is proposed to address the problem with two hybrid constraints. Moreover, a gym-like dynamic material handling simulator, named DMH-GYM, is developed and equipped with diverse problem instances, which can be used as benchmarks for dynamic material handling. Experimental results on the problem instances demonstrate the outstanding performance of our proposed approach compared with eight state-of-the-art constrained and non-constrained reinforcement learning algorithms, and widely used dispatching rules for material handling.


Decentralized Multi-AGV Task Allocation based on Multi-Agent Reinforcement Learning with Information Potential Field Rewards

Automated Guided Vehicles (AGVs) have been widely used for material hand...

Deep reinforcement learning under signal temporal logic constraints using Lagrangian relaxation

Deep reinforcement learning (DRL) has attracted much attention as an app...

Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism

The dynamic job-shop scheduling problem (DJSP) is a class of scheduling ...

Recursive Constraints to Prevent Instability in Constrained Reinforcement Learning

We consider the challenge of finding a deterministic policy for a Markov...

Competitors-Aware Stochastic Lap Strategy Optimisation for Race Hybrid Vehicles

World Endurance Championship (WEC) racing events are characterised by a ...

A Memetic Algorithm with Reinforcement Learning for Sociotechnical Production Scheduling

The following article presents a memetic algorithm with applying deep re...

Adaptive Learning for the Resource-Constrained Classification Problem

Resource-constrained classification tasks are common in real-world appli...

Please sign up or login with your details

Forgot password? Click here to reset