Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning

11/12/2019
by   Qiang Ma, et al.
0

In this work, we introduce Graph Pointer Networks (GPNs) trained using reinforcement learning (RL) for tackling the traveling salesman problem (TSP). GPNs build upon Pointer Networks by introducing a graph embedding layer on the input, which captures relationships between nodes. Furthermore, to approximate solutions to constrained combinatorial optimization problems such as the TSP with time windows, we train hierarchical GPNs (HGPNs) using RL, which learns a hierarchical policy to find an optimal city permutation under constraints. Each layer of the hierarchy is designed with a separate reward function, resulting in stable training. Our results demonstrate that GPNs trained on small-scale TSP50/100 problems generalize well to larger-scale TSP500/1000 problems, with shorter tour lengths and faster computational times. We verify that for constrained TSP problems such as the TSP with time windows, the feasible solutions found via hierarchical RL training outperform previous baselines. In the spirit of reproducible research we make our data, models, and code publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2021

Hybrid Pointer Networks for Traveling Salesman Problems Optimization

In this work, a novel idea is presented for combinatorial optimization p...
research
11/29/2016

Neural Combinatorial Optimization with Reinforcement Learning

This paper presents a framework to tackle combinatorial optimization pro...
research
01/07/2021

Active Screening for Recurrent Diseases: A Reinforcement Learning Approach

Active screening is a common approach in controlling the spread of recur...
research
06/22/2020

Constrained Combinatorial Optimization with Reinforcement Learning

This paper presents a framework to tackle constrained combinatorial opti...
research
11/07/2020

A Reinforcement Learning Approach to the Orienteering Problem with Time Windows

The Orienteering Problem with Time Windows (OPTW) is a combinatorial opt...
research
10/16/2019

On Learning Paradigms for the Travelling Salesman Problem

We explore the impact of learning paradigms on training deep neural netw...
research
12/24/2021

An Efficient Combinatorial Optimization Model Using Learning-to-Rank Distillation

Recently, deep reinforcement learning (RL) has proven its feasibility in...

Please sign up or login with your details

Forgot password? Click here to reset