Generalization in Deep RL for TSP Problems via Equivariance and Local Search

10/07/2021
by   Wenbin Ouyang, et al.
0

Deep reinforcement learning (RL) has proved to be a competitive heuristic for solving small-sized instances of traveling salesman problems (TSP), but its performance on larger-sized instances is insufficient. Since training on large instances is impractical, we design a novel deep RL approach with a focus on generalizability. Our proposition consisting of a simple deep learning architecture that learns with novel RL training techniques, exploits two main ideas. First, we exploit equivariance to facilitate training. Second, we interleave efficient local search heuristics with the usual RL training to smooth the value landscape. In order to validate the whole approach, we empirically evaluate our proposition on random and realistic TSP problems against relevant state-of-the-art deep RL methods. Moreover, we present an ablation study to understand the contribution of each of its component

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2021

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Recent work applying deep reinforcement learning (DRL) to solve travelin...
research
10/16/2019

On Learning Paradigms for the Travelling Salesman Problem

We explore the impact of learning paradigms on training deep neural netw...
research
05/17/2018

Learning Time-Sensitive Strategies in Space Fortress

Although there has been remarkable progress and impressive performance o...
research
09/20/2022

Graph Value Iteration

In recent years, deep Reinforcement Learning (RL) has been successful in...
research
02/16/2021

Training Larger Networks for Deep Reinforcement Learning

The success of deep learning in the computer vision and natural language...
research
02/03/2022

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

Learning effective policies for real-world problems is still an open cha...
research
04/24/2017

Reinforcement Learning Based Dynamic Selection of Auxiliary Objectives with Preserving of the Best Found Solution

Efficiency of single-objective optimization can be improved by introduci...

Please sign up or login with your details

Forgot password? Click here to reset