Learning Improvement Heuristics for Solving the Travelling Salesman Problem

12/12/2019
by   Yaoxin Wu, et al.
26

Recent studies in using deep learning to solve the Travelling Salesman Problem (TSP) focus on construction heuristics, the solution of which may still be far from optimality. To improve solution quality, additional procedures such as sampling or beam search are required. However, they are still based on the same construction policy, which is less effective in refining a solution. In this paper, we propose to directly learn the improvement heuristics for solving TSP based on deep reinforcement learning.We first present a reinforcement learning formulation for the improvement heuristic, where the policy guides selection of the next solution. Then, we propose a deep architecture as the policy network based on self-attention. Extensive experiments show that, improvement policies learned by our approach yield better results than state-of-the-art methods, even from random initial solutions. Moreover, the learned policies are more effective than the traditional hand-crafted ones, and robust to different initial solutions with either high or poor quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2020

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

Recent works using deep learning to solve the Traveling Salesman Problem...
research
11/20/2022

Learning to Search for Job Shop Scheduling via Deep Reinforcement Learning

Recent studies in using deep reinforcement learning (DRL) to solve Job-s...
research
12/19/2020

Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems

We present a novel deep reinforcement learning method to learn construct...
research
09/30/2018

Learning to Progressively Plan

For problem solving, making reactive decisions based on problem descript...
research
04/16/2018

Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning

Self-training is a useful strategy for semi-supervised learning, leverag...
research
12/23/2019

Learning Variable Ordering Heuristics for Solving Constraint Satisfaction Problems

Backtracking search algorithms are often used to solve the Constraint Sa...
research
10/11/2018

Policy Design for Active Sequential Hypothesis Testing using Deep Learning

Information theory has been very successful in obtaining performance lim...

Please sign up or login with your details

Forgot password? Click here to reset