A Lightweight CNN-Transformer Model for Learning Traveling Salesman Problems

05/03/2023
by   Minseop Jung, et al.
0

Transformer-based models show state-of-the-art performance even for large-scale Traveling Salesman Problems (TSPs). However, they are based on fully-connected attention models and suffer from large computational complexity and GPU memory usage. We propose a lightweight CNN-Transformer model based on a CNN embedding layer and partial self-attention. Our CNN-Transformer model is able to better learn spatial features from input data using a CNN embedding layer compared with the standard Transformer models. It also removes considerable redundancy in fully connected attention models using the proposed partial self-attention. Experiments show that the proposed model outperforms other state-of-the-art Transformer-based models in terms of TSP solution quality, GPU memory usage, and inference time. Our model consumes approximately 20 state-of-the-art Transformer-based models. Our code is publicly available at https://github.com/cm8908/CNN_Transformer3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/09/2021

RaftMLP: Do MLP-based Models Dream of Winning Over Computer Vision?

For the past ten years, CNN has reigned supreme in the world of computer...
research
04/26/2021

Visformer: The Vision-friendly Transformer

The past year has witnessed the rapid development of applying the Transf...
research
05/24/2023

Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator

The transformer model is known to be computationally demanding, and proh...
research
09/11/2021

HYDRA – Hyper Dependency Representation Attentions

Attention is all we need as long as we have enough data. Even so, it is ...
research
08/28/2023

Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models

This report introduces the Attention Visualizer package, which is crafte...
research
07/05/2022

Swin Deformable Attention U-Net Transformer (SDAUT) for Explainable Fast MRI

Fast MRI aims to reconstruct a high fidelity image from partially observ...
research
10/18/2021

NormFormer: Improved Transformer Pretraining with Extra Normalization

During pretraining, the Pre-LayerNorm transformer suffers from a gradien...

Please sign up or login with your details

Forgot password? Click here to reset