Deep Prompt Tuning for Graph Transformers

09/18/2023
by   Reza Shirkavand, et al.
0

Graph transformers have gained popularity in various graph-based tasks by addressing challenges faced by traditional Graph Neural Networks. However, the quadratic complexity of self-attention operations and the extensive layering in graph transformer architectures present challenges when applying them to graph based prediction tasks. Fine-tuning, a common approach, is resource-intensive and requires storing multiple copies of large models. We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning for leveraging large graph transformer models in downstream graph based prediction tasks. Our method introduces trainable feature nodes to the graph and pre-pends task-specific tokens to the graph transformer, enhancing the model's expressive power. By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies, making it suitable for small datasets and scalable to large graphs. Through extensive experiments on various-sized datasets, we demonstrate that deep graph prompt tuning achieves comparable or even superior performance to fine-tuning, despite utilizing significantly fewer task-specific parameters. Our contributions include the introduction of prompt tuning for graph transformers, its application to both graph transformers and message passing graph neural networks, improved efficiency and resource utilization, and compelling experimental results. This work brings attention to a promising approach to leverage pre-trained models in graph based prediction tasks and offers new opportunities for exploring and advancing graph representation learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2022

Prompt Tuning for Graph Neural Networks

In recent years, prompt tuning has set off a research boom in the adapta...
research
05/17/2023

G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks

It has become a popular paradigm to transfer the knowledge of large-scal...
research
12/30/2020

Optimizing Deeper Transformers on Small Datasets: An Application on Text-to-SQL Semantic Parsing

Due to the common belief that training deep transformers from scratch re...
research
07/18/2023

Attention over pre-trained Sentence Embeddings for Long Document Classification

Despite being the current de-facto models in most NLP tasks, transformer...
research
03/04/2021

Universal Representation for Code

Learning from source code usually requires a large amount of labeled dat...
research
09/15/2023

Cure the headache of Transformers via Collinear Constrained Attention

As the rapid progression of practical applications based on Large Langua...
research
02/02/2023

Mnemosyne: Learning to Train Transformers with Transformers

Training complex machine learning (ML) architectures requires a compute ...

Please sign up or login with your details

Forgot password? Click here to reset