GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding

05/16/2023
by   Jia-Chen Gu, et al.
0

Addressing the issues of who saying what to whom in multi-party conversations (MPCs) has recently attracted a lot of research attention. However, existing methods on MPC understanding typically embed interlocutors and utterances into sequential information flows, or utilize only the superficial of inherent graph structures in MPCs. To this end, we present a plug-and-play and lightweight method named graph-induced fine-tuning (GIFT) which can adapt various Transformer-based pre-trained language models (PLMs) for universal MPC understanding. In detail, the full and equivalent connections among utterances in regular Transformer ignore the sparse but distinctive dependency of an utterance on another in MPCs. To distinguish different relationships between utterances, four types of edges are designed to integrate graph-induced signals into attention mechanisms to refine PLMs originally designed for processing sequential texts. We evaluate GIFT by implementing it into three PLMs, and test the performance on three downstream tasks including addressee recognition, speaker identification and response selection. Experimental results show that GIFT can significantly improve the performance of three PLMs on three downstream tasks and two benchmarks with only 4 additional parameters per encoding layer, achieving new state-of-the-art performance on MPC understanding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding

Recently, various neural models for multi-party conversation (MPC) have ...
research
02/21/2023

Generic Dependency Modeling for Multi-Party Conversation

To model the dependencies between utterances in multi-party conversation...
research
05/26/2023

Parameter-Efficient Fine-Tuning without Introducing New Latency

Parameter-efficient fine-tuning (PEFT) of pre-trained language models ha...
research
05/22/2023

MADNet: Maximizing Addressee Deduction Expectation for Multi-Party Conversation Generation

Modeling multi-party conversations (MPCs) with graph neural networks has...
research
04/19/2023

AdapterGNN: Efficient Delta Tuning Improves Generalization Ability in Graph Neural Networks

Fine-tuning pre-trained models has recently yielded remarkable performan...
research
04/09/2022

TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization

Although pre-trained language models (PLMs) have achieved great success ...

Please sign up or login with your details

Forgot password? Click here to reset