Conditional Self-Attention for Query-based Summarization

02/18/2020
by   Yujia Xie, et al.
20

Self-attention mechanisms have achieved great success on a variety of NLP tasks due to its flexibility of capturing dependency between arbitrary positions in a sequence. For problems such as query-based summarization (Qsumm) and knowledge graph reasoning where each input sequence is associated with an extra query, explicitly modeling such conditional contextual dependencies can lead to a more accurate solution, which however cannot be captured by existing self-attention mechanisms. In this paper, we propose conditional self-attention (CSA), a neural network module designed for conditional dependency modeling. CSA works by adjusting the pairwise attention between input tokens in a self-attention module with the matching score of the inputs to the given query. Thereby, the contextual dependencies modeled by CSA will be highly relevant to the query. We further studied variants of CSA defined by different types of attention. Experiments on Debatepedia and HotpotQA benchmark datasets show CSA consistently outperforms vanilla Transformer and previous models for the Qsumm problem.

READ FULL TEXT

page 12

page 13

page 14

research
10/14/2020

DA-Transformer: Distance-aware Transformer

Transformer has achieved great success in the NLP field by composing var...
research
09/13/2022

Switchable Self-attention Module

Attention mechanism has gained great success in vision recognition. Many...
research
02/15/2019

Context-Aware Self-Attention Networks

Self-attention model have shown its flexibility in parallel computation ...
research
03/15/2022

Efficient Long Sequence Encoding via Synchronization

Pre-trained Transformer models have achieved successes in a wide range o...
research
11/21/2018

Contextualized Non-local Neural Networks for Sequence Learning

Recently, a large number of neural mechanisms and models have been propo...
research
10/11/2019

exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models

Large language models can produce powerful contextual representations th...
research
09/02/2023

Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages

Despite that Transformers perform well in NLP tasks, recent studies sugg...

Please sign up or login with your details

Forgot password? Click here to reset