KSAT: Knowledge-infused Self Attention Transformer – Integrating Multiple Domain-Specific Contexts

10/09/2022
by   Kaushik Roy, et al.
37

Domain-specific language understanding requires integrating multiple pieces of relevant contextual information. For example, we see both suicide and depression-related behavior (multiple contexts) in the text “I have a gun and feel pretty bad about my life, and it wouldn't be the worst thing if I didn't wake up tomorrow”. Domain specificity in self-attention architectures is handled by fine-tuning on excerpts from relevant domain specific resources (datasets and external knowledge - medical textbook chapters on mental health diagnosis related to suicide and depression). We propose a modified self-attention architecture Knowledge-infused Self Attention Transformer (KSAT) that achieves the integration of multiple domain-specific contexts through the use of external knowledge sources. KSAT introduces knowledge-guided biases in dedicated self-attention layers for each knowledge source to accomplish this. In addition, KSAT provides mechanics for controlling the trade-off between learning from data and learning from knowledge. Our quantitative and qualitative evaluations show that (1) the KSAT architecture provides novel human-understandable ways to precisely measure and visualize the contributions of the infused domain contexts, and (2) KSAT performs competitively with other knowledge-infused baselines and significantly outperforms baselines that use fine-tuning for domain-specific tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2020

CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights based Counterfactual Detection

In this paper, we describe an approach for modelling causal reasoning in...
research
10/08/2019

SesameBERT: Attention for Anywhere

Fine-tuning with pre-trained models has achieved exceptional results for...
research
06/23/2023

Knowledge-Infused Self Attention Transformers

Transformer-based language models have achieved impressive success in va...
research
10/12/2021

Relative Molecule Self-Attention Transformer

Self-supervised learning holds promise to revolutionize molecule propert...
research
05/08/2023

Augmented Large Language Models with Parametric Knowledge Guiding

Large Language Models (LLMs) have significantly advanced natural languag...
research
09/15/2022

Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention

We propose Beat Transformer, a novel Transformer encoder architecture fo...
research
10/14/2022

Self-Repetition in Abstractive Neural Summarizers

We provide a quantitative and qualitative analysis of self-repetition in...

Please sign up or login with your details

Forgot password? Click here to reset