Structural analysis of an all-purpose question answering model

04/13/2021
by   Vincent Micheli, et al.
0

Attention is a key component of the now ubiquitous pre-trained language models. By learning to focus on relevant pieces of information, these Transformer-based architectures have proven capable of tackling several tasks at once and sometimes even surpass their single-task counterparts. To better understand this phenomenon, we conduct a structural analysis of a new all-purpose question answering model that we introduce. Surprisingly, this model retains single-task performance even in the absence of a strong transfer effect between tasks. Through attention head importance scoring, we observe that attention heads specialize in a particular task and that some heads are more conducive to learning than others in both the multi-task and single-task settings.

READ FULL TEXT
research
10/07/2021

A Comparative Study of Transformer-Based Language Models on Extractive Question Answering

Question Answering (QA) is a task in natural language processing that ha...
research
03/05/2020

Talking-Heads Attention

We introduce "talking-heads attention" - a variation on multi-head atten...
research
04/13/2021

What's in your Head? Emergent Behaviour in Multi-Task Transformer Models

The primary paradigm for multi-task training in natural language process...
research
10/01/2020

ISAAQ – Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention

Textbook Question Answering is a complex task in the intersection of Mac...
research
03/14/2022

Choose Your QA Model Wisely: A Systematic Study of Generative and Extractive Readers for Question Answering

While both extractive and generative readers have been successfully appl...
research
07/03/2020

Eliminating Catastrophic Interference with Biased Competition

We present here a model to take advantage of the multi-task nature of co...
research
09/16/2023

NOWJ1@ALQAC 2023: Enhancing Legal Task Performance with Classic Statistical Models and Pre-trained Language Models

This paper describes the NOWJ1 Team's approach for the Automated Legal Q...

Please sign up or login with your details

Forgot password? Click here to reset