Salient Span Masking for Temporal Understanding

03/22/2023
by   Jeremy R. Cole, et al.
0

Salient Span Masking (SSM) has shown itself to be an effective strategy to improve closed-book question answering performance. SSM extends general masked language model pretraining by creating additional unsupervised training sentences that mask a single entity or date span, thus oversampling factual information. Despite the success of this paradigm, the span types and sampling strategies are relatively arbitrary and not widely studied for other tasks. Thus, we investigate SSM from the perspective of temporal tasks, where learning a good representation of various temporal expressions is important. To that end, we introduce Temporal Span Masking (TSM) intermediate training. First, we find that SSM alone improves the downstream performance on three temporal tasks by an avg. +5.8 points. Further, we are able to achieve additional improvements (avg. +0.29 points) by adding the TSM task. These comprise the new best reported results on the targeted tasks. Our analysis suggests that the effectiveness of SSM stems from the sentences chosen in the training data rather than the mask choice: sentences with entities frequently also contain temporal expressions. Nonetheless, the additional targeted spans of TSM can still improve performance, especially in a zero-shot context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2020

A Cross-Task Analysis of Text Span Representations

Many natural language processing (NLP) tasks involve reasoning with text...
research
03/15/2022

Representation Learning for Resource-Constrained Keyphrase Generation

State-of-the-art keyphrase generation methods generally depend on large ...
research
01/02/2021

Few-Shot Question Answering by Pretraining Span Selection

In a number of question answering (QA) benchmarks, pretrained models hav...
research
09/09/2021

AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models

Despite their success in a variety of NLP tasks, pre-trained language mo...
research
05/12/2022

Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction

This paper investigates the effectiveness of sentence-level transformers...
research
08/12/2021

How Optimal is Greedy Decoding for Extractive Question Answering?

Fine-tuned language models use greedy decoding to answer reading compreh...
research
12/06/2021

MoCA: Incorporating Multi-stage Domain Pretraining and Cross-guided Multimodal Attention for Textbook Question Answering

Textbook Question Answering (TQA) is a complex multimodal task to infer ...

Please sign up or login with your details

Forgot password? Click here to reset