Neural Language Modeling for Contextualized Temporal Graph Generation

10/20/2020
by   Aman Madaan, et al.
0

This paper presents the first study on using large-scale pre-trained language models for automated generation of an event-level temporal graph for a document. Despite the huge success of neural pre-training methods in NLP tasks, its potential for temporal reasoning over event graphs has not been sufficiently explored. Part of the reason is the difficulty in obtaining large training corpora with human-annotated events and temporal links. We address this challenge by using existing IE/NLP tools to automatically generate a large quantity (89,000) of system-produced document-graph pairs, and propose a novel formulation of the contextualized graph generation problem as a sequence-to-sequence mapping task. These strategies enable us to leverage and fine-tune pre-trained language models on the system-induced training data for the graph generation task. Our experiments show that our approach is highly effective in generating structurally and semantically valid graphs. Further, evaluation on a challenging hand-labeled, out-domain corpus shows that our method outperforms the closest existing method by a large margin on several metrics. Code and pre-trained models are available at https://github.com/madaan/temporal-graph-gen.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2022

Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning

Pre-trained sequence-to-sequence language models have led to widespread ...
research
10/16/2020

Substance over Style: Document-Level Targeted Content Transfer

Existing language models excel at writing from scratch, but many real-wo...
research
09/20/2021

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

We present BARTpho with two versions – BARTpho_word and BARTpho_syllable...
research
08/30/2022

Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP

Obtaining text datasets with semantic annotations is an effortful proces...
research
01/22/2023

An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from m...
research
10/06/2020

Modeling Preconditions in Text with a Crowd-sourced Dataset

Preconditions provide a form of logical connection between events that e...
research
09/20/2022

Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models

Prompting, which casts downstream applications as language modeling task...

Please sign up or login with your details

Forgot password? Click here to reset