Effective Use of Transformer Networks for Entity Tracking

09/05/2019
by   Aditya Gupta, et al.
0

Tracking entities in procedural language requires understanding the transformations arising from actions on entities as well as those entities' interactions. While self-attention-based pre-trained language encoders like GPT and BERT have been successfully applied across a range of natural language understanding tasks, their ability to handle the nuances of procedural texts is still untested. In this paper, we explore the use of pre-trained transformer networks for entity tracking tasks in procedural text. First, we test standard lightweight approaches for prediction with pre-trained transformers, and find that these approaches underperform even simple baselines. We show that much stronger results can be attained by restructuring the input to guide the transformer model to focus on a particular entity. Second, we assess the degree to which transformer networks capture the process dynamics, investigating such factors as merged entities and oblique entity references. On two different tasks, ingredient detection in recipes and QA over scientific processes, we achieve state-of-the-art results, but our models still largely attend to shallow context clues and do not form complex representations of intermediate entity or process state.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2022

The Dark Side of the Language: Pre-trained Transformers in the DarkNet

Pre-trained Transformers are challenging human performances in many natu...
research
04/21/2020

DIET: Lightweight Language Understanding for Dialogue Systems

Large-scale pre-trained language models have shown impressive results on...
research
05/02/2022

Entity-aware Transformers for Entity Search

Pre-trained language models such as BERT have been a key ingredient to a...
research
04/06/2019

Tracking Discrete and Continuous Entity State for Process Understanding

Procedural text, which describes entities and their interactions as they...
research
11/14/2017

Simulating Action Dynamics with Neural Process Networks

Understanding procedural language requires anticipating the causal effec...
research
05/23/2022

Contrastive Representation Learning for Cross-Document Coreference Resolution of Events and Entities

Identifying related entities and events within and across documents is f...
research
05/06/2022

When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it

Understanding longer narratives or participating in conversations requir...

Please sign up or login with your details

Forgot password? Click here to reset