Back to the Future -- Sequential Alignment of Text Representations

09/08/2019
by   Johannes Bjerva, et al.
10

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens 'BERT' and 'ELMO' in publications refer to neural network architectures rather than persons. This type of temporal signal is typically overlooked, but is important if one aims to deploy a machine learning model over an extended period of time. In particular, language evolution causes data drift between time-steps in sequential decision-making tasks. Examples of such tasks include prediction of paper acceptance for yearly conferences (regular intervals) or author stance prediction for rumours on Twitter (irregular intervals). Inspired by successes in computer vision, we tackle data drift by sequentially aligning learned representations. We evaluate on three challenging tasks varying in terms of time-scales, linguistic units, and domains. These tasks show our method outperforming several strong baselines, including using all available data. We argue that, due to its low computational expense, sequential alignment is a practical solution to dealing with language evolution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2022

RobBERT-2022: Updating a Dutch Language Model to Account for Evolving Language Use

Large transformer-based language models, e.g. BERT and GPT-3, outperform...
research
08/17/2022

On the evolution of research in hypersonics: application of natural language processing and machine learning

Research and development in hypersonics have progressed significantly in...
research
10/16/2019

Evolution of transfer learning in natural language processing

In this paper, we present a study of the recent advancements which have ...
research
10/13/2020

Language Networks: a Practical Approach

This manuscript provides a short and practical introduction to the topic...
research
06/24/2022

Text and author-level political inference using heterogeneous knowledge representations

The inference of politically-charged information from text data is a pop...
research
07/02/2022

Language statistics at different spatial, temporal, and grammatical scales

Statistical linguistics has advanced considerably in recent decades as d...

Please sign up or login with your details

Forgot password? Click here to reset