DeepAI AI Chat
Log In Sign Up

Metadata Might Make Language Models Better

by   Kaspar Beelen, et al.

This paper discusses the benefits of including metadata when training language models on historical collections. Using 19th-century newspapers as a case study, we extend the time-masking approach proposed by Rosin et al., 2022 and compare different strategies for inserting temporal, political and geographical information into a Masked Language Model. After fine-tuning several DistilBERT on enhanced input data, we provide a systematic evaluation of these models on a set of evaluation tasks: pseudo-perplexity, metadata mask-filling and supervised classification. We find that showing relevant metadata to a language model has a beneficial impact and may even produce more robust and fairer models.


page 1

page 2

page 3

page 4


Making Pre-trained Language Models Better Few-shot Learners

The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot...

Building Metadata Inference Using a Transducer Based Language Model

Solving the challenges of automatic machine translation of Building Auto...

Release Strategies and the Social Impacts of Language Models

Large language models have a range of beneficial uses: they can assist i...

Bidirectional Language Models Are Also Few-shot Learners

Large language models such as GPT-3 (Brown et al., 2020) can perform arb...

Robust Document Representations using Latent Topics and Metadata

Task specific fine-tuning of a pre-trained neural language model using a...

How Many Pages? Paper Length Prediction from the Metadata

Being able to predict the length of a scientific paper may be helpful in...