No News is Good News: A Critique of the One Billion Word Benchmark

10/25/2021
by   Helen Ngo, et al.
0

The One Billion Word Benchmark is a dataset derived from the WMT 2011 News Crawl, commonly used to measure language modeling ability in natural language processing. We train models solely on Common Crawl web scrapes partitioned by year, and demonstrate that they perform worse on this task over time due to distributional shift. Analysis of this corpus reveals that it contains several examples of harmful text, as well as outdated references to current events. We suggest that the temporal nature of news and its distribution shift over time makes it poorly suited for measuring language modeling ability, and discuss potential impact and considerations for researchers building language models and evaluation datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2021

How News Evolves? Modeling News Text and Coverage using Graphs and Hawkes Process

Monitoring news content automatically is an important problem. The news ...
research
02/07/2023

What do Language Models know about word senses? Zero-Shot WSD with Language Models and Domain Inventories

Language Models are the core for almost any Natural Language Processing ...
research
11/27/2019

SimpleBooks: Long-term dependency book dataset with simplified English vocabulary for word-level language modeling

With language modeling becoming the popular base task for unsupervised r...
research
12/20/2022

A Measure-Theoretic Characterization of Tight Language Models

Language modeling, a central task in natural language processing, involv...
research
12/14/2021

Towards Interactive Language Modeling

Interaction between caregivers and children plays a critical role in hum...
research
05/25/2023

Measuring the Effect of Influential Messages on Varying Personas

Predicting how a user responds to news events enables important applicat...
research
09/14/2021

Types of Out-of-Distribution Texts and How to Detect Them

Despite agreement on the importance of detecting out-of-distribution (OO...

Please sign up or login with your details

Forgot password? Click here to reset