Addressing Barriers to Reproducible Named Entity Recognition Evaluation

by   Chester Palen-Michel, et al.

To address what we believe is a looming crisis of unreproducible evaluation for named entity recognition tasks, we present guidelines for reproducible evaluation. The guidelines we propose are extremely simple, focusing on transparency regarding how chunks are encoded and scored, but very few papers currently being published fully comply with them. We demonstrate that despite the apparent simplicity of NER evaluation, unreported differences in the scoring procedure can result in changes to scores that are both of noticeable magnitude and are statistically significant. We provide SeqScore, an open source toolkit that addresses many of the issues that cause replication failures and makes following our guidelines easy.



page 1

page 2

page 3

page 4


NeuroNER: an easy-to-use program for named-entity recognition based on neural networks

Named-entity recognition (NER) aims at identifying entities of interest ...

KazNERD: Kazakh Named Entity Recognition Dataset

We present the development of a dataset for Kazakh named entity recognit...

SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems

In dialogue systems, the tasks of named entity recognition (NER) and nam...

Interpretable Multi-dataset Evaluation for Named Entity Recognition

With the proliferation of models for natural language processing tasks, ...

Graph Convolutional Networks for Named Entity Recognition

In this paper we investigate the role of the dependency tree in a named ...

A Pragmatic Guide to Geoparsing Evaluation

Empirical methods in geoparsing have thus far lacked a standard evaluati...

Earnings-21: A Practical Benchmark for ASR in the Wild

Commonly used speech corpora inadequately challenge academic and commerc...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.