A Realistic Study of Auto-regressive Language Models for Named Entity Typing and Recognition

by   Elena V. Epure, et al.

Despite impressive results of language models for named entity recognition (NER), their generalization to varied textual genres, a growing entity type set, and new entities remains a challenge. Collecting thousands of annotations in each new case for training or fine-tuning is expensive and time-consuming. In contrast, humans can easily identify named entities given some simple instructions. Inspired by this, we challenge the reliance on large datasets and study pre-trained language models for NER in a meta-learning setup. First, we test named entity typing (NET) in a zero-shot transfer scenario. Then, we perform NER by giving few examples at inference. We propose a method to select seen and rare / unseen names when having access only to the pre-trained model and report results on these groups. The results show: auto-regressive language models as meta-learners can perform NET and NER fairly well especially for regular or seen names; name irregularity when often present for a certain entity type can become an effective exploitable cue; names with words foreign to the model have the most negative impact on results; the model seems to rely more on name than context cues in few-shot NER.


page 1

page 2

page 3

page 4


Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition

Fine-tuning pre-trained language models has recently become a common pra...

Identifying and Extracting Rare Disease Phenotypes with Large Language Models

Rare diseases (RDs) are collectively common and affect 300 million peopl...

TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models

Transferring knowledge from one domain to another is of practical import...

PromptNER: Prompting For Named Entity Recognition

In a surprising turn, Large Language Models (LLMs) together with a growi...

Generative AI for Business Strategy: Using Foundation Models to Create Business Strategy Tools

Generative models (foundation models) such as LLMs (large language model...

UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Large language models (LLMs) have demonstrated remarkable generalizabili...

Few-Sample Named Entity Recognition for Security Vulnerability Reports by Fine-Tuning Pre-Trained Language Models

Public security vulnerability reports (e.g., CVE reports) play an import...

Please sign up or login with your details

Forgot password? Click here to reset