DeepAI AI Chat
Log In Sign Up

Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0

by   Francesco De Toni, et al.

In this work, we explore whether the recently demonstrated zero-shot abilities of the T0 model extend to Named Entity Recognition for out-of-distribution languages and time periods. Using a historical newspaper corpus in 3 languages as test-bed, we use prompts to extract possible named entities. Our results show that a naive approach for prompt-based zero-shot multilingual Named Entity Recognition is error-prone, but highlights the potential of such an approach for historical languages lacking labeled datasets. Moreover, we also find that T0-like models can be probed to predict the publication date and language of a document, which could be very relevant for the study of historical texts.


Zero-Shot Information Extraction via Chatting with ChatGPT

Zero-shot information extraction (IE) aims to build IE systems from the ...

hmBERT: Historical Multilingual Language Models for Named Entity Recognition

Compared to standard Named Entity Recognition (NER), identifying persons...

Towards Lingua Franca Named Entity Recognition with BERT

Information extraction is an important task in NLP, enabling the automat...

Temporal Concept Drift and Alignment: An empirical approach to comparing Knowledge Organization Systems over time

This research explores temporal concept drift and temporal alignment in ...

Strong Heuristics for Named Entity Linking

Named entity linking (NEL) in news is a challenging endeavour due to the...

Priberam Labs at the NTCIR-15 SHINRA2020-ML: Classification Task

Wikipedia is an online encyclopedia available in 285 languages. It compo...

Pho(SC)Net: An Approach Towards Zero-shot Word Image Recognition in Historical Documents

Annotating words in a historical document image archive for word image r...