DeepAI AI Chat
Log In Sign Up

Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0

04/11/2022
by   Francesco De Toni, et al.
4

In this work, we explore whether the recently demonstrated zero-shot abilities of the T0 model extend to Named Entity Recognition for out-of-distribution languages and time periods. Using a historical newspaper corpus in 3 languages as test-bed, we use prompts to extract possible named entities. Our results show that a naive approach for prompt-based zero-shot multilingual Named Entity Recognition is error-prone, but highlights the potential of such an approach for historical languages lacking labeled datasets. Moreover, we also find that T0-like models can be probed to predict the publication date and language of a document, which could be very relevant for the study of historical texts.

READ FULL TEXT
02/20/2023

Zero-Shot Information Extraction via Chatting with ChatGPT

Zero-shot information extraction (IE) aims to build IE systems from the ...
05/31/2022

hmBERT: Historical Multilingual Language Models for Named Entity Recognition

Compared to standard Named Entity Recognition (NER), identifying persons...
11/19/2019

Towards Lingua Franca Named Entity Recognition with BERT

Information extraction is an important task in NLP, enabling the automat...
08/16/2022

Temporal Concept Drift and Alignment: An empirical approach to comparing Knowledge Organization Systems over time

This research explores temporal concept drift and temporal alignment in ...
07/06/2022

Strong Heuristics for Named Entity Linking

Named entity linking (NEL) in news is a challenging endeavour due to the...
05/12/2021

Priberam Labs at the NTCIR-15 SHINRA2020-ML: Classification Task

Wikipedia is an online encyclopedia available in 285 languages. It compo...
05/31/2021

Pho(SC)Net: An Approach Towards Zero-shot Word Image Recognition in Historical Documents

Annotating words in a historical document image archive for word image r...