Contextualization and Generalization in Entity and Relation Extraction

06/15/2022
by   Bruno Taillé, et al.
0

During the past decade, neural networks have become prominent in Natural Language Processing (NLP), notably for their capacity to learn relevant word representations from large unlabeled corpora. These word embeddings can then be transferred and finetuned for diverse end applications during a supervised training phase. More recently, in 2018, the transfer of entire pretrained Language Models and the preservation of their contextualization capacities enabled to reach unprecedented performance on virtually every NLP benchmark, sometimes even outperforming human baselines. However, as models reach such impressive scores, their comprehension abilities still appear as shallow, which reveal limitations of benchmarks to provide useful insights on their factors of performance and to accurately measure understanding capabilities. In this thesis, we study the behaviour of state-of-the-art models regarding generalization to facts unseen during training in two important Information Extraction tasks: Named Entity Recognition (NER) and Relation Extraction (RE). Indeed, traditional benchmarks present important lexical overlap between mentions and relations used for training and evaluating models, whereas the main interest of Information Extraction is to extract previously unknown information. We propose empirical studies to separate performance based on mention and relation overlap with the training set and find that pretrained Language Models are mainly beneficial to detect unseen mentions, in particular out-of-domain. While this makes them suited for real use cases, there is still a gap in performance between seen and unseen mentions that hurts generalization to new facts. In particular, even state-of-the-art ERE models rely on a shallow retention heuristic, basing their prediction more on arguments surface forms than context.

READ FULL TEXT

page 25

page 37

research
09/24/2021

Separating Retention from Extraction in the Evaluation of End-to-end Relation Extraction

State-of-the-art NLP models can adopt shallow heuristics that limit thei...
research
01/22/2020

Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization

Contextualized embeddings use unsupervised language model pretraining to...
research
12/20/2019

End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models

Named entity recognition (NER) and relation extraction (RE) are two impo...
research
12/19/2022

Enriching Relation Extraction with OpenIE

Relation extraction (RE) is a sub-discipline of information extraction (...
research
05/21/2022

DeepStruct: Pretraining of Language Models for Structure Prediction

We introduce a method for improving the structural understanding abiliti...
research
10/26/2022

Autoregressive Structured Prediction with Language Models

Recent years have seen a paradigm shift in NLP towards using pretrained ...
research
02/03/2021

Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation

Public datasets are often used to evaluate the efficacy and generalizabi...

Please sign up or login with your details

Forgot password? Click here to reset