Automated Testing and Improvement of Named Entity Recognition Systems

by   Boxi Yu, et al.

Named entity recognition (NER) systems have seen rapid progress in recent years due to the development of deep neural networks. These systems are widely used in various natural language processing applications, such as information extraction, question answering, and sentiment analysis. However, the complexity and intractability of deep neural networks can make NER systems unreliable in certain circumstances, resulting in incorrect predictions. For example, NER systems may misidentify female names as chemicals or fail to recognize the names of minority groups, leading to user dissatisfaction. To tackle this problem, we introduce TIN, a novel, widely applicable approach for automatically testing and repairing various NER systems. The key idea for automated testing is that the NER predictions of the same named entities under similar contexts should be identical. The core idea for automated repairing is that similar named entities should have the same NER prediction under the same context. We use TIN to test two SOTA NER models and two commercial NER APIs, i.e., Azure NER and AWS NER. We manually verify 784 of the suspicious issues reported by TIN and find that 702 are erroneous issues, leading to high precision (85.0 over-labeling, incorrect category, and range error. For automated repairing, TIN achieves a high error reduction rate (26.8 under test, which successfully repairs 1,056 out of the 1,877 reported NER errors.


A Survey on Recent Advances in Named Entity Recognition from Deep Learning models

Named Entity Recognition (NER) is a key component in NLP systems for que...

Assessing Demographic Bias in Named Entity Recognition

Named Entity Recognition (NER) is often the first step towards automated...

A Survey on Deep Learning for Named Entity Recognition

Named entity recognition (NER) is the task to identify text spans that m...

Exploiting Lists of Names for Named Entity Identification of Financial Institutions from Unstructured Documents

There is a wealth of information about financial systems that is embedde...

A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

Objective. Chemical named entity recognition (NER) models have the poten...

Automated Generation of Interorganizational Disaster Response Networks through Information Extraction

When a disaster occurs, maintaining and restoring community lifelines su...

An Intelligent Recommendation-cum-Reminder System

Intelligent recommendation and reminder systems are the need of the fast...

Please sign up or login with your details

Forgot password? Click here to reset