Comparing and combining some popular NER approaches on Biomedical tasks

05/30/2023
by   Harsh Verma, et al.
0

We compare three simple and popular approaches for NER: 1) SEQ (sequence-labeling with a linear token classifier) 2) SeqCRF (sequence-labeling with Conditional Random Fields), and 3) SpanPred (span-prediction with boundary token embeddings). We compare the approaches on 4 biomedical NER tasks: GENIA, NCBI-Disease, LivingNER (Spanish), and SocialDisNER (Spanish). The SpanPred model demonstrates state-of-the-art performance on LivingNER and SocialDisNER, improving F1 by 1.3 and 0.6 F1 respectively. The SeqCRF model also demonstrates state-of-the-art performance on LivingNER and SocialDisNER, improving F1 by 0.2 F1 and 0.7 respectively. The SEQ model is competitive with the state-of-the-art on the LivingNER dataset. We explore some simple ways of combining the three approaches. We find that majority voting consistently gives high precision and high F1 across all 4 datasets. Lastly, we implement a system that learns to combine the predictions of SEQ and SpanPred, generating systems that consistently give high recall and high F1 across all 4 datasets. On the GENIA dataset, we find that our learned combiner system significantly boosts F1(+1.2) and recall(+2.1) over the systems being combined. We release all the well-documented code necessary to reproduce all systems at https://github.com/flyingmothman/bionlp.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2023

CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling approaches for NER

This paper summarizes the CLaC submission for the MultiCoNER 2 task whic...
research
06/06/2019

GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

Current state-of-the-art systems for sequence labeling are typically bas...
research
02/24/2021

NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques

Toxicity detection of text has been a popular NLP task in the recent yea...
research
04/25/2022

Local Hypergraph-based Nested Named Entity Recognition as Query-based Sequence Labeling

There has been a growing academic interest in the recognition of nested ...
research
05/29/2023

Extrinsic Factors Affecting the Accuracy of Biomedical NER

Biomedical named entity recognition (NER) is a critial task that aims to...
research
03/23/2021

TMR: Evaluating NER Recall on Tough Mentions

We propose the Tough Mentions Recall (TMR) metrics to supplement traditi...
research
10/07/2019

Why Attention? Analyzing and Remedying BiLSTM Deficiency in Modeling Cross-Context for NER

State-of-the-art approaches of NER have used sequence-labeling BiLSTM as...

Please sign up or login with your details

Forgot password? Click here to reset