Improving Tagging Consistency and Entity Coverage for Chemical Identification in Full-text Articles

11/20/2021
by   Hyunjae Kim, et al.
0

This paper is a technical report on our system submitted to the chemical identification task of the BioCreative VII Track 2 challenge. The main feature of this challenge is that the data consists of full-text articles, while current datasets usually consist of only titles and abstracts. To effectively address the problem, we aim to improve tagging consistency and entity coverage using various methods such as majority voting within the same articles for named entity recognition (NER) and a hybrid approach that combines a dictionary and a neural model for normalization. In the experiments on the NLM-Chem dataset, we show that our methods improve models' performance, particularly in terms of recall. Finally, in the official evaluation of the challenge, our system was ranked 1st in NER by significantly outperforming the baseline model and more than 80 submissions from 16 teams.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2021

Chemical Identification and Indexing in PubMed Articles via BERT and Text-to-Text Approaches

The Biocreative VII Track-2 challenge consists of named entity recogniti...
research
06/04/2021

Dutch Named Entity Recognition and De-identification Methods for the Human Resource Domain

The human resource (HR) domain contains various types of privacy-sensiti...
research
11/19/2017

A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text

Named Entity Recognition and Relation Extraction for Chinese literature ...
research
07/21/2020

newsSweeper at SemEval-2020 Task 11: Context-Aware Rich Feature Representations For Propaganda Classification

This paper describes our submissions to SemEval 2020 Task 11: Detection ...
research
12/24/2022

A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

Objective. Chemical named entity recognition (NER) models have the poten...
research
05/21/2023

A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition

Existing models for named entity recognition (NER) are mainly based on l...
research
07/30/2021

An automated domain-independent text reading, interpreting and extracting approach for reviewing the scientific literature

It is presented here a machine learning-based (ML) natural language proc...

Please sign up or login with your details

Forgot password? Click here to reset