Active Learning with Tabular Language Models

11/08/2022
by   Martin Ringsquandl, et al.
0

Despite recent advancements in tabular language model research, real-world applications are still challenging. In industry, there is an abundance of tables found in spreadsheets, but acquisition of substantial amounts of labels is expensive, since only experts can annotate the often highly technical and domain-specific tables. Active learning could potentially reduce labeling costs, however, so far there are no works related to active learning in conjunction with tabular language models. In this paper we investigate different acquisition functions in a real-world industrial tabular language model use case for sub-cell named entity recognition. Our results show that cell-level acquisition functions with built-in diversity can significantly reduce the labeling effort, while enforced table diversity is detrimental. We further see open fundamental questions concerning computational efficiency and the perspective of human annotators.

READ FULL TEXT
research
08/28/2020

Cost-Quality Adaptive Active Learning for Chinese Clinical Named Entity Recognition

Clinical Named Entity Recognition (CNER) aims to automatically identity ...
research
09/29/2022

Named Entity Recognition in Industrial Tables using Tabular Language Models

Specialized transformer-based models for encoding tabular data have gain...
research
07/19/2017

Deep Active Learning for Named Entity Recognition

Deep neural networks have advanced the state of the art in named entity ...
research
11/06/2021

Focusing on Possible Named Entities in Active Named Entity Label Acquisition

Named entity recognition (NER) aims to identify mentions of named entiti...
research
05/16/2023

On Dataset Transferability in Active Learning for Transformers

Active learning (AL) aims to reduce labeling costs by querying the examp...
research
11/17/2019

Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition

Existing deep active learning algorithms achieve impressive sampling eff...
research
10/30/2020

Semantic Labeling Using a Deep Contextualized Language Model

Generating schema labels automatically for column values of data tables ...

Please sign up or login with your details

Forgot password? Click here to reset