Pre-trained Language Model Based Active Learning for Sentence Matching

by   Guirong Bai, et al.

Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria to measure instances and help select more efficient instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.


page 1

page 2

page 3

page 4


Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates

Annotating training data for sequence tagging tasks is usually very time...

Active Sentence Learning by Adversarial Uncertainty Sampling in Discrete Space

In this paper, we focus on reducing the labeled data size for sentence l...

Acquiring Word-Meaning Mappings for Natural Language Interfaces

This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted E...

Cold-start Active Learning through Self-supervised Language Modeling

Active learning strives to reduce annotation costs by choosing the most ...

Active Learning amidst Logical Knowledge

Structured prediction is ubiquitous in applications of machine learning ...

Understand customer reviews with less data and in short time: pretrained language representation and active learning

In this paper, we address customer review understanding problems by usin...

Active Learning for Delineation of Curvilinear Structures

Many recent delineation techniques owe much of their increased effective...