Support Vector Machine Active Learning Algorithms with Query-by-Committee versus Closest-to-Hyperplane Selection

01/24/2018
by   Michael Bloodgood, et al.
0

This paper investigates and evaluates support vector machine active learning algorithms for use with imbalanced datasets, which commonly arise in many applications such as information extraction applications. Algorithms based on closest-to-hyperplane selection and query-by-committee selection are combined with methods for addressing imbalance such as positive amplification based on prevalence statistics from initial random samples. Three algorithms (ClosestPA, QBagPA, and QBoostPA) are presented and carefully evaluated on datasets for text classification and relation extraction. The ClosestPA algorithm is shown to consistently outperform the other two in a variety of ways and insights are provided as to why this is the case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2023

Algorithm Selection for Deep Active Learning with Imbalanced Datasets

Label efficiency has become an increasingly important objective in deep ...
research
07/27/2018

Leveraging Support Vector Machine for Opcode Density Based Detection of Crypto-Ransomware

Ransomware is a significant global threat, with easy deployment due to t...
research
01/25/2022

Cold Start Active Learning Strategies in the Context of Imbalanced Classification

We present novel active learning strategies dedicated to providing a sol...
research
01/20/2020

Early Forecasting of Text Classification Accuracy and F-Measure with Active Learning

When creating text classification systems, one of the major bottlenecks ...
research
05/18/2018

Combining Cost-Sensitive Classification with Negative Selection for Protein Function Prediction

Motivation: Computational methods play a central role in annotating the ...
research
08/25/2023

Active learning for fast and slow modeling attacks on Arbiter PUFs

Modeling attacks, in which an adversary uses machine learning techniques...
research
12/10/2021

Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

In this paper, we present a method of building strong, explainable class...

Please sign up or login with your details

Forgot password? Click here to reset