Granular conditional entropy-based attribute reduction for partially labeled data with proxy labels

01/23/2021
by   Can Gao, et al.
0

Attribute reduction is one of the most important research topics in the theory of rough sets, and many rough sets-based attribute reduction methods have thus been presented. However, most of them are specifically designed for dealing with either labeled data or unlabeled data, while many real-world applications come in the form of partial supervision. In this paper, we propose a rough sets-based semi-supervised attribute reduction method for partially labeled data. Particularly, with the aid of prior class distribution information about data, we first develop a simple yet effective strategy to produce the proxy labels for unlabeled data. Then the concept of information granularity is integrated into the information-theoretic measure, based on which, a novel granular conditional entropy measure is proposed, and its monotonicity is proved in theory. Furthermore, a fast heuristic algorithm is provided to generate the optimal reduct of partially labeled data, which could accelerate the process of attribute reduction by removing irrelevant examples and excluding redundant attributes simultaneously. Extensive experiments conducted on UCI data sets demonstrate that the proposed semi-supervised attribute reduction method is promising and even compares favourably with the supervised methods on labeled data and unlabeled data with true labels in terms of classification performance.

READ FULL TEXT
research
10/12/2020

Unsupervised Semantic Aggregation and Deformable Template Matching for Semi-Supervised Learning

Unlabeled data learning has attracted considerable attention recently. H...
research
05/31/2023

A rule-general abductive learning by rough sets

In real-world tasks, there is usually a large amount of unlabeled data a...
research
04/25/2016

Semi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion

In this paper, a novel semi-supervised dictionary learning and sparse re...
research
08/26/2011

Semi-supervised logistic discrimination via labeled data and unlabeled data from different sampling distributions

This article addresses the problem of classification method based on bot...
research
02/20/2008

Classification Constrained Dimensionality Reduction

Dimensionality reduction is a topic of recent interest. In this paper, w...
research
12/22/2015

Heuristic algorithms for finding distribution reducts in probabilistic rough set model

Attribute reduction is one of the most important topics in rough set the...
research
10/19/2012

On Information Regularization

We formulate a principle for classification with the knowledge of the ma...

Please sign up or login with your details

Forgot password? Click here to reset