Subgroup Discovery in Unstructured Data

07/15/2022
by   Ali Arab, et al.
32

Subgroup discovery is a descriptive and exploratory data mining technique to identify subgroups in a population that exhibit interesting behavior with respect to a variable of interest. Subgroup discovery has numerous applications in knowledge discovery and hypothesis generation, yet it remains inapplicable for unstructured, high-dimensional data such as images. This is because subgroup discovery algorithms rely on defining descriptive rules based on (attribute, value) pairs, however, in unstructured data, an attribute is not well defined. Even in cases where the notion of attribute intuitively exists in the data, such as a pixel in an image, due to the high dimensionality of the data, these attributes are not informative enough to be used in a rule. In this paper, we introduce the subgroup-aware variational autoencoder, a novel variational autoencoder that learns a representation of unstructured data which leads to subgroups with higher quality. Our experimental results demonstrate the effectiveness of the method at learning subgroups with high quality while supporting the interpretability of the concepts.

READ FULL TEXT

page 7

page 8

page 9

research
04/24/2018

Mask-aware Photorealistic Face Attribute Manipulation

The task of face attribute manipulation has found increasing application...
research
08/12/2019

Variational Autoencoded Regression: High Dimensional Regression of Visual Data on Complex Manifold

This paper proposes a new high dimensional regression method by merging ...
research
04/01/2022

Semi-FairVAE: Semi-supervised Fair Representation Learning with Adversarial Variational Autoencoder

Adversarial learning is a widely used technique in fair representation l...
research
08/11/2022

RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data

Background: Understanding the relationship between the Omics and the phe...
research
10/12/2021

Label scarcity in biomedicine: Data-rich latent factor discovery enhances phenotype prediction

High-quality data accumulation is now becoming ubiquitous in the health ...
research
04/12/2020

Variational Autoencoders with Normalizing Flow Decoders

Recently proposed normalizing flow models such as Glow have been shown t...
research
06/28/2022

Adaptive Multi-view Rule Discovery for Weakly-Supervised Compatible Products Prediction

On e-commerce platforms, predicting if two products are compatible with ...

Please sign up or login with your details

Forgot password? Click here to reset