Finding Frequent Entities in Continuous Data

05/08/2018
by   Ferran Alet, et al.
0

In many applications that involve processing high-dimensional data, it is important to identify a small set of entities that account for a significant fraction of detections. Rather than formalize this as a clustering problem, in which all detections must be grouped into hard or soft categories, we formalize it as an instance of the frequent items or heavy hitters problem, which finds groups of tightly clustered objects that have a high density in the feature space. We show that the heavy hitters formulation generates solutions that are more accurate and effective than the clustering formulation. In addition, we present a novel online algorithm for heavy hitters, called HAC, which addresses problems in continuous space, and demonstrate its effectiveness on real video and household domains.

READ FULL TEXT

page 5

page 6

page 7

research
11/02/2018

A Fast Algorithm for Clustering High Dimensional Feature Vectors

We propose an algorithm for clustering high dimensional data. If P featu...
research
10/27/2021

Mining frequency-based sequential trajectory co-clusters

Co-clustering is a specific type of clustering that addresses the proble...
research
03/05/2018

Deep Continuous Clustering

Clustering high-dimensional datasets is hard because interpoint distance...
research
10/26/2020

Multi-Objective Frequent Termset Clustering

Large media collections rapidly evolve in the World Wide Web. In additio...
research
01/30/2018

Links: A High-Dimensional Online Clustering Method

We present a novel algorithm, called Links, designed to perform online c...
research
12/02/2022

Clustering through Feature Space Sequence Discovery and Analysis

Identifying high-dimensional data patterns without a priori knowledge is...

Please sign up or login with your details

Forgot password? Click here to reset