Sequence Graph Transform (SGT): A Feature Extraction Function for Sequence Data Mining (Extended Version)

08/11/2016
by   Chitta Ranjan, et al.
0

The ubiquitous presence of sequence data across fields such as the web, healthcare, bioinformatics, and text mining has made sequence mining a vital research area. However, sequence mining is particularly challenging because of difficulty in finding (dis)similarity/distance between sequences. This is because a distance measure between sequences is not obvious due to their unstructuredness---arbitrary strings of arbitrary length. Feature representations, such as n-grams, are often used but they either compromise on extracting both short- and long-term sequence patterns or have a high computation. We propose a new function, Sequence Graph Transform (SGT), that extracts the short- and long-term sequence features and embeds them in a finite-dimensional feature space. Importantly, SGT has low computation and can extract any amount of short- to long-term patterns without any increase in the computation, also proved theoretically in this paper. Due to this, SGT yields superior result with significantly higher accuracy and lower computation compared to the existing methods. We show it via several experimentation and SGT's real world application for clustering, classification, search and visualization as examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2022

Convolution channel separation and frequency sub-bands aggregation for music genre classification

In music, short-term features such as pitch and tempo constitute long-te...
research
12/04/2017

Long-Term Visual Object Tracking Benchmark

In this paper, we propose a new long video dataset (called Track Long an...
research
12/12/2018

Long-Term Feature Banks for Detailed Video Understanding

To understand the world, we humans constantly need to relate the present...
research
05/17/2019

Reference-Based Sequence Classification

Sequence classification is an important data mining task in many real wo...
research
06/20/2022

A Novel Long-term Iterative Mining Scheme for Video Salient Object Detection

The existing state-of-the-art (SOTA) video salient object detection (VSO...
research
09/05/2023

iLoRE: Dynamic Graph Representation with Instant Long-term Modeling and Re-occurrence Preservation

Continuous-time dynamic graph modeling is a crucial task for many real-w...
research
06/13/2021

InfoBehavior: Self-supervised Representation Learning for Ultra-long Behavior Sequence via Hierarchical Grouping

E-commerce companies have to face abnormal sellers who sell potentially-...

Please sign up or login with your details

Forgot password? Click here to reset