Modifying the Symbolic Aggregate Approximation Method to Capture Segment Trend Information

by   Muhammad Marwan Muhammad Fuad, et al.

The Symbolic Aggregate approXimation (SAX) is a very popular symbolic dimensionality reduction technique of time series data, as it has several advantages over other dimensionality reduction techniques. One of its major advantages is its efficiency, as it uses precomputed distances. The other main advantage is that in SAX the distance measure defined on the reduced space lower bounds the distance measure defined on the original space. This enables SAX to return exact results in query-by-content tasks. Yet SAX has an inherent drawback, which is its inability to capture segment trend information. Several researchers have attempted to enhance SAX by proposing modifications to include trend information. However, this comes at the expense of giving up on one or more of the advantages of SAX. In this paper we investigate three modifications of SAX to add trend capturing ability to it. These modifications retain the same features of SAX in terms of simplicity, efficiency, as well as the exact results it returns. They are simple procedures based on a different segmentation of the time series than that used in classic-SAX. We test the performance of these three modifications on 45 time series datasets of different sizes, dimensions, and nature, on a classification task and we compare it to that of classic-SAX. The results we obtained show that one of these modifications manages to outperform classic-SAX and that another one slightly gives better results than classic-SAX.



page 8


Extreme-SAX: Extreme Points Based Symbolic Representation for Time Series Classification

Time series classification is an important problem in data mining with s...

A Novel Trend Symbolic Aggregate Approximation for Time Series

Symbolic Aggregate approximation (SAX) is a classical symbolic approach ...

An Improvement of PAA on Trend-Based Approximation for Time Series

Piecewise Aggregate Approximation (PAA) is a competitive basic dimension...

Particle Swarm Optimization of Information-Content Weighting of Symbolic Aggregate Approximation

Bio-inspired optimization algorithms have been gaining more popularity r...

Distribution Agnostic Symbolic Representations for Time Series Dimensionality Reduction and Online Anomaly Detection

Due to the importance of the lower bounding distances and the attractive...

One-Step or Two-Step Optimization and the Overfitting Phenomenon: A Case Study on Time Series Classification

For the last few decades, optimization has been developing at a fast rat...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.