Self-Bounded Prediction Suffix Tree via Approximate String Matching

02/09/2018
by   Dongwoo Kim, et al.
0

Prediction suffix trees (PST) provide an effective tool for sequence modelling and prediction. Current prediction techniques for PSTs rely on exact matching between the suffix of the current sequence and the previously observed sequence. We present a provably correct algorithm for learning a PST with approximate suffix matching by relaxing the exact matching condition. We then present a self-bounded enhancement of our algorithm where the depth of suffix tree grows automatically in response to the model performance on a training sequence. Through experiments on synthetic datasets as well as three real-world datasets, we show that the approximate matching PST results in better predictive performance than the other variants of PST.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

08/14/2019

Fast Cartesian Tree Matching

Cartesian tree matching is the problem of finding all substrings of a gi...
11/05/2019

Fast Multiple Pattern Cartesian Tree Matching

Cartesian tree matching is the problem of finding all substrings in a gi...
11/08/2020

Scout Algorithm For Fast Substring Matching

Exact substring matching is a common task in many software applications....
10/26/2021

Linear Approximate Pattern Matching Algorithm

Pattern matching is a fundamental process in almost every scientific dom...
10/02/2018

Fully dynamic 3/2 approximate maximum cardinality matching in O(√(n)) update time

We present a randomized algorithm to maintain a maximal matching without...
12/16/2015

Learning a Hybrid Architecture for Sequence Regression and Annotation

When learning a hidden Markov model (HMM), sequen- tial observations can...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.