Fast and linear-time string matching algorithms based on the distances of q-gram occurrences

02/19/2020
by   Satoshi Kobayashi, et al.
0

Given a text T of length n and a pattern P of length m, the string matching problem is a task to find all occurrences of P in T. In this study, we propose an algorithm that solves this problem in O((n + m)q) time considering the distance between two adjacent occurrences of the same q-gram contained in P. We also propose a theoretical improvement of it which runs in O(n + m) time, though it is not necessarily faster in practice. We compare the execution times of our and existing algorithms on various kinds of real and artificial datasets such as an English text, a genome sequence and a Fibonacci string. The experimental results show that our algorithm is as fast as the state-of-the-art algorithms in many cases, particularly when a pattern frequently appears in a text.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2023

Optimal-Hash Exact String Matching Algorithms

String matching is the problem of finding all the occurrences of a patte...
research
01/31/2022

Fuzzy Segmentations of a String

This article discusses a particular case of the data clustering problem,...
research
11/09/2021

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processin...
research
06/28/2023

Approximate Cartesian Tree Matching: an Approach Using Swaps

Cartesian tree pattern matching consists of finding all the factors of a...
research
02/17/2020

Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences

The equidistant subsequence pattern matching problem is considered. Give...
research
08/23/2021

On Specialization of a Program Model of Naive Pattern Matching in Strings (Extended Abstract)

We have proved that for any pattern p the tail recursive program model o...
research
04/24/2017

GaKCo: a Fast GApped k-mer string Kernel using COunting

String Kernel (SK) techniques, especially those using gapped k-mers as f...

Please sign up or login with your details

Forgot password? Click here to reset