A Data-Structure for Approximate Longest Common Subsequence of A Set of Strings

08/04/2020
by   Sepideh Aghamolaei, et al.
0

Given a set of k strings I, their longest common subsequence (LCS) is the string with the maximum length that is a subset of all the strings in I. A data-structure for this problem preprocesses I into a data-structure such that the LCS of a set of query strings Q with the strings of I can be computed faster. Since the problem is NP-hard for arbitrary k, we allow an error that allows some characters to be replaced by other characters. We define the approximation version of the problem with an extra input m, which is the length of the regular expression (regex) that describes the input, and the approximation factor is the logarithm of the number of possibilities in the regex returned by the algorithm, divided by the logarithm regex with the minimum number of possibilities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2018

Fast entropy-bounded string dictionary look-up with mismatches

We revisit the fundamental problem of dictionary look-up with mismatches...
research
04/16/2019

Dynamic Packed Compact Tries Revisited

Given a dynamic set K of k strings of total length n whose characters ar...
research
05/23/2023

Engineering Rank/Select Data Structures for Big-Alphabet Strings

Big-alphabet strings are common in several scenarios such as information...
research
06/29/2020

Pattern Masking for Dictionary Matching

In the Pattern Masking for Dictionary Matching (PMDM) problem, we are gi...
research
04/25/2019

SafeStrings: Representing Strings as Structured Data

Strings are ubiquitous in code. Not all strings are created equal, some ...
research
01/25/2022

The development of a portable elbow exoskeleton with a Twisted Strings Actuator to assist patients with upper limb inhabitation

Over the years, the number of exoskeleton devices utilized for upper-lim...
research
02/17/2016

Lexis: An Optimization Framework for Discovering the Hierarchical Structure of Sequential Data

Data represented as strings abounds in biology, linguistics, document mi...

Please sign up or login with your details

Forgot password? Click here to reset