LCP-Aware Parallel String Sorting

06/03/2020
by   Jonas Ellert, et al.
0

When lexicographically sorting strings, it is not always necessary to inspect all symbols. For example, the lexicographical rank of "europar" amongst the strings "eureka", "eurasia", and "excells" only depends on its so called relevant prefix "euro". The distinguishing prefix size D of a set of strings is the number of symbols that actually need to be inspected to establish the lexicographical ordering of all strings. Efficient string sorters should be D-aware, i.e. their complexity should depend on D rather than on the total number N of all symbols in all strings. While there are many D-aware sorters in the sequential setting, there appear to be no such results in the PRAM model. We propose a framework yielding a D-aware modification of any existing PRAM string sorter. The derived algorithms are work-optimal with respect to their original counterpart: If the original algorithm requires O(w(N)) work, the derived one requires O(w(D)) work. The execution time increases only by a small factor that is logarithmic in the length of the longest relevant prefix. Our framework universally works for deterministic and randomized algorithms in all variations of the PRAM model, such that future improvements in (D-unaware) parallel string sorting will directly result in improvements in D-aware parallel string sorting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/07/2020

Quantum Algorithms for the Most Frequently String Search, Intersection of Two String Sequences and Sorting of Strings Problems

We study algorithms for solving three problems on strings. The first one...
research
04/24/2022

String Rearrangement Inequalities and a Total Order Between Primitive Words

We study the following rearrangement problem: Given n words, rearrange a...
research
09/16/2023

Parallel Longest Common SubSequence Analysis In Chapel

One of the most critical problems in the field of string algorithms is t...
research
05/15/2019

Techniques for Inferring Context-Free Lindenmayer Systems With Genetic Algorithm

Lindenmayer systems (L-systems) are a formal grammar system, where the m...
research
12/01/2017

New Techniques for Inferring L-Systems Using Genetic Algorithm

Lindenmayer systems (L-systems) are a formal grammar system that iterati...
research
01/23/2020

Communication-Efficient String Sorting

There has been surprisingly little work on algorithms for sorting string...
research
03/21/2019

Scalable Similarity Joins of Tokenized Strings

This work tackles the problem of fuzzy joining of strings that naturally...

Please sign up or login with your details

Forgot password? Click here to reset