String-to-String Interpretations with Polynomial-Size Output

05/30/2019 ∙ by Mikołaj Bojańczyk, et al. ∙ RWTH Aachen University University of Warsaw 0

String-to-string MSO interpretations are like Courcelle's MSO transductions, except that a single output position can be represented using a tuple of input positions instead of just a single input position. In particular, the output length is polynomial in the input length, as opposed to MSO transductions, which have output of linear length. We show that string-to-string MSO interpretations are exactly the polyregular functions. The latter class has various characterizations, one of which is that it consists of the string-to-string functions recognized by pebble transducers. Our main result implies the surprising fact that string-to-string MSO interpretations are closed under composition.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A string-to-string function is called regular if it is computed by a deterministic two-way automaton with output. There are many equivalent models for the same class of functions: string-to-string mso transductions [engelfriet2001mso], streaming string transducers [alur2011streaming], and various kinds of combinator-based formalisms [alur2014regular, DBLP:conf/lics/DaveGK18, DBLP:conf/lics/BojanczykDK18].

A deterministic two-way automaton can visit each input position at most once in each state, otherwise it would loop forever. This means that the length of the run – and also the size of the output word – is linear in the input string. One way to go beyond linear-sized outputs was proposed by Milo, Suciu, and Vianu [milo2003typechecking], following earlier work by Globerman and Harel [Globerman:1996he]: equip the automaton with pebbles which can be used to mark positions in the input word. To avoid making the model Turing-powerful, the pebbles are required to observe a so-called stack discipline: the pebbles are organised in a stack, and only the top-most pebble can be moved. In [DBLP:journals/corr/abs-1810-08760], it is shown that pebble transducers are equivalent to multiple other models: a higher-order functional programming language [DBLP:journals/corr/abs-1810-08760, Section 4], an imperative programming language with for-loops [DBLP:journals/corr/abs-1810-08760, Section 3], combinators [DBLP:journals/corr/abs-1810-08760, end of Section 4], and compositions of certain simple atomic functions [DBLP:journals/corr/abs-1810-08760, Section 1]. Because of the multitude of models and their polynomial size outputs, the class of functions recognised by these models is called polyregular functions.

The list of models for polyregular functions described in [DBLP:journals/corr/abs-1810-08760] does not include any logical model. In this paper, we fix that omission. As mentioned above, for the regular functions, which have linear size output, the logical model consists in string-to-string mso transductions. In an mso transduction, each position of the output string is interpreted as a single position of the input string. A natural idea to capture polyregular functions is to consider what we call string-to-string mso interpretations, where a position of the output string is represented by a -tuple of positions in the input string. At first glance, this idea looks suspicious: if string-to-string mso interpretations were equivalent to polyregular functions, then they would be closed under composition, because the class of polyregular functions is. However, composing two string-to-string mso interpretations