 # Distributed graph problems through an automata-theoretic lens

We study the following algorithm synthesis question: given the description of a locally checkable graph problem Π for paths or cycles, determine in which instances Π is solvable, determine what is the distributed round complexity of solving Π in the usual 𝖫𝖮𝖢𝖠𝖫 model of distributed computing, and construct an asymptotically optimal distributed algorithm for solving Π. To answer such questions, we represent Π as a nondeterministic finite automaton ℳ over a unary alphabet. We classify the states of ℳ into repeatable states, flexible states, mirror-flexible states, loops, and mirror-flexible loops; all of these can be decided in polynomial time. We show that these five classes of states completely answer all questions related to the solvability and distributed computational complexity of Π on cycles. On paths, there is one case in which the question of solvability coincides with the classical universality problem for unary regular languages, and hence determining if a given problem Π is always solvable is co-𝖭𝖯-complete. However, we show that all other questions, including the question of determining the distributed round complexity of Π and finding an asymptotically optimal algorithm for solving Π, can be answered in polynomial time.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this work, we introduce an automata-theoretic perspective for studying locality in distributed computing, and we use it to completely resolve questions related to the solvability and distributed computational complexity of locally checkable graph problems on unlabeled paths and cycles. In particular, we show that almost all such questions can be decided in polynomial time, with only a couple of exceptions that are (co-)-complete. All of our positive results are constructive: in addition to determining the distributed computational complexity of any given problem, we can also synthesize an asymptotically optimal distributed algorithm for solving the problem.

#### Background: locality and round complexity in distributed computing.

One of the fundamental questions in distributed computing is locality: given a graph problem, how far does an individual node of a graph need to see in order to pick its own part of the solution.

In classical centralized sequential computing, a particularly successful idea has been the comparison of deterministic and nondeterministic models of computing. The classical question of vs.  is a prime example: given a problem in which solutions are easy to verify, are they also easy to solve?

The distributed analog of this idea is formally captured in the study of so-called locally checkable labeling () problems in the model of distributed computing. problems are graph problems in which the solutions that are labelings of nodes and/or edges can be verified locally: if a solution looks feasible in all constant-radius neighborhoods, then it is also globally feasible . A simple example of an problem is proper -coloring of a graph: if a labeling of the nodes looks like a proper -coloring in the radius- neighborhood of each node, then it is by definition a feasible solution.

In the model of computing [23, 25], we assume that the nodes of the input graph are labeled with unique identifiers from , where is the number of nodes; the labeling is chosen by the adversary. A distributed algorithm with a time complexity is then a function that maps the radius- neighborhood of each node into its local output. The local output of a node is its own part of the solution, e.g., its own color in the graph coloring problem.

If we interpret the input graph as a computer network, with nodes as computers and edges as communication links, then in synchronous communication rounds all nodes can gather full information about their radius- neighborhood. Hence time (number of communication rounds) and distance (how far one needs to see) are interchangeable in the model. In what follows, we will primarily use the term round complexity.

#### Prior work: the complexity landscape of LCL problems.

Now we have a natural distributed analog of the classical vs.  question: given a problem that can be verified locally, how easy is it to solve locally? Put otherwise, given an problem, what is its time complexity in the model? This is a question that was already introduced by Naor and Stockmeyer in 1995 , but the systematic study of the complexity landscape of questions was started only very recently, around 2016 [7, 6, 4, 5, 11, 18, 16, 10, 19, 27, 8].

By now we have got a relatively complete understanding of possible complexity classes: to give a simple example, if we look at deterministic algorithms in the model, there are problems with complexity , and there are also problems with complexity , but it can be shown that there is no problem with complexity between and [10, 13, 23, 7].

However, much less is known about how to decide the complexity of a given problem. Unsurprisingly, many such questions are undecidable in general, and undecidability holds already in relatively simple settings such as s on 2-dimensional grids .

#### Our focus: LCLs on paths and cycles.

In this work, we focus on a setting in which questions related to distributed complexity are known to be decidable: paths and cycles. We believe this will also pave the way for resolving analogous questions on trees.

In cycles and paths, there are only three possible complexities: , , or  . Furthermore, randomness does not help in this case—this is a major difference in comparison with trees, in which there are problem in which randomness helps exponentially .

In the most general setting, we may have input labels which are elements from some input alphabet that are attached to the nodes of the graph and the can refer to them. In this case the distributed complexity is decidable but unfortunately it is known to be at least -hard . In the other extreme we have got s on unlabeled directed cycles, and in this case there is a simple graph-theoretic characterization of the distributed complexity of any given problem .

However, many questions are left open by prior work, and these are the questions that we will resolve in this work:

• What happens in undirected cycles?

• What happens if we study paths instead of cycles?

• Can we also characterize the existence of a solution for all graphs in a graph class?

To illustrate these questions, consider the following problems that can be expressed as s:

• : finding a proper -coloring,

• : finding a globally consistent orientation.

The round complexity of is both in cycles and paths, regardless of whether they are directed or undirected, while the complexity of is in the undirected setting but it becomes in the directed setting. Problems and are always solvable on paths,

is always solvable on cycles, but if we have an odd cycle, then a solution to

does not exist. In particular, for there are infinitely many solvable instances and infinitely many unsolvable instances. Our goal in this work is to develop a framework that enables us to make this kind of observations automatically for any given problem—indeed, prior work does not exclude the possibility that answering all such questions would be possible in polynomial time (in the length of the description of the ).

#### New idea: automata-theoretic perspective.

In this work we introduce an automata-theoretic perspective for studying the solvability and the distributed complexity of problems. A labeling of a directed path with symbols from some alphabet can be interpreted as a string, and then a locally checkable problem can be interpreted as a regular language. We could then represent an problem as a finite automaton such that accepts a string if and only if a directed path labeled with is a feasible solution to . We could then try to study questions related to distributed complexity of by studying classical automata-theoretic properties of .

However, this is not the perspective that we take in this work, and this does not seem to lead to a useful theory of problems. To see one challenge, consider these problems:

• : finding a proper -coloring,

• : finding a proper -coloring.

These are fundamentally different problems from the perspective of s in the model: problem requires rounds while problem is solvable in rounds , but if we consider analogous automata and that recognize these solutions, it is not easy to identify a classical automata-theoretic concept that would separate these cases.

Instead, we take the following perspective (this is a simplified version of the idea):

Assume is an problem in which the set of output symbols is . We interpret as a nondeterministic finite automaton over the unary alphabet such that the set of states of is .

At first this approach may seem counterintuitive, but as we will see in this work, it enables us to connect classical automata-theoretic concepts to properties of s this way.

To give one nontrivial example, consider the question of whether a given problem can be solved in rounds. This turns out to be directly connected to the existence of synchronizing words [9, 15], in the following nondeterministic sense: we say that is a synchronizing word for an NFA that takes into state if, given any starting state there is a sequence of state transitions that takes to state when it processes . Such a sequence is known as the D3-directing word introduced in , and further studied in [14, 22, 17]. We will show that the following holds (up to some minor technicalities):

An on directed paths and cycles has a round complexity of if and only if the corresponding NFA over the unary alphabet has a D3-directing word.

Moreover, the existence of such a word can be decided in polynomial time in the size of the NFA , or equivalently, in the size of the description of the .

#### Our contributions.

We study problems in unlabeled cycles and paths, both with and without consistent orientation. We use a formalism that is expressive enough to capture all such problems. We show how to answer the following questions in a mechanical manner, for any given problem in any of these settings:

• How many unsolvable instances there are (none, constantly many, or infinitely many)?

• How many solvable instances there are (none, constantly many, or infinitely many)?

• What is the round complexity of for solvable instances (, , or )?

We show that all such questions are not only decidable but they are in (co-), and almost all such questions are in P, with the exception of a couple of specific questions that are (co-)-complete. We also give a complete classification of all possible case combinations—for example, we show that if there are infinitely many unsolvable instances, then the complexity of the problem for solvable instances cannot be .

We give a uniform automata-theoretic formalism that enables us to study such questions, and that makes it possible to leverage prior work on automata theory. We also develop new efficient algorithms for some automata-theoretic questions that to our knowledge have not been studied before.

#### Comparison with prior work.

In comparison with [11, 1, 8, 24], our work gives a more fine-grained perspective: instead of merely discussing decidability, we explore the question of which of the decision problems are in or .

In comparison with the discussion of directed cycles in , our work studies a much broader range of settings. Previously, it was not expected that the simple characterization of s on directed cycles could be extended in a straightforward manner to paths or undirected cycles. For example, we can define an infinite family of orientation problems that can be solved in undirected cycles in rounds but that require a nontrivial algorithm; such problems do not exist in directed cycles, as -round solvability implies trivial -round solvability. Nevertheless, as we will see in this work, we can develop an effective characterization of all problems in all of these settings.

Furthermore, we study the graph-theoretic question of the existence of a solution in addition to the algorithmic question of the complexity of finding a solution, and relate solvability with complexity in a systematic manner; we are not aware of prior work that would do the same in the context of s in the model.

#### Future work: LCLs on trees.

We envision that our approach can be extended to the study of s beyond unlabeled paths and cycles. At least in principle, one could represent s on bounded-degree trees by replacing automata with tree automata, and one could represent s with input labels by considering automata with an alphabet size more than . It is not yet known if the distributed complexity of s on bounded-degree trees is decidable. However, it is known that deciding the distributed complexity of s on paths and cycles with input labels is -hard , and the structure of a tree can be used to encode input labels; therefore deciding the round complexity of a given is at least -hard on unlabeled bounded-degree trees. However, there is hope that one could find an interesting subfamily of s that are sufficiently expressible to capture most of the fundamental problems considered in the literature, yet simple enough that the fundamental properties can be decided in polynomial time.

## 2 Representation of LCLs as automata

problems , broadly speaking, are problems in which the task is to label nodes and/or edges with labels from a constant-size alphabet, subject to local constraints. That is, a solution is globally feasible if it looks good in all radius- neighborhoods for some constant . In this section we will develop a way to represent all problems on paths and cycles as a nondeterministic automata. Figure 1: Examples of how to encode LCL problems in the half-edge formalism, and how to represent the problem as an automaton. Here the problems are symmetric, so they are well-specified also on undirected cycles. For maximal matching, ports incident to matched nodes are labeled with “1”, ports incident to unmatched nodes are labeled with “0”, and the edge constraints ensure that there are no unmatched nodes adjacent to each other.

### 2.1 Half-edge formalism and node-edge-checkable problems

problems come in many different forms, and we have to be able to capture, among others, problems of the following forms:

• The problem may ask for a labeling of nodes, a labeling of edges, a labeling of the endpoints of the edges, an orientation of the edges, or any combination of these.

• The input graph can be a path or a cycle.

• The input graph may be directed or undirected.

As discussed in the recent papers [2, 3], a rather elegant way to capture all problems is the following approach:

• Each edge is split into two halves, which we call ports.

• The task is to label each port with a label from some finite set .

• There is a node constraint that specifies which label combinations are feasible for the ports incident to a node.

• There is an edge constraint that specifies which label combinations are feasible for the two ports of an edge.

problems that are specified in this formalism are called node-edge-checkable problems. It is usually fairly easy to encode any given problem in a natural manner in this formalism; see Figure 1 for examples. Here maximal matching serves as an example of a problem in which the natural encoding of indicating which edges are part of the matching does not work (it does not capture maximality) but with a few additional labels we can precisely define a problem that is equivalent to maximal matchings.

In general, if we have any problem (in which the problem description can refer to radius- neighborhoods for some constant ), we can define an equivalent problem that can be represented in the node-edge formalism, modulo constant-time preprocessing and postprocessing. In brief, one label in the new problem corresponds to the labeling of a sub-path of length in . Now given a solution of , one can construct a solution of in rounds, and given a solution of , one can construct a solution of in zero rounds. Moreover, can be specified in the node-edge formalism. We will give the details in Appendix D.

#### Notation.

We will use the following notation to specify node-edge-checkable problems:

• The edge constraint

consists of all ordered pairs

such that we can label the first port of an edge with and the second port with .

• The (internal) node constraint consists of all ordered pairs such that we can label the first port of a node with and the second port with .

• The head constraint consists of all labels that can appear on the port adjacent to the first node of a path.

• The tail constraint consists of all labels that can appear on the port adjacent to the last node of a path.

If the input graph is a directed cycle or path, “first”, “second”, and “last” are well-defined by the globally consistent orientation given in the input. If the input graph is an undirected cycle or path, then there is no distinction between the first and the second port of a node or an edge, and no distinction between the first and the last node. In that case and are symmetric relations and ; we call such a problem symmetric and otherwise the problem is asymmetric. If the input graph is a cycle, we set . For brevity, we will usually write the pair simply as .

For the maximal matching problem on undirected cycles (see Figure 1), we have got and . The problem is symmetric.

### 2.2 Turning node-edge-checkable problems into automata

Now consider an problem that is specified in the node-edge formalism. Construct a nondeterministic finite automaton as follows; see Figures 1 and 2 for examples.

• The set of states is .

• There is a transition from to whenever .

• is a starting state whenever .

• is an accepting state whenever .

We will interpret as an NFA over the unary alphabet . Note that there can be multiple starting states; the automaton can choose the starting state nondeterministically. Figure 2: Five versions of the vertex 2-coloring problem, with different starting states and accepting states. Here (a) and (d) are the only problems that are symmetric; therefore problems (b), (c), and (e) are not meaningful on undirected paths.

We define the following concepts: [generating paths and cycles] Automaton can generate the cycle if each is a state of , there is a state transition from to for each , and there is a state transition from to .

Automaton can generate the path if each is a state of , is a starting state, is an accepting state, and there is a state transition from to for each . Note that can generate cycles even if there are no starting states or accepting states.

Consider the state machines in Figure 1. The state machine for consistent orientation can generate the following cycles:

 (HT), (TH), (HT,HT), (TH,TH), (HT,HT,HT), (TH,TH,TH), …

The state machine for maximal matching can generate the following cycles:

 (11,MM), (MM,11), (10,01,MM), (01,MM,10), (MM,10,01), (11,MM,11,MM), (MM,11,MM,11), …

If we start with a symmetric problem, the automaton will be mirror-symmetric in the following sense: there is a state transition if and only if there is a state transition , and the automaton can generate if and only if it can generate . All automata in Figure 1 have this property, while in Figure 2 only automata (a) and (d) are mirror-symmetric.

#### Automata capture node-edge-checkable problems.

These observations follow directly from the definitions:

• Let be a symmetric or asymmetric problem. Automaton can generate a cycle if and only if the following is a feasible solution for problem : Take a directed cycle with nodes and edges and walk along the cycle in the positive direction, starting at an arbitrary edge. Label the ports of the first edge with , the ports of the second edge with , etc.

• Let be a symmetric problem. Automaton can generate a cycle if and only if the following is a feasible solution for problem : Take an undirected cycle with nodes and edges and walk the cycle in some consistent direction, starting at an arbitrary edge. Label the ports of the first edge with , the ports of the second edge with , etc.

• Let be a symmetric or asymmetric problem. Automaton can generate a path if and only if the following is a feasible solution for problem : Take a directed path with nodes and edges and walk along the path in the positive direction, starting with the first edge. Label the ports of the first edge with , the ports of the second edge with , etc.

• Let be a symmetric problem. Automaton can generate a path if and only if the following is a feasible solution for problem : Take an undirected path with nodes and edges and walk along the path in some consistent direction, starting with the first edge. Label the ports of the first edge with , the ports of the second edge with , etc.

Hence, for example, the question of whether a given problem is solvable in a path of length is equivalent to the question of whether accepts the string . Similarly, the question of whether is solvable in a cycle of length is equivalent to the question of whether there is a state such that can return to state after processing .

However, the key question is what can be said about the complexity of solving in a distributed setting. As we will see, this is also captured in the structural properties of .

## 3 Classification of all LCL problems on cycles

### 3.1 Types of states

Consider a problem . We introduce the following definitions; see Figure 3 for examples: [repeatable state] State is repeatable if there is a walk in . [flexible state ] State is flexible with flexibility if for all there is a walk of length exactly in . [loop] State is a loop if there is a state transition in . For a symmetric problem  we also define: [mirror-flexible state] State is mirror-flexible with flexibility if for all there are walks , , , and of length exactly in . [mirror-flexible loop] State is a mirror-flexible loop with flexibility if is a mirror-flexible state with flexibility and is also a loop. Note that if is mirror-flexible loop, then so is , as the problem is symmetric. Figure 3: Examples of LCL problems with repeatable, flexible, and mirror-flexible states. Labels A–K refer to the problem types in Table 1. Here is a brief description of each sample problem: A: orient the edges so that each consistently oriented fragment consists of at least two edges, one with the label pair 12 and at least one with the label pair 34. B: either find a consistent orientation (encoded with labels 1–2) or find a proper 3-coloring of the edges (encoded with labels 3–5). C: consistent orientation. D: orientation in the positive direction. E: edge 3-coloring. F: consistent orientation together with an edge 3-coloring. G: orientation in the positive direction together with an edge 3-coloring. H: edge 2-coloring. I: orientation in the positive direction together with an edge 2-coloring. J–K: problems only solvable on paths of length at most 2 (assuming appropriate starting and accepting states).

### 3.2 Flexibility and synchronizing words

Flexibility is a key concept that we will use in our characterization of problems. We will now connect it to the automata-theoretic concept of synchronizing words.

First, let us make a simple observation that allows us to study automata by their strongly connected components: Let be a strongly connected component of automaton , and let be a state in . Then is flexible in if and only if is flexible in .

###### Proof.

A walk from back to in cannot leave . ∎

Recall that a word is called D3-directing word  for NFA if, starting with any state of there is a sequence of state transitions that takes to state when it processes . We show that this specific notion of a nondeterministic synchronizing word is, in essence, equivalent to the concept of flexibility: Consider a strongly connected component of some automaton . The following statements are equivalent:

1. There is a flexible state in .

2. All states of are flexible.

3. There is a D3-directing word for .

###### Proof.

(1)(2): Assume that state has flexibility . Let be another state in . As it is in the same connected component, there is some such that we can walk from to and back in steps. Therefore for any we can walk from back to in steps by following the route . Hence is a flexible state with flexibility at most .

(2)(3): Assume that state has flexibility , and there is a walk of length at most from any state to state . Then we can walk from any state to in exactly steps: first in steps we can reach and then in steps we can walk from back to itself. Hence is a D3-directing word for automaton that takes it from any state to state .

(3)(1): Assume that there is some D3-directing word that can take one from any state of to state in exactly steps. Then we can also walk from to itself in steps for any : first take steps arbitrarily inside , and then walk back to in exactly steps. ∎

Hence, in what follows, we can freely use any of the above perspectives when reasoning about the distributed complexity of problems. Mirror-flexibility can be then seen as a mirror-symmetric extension of D3-directing words.

There is also a natural connection between flexibility and Markov chains. Automaton

over the unary alphabet can be viewed as the diagram of a Markov Chain for unknown probabilities of the transitions. If we assume that every edge will have a non-zero probability, then a strongly connected component of the automaton is an

irreducible Markov chain, and in such a component the notion of flexibility coincides with the notion of aperiodicity.

### 3.3 Results

Our main result is summarized in Table 1; see Figure 3 for examples. What was already well-known by prior work [11, 1] is that there are only three possible complexities: , , and . However, our work gives for the first time a concise classification of exactly which problems belong to which complexity class. In Appendix B we show that our classification is correct and complete.

The entire classification can be computed efficiently. In particular, all of the following properties can be decided in polynomial time in the size of the automaton: repeatable states, flexible states, loops, mirror-flexible states and mirror-flexible loops. The non-trivial cases here are flexibility and mirror-flexibility; we present the proofs in Appendix A.

#### The role of mirror-flexibility.

Consider the following problem that we call distance- anchoring; here the selected edges are called anchors: A distance- anchoring is a maximal subset of edges that splits the cycle in fragments of length at least . This problem can be solved in rounds (e.g. by applying maximal independent set algorithms in the th power of the line graph of the input graph). Now consider an problem that has a flexible state with flexibility . It is known by prior work  that we can now solve on directed cycles in rounds, as follows: Solve distance- anchoring and label the anchor edges with the label pair of state . As state is flexible, we can walk along the cycle from one anchor to another, and find a way to fill in the fragment between two anchors with a feasible label sequence.

Mirror-flexibility plays a similar role for undirected cycles: the key difference is that the anchor edges cannot be consistently oriented, and hence we need to be able to also fill a gap between state and its mirror , in any order. It is easy to see that mirror-flexibility then implies -round solvability—what is more surprising is that the converse also holds: -round solvability necessarily implies the existence of a mirror-flexible state.

#### A new canonical problem for constant-time solvability.

One of the new conceptual contributions of this work is related to the following problem, which we call distance- orientation: A distance- orientation is an orientation in which each consistently oriented fragment has length at least . The problem is trivial to solve in directed cycles in rounds, but the case of undirected cycles is not equally simple. However, with some thought, one can see that the problem can be solved in rounds also on undirected cycles . This shows that there are infinite families of nontrivial -time solvable problems, and hence it seems at first challenging to concisely and efficiently characterize all such problems. However, as we will see in Appendix B, distance- orientation can be seen as the canonical -time solvable problem on undirected cycles. We show that any problem that is -time solvable on undirected cycles has to be of type A, and any such problem can be solved in two steps: first find a distance- orientation for some constant that only depends on the structure of , and then map the distance- orientation to a feasible solution of .

We can summarize the key new observations related to undirected cycles as follows:

rounds mirror-flexible loop solvable with distance- orientation

rounds mirror-flexible state solvable with distance- anchoring

## 4 Classification of all LCL problems on paths

#### What is similar: distributed complexity.

Broadly speaking, efficient distributed solvability on paths is not that different from efficient solvability on cycles (see Table 1). Consider an problem and the state machine . Without loss of generality, we can remove all states that are not reachable from a starting state, and all states from which there is no path to an accepting state—such states can never appear in any feasible labeling of a path. The removal of irrelevant states can be done in polynomial time, and hence throughout this work we assume that such states have already been eliminated and, to avoid trivialities, the resulting automaton is nonempty.

Now consider, for example, the case of directed paths. If there is a loop in , we can solve in constant time. By assumption can be reached from some starting state and we can reach some accepting state from . Hence near the endpoints of a path we can label according to the walks and , and fill in everything in between with ; the round complexity is simply the maximum of the lengths of the (shortest) walks and . Similarly, if is not a loop but a flexible state with flexibility , we can find a distance- anchoring for the internal part of the path, use at the anchor points, and fill the gaps just like in the case of a cycle. The case of undirected paths and mirror-flexibility is analogous.

Furthermore, negative results on cycles imply negative results on paths. To see this, consider a hypothetical algorithm that solves efficiently in directed paths. Then we could also apply to each local neighborhood of a long directed cycle, and hence would also solve efficiently in directed cycles. If cannot be solved in rounds in directed cycles, it cannot be solved in rounds in directed paths, either. The same holds for the undirected case. Hence the classification of distributed complexities in Table 1 generalizes to paths almost verbatim.

#### What is new: solvability.

In directed cycles, global problems (i.e., problems of round complexity , types H and I) came in only one possible flavor: there are infinitely many solvable instances and infinitely many unsolvable instances. A simple example is the problem of finding a proper -coloring: even cycles are solvable and odd cycles are unsolvable. Our classification for cycles implies that it is not possible to have an problem of complexity in directed cycles that is always solvable.

This is clearly different in directed paths. As a simple example, -coloring a path is a global problem on directed paths that is always solvable. Figure 2 shows both examples of s that are solvable in all paths (e.g. -coloring), and examples of s that are solvable in infinitely many paths and unsolvable in infinitely many paths (e.g. -coloring in which all endpoints must have color ). It is also easy to construct problems that are solvable in all but finitely many instances and problems that are solvable only in finitely many instances. However, can we efficiently tell the difference between these cases if we are given a description of an problem?

This is a question in which the automata-theoretic perspective gives direct answers. In essence, the question is rephrased as follows: for which values of a nondeterministic finite automaton accepts the unary string ; whether accepts all such strings is the classical universality problem  for unary languages. Prior work directly implies the following:

• vs.  unsolvable instances: Consider the following decision problem: given an automaton , answer “yes” if accepts all strings, “no” if rejects at least one but finitely many strings, and answer “yes” or “no” otherwise. This problem can be solved in polynomial time, as a consequence of Chrobak’s theorem [30, 12].

• vs.  unsolvable instances: Consider the following decision problem: given an automaton , answer “yes” if accepts all strings, “no” if rejects infinitely many strings, and answer “yes” or “no” otherwise. This is a well-known co--complete problem .

We give the details in Appendix C.

#### Discussion.

We have seen that questions about the solvability of s in paths are, unsurprisingly, related to classical automata-theoretic questions, as we can directly interpret a path as a string. Our work on s in cycles can be then seen as an extension of classical questions to cyclic words. In particular, we see that an automaton “accepts” all but finitely many cyclic words if and only if there is a flexible state in the automaton, or equivalently if a D3-directing word exists for the automaton. Our work shows that all such questions on cyclic words can be decided in polynomial time, even if their classical non-cyclic analogs are in some cases co--complete.

## References

•  Alkida Balliu, Sebastian Brandt, Yi-Jun Chang, Dennis Olivetti, Mikaël Rabie, and Jukka Suomela. The Distributed Complexity of Locally Checkable Problems on Paths is Decidable. In Proc. 38th ACM Symposium on Principles of Distributed Computing (PODC 2019), pages 262–271. ACM Press, 2019.
•  Alkida Balliu, Sebastian Brandt, Yuval Efron, Juho Hirvonen, Yannic Maus, Dennis Olivetti, and Jukka Suomela. Classification of distributed binary labeling problems, 2019.
•  Alkida Balliu, Sebastian Brandt, Juho Hirvonen, Dennis Olivetti, Mikaël Rabie, and Jukka Suomela. Lower bounds for maximal matchings and maximal independent sets. In Proc. 60th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2019), pages 481–497. IEEE, 2019.
•  Alkida Balliu, Sebastian Brandt, Dennis Olivetti, and Jukka Suomela. Almost global problems in the LOCAL model. In Proc. 32nd International Symposium on Distributed Computing (DISC 2018), Leibniz International Proceedings in Informatics (LIPIcs), pages 9:1–9:16. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2018.
•  Alkida Balliu, Sebastian Brandt, Dennis Olivetti, and Jukka Suomela. How much does randomness help with locally checkable problems?, 2019.
•  Alkida Balliu, Juho Hirvonen, Janne H Korhonen, Tuomo Lempiäinen, Dennis Olivetti, and Jukka Suomela. New classes of distributed time complexity. In

Proc. 50th ACM Symposium on Theory of Computing (STOC 2018)

, pages 1307–1318. ACM Press, 2018.
•  Sebastian Brandt, Orr Fischer, Juho Hirvonen, Barbara Keller, Tuomo Lempiäinen, Joel Rybicki, Jukka Suomela, and Jara Uitto. A lower bound for the distributed Lovász local lemma. In Proc. 48th ACM Symposium on Theory of Computing (STOC 2016), pages 479–488. ACM Press, 2016.
•  Sebastian Brandt, Juho Hirvonen, Janne H Korhonen, Tuomo Lempiäinen, Patric R J Östergård, Christopher Purcell, Joel Rybicki, Jukka Suomela, and Przemysław Uznański. LCL problems on grids. In Proc. 36th ACM Symposium on Principles of Distributed Computing (PODC 2017), pages 101–110. ACM Press, 2017.
•  Ján Černý. Poznámka k homogénnym experimentom s konečnými automatmi. Matematicko-fyzikálny časopis, 14(3):208–216, 1964.
•  Yi-Jun Chang, Tsvi Kopelowitz, and Seth Pettie. An Exponential Separation between Randomized and Deterministic Complexity in the LOCAL Model. In Proc. 57th IEEE Symposium on Foundations of Computer Science (FOCS 2016), pages 615–624. IEEE, 2016.
•  Yi-Jun Chang and Seth Pettie. A Time Hierarchy Theorem for the LOCAL Model. SIAM Journal on Computing, 48(1):33–69, 2019.
•  Marek Chrobak. Finite automata and unary languages. Theoretical Computer Science, 47:149–158, 1986.
•  Richard Cole and Uzi Vishkin. Deterministic coin tossing with applications to optimal parallel list ranking. Information and Control, 70(1):32–53, 1986.
•  Henk Don and Hans Zantema. Synchronizing non-deterministic finite automata. Journal of Automata, Languages and Combinatorics, 23(4):307–328, 2018.
•  David Eppstein. Reset Sequences for Monotonic Automata. SIAM Journal on Computing, 19(3):500–510, 1990.
•  Manuela Fischer and Mohsen Ghaffari. Sublogarithmic Distributed Algorithms for Lovász Local Lemma, and the Complexity Hierarchy. In Proc. 31st International Symposium on Distributed Computing (DISC 2017), pages 18:1–18:16, 2017.
•  Zsolt Gazdag, Szabolcs Iván, and Judit Nagy-György. Improved upper bounds on synchronizing nondeterministic automata. Information Processing Letters, 109(17):986–990, 2009.
•  Mohsen Ghaffari, David G Harris, and Fabian Kuhn. On Derandomizing Local Distributed Algorithms. In Proc. 59th IEEE Symposium on Foundations of Computer Science (FOCS 2018), pages 662–673, 2018.
•  Mohsen Ghaffari and Hsin-Hao Su. Distributed Degree Splitting, Edge Coloring, and Orientations. In Proc. 28th ACM-SIAM Symposium on Discrete Algorithms (SODA 2017), pages 2505–2523. Society for Industrial and Applied Mathematics, 2017.
•  Markus Holzer and Martin Kutrib. Descriptional and computational complexity of finite automata—A survey. Information and Computation, 209(3):456–470, 2011.
•  B. Imreh and M. Steinby. Directable nondeterministic automata. Acta Cybernetica, 14(1):105–115, 1999.
•  Balázs Imreh and Masami Ito. On regular languages determined by nondeterministic directable automata. Acta Cybernetica, 17(1):1–10, 2005.
•  Nathan Linial. Locality in Distributed Graph Algorithms. SIAM Journal on Computing, 21(1):193–201, 1992.
•  Moni Naor and Larry Stockmeyer. What Can be Computed Locally? SIAM Journal on Computing, 24(6):1259–1277, 1995.
•  David Peleg. Distributed Computing: A Locality-Sensitive Approach. Society for Industrial and Applied Mathematics, 2000.
•  J. L. Ramírez-Alfonsín. Complexity of the Frobenius problem. Combinatorica, 16(1):143–147, 1996.
•  Václav Rozhoň and Mohsen Ghaffari. Polylogarithmic-Time Deterministic Network Decomposition and Distributed Derandomization. In Proc. 52nd Annual ACM Symposium on Theory of Computing (STOC 2020), 2020.
•  Jeffrey Shallit. The Frobenius Problem and Its Generalizations. In Proc. 12th International Conference on Developments in Language Theory (DLT 2008), volume 5257 of LNCS, pages 72–83, Berlin, Heidelberg, 2008. Springer.
•  L. J. Stockmeyer and A. R. Meyer. Word problems requiring exponential time. In Proc. 5h Annual ACM Symposium on Theory of Computing (STOC 1973), pages 1–9, New York, New York, USA, 1973. ACM Press.
•  Anthony Widjaja To. Unary finite automata vs. arithmetic progressions. Information Processing Letters, 109(17):1010–1014, 2009.

## Appendix A Efficient computation of the classification of LCL problems

In view of Table 1, the task to classify for an problem to which class it belongs to can be reduced to testing certain graph properties of . In this section, we show that checking whether a state is flexible or mirror-flexible can be done in polynomial time, and so deciding the optimal distributed complexity of an problem is also in polynomial time.

Let be the set of states of . For each we define:

• is the set of values such that there is a walk of length in .

• is the restriction of to walks of length at most .

For any automaton and for any state , we have .

###### Proof.

We show that for each , we can find such that and for some integers . By applying this argument recursively to each , we can eventually write any as a linear combination of sufficiently small numbers . Hence if all values in are multiples of some , all values in have to be also multiples of .

Therefore it suffices to show that for each walk of the form of length , it is possible to find shorter returning walks of the form of lengths such that for some integers .

We write , where

. Since this vector has

elements, by the pigeonhole principle, there exists a state that appears at least three times. Therefore, can be decomposed into four walks: , , , and , where and . We write to denote the length of .

Now define , , and ; the lengths of these paths are , , and . Now the length of can be expressed as . Since and , the three lengths are all smaller than , as required. ∎

A state is flexible if and only if .

###### Proof.

If , then and hence there is no walk of length for any , and cannot be flexible.

For the other direction, given a set of positive integers with , the Frobenius number of the set is the largest number such that cannot be expressed as a linear combination of , where each coefficient is a non-negative integer. It is known that  .

By Lemma A, and . Hence implies that for all , it is possible to find a length- walk by combining some returning walks of length at most , and so is flexible. ∎

We remark that the problem of calculating the Frobenius number when the input numbers can be encoded in binary is -hard . However, the flexibility of a given automaton can be nevertheless found efficiently.

Testing whether a state is flexible and finding its flexibility number is solvable in polynomial time.

###### Proof.

By Lemma A, it is sufficient to test if , and by Lemma A, it suffices to find the set and compute its , which can be done in polynomial time. ∎

Testing whether a state is mirror-flexible and finding its mirror-flexibility number is solvable in polynomial time.

###### Proof.

Follows from Lemma A: is mirror-flexible if and only if is flexible and is reachable to its mirror and can be reached back from . Reachability between two states can be tested in polynomial time. ∎

Given an problem , classifying its type can be computed in polynomial time.

###### Proof.

The non-trivial cases are captured in Lemmas A and A. ∎

## Appendix B Correctness of the classification of LCL problems on cycles

In this appendix, we show that the classification of problems on cycles in Table 1 is correct and complete. To streamline the proofs we use the term cyclepath (similarly to circleline) to refer to a graph that is either a path or a cycle. We first prove the round complexity of each type and then the solvability. The connection between the proofs and the results they establish is depicted in Table 2.

### b.1 Round complexity lower bounds

In all proofs in this section, we need a technical assumption that contains a repeatable state. This ensures that for every number , we can find an -node solvable instance for some . This assumption is necessary: If does not contain a repeatable state, then we can find a number such that for all the problem has no solution on a cyclepath of nodes, and so the round complexity of is trivially in all solvable instances.

Let be an problem on directed cyclepaths. Suppose that the automaton contains a repeatable state, but it does not contain a loop. Then the round complexity is .

###### Proof.

We show how to turn any legal labeling of into an edge -coloring in a constant number of rounds. As -coloring of edges requires rounds , so does .

Let be the set of states of , and consider a valid solution of . Such a labeling can be easily turned into an edge -coloring : an edge that was labeled with the pair in will be colored with the color in . As there are no loops in , adjacent edges must have different label pairs and hence different colors. Finally, we can reduce the number of colors from to in a constant number of rounds (w.r.t. to ) with the trivial algorithm that eliminates colors one at a time. ∎ Figure 4: An illustration of the proof of Theorem B.1. Pairs (f(e),h(e)) form a proper edge coloring.

Let be an problem on undirected cyclepaths. Suppose that the automaton contains a repeatable state, but it does not contain a loop. Then the round complexity is .

###### Proof.

We use an idea similar to Theorem B.1, with one extra ingredient. Assume that is a feasible solution of . First construct a labeling of the edges with (at most) colors as follows: an edge that was labeled with the pair in will be colored with the color in (note that the colors are now unordered pairs).

Now such a labeling is not necessarily a proper coloring. There may be an arbitrarily long sequence of edges that have the same label , for some ; such a path is called monochromatic. However, this would arise only if contains a sequence of the form . Within such a path, we can find a partial labeling of the nodes as follows: nodes that have both ports labeled with are colored with , and nodes that have both ports labeled with are colored with ; all other nodes are left uncolored. See Figure 4 for an illustration.

Now we have two ingredients: a not-necessarily-proper edge coloring with colors, and a partial node coloring with colors. These complement each other: all internal nodes in monochromatic paths of are properly -colored in . Hence we can use to find a proper edge -coloring of each monochromatic path, e.g. as follows: Nodes of color are active and send proposals to adjacent nodes of color (proposals are sent in the order of unique identifiers), nodes of color accept the first proposal that they get (breaking ties with unique identifiers), and this way we can find a maximal matching within each monochromatic path. Each such matching forms one color class in ; we delete the edges that are colored and repeat. After three such iterations all internal edges of monochromatic paths are properly colored in ; then is easy to extend so that also the edges near the endpoints of monochromatic paths have colors different from their monochromatic neighbors (monochromatic paths of length two are also easy to -color). Now the pairs form a proper edge coloring with colors, and we can finally reduce the number of colors down to . ∎

In both of the following lemmas to be applicable also to the case of a path, we always assume that the “witness” of any specific behavior happens somewhere in the middle of a cyclepath and not next to the endpoints.

Let be an problem that is solvable in cyclepaths of length for infinitely many values of . Assume that solves in for all solvable instances, and assume that for arbitrarily large values of , we can find a cyclepath of length such that there are two edges and with the following properties:

• The distance between and , and the distance between each and the nearest degree- node (if any) is more than .

• Algorithm labels both and with the same state that is not flexible.

Then the round complexity of has to be .

###### Proof.

We give the proof for the case of a path; the case of a cycle is similar. To reach a contradiction, assume the complexity of is sublinear. Pick a sufficiently large such that the algorithm runs in rounds and paths of length are solvable. Decompose the path in fragments

 G=(P0,N1,P1,N2,x,P2),

where is the radius- neighborhood of , each is a path of nodes, and is one node. Now we can move one node to construct another path

 G′=(P0,N1,P1,x,N2,P2).

Path has the same length as , and hence is also a solvable instance and has to be able to find a feasible solution. As the radius- neighborhoods of and are the same in and , algorithm will label them with in both and . But as is not flexible, we can this way eventually construct an instance in which the distance between the two edges with label is such that does not have a walk of length from back to itself, and hence cannot produce a valid solution. ∎

Let be a symmetric problem that is solvable in undirected cyclepaths of length for infinitely many values of . Assume that solves in for all solvable instances, and assume that for arbitrarily large values of , we can find a cyclepath of length such that there is an edge with the following properties:

• The distance between and the nearest degree- node (if any) is more than .

• Algorithm labels both with a state that is not mirror-flexible.

Then the round complexity of has to be .

###### Proof.

We give the proof for the case of a path; the case of a cycle is similar. To reach a contradiction, assume the complexity of is sublinear. Pick a sufficiently large such that the algorithm runs in rounds and paths of length are solvable. For the purposes of this proof, orient the path so that the distance between and the end of the path is at least . Let be an edge between and the end of the path such that the distance between and , and the distance between and the endpoint is at least . Decompose the path in fragments

 G=(P0,N1,P1,N2,P2),

where is the radius- neighborhood of , and each is a path of nodes. Let be the mirror image of path , i.e., the same nodes in the opposite direction; then will label the midpoint of with , the mirrored version of state . Construct the following paths:

 G1 =(P0,N2,P1,N1,P2), G2 =(P0,¯N1,P1,N2,P2), G3 =(P0,N2,P1,¯N1,P2).

Now all such paths have length , and hence they are also solvable and is expected to produce a feasible solution. Such a solution in gives a walk in , gives a walk , gives a walk , and gives a walk . Putting these together, we can construct walks , , , and .

Finally, we can move nodes one by one from to in each of to construct such walks of any sufficiently large length. It follows that is mirror-flexible, which is a contradiction. ∎

Let be an problem. Suppose that the automaton contains a repeatable state, but it does not contain a flexible state. Then the round complexity is .

###### Proof.

We can apply Lemma B.1: the algorithm can only use non-flexible states, and it has to use some non-flexible state repeatedly. ∎

Let be a symmetric problem on undirected cyclepaths. Suppose that contains a repeatable state, but it does not contain a mirror-flexible loop. Then the round complexity of is .

###### Proof.

Consider an algorithm that solves , and look at the behavior of in sufficiently large instances, far away from the endpoints of the paths (if any). There are two cases:

1. Algorithm sometimes outputs a loop state (which by assumption cannot be mirror-flexible). Then by Lemma B.1 we obtain a lower bound of .

2. Otherwise essentially solves the restriction of where loop states are not allowed (except near the endpoints of the path), and we can use Theorem B.1 to obtain a lower bound of . ∎

Let be a symmetric problem on undirected cyclepaths. Suppose that contains a repeatable state, but it does not have a mirror-flexible state. Then the round complexity of is .

###### Proof.

Again consider an algorithm that solves , and look at the behavior of in sufficiently large instances, far away from the endpoints of the paths (if any). There are two cases:

1. Algorithm sometimes outputs a flexible state (which by assumption cannot be mirror-flexible). Then by Lemma B.1 we obtain a lower bound of .

2. Otherwise essentially solves the restriction of where flexible states are not allowed (except near the endpoints of the path), therefore it is using some non-flexible state repeatedly far from endpoints, and Lemma B.1 applies. ∎

### b.2 Round complexity upper bounds

Let us first consider the trivial case of automata without repeating states.

Let be an problem. Suppose does not have repeatable state. Then can be solved in constant time in solvable instances.

###### Proof.

Let be a set of states of . As does not have a repeatable state, it is not solvable in any cycle, and it is only solvable in some paths of length at most . Hence can be solved in constant time by brute force (and also in constant time all nodes can detect if the given instance is solvable. ∎

In the rest of this section, we design efficient algorithms for solving problems with flexible or mirror-flexible states. We present the algorithms first for the case of a cycle. The case of a path is then easy to solve: we can first label the path as if it was a cycle, remove the labels near the endpoints (up to distance , where is bounded by the (mirror-)flexibility of a chosen (mirror-)flexible state plus the number of states in ), and fill constant-length path fragments near the endpoints by brute force. We refer to this process as fixing the ends.

Let be an problem on directed cyclepaths. Suppose has a loop. Then the round complexity is .

###### Proof.

All nodes can be labeled by a loop state. In a path we will then fix the ends. ∎

Let be an problem. Suppose has a mirror-flexible loop. Then the round complexity is .

###### Proof.

Let be a mirror-flexible loop state of mirror-flexibility . Let be an even constant. The first step is to construct a distance- orientation (Definition 3.3); this can be done in rounds.

We say that an edge is a boundary edge if there is another edge with a different orientation within distance less than from ; otherwise is an internal edge. Note that each consistently oriented fragment contains at least one internal edge.

The internal edges are labeled as follows: each edge with orientation “” is assigned label , and each edge with orientation “” is assigned label , i.e., the mirror of .

We are left with gaps of length between the labeled edges. As is mirror-flexible, we can find paths and of length to fill in such gaps. Finally, in a path we will fix the ends. ∎

Let be an problem on directed cyclepaths. Suppose has a flexible state. Round complexity of such is .

###### Proof.

Let be a flexible state of flexibility . This time we first construct a distance- anchoring (Definition 3.3); this can be done in rounds. Let the set of anchors be . If an edge is in , we label its ports by . We are left with the gaps, which can be of size between and (anchoring is maximal). As is flexible, for each gap of size we can find a returning walk of length exactly and fill it by the states along such walk. Finally, in a path we will fix the ends. ∎

Let be an problem on undirected cyclepaths. Suppose has a mirror-flexible state. Round complexity of such is .

###### Proof.

The proof is very similar to a previous proof, only with some minor changes as now we are in the undirected setting.

Let be a mirror-flexible state of flexibility . First, we construct a distance- anchoring (Definition 3.3); this can be done in rounds. Let the set of anchors be . If an edge is in , we label its ports by either or its mirror arbitrarily (breaking symmetry with unique identifiers). We are left with the gaps, which can be of size between and (anchoring is maximal). As is mirror-flexible, for each gap of size we can find a returning walk of length exactly and fill the gap no matter the combinations of anchors (, , or ). Finally, in a path we will fix the ends. ∎

### b.3 Solvability

In this part, we consider the solvability of an problem. That is, for a given graph class (the set of all cycles of every length or the set of paths of every length), how many graphs are solvable instances (instances that admit a legal labeling) with respect to the given problem .

Let be an problem. If has a repeatable state, then the number of solvable instances is .

###### Proof.

Let be a repeatable state, i.e., there is a walk of some length . Now for every , cycles of length are solvable, as we can generate cycles of the form .

In paths, by assumption is reachable from some starting state and we can reach some accepting state from ; let be the length of a walk . Now for every , paths of length are solvable, as we can generate paths of the form . ∎

Let be an problem. If has a flexible state, number of unsolvable instances is at most , where is a constant.

###### Proof.

Let be a flexible state with flexibility . All cycles of length are now trivially solvable, as we have a walk of length .

In paths, by assumption is reachable from some starting state and we can reach some accepting state from ; let be the length of a walk . Now all paths of length are solvable, as we have a walk of length . ∎

Let be an problem on cycles. If has a loop the number of unsolvable instances is zero.

###### Proof.

As has a loop, returning walks of all lengths exists and all cycles can be labeled. ∎

Let be an problem on cycles. If has does not have a repeatable state the number of solvable instances is zero.

###### Proof.

Any legal labeling on cycles has to contain a repeatable state. ∎

Let be an problem. Assume does not have any flexible state. Then there are infinitely many unsolvable instances on cycles.

###### Proof.

Let be the set of states of . Since no state is flexible in , by Lemma A we have for all . Pick

 b=∏q∈Qgcd(Lq).

Now for any and any natural number . Therefore it is not possible to use any state in a cycle of length , as a feasible solution in such a cycle would form a walk of length . Hence there are infinitely many unsolvable instances. ∎

Let be an problem. Suppose does not have repeatable state. Then there are at most constantly many solvable instances.

###### Proof.

Let be a set of states of . As does not have a repeatable state, all walks that can form legal labeling have to have to have length at most . So all paths of lengths are unsolvable instances. ∎

## Appendix C Complexity of deciding solvability in paths

Theorem C shows that the unary NFA universality problem becomes polynomial time solvable once we have a promise that rejects only finitely many strings. The theorem implies that distinguishing between 0 unsolvable instances and unsolvable instances is in polynomial time, for both s on paths and on cycles. Although the automaton used in the half-edge formalism has a different acceptance condition than that of the standard NFA, it is straightforward to transform into an equivalent NFA with the standard NFA acceptance condition (i.e., there is one starting state , and a set of accepting states ).

There is a polynomial time algorithm that achieves the following for any given unary NFA . If does not reject any string, then the output of is Yes. If rejects at least one but only finitely many strings, then the output of is No. If rejects infinitely many strings, the output of can be either No or Yes.

###### Proof.

This is an immediate consequence of Chrobak’s theorem [30, 12], which shows that any unary NFA is equivalent to some NFA in the Chrobak normal form, and the number of states in is at most . An NFA is in Chrobak normal form if it can be constructed as follows. Start with a directed path and directed cycles , for each , where is the length of . Add a transition from to for each . The starting state is . The set of accepting states can be arbitrary.

The algorithm works as follows. It tests whether accepts all strings of length at most . If so, then the output is Yes; otherwise, the output is No. To see the correctness, we only need to show that whenever rejects at least one but only finitely many strings, then the output of is No. To show this, it suffices to prove that if there is a string of length higher than that is rejected by , then there must be infinitely many strings rejected by .

Let be the length of . Now consider some NFA that is in the Chrobak normal form and is equivalent to . We can assume that the number of states in is at most . Define