Two strings and are said to be anagrams of each other if is a permutation of . A single string is anagramish if its first half and its second half are anagrams of each other. A string is anagram-free if it does not contain an anagramish substring.
In 1961, Erdős  asked if there exists arbitrarily long strings over an alphabet of size that are anagram-free.111This was an incredibly prescient question since it is not at all obvious that there exist arbitrarily long anagram-free strings over any finite alphabet. The only justification for choosing the constant is that a short case analysis rules out the possibility of length- anagram-free strings over an alphabet of size . In 1968 Evdokimov [4, 5] showed the existence of arbitrarily long anagram-free strings over an alphabet of size and in 1971 Pleasants  showed an alphabet of size is sufficient. Erdős’s question was not fully resolved until 1992, when Keränen  answered it in the affirmative.
A path in a graph is anagramish under a vertex -colouring if is an anagramish string. The colouring is an anagram-free colouring of if no path in is anagramish under . The minimum integer for which has an anagram-free vertex -colouring is called the anagram-free chromatic number of and is denoted by .
For a graph family , or if the maximum is undefined. The results on anagram-free strings discussed in the preceding paragraph can be interpreted in terms of where is the family of all paths. Slightly more complicated than paths are trees. Wilson and Wood  showed that for the family of trees and Kamčev et al.  showed that even for the family of binary trees.
One positive result in this context is that of Wilson and Wood , who showed that every tree of pathwidth has . Since trees are graphs of treewidth it is natural to ask if this result can be extended to show that every graph of treewidth and pathwidth has for some . Carmi et al.  showed that such a generalization is not possible for any by giving examples of -vertex graph of pathwidth (and treewidth ) with . The obvious remaining gap left by these two works is graphs of treewidth . Our main result is to show that , the square grid has . Since has pathwidth 2, we have:
For every , there exists a graph of pathwidth that has no anagram-free vertex -colouring.
Wilson [11, Section 7.1] conjectured that , so this work confirms this conjecture. Prior to the current work, it was not even known if the family of square grids had anagram-free colourings using a constant number of colours.
In a larger context, this lower bound gives more evidence that, except for a few special cases (paths [4, 10, 7], trees of bounded pathwidth , and highly subdivided graphs ), the qualitative behaviour of anagram-free chromatic number is not much different than that of treedepth/centered colouring . Very roughly: For most graph classes, every graph in the class has an anagram-free colouring using a bounded number of colors precisely when every graph in the class has a colouring using a bounded number of colours in which every path contains a colour that appears only once in the path.
The remainder of this paper is organized as follows: Section 2 gives some definitions and states a key lemma that shows that, under a certain periodicity condition, every sufficiently long string contains a substring that is -close to being anagramish. In Section 3 we prove Section 1. In Section 4 we prove the key lemma. Section 5 concludes with some final remarks about the (non-)constructiveness of our proof technique.
2 Periodicity in Strings
An alphabet is a finite non-empty set. A string over is a (possibly empty) sequence with for each The length of is the length, , of the sequence. For an integer , (the -fold cartesian product of with itself) is the set of all length- strings over . The Kleene closure is the set of all strings over . For each , is called a substring of . (Note the convention that does not include , so has length .)
Let be a string over an alphabet and, for each , define . The histogram of is the integer-valuedindexed by elements of . Observe that a string is anagramish if and only if or, equivalently, . For each , let . Then is a useful measure of how far a string is from being anagramish and if and only if is anagramish.
A string is -periodic if each length- substring of contains every character in . We make use of the following lemma, which states that every sufficiently long -periodic string contains a long substring that is -close to being anagramish.
For each and each , there exists a positive integer such that each -periodic string contains a substring of length such that .
The proof of Section 2 is deferred to Section 4. We now give some intuition as to how it is used. The process of checking if a string is anagramish is often viewed as finding common terms in the first and second halves and crossing them both out. If this results in a complete cancellation of all terms, then the string is an anagram. Section 2 tells us that we can always find a long substring where, after exhaustive cancellation, only an -fraction of the original terms remain. Informally, the substring is -close to being anagramish.
Section 2 says that, if up to terms in each half of were each allowed to cancel two terms each in the other half of , then it would be possible to complete the cancellation process. To achieve this type of one-versus-two cancellation in our setting, we decompose our coloured pathwidth- graph into pieces of constant size. The vertices in each piece can be covered with one path or partitioned into two paths. In this way an occurrence of a particular coloured piece in one half can be matched with two like-coloured pieces and in the other half. We construct a single path that contains all vertices in and only half the vertices in each of and . In this way, the colours of vertices can cancel the colours of the vertices in .
Let be an alphabet and let be a function such that,
if for some substring of then ; and
for each , there exists at least one such that .
Then there exists and such that, for each ,
there exists such that ; and
every string in is -periodic.
Take to be a minimal subset of such that there exists with , for each . Such a exists by (A2) and the fact that is finite. By definition, satisfies (C1) so we need only show that it also satisifies (C2). If then we are done since every string over a -character alphabet is -periodic.
For , the minimality of implies that, for any , there exist such that for each . Therefore (A1) implies that, for each , for each . Set . Now (A1) implies that, for every , every length -substring of contains every character in , so is -periodic. ∎
3 Proof of Section 1
For each , let be the square grid with top row and bottom row (see Figure 1). For convenience, we let . For each with , define and we call a -block. Note that each -block is isomorphic to with the mapping given by and for each .
For each , let be the set of all functions , i.e., all vertex -colourings of . Given some , each -block defines a vertex colouring of defined as for each .
Our strategy will be to break up into small pieces using -blocks that all have the same colouring. Observe that any string defines a vertex -colouring of the graph where for each . Indeed, this is a bijection between -colourings of and strings in .
If for each then there exists such that, for each there exists such that is an anagram-free vertex colouring of and every such is -periodic.
For any , let if is an anagram-free colouring of and let otherwise. Then has property (A1) of Section 2 since any substring of of defines a colouring of that appears in the colouring of ; if the colouring of is not anagram-free then neither is the colouring of . By assumption, , so has some anagram-free vertex -colouring, so also satisfies property (A2) of Section 2. The result now follows from Section 2. ∎
Let and be defined as in Section 3. Let be an arbitrary element of and let . For a string , define , define . Fix a vertex -colouring of and define the colouring of as follows:
for each , ; and
for each , .
See Figure 2. In words, is decomposed into blocks each of whose length is a multiple of . There are colourful blocks of lengths and these are interleaved with boring blocks, each of length . The colourful blocks have their vertex colours determined by . The boring blocks are all coloured the same way, by .
Define the function so that if is an anagram-free colouring of and otherwise. Observe that any substring of defines a colouring of that appears in the colouring of . Therefore satisfies property (A1) of Section 2. Furthermore, if for each , then Section 3 implies that there exists a string with for each . Therefore satisfies property (A2) of Section 2. Therefore, Section 2 implies the following result:
If for each then there exists such that, for each there exists an -periodic string such that is an anagram-free vertex colouring of .
If for each then, for each and , there exists and a string with such that is an anagram-free vertex colouring of .
Section 3 shows the existence of colourings of for arbitrarily large values of that are defined by strings that are -close to being anagramish. The last step, done in the next lemma, is to show that there is enough flexibility when constructing a path in that we can find a path that has an anagramish colour sequence.
For any , there exists and such that, for any integer and any -periodic with , the graph contains a path such that is anagramish.
For each define and define sets and as follows:
If then .
If then .
If then .
Let and . The sets and are chosen so that they satisfy the following global independence constraint: There is no pair such that . To see that this is possible, first observe that, because is not in or we need only concern ourselves with pairs where both or pairs where both . Thus, we can choose the elements of , for each and then independently choose the elements of , for each .
We show how to choose the elements of for each . The same method works for choosing the elements in . Observe that, because is -periodic, for each . This allows us to greedily choose the elements in for each . At each step we simply avoid choosing if or have already been chosen in some previous step. At any step in the process, at most elements have already been chosen in previous steps and each of these eliminates at most options. Therefore, there will always be an element available to choose, provided that . In particular, for any , works.
We now construct the path in a piecewise fashion. Refer to Figure 3. For each , let . The subgraph are what is referred to above as colourful blocks. The colouring of by is defined by .
For each such that , group the elements of into pairs. For each pair , contains the path through the top row of and the path through the bottom row of . For each element , contains the zig-zag path with both endpoints in the top row of and that contains every vertex of . (Note that the zig-zag path begins at the top and ends at the bottom row because is a -block for a multiple of ; in particular, is even.)
Figure 4: Subpaths of through colourful blocks: A top path and a bottom path contribute the same amount as a single zig-zag path.
For each such that we proceed symmetrically to the previous case, but reversing the roles of and . Specifically, we group the elements of into pairs. For each pair , contains the path through the top row of and the path through the bottom row of . For each element , contains the zig-zag path with both endpoints in the top row of and that contains every vertex of .
For each , contains the top row of .
The rules above define the intersection, , of with each colourful block of . If is the path through the bottom (top) row of then we call a bottom (top) block. If is the zig-zag path that contains every vertex of then we call a zig-zag block. Note that and this implies that the number of bottom blocks among is the same as the number of bottom blocks among . Indeed, this number is exactly .
We now define how behaves for the boring blocks, that we name . The first boring block comes immediately before . Each boring block , for comes immediately after and immediately before . In almost every case, uses the path through the top row of . The only exceptions are when or are bottom blocks. Note that, because of the global independence constraint, these two cases are mutually exclusive. See Figure 5.
When is a bottom block uses a path that begins at the bottom row of but moves immediately to the top row of and uses the entire path along the top row. We call this a downup path.
When is a bottom block, uses a path that begins at the top row of and moves immediately to to the bottom row of ad uses the entire path along the bottom row. We call this a updown path.
This completely defines the path . All that remains is to argue that is anagramish.
Observe that the number of downup paths and the number of updown paths in is exactly the same as the number of bottom blocks among which is exactly . Similarly, the number of updown paths and downup paths in is exactly . Now every path that is neither downup nor updown uses the top row. This implies that the sequence of colours contributed to by the intersection of with is a permutation of the sequence of colours contributed to by the intersection of with .
Finally, by construction, each pair of top and bottom blocks in contributes exactly the same amount as a single matching zig-zag block in . Specifically, if , and , is a top block, is a bottom block and is a zig-zag block, then the contributions of and to cancels out the contribution of . After doing this cancellation exhaustively, all that remains are top blocks, which also cancel each other perfectly. This completes the proof. ∎
For completeness, we wrap up the proof of Section 1:
Proof of Section 1.
Assume for the sake of contradiction that there exists some such that for each . With this assumption, Section 3 shows that for every there exists a string with , , and such that is an anagram-free colouring .
Section 3 shows that, for any string with and , the graph contain a path that is anagramish under . This is certainly a contradiction to the fact that is anagram-free colouring of . We therefore conclude that, for every there exists an such that . ∎
4 Proof of Section 2
All that remains is to prove Section 2, which we do now.
Proof of Section 2.
Define an even-length string over the alphabet to be -unbalanced if and -balanced otherwise. If is -balanced for each then is balanced. Observe that, if is balanced then . A string is everywhere unbalanced if it contains no balanced substring of length . Our goal therefore is to show that there is an upper bound on the length of any -periodic everywhere unbalanced string.
Let be a positive integer (that determines and whose value will be discussed later), and let . Let be an -periodic everywhere unbalanced string of length over the alphabet . The fact that is -periodic, implies that the . Assume, without loss of generality, that is a multiple of .
Consider the complete binary tree of height whose leaves, in order, are length- strings whose concatenation is and for which each internal node is the substring obtained by concatenating the node’s left and right child. Note that for each and each , the fact that is -periodic and is multiple of implies that .
For each , let . Since is everywhere unbalanced, . Therefore,
and therefore, there exists some such that . At this point we are primarily concerned with appearances of , so let , and, for each node , let .
For each non-leaf node of , let denote a child of such that . (It is helpful to think of as being ordered so that each right child with sibling has .) For a non-leaf node the fact that is -unbalanced implies that
From this point on we use the following shorthands. For any , , , and . Summarizing, we have a complete binary tree of height and with the following properties:
For each , .
For each non-leaf node , .
For each , let denote the set of nodes for which the path from the root of to contains exactly nodes in , excluding . See Figure 6. Observe that, since each node in has an ancestor in ,
We will show that there exists an integer such that, for each ,
In this way,
(h+1)n/ℓ≤L(X) & = ∑_i=0^h L(X_i)
&≤∑_i=0^h L(X_ti/t) & (since )
& = t⋅∑_i=0^h/t L(X_it) & (for a multiple of )
&≤t⋅∑_i=0^∞ (1-(1/2)^t+1)^i L(X_0) & (by Equation 1)
&≤tn⋅∑_i=0^∞ (1-(1/2)^t+1)^i & (since )
& = tn2^t+1
which is a contradiction for sufficiently large ; in particular, for .
It remains to establish Equation 1, which we do now. Define and, for each , define to be the subset of that are descendants of some node in . See Figure 6. To upper bound observe that can be split into two sets and defined as follows: The nodes do not have an ancestor in and therefore . The nodes in do have an ancestor in and therefore have an ancestor in . Iterating this argument, we obtain
&≤(1/2)L(A_0) + (1/2)L(A_1) + ⋯+ (1/2)L(A_t-1) + L(A_t)
&≤(1/2)L(A_0) + (1/4)L(A_0) + ⋯+ (1/2)^t L(A_0) + L(A_t)
& = (1-(1/2)^t)L(A_0) + L(A_t) = (1-(1/2)^t)L(X_i) + L(A_t) . So all that remains to establish 1 is to prove that .
To to this, observe that, for each ,
Since is -periodic,
for . ∎
Although an explicit upper bound on could be extracted from the proof of Section 2 it would likely be far from tight. We suspect that there is a Fourier analytic proof that would give better quantitative bounds. We have not pursued this, because we have no idea how to explicitly upper bound , for reasons discussed in the next paragraph.
Section 2 and its proof give absolutely no clues to help find a concrete bound on or to find a minimal set . Indeed, for some choices of , doing so can be a difficult problem. Consider the example where and is predicate that tells whether or not its input is anagram-free. It is easy to see that this predicate satisfies (A1) and the result of Pleasants , published in 1970, shows that this satisfies (A2). The question of whether or is then the question of determining whether there exist arbitrarily long anagram-free strings on an alphabet of size . This was the open problem posed by Erdős  in 1961 and again by Brown  in 1971 and not resolved until 1992 when Keränen [7, 8] showed that the answer, in this case, is that . However, if this were not the case, then determining would be the question of determining the length of the longest anagram-free string over an alphabet of size .
Our proof uses Section 2 twice and each application uses a predicate that is considerably more complicated than asking if the input string is anagram-free. It seems unlikely that we will obtain concrete bounds upper bounds on as a function of except, possibly, through the use of computer search. The resulting value is used in the application of Section 2 and also within the proof of Section 3.
- Brown  T. C. Brown. Is there a sequence on four symbols in which no two adjacent segments are permutations of one another? The American Mathematical Monthly, 78(8):886–888, 1971. doi:10.1080/00029890.1971.11992892.
- Carmi et al.  Paz Carmi, Vida Dujmović, and Pat Morin. Anagram-free chromatic number is not pathwidth-bounded. In Andreas Brandstädt, Ekkehard Köhler, and Klaus Meer, editors, Graph-Theoretic Concepts in Computer Science - 44th International Workshop, WG 2018, Cottbus, Germany, June 27-29, 2018, Proceedings, volume 11159 of Lecture Notes in Computer Science, pages 91–99. Springer, 2018. doi:10.1007/978-3-030-00256-5_8.
- Erdős  P. Erdős. Some unsolved problems. Magyar Tud Akad Mat Kutato Int Kozl, 6:221–254, 1961.
- Evdokimov [1968a] A. A. Evdokimov. Strongly asymmetric sequences generated by finite number of symbols. Doklady Akademii Nauk SSSR, 179:1268–1271, 1968a.
- Evdokimov [1968b] A. A. Evdokimov. Strongly asymmetric sequences generated by finite number of symbols. Soviet Mathematics Doklady, 9:536–539, 1968b.
- Kamčev et al.  Nina Kamčev, Tomasz Łuczak, and Benny Sudakov. Anagram-free colourings of graphs. Comb. Probab. Comput., 27(4):623–642, 2018. doi:10.1017/S096354831700027X.
- Keränen  Veikko Keränen. Abelian squares are avoidable on 4 letters. In Werner Kuich, editor, Automata, Languages and Programming, 19th International Colloquium, ICALP92, Vienna, Austria, July 13-17, 1992, Proceedings, volume 623 of Lecture Notes in Computer Science, pages 41–52. Springer, 1992. doi:10.1007/3-540-55719-9_62.
- Keränen  Veikko Keränen. A powerful abelian square-free substitution over 4 letters. Theor. Comput. Sci., 410(38-40):3893–3900, 2009. doi:10.1016/j.tcs.2009.05.027.
- Nešetřil and de Mendez  Jaroslav Nešetřil and Patrice Ossona de Mendez. Tree-depth, subgraph coloring and homomorphism bounds. Eur. J. Comb., 27(6):1022–1041, 2006. doi:10.1016/j.ejc.2005.01.010.
- Pleasants  P. A. B. Pleasants. Non-repetitive sequences. Proceedings of the Cambridge Philosophical Society, 68:267–274, 1970. doi:10.1017/S0305004100046077.
- Wilson  Tim E. Wilson. Anagram-free Graph Colouring and Colour Schemes. Ph.D. thesis, Monash University, 2019. doi:10.26180/5c72eca26d5c7.
- Wilson and Wood [2018a] Tim E. Wilson and David R. Wood. Anagram-free colorings of graph subdivisions. SIAM J. Discret. Math., 32(3):2346–2360, 2018a. doi:10.1137/17M1145574.
- Wilson and Wood [2018b] Tim E. Wilson and David R. Wood. Anagram-free graph colouring. Electron. J. Comb., 25(2):P2.20, 2018b. doi:10.37236/6267.