Palindromes are words that coincide when read left to right and right to left. In natural languages we can find several known, and sometimes amusing, examples of palindromes. In the theory of formal languages, questions related to palindromes are usually elegantly formulated but difficult to solve. An example of this is the famous HKS conjecture, whose first vague formulation appear in 1995 in the paper , and that still remains unsolved in its general case. Another recent conjecture is the one about palindromic length of aperiodic words, defined in , which still resists many attempts to prove it.
This article is devoted to words rich in palindromes. The definition of richness is a natural consequence of a simple observation made by Droubay, Justin and Pirillo in , that each word of length contains at most distinct palindromes, counting also the empty word which is considered to be a palindrome of length zero. A word of length is said to be rich in palindromes if distinct palindromes occur in it. For example, (Czech and Italian for pineapple) is rich in taste but also in palindromes since it contains the 7 palindromes: . Another rich word is , since it contains the palindromes . A counterexample is given by the word of length that only contains palindromes, namely , and thus is not rich in palindromes (nor in taste, according to the first author).
Any factor of a rich word is rich as well. Therefore the set of all rich words over a given finite alphabet is a so-called factorial language. Its factor complexity is known to be subexponential  and superpolynomial . Nevertheless the gap between the best known upper and lower bounds on number of rich words of length over a fixed alphabet is still huge, in particular for alphabet of cardinality at least . To improve the lower bound requires to construct more rich words. In this paper we look for rewriting rules (formally homomorphisms of the free monoid), which allow to construct new rich words from already known rich words. An example of such a rewriting rule is the so-called Fibonacci morphism: rewrites to and rewrites to . Applying the Fibonacci morphism to a rich word, say , we get a new rich word .
Because of the result of J. Vesti on extendability of rich words (), it is enough to apply rewriting rules only to infinite words, i.e., to sequences of the form , where is an element of a finite alphabet for each . The definition of palindromic richness can be naturally extended to infinite words. We say that is rich in palindromes if all its finite prefixes are rich in palindromes. The most studied class of binary words, namely Sturmian words, consists in words rich in palindromes. Therefore, rich words can be considered as one of the possible generalizations of Sturmian words. The Rote complementary symmetric sequences form another class of binary rich words. Except for these two classes, only several singular examples of binary rich words are known (for example the period doubling word). On -ary alphabet, with , two disjoint classes of rich words were found: the episturmian words and the words coding -interval exchange transformation with the symmetric permutation of intervals. On ternary alphabet a further class of rich words is described in .
Here we study two types of morphisms: Arnoux-Rauzy morphisms and morphisms from Class . Arnoux-Rauzy morphisms over -ary alphabet preserve the episturmian words. We show in Theorem 29 that Arnoux-Rauzy morphisms preserve the set of all rich words. morphisms were introduced in . Harju, Vesti and Zamboni used these morphisms in  to prove a weak version of the HKS conjecture for rich words. In Theorem 32 we characterize morphisms which preserve richness on binary alphabet. These morphisms contain Sturmian morphisms as a subclass.
The article is organized as follows: In Preliminaries, necessary definitions and tools for the study of richness are presented. In Section 3 we define Class morphisms and list some of their properties, including relation to morphism from Class . Morphisms from Class which are moreover marked are studied in Section 4. Section 5 is devoted to the proofs of our main results on morphisms preserving richness. These results are applied for finding rich words in Section 6. Some suggestions for further research are presented in the last section.
For all undefined terms we refer to . Let be a finite alphabet and the free monoid over . The elements of are called letters and the elements of are called (finite) words. The length of a word , with , is the number . The identity of the monoid is called the empty word and it is denoted by . Note that is the only word of length .
If a word can be written as , with , we say that is a factor of , is a prefix of and is a suffix of . If , then we define and . Given a word , with , we define its mirror image as . A word is called a palindrome (or palindromic) if .
A (right) infinite word is an infinite sequence of letters . We denote by the set of infinite words over . In this paper we use bolder letters to denote infinite words. We can extend in a natural way to infinite words the notions of factor, prefix and suffix. An infinite word is called ultimately periodic if for certain words . If we will call the word (purely) periodic. A word that is not ultimately periodic is called aperiodic. A word is called recurrent if any of its factors appears at least twice (thus an infinite number of times). It is uniformly recurrent if these factors appear at bounded gap, i.e., if for every factor of there exists an integer such that all factors of length of are of the form for certain .
The language of an infinite word , is the set of its finite factors, that is
We say that the language is periodic (resp. aperiodic, recurrent, uniformly recurrent) if the word is periodic (resp. aperiodic, recurrent, uniformly recurrent). A language is said to be closed under reversal if for any we have as well. It is known that if is closed under reversal, then is recurrent.
The set of complete return words to a word (with respect to ) is the set of words in having exactly two factors equal to , one as a proper prefix and the other as a proper suffix. It is known that is uniformly recurrent if and only if it is recurrent and for every word the set is finite. If is a complete return word to , then is called a (right) return word to . We denote by the set of return words to . Clearly .
Let us consider an infinite word and its language . The extension graph of a word with respect to is the undirected bipartite graph . Its set of vertices is the disjoint union of on one side and on the other side of the bipartite graph. Edges are formed by the elements of (see  for a more detailed explanation). When it is clear from the context we will write and instead of and .
Let us consider an infinite word such that its language contains the following words of length : (but not ); and the following words of length : (but not ). The graphs , and are shown in Figure 1.
A word is left special (resp. right special) if (resp ). A bispecial word is a word that is both left and right special. We denote by the set of all bispecial factors of . We also define the bilateral order of a word as
An infinite word over an alphabet is episturmian if its language is closed under reversal and contains for each at most one word of length which is right-special. It is strict episturmian, or Arnoux-Rauzy, if it has exactly one right-special word for each length and moreover each right-special factor is such that . Episturmian words over a binary alphabet are called Sturmian (note that when , every episturmian word is also a strict episturmian word).
In  it is proved that a finite word has at most factors that are palindromes. The defect of a finite word is defined as the difference between and the actual number of palindromes contained in . Finite words with zero defects are called rich words. If is a factor of then . In particular, factors of a rich word are rich as well. Droubay, Justin and Pirillo demonstrated that a finite word is rich if and only if any prefix of has the so-called Property Ju: the longest palindromic suffix of is unioccurrent in . A direct consequence of Property Ju is the following lemma (see [4, Lemma 4.4] for a proof).
Lemma 2 ().
Let and . Assume that is rich.
If both and occur in , then occurrences of and in alternate, in other words, any complete return word to in contains and any complete return word to in contains .
If is a factor of such that occurs in exactly once as a prefix of and occurs in exactly once as a suffix of , then is a palindrome.
The following result give us a more precise characterization of rich words in case of a binary alphabet.
Lemma 3 ().
Let be a finite word over . The word is not rich if and only if there exists a non-palindromic word such that and occur in .
The definition of richness can be extended to infinite words. The defect of an infinite word is defined as the number (finite or infinite) . An infinite word is rich if for any the prefix of length contains exactly different palindromes (see ). Infinite words with zero defects correspond exactly to rich words. Infinite words with finite (but not necessarily zero) defect are called almost rich. Such words have been studied in . A simple test for richness in periodic words was provided by Brlek, Hamel, Nivat and Reutenauer in [11, Theorems 4 and 6].
Theorem 4 ().
If is periodic and contains infinitely many palindromes, then there exist two palindromes such that . Moreover, is rich if its prefix of length is rich.
To test richness in general infinite words we introduce the following notion. Given an infinite word and a palindrome , we define the number of palindromic extensions of in as
The following relation between bispecial factors and richness in the language of an infinite word has been proved in [3, Theorem 11].
Theorem 5 ().
Let be such that is closed under reversal. Then is rich if and only if any bispecial factor satisfies
Extendability of finite rich words into infinite rich words were studied by Vesti. He proved the following property (see [34, Propositions 2.11 and 2.12]).
Proposition 6 ().
Let be rich. Then there exist an infinite aperiodic rich word and an infinite periodic rich words such that is a factor of both of them.
Let be two alphabets. A morphism is a monoid morphism from to . A morphism is called non-erasing if for all letters .
Let us consider a morphism . If is such that begins with and tends to infinity with , then the morphism is usually called the substitution. A unique word, denoted which has all words as prefixes is called a fixed point of the substitution .
A morphism is called primitive if there exists an integer such that for all , the letter appears as a factor in . If is a primitive morphism, then for all fixed point of , the language is uniformly recurrent (see [31, Proposition 13]).
A morphism is right conjugate to a morphism (or equivalently is left conjugate to ) if there exists a word - called the conjugate word - such that for each . We say that is the rightmost conjugate to , if is right conjugate to and is the only right conjugate to . In this case we denote such rightmost conjugate as . Analogously we define the left most conjugate to and denote it .
If there exists no right (resp. left) most conjugate to , then is conjugate to itself by a non-empty conjugate word . Such a morphism is called cyclic and there exists such that for each letter . A fixed point of a cyclic morphism if periodic (). A morphism that is not cyclic is called acyclic.
Over an alphabet we define the morphisms and for each as follows:
These morphisms together with permutations of letters in are called elementary Arnoux-Rauzy morphisms. The monoid generated by elementary Arnoux-Rauzy morphisms is called the Arnoux-Rauzy monoid. A morphism from this monoid is said to be an Arnoux-Rauzy morphism. Arnoux-Rauzy morphisms are known to send episturmian words to episturmian words (see ).
Let us consider the morphism , with defined by
Such a morphism, called the Fibonacci morphism, is an Arnoux-Rauzy morphism since it can be obtained as composition , where is the permutation interchanging the two letters. Its fixed-point is called the Fibonacci sequence.
Similarly, the morphism , with defined by
is an Arnoux-Rauzy morphism, called the Tribonacci morphism. Its fixed point is called the Tribonacci sequence.
A morphism over the binary alphabet is called a standard Sturmian moprhism if it belongs to the monoid generated by and , where is the permutation of the two letters and is the Fibonacci morphism defined in Example 7.
The basic set in which we will look for morphisms preserving richness is the so-called Class . It was introduced in , to study relations between rich and almost rich words.
A morphism belongs to Class , if there exists a palindrome such that
is a palindromic complete return word to for each ,
for each pair , .
Clearly, if Class , then is non-erasing, as a complete return word to is longer than itself. In this paper we focus on morphisms for which . In this case, following , we call the palindrome in Definition 8 the marker of the morphism Class .
One can easily check that the following standard Sturmian morphisms belong to Class
and with marker ,
and with marker .
Let be positive integers, . Then the morphism
belongs to Class and the associated marker is .
For each , the elementary Arnoux-Rauzy morphism defined in (2) belongs to Class and is its marker. A morphism which permutes letters of the alphabet, i.e., a permutation on , is from Class as well, the associated marker being . The morphism does not belong to Class , but it is conjugate to Class with the conjugate word .
Each Arnoux-Rauzy morphism is conjugate to a morphism in Class .
As proven in [23, Lemma 2.5], if a morphism is conjugate to and is conjugate to , then their composition is conjugate to . Using the previous example, any elementary Arnoux-Rauzy morphism is conjugate to a morphism from Class . As shown in , the Class is closed under composition and thus any composition of elementary Arnoux-Rauzy morphisms belongs to Class . ∎
Let us list some properties of morphisms from Class . They are stated in  as Remark 5.2 and Proposition 5.3.
Lemma 13 ().
Let be morphisms in Class . Denote and the markers associated to and , respectively. Then, for any
is a palindrome if and only if is a palindrome;
is a morphism in Class and its associated marker is .
Using the previous lemma, we deduce several auxiliary useful properties.
Let be a morphism in Class with marker and .
If , then occurs in exactly times and is a prefix of .
The set of return words to in is .
If , then .
Item (1). We proceed by induction on . By definition of , the claim is valid for . Assume now that , with and . Then . By definition of , the factor is a prefix of and it has 2 occurrences in . Therefore we can write . By induction hypothesis, is a prefix of and occurs times in . Consequently, is a prefix of and has occurrences in .
Item (2). Let . First we show that for every , the word has as a prefix. Indeed, the word is longer than for some . By Item , the word has as a prefix. Therefore, is a prefix of as well.
Since is a prefix of and of , the word is a prefix of . By definition of , occurs twice in . This implies that is a return word to in .
Item (3). Let . Since the language is extendable and the morphism is non-erasing, there exist letters such that and . Thus is an element of . It is enough to show that has as a prefix. By Item (1), we know that has as a prefix. Since , the word starts with as well. ∎
Let be a morphism in Class with marker . Let , with , , such that .
If has as a prefix and as a suffix, then there exists a unique such that . Moreover, if an index is an occurrence of in , then , where the index is an occurrence of in .
is closed under reversal if and only if is closed under reversal.
The factor is a concatenation of return words to . By Item (2) of Lemma 14, the factor can be written as for a certain word . Since for every , such a factor is given uniquely. Moreover, Item (2) of Lemma 14 implies that occurs in only as a prefix of for some .
To prove the second part of the corollary, we first assume that is closed under reversal and let . Then there exists a factor such that occurs in . By Item (3) of Lemma 14, . Since , then according to Item (2) of Lemma 13 the factor belongs to . Obviously, occurs in and thus belongs to too.
Now, let us assume that is closed under reversal and let . Then and thus, using Item (2) of Lemma 13, . By the first statement of this corollary, belongs to . ∎
If Class , then any factor of can be decomposed (up to a short prefix and suffix) uniquely into concatenation of return words to . Therefore, the synchronizing delay as defined in  is at most the length of a factor which does not contain . In other words, the synchronizing delay of is at most
If a morphism belongs to Class , then it is acyclic.
Assume that is cyclic and in . Then there exist words , distinct letters and positive integers , with such that and . Moreover, and are complete return words to . In particular, there exists a nonempty word such that . Hence we have which implies that appears at least thrice as a factor of , a contradiction. ∎
Let be a morphism in Class and be its marker.
If is a right conjugate to , then belongs to Class . In particular, Class .
If , then the marker is the conjugate word between and , i.e.,
First note that, by Proposition 17, the morphism is acyclic and thus and are well defined.
Item (1). Let us assume that there exists a letter such that for each . Then each has a suffix . Since is a palindrome, has a prefix . Thus is a palindrome as well and the palindrome occurs in exactly twice (otherwise we would have a third occurrence of as a factor of ). Therefore, Class and is the associated marker of . We have shown that obtained from by right conjugation using one letter belongs to Class . As each right conjugation can be obtained step by step by conjugations using one letter only, the assertion is proven.
Item (2). Since has a prefix , we can define a new morphism by for each . Obviously, is left conjugate to and is the conjugate word. Since is a palindrome, for each . By our assumption, is rightmost conjugate and thus there exist two letters, say and , such that . Therefore, and thus is the leftmost conjugate to . ∎
The following class was introduced by Hof, Knill and Simon in  in order to construct words containing infinitely many palindromic factors. Such words play an important role in the study of the spectrum of the Schrodinger operator associated with an aperiodic sequence.
We say that a morphism belongs to Class , if there exists a palindrome such that for each , where is a palindrome.
It is easy to see that any fixed point of a substitution from Class contains infinitely many palindromes. Let us explain the relation between Class and Class .
As demonstrated in [27, Lemma 20], an acyclic morphism is conjugate to a morphism from Class if and only if for each . If it is the case, the conjugation word is a palindrome. Using this result and Item of Propositions 18 and 17, we deduce that any morphism from Class is conjugate to an acyclic morphism from Class . The converse is not true. For example, the morphism given by
belongs to Class and it is acyclic. In particular,
and the conjugate word is Nevertheless, in the word occurs four times. Therefore, no conjugate of belongs to Class .
4. Marked morphisms
In our attempt to find morphisms preserving richness we will restrict to morphisms for which "desubstitution" is easy. We use marked morphisms. Our definition is slightly more general than the definition A. Frid uses in . Let and denote respectively the first and the last letter of a non-empty word .
We say that an acyclic morphism is right marked (resp. left marked) if the mapping (resp. ) is injective on .
A morphism is marked if it is both right marked and left marked.
A marked morphism is called well-marked if the mappings above are the identity on .
Let be a morphism in Class . If is right marked then it is left marked too.
This is a direct consequence of Item (2) in Proposition 18. ∎
If Class is right marked then there exists a positive integer such that is well-marked.
The statement follows simply from three facts:
The mappings and are permutations on .
If is a permutation on elements, then for , the power is the identity.
Class is closed under composition (see ).
Let and Class be a marked morphism over with marker , such that . Let such that .
If is an occurrence of the word in , then is an occurrence of in .
If is an occurrence of the word in , then is an occurrence of in .
Let us set . Let us consider a such that is a return word to in . By Item (2) of Lemma 14, the index is an occurrence of for some . Since is a palindrome, . By Item (2) of Proposition 18, . Since the mapping is injective, we have .
Now, let us assume that is an occurrence of the word in . Since has a prefix , there exists an index such that is a return word to in . By the same argument as above, is an occurrence of in for some . Obviously, and thus . ∎
Let be a marked morphism from Class with the marker such that . Denote by the permutation for each .
Let and . Then
Let be a marked morphism from Class such that and be its marker. Let .
If and is a factor of , then there exists such that .
for any . In particular, if and only if .
for any .
for any palindromic .
Item (1). Let be a bispecial factor in and a factor of . There exist two letters and , such that . We show that is a suffix of . Assume the contrary, i.e., that , with , such that the factor occurs in only once and . Let be the first letter of . Then by Lemma 23, is a prefix of , where , a contradiction with the definition of . Therefore is a suffix of . By the same argument, is also a prefix of . Corollary 15 implies that .
Item (2). Proposition 24 says that is an edge in the extension graph if and only if is an edge in . Therefore the bijection is an isomorphism of these two graphs.
Item (3). A consequence of the previous item.
Let be recurrent and and be two mutually conjugate morphisms over . Clearly, the languages of and coincide. Since the palindromic richness can be seen as a property of a language and not of an infinite word itself, it is enough to examine richness for one of these languages.
Let be a marked morphism from Class and such that its language is closed under reversal. If is rich, then is rich.
Let us recall that if a language is closed under reversal then it is recurrent. Because of the Remark 26, we may assume without loss of generality that . Let denote the marker associated to . By Corollary 15, the language of is closed under reversal as well. Therefore, to demonstrate the richness of we may use Theorem 5. If , then by Item (1) of Corollary 25, . Since is rich, we have that if is a palindrome, and otherwise. Recall that by Lemma 13, is a palindrome if and only if is a palindrome. Therefore, Items (3) and (4) of Corollary 25 imply that if is a palindrome, and otherwise. Consequently, is rich. ∎
5. Morphisms preserving richness
Our aim is to study the implication opposite to Corollary 27, i.e., for which morphism from Class , richness of forces richness of . Behaviour of morphisms from this class may differ according to the infinite word we apply the morphism on. It is testified by an example of a morphism from Class which can be found in [4, Example 5.6]. This morphism maps a rich sequence over a ternary alphabet to a binary sequence with infinite palindromic defect. The same morphism also maps the Tribonacci sequence (which is rich) to a rich sequence.
() Let be an episturmian sequence and be a morphism of Class . Then the sequence is almost rich.
Here we consider arbitrary rich sequences and we map them by an Arnoux-Rauzy morphism.
Let be an Arnoux-Rauzy morphism over the alphabet and such that its language is closed under reversal. Then is rich if and only if is rich.
Since the Arnoux-Rauzy monoid is generated by elementary morphisms, let us prove the result for these morphisms.
Let us first consider the case for a certain letter . This morphism belongs to Class and is its associated palindrome. Moreover, is marked. By Corollary 27, if is rich, then is rich as well. To prove the opposite implication let us assume that is rich and let us prove that for all bispecial words in Equation (1) is satisfied. The empty word is clearly a palindrome. The only palindromic extension of the empty word is , and . The only bispecial letter is , that is clearly a palindrome. Using Items (3) and (4) of Corollary 25 we have:
Let us now consider a bispecial word of length at least . Since the only left-special (resp. right-special) letter is , we can write for a certain word . Moreover for a certain word , hence and, by Item (2) of Corollary 25, . Using Item (3) of Lemma 13, we have that is a palindrome if and only if is a palindrome. Using the same reasoning as before we find that whenever is a palindrome, then
while whenever is not a palindrome, then
Thus, according to Theorem 5, is rich.
Let us now consider the case for a certain letter . We note that is conjugate of . Due to Remark 26 we have that
Finally, let us consider a morphism permuting letters. Since the definition of richness is based on counting palindromes occurring in a factor of , permutations of letters do not change the number of palindromes and the statement of the theorem is trivially verified. ∎
On binary alphabet the set of morphisms from Class which preserve richness can be extended even more. Indeed, on binary alphabet, any acyclic morphism is marked and moreover a morphism from Class