A finite word is a quasiperiod of a word if and only if each position of is covered by an occurrence of . A word with a quasiperiod is called quasiperiodic. For instance, is quasiperiodic and has two quasiperiods: and . Likewise, an infinite word may have several, or even infinitely many quasiperiods; in the latter case, we call it multi-scale quasiperiodic. The study of quasiperiodicity began on finite words in the context of text algorithms [1, 6], and was subsequently generalized to right infinite words [5, 7, 10], to symbolic dynamical systems , and to two-dimensional words  where it is a special case of the tiling problem. Finally, a previous article  provided a method to determine the set of quasiperiods of an arbitrary right infinite word. It also characterized periodic words and standard Sturmian words in terms of quasiperiods. This is interesting, because periodic words are the simplest possible infinite words, and Sturmian words are a widely studied class [7, 9, 12] which could be defined as the least complex non-periodic words. These results suggest that quasiperiodicity has some expressive power, and that the set of quasiperiods is an interesting object to study in order to get information about infinite words.
The current paper extends to the biinfinite case (-words) some results from . The motivations for this are threefold.
In the two-dimensional case, quasiperiodic -words and -words behave quite differently . This difference is not specific to the dimension , so it seems natural to start by understanding the differences in quasiperiodicity between -words and -words.
Quasiperiodicity have been considered not only on infinite words, but also on subshifts . However the shift map does not preserve quasiperiodicity in the right infinite case and this leads to annoying technicalities. The biinfinite case is sometimes considered more natural for subshifts because it turns the shift map into a bijection. Moreover, it also turns the shift map into a quasiperiodicity-preserving map, which makes the study of quasiperiodic subshifts much more convenient.
Finally, a previous article  gave a characterization of standard Sturmian words in terms of quasiperiods. Intuitively, the condition “standard” was only needed because of problems at the origin. By moving to the biinfinite case, we remove the origin so we can hope for a characterization of all Sturmian words. (We did not achieve this yet, but it is a possible continuation of our work.)
The current article makes a first step toward the resolution of these questions: it generalizes the method to study the set of quasiperiods of an arbitrary word from  to the biinfinite case. This is not a trivial task because, by contrast with the right infinite case, we might have several quasiperiods with the same length. (In the right infinite case, all quasiperiods are prefixes, thus there may be only one quasiperiod of a given length.) Therefore we need to determine not only the lengths of the quasiperiods, but also for each length which factors are quasiperiods and which are not.
Many natural results about quasiperiodicity on -words turned out to be surprisingly difficult to generalize to -words because of this problem. In addition to show how to determine the set of quasiperiods of an arbitrary -word, we investigate the relations existing between two quasiperiods of the same length inside a given biinfinite word. More preciesly, we show that the following conditions are decidable, given two words of the same length:
there exists a biinfinite word both and -quasiperiodic;
each -quasiperiodic biinfinite word contains infinitely many occurrences of ;
each -quasiperiodic biinfinite word is also -quasiperiodic;
in any word with quasiperiods and , the derivated sequences of and are equal.
Derivated sequences are a tool previously used to build examples and counter-examples of quasiperiodic words and to show independence results . A derivated sequence can be thought as a normal form for quasiperiodic words. Intuitively, when two derivated sequences are equal, the considered quasiperiods contain the same information about .
Finally, we give a complete description of the set of quasiperiods of each biinfinite Sturmian word. In particular, we show that each biinfinite Sturmian word has infinitely many quasiperiods. This contrasts with the right infinite case, where two Sturmian words of each slope have no quasiperiods.
The paper is structured as follows.
In Section 2, we provide a method to study the quasiperiods of an arbitrary biinfinite word, i.e., a description of the set of quasiperiods of an arbitrary word.
In Section 3, we define three relations over couples of words: compatible, definite, and positive. Those relations are decidable by an algorithm. We show that the couple is compatible if and only if there exists a biinfinite word having both and as quasiperiods (Item 1 above). Moreover, the couple is definite and positive if and only if all -quasiperiodic words are also -quasiperiodic (Item 3).
In Section 4, we show that the couple is positive if and only if in any word which is both and -quasiperiodic, the derivated sequences along and are the equal (Item 4). We also prove that is definite if and only if each -quasiperiodic word contains infinitely many copies of (Item 2).
In Section 5, we determine the set of quasiperiods of each biinfinite Sturmian word. In the process we show that all biinfinite Sturmian words have infinitely many quasiperiods.
Finally in Section 6, we conclude with a few related open questions and state our acknowledgements.
2 Determining the quasiperiods of biinfinite words
We quickly review classical definitions and notation. Let denote two finite words and a finite or infinite word. As usual, denotes the length of and the concatenation and . We note the letter of ; letters are often considered as words of length . We write for the empty word. If is of length and satisfies , then we say that is a factor of which occurs at position and which covers positions to (included). The word is a quasiperiod of if each position of is covered by an occurrence of . In particular, if is finite or right infinite, then is a prefix of . If is a word and two different letters such that and are both factors of , we say that is right special in . Symmetrically, if and are factors of , then is left special in . If is a factor of , then we say that is a successor of , and conversely that is a predecessor of in . A word has a unique successor (resp. predecessor) if and only if it is not right (resp. left) special. Finally, denotes the number of occurrences of in . Unless stated otherwise, all infinite words are biinfinite, i.e. indexed by .
We now have enough vocabulary to state the main theorem of , adapted to the biinfinite case.
Let denote an infinite word, a factor of and a letter.
Suppose is a quasiperiod and a factor of . The word is a quasiperiod if and only if is not right special.
Suppose is a quasiperiod and a factor of . The word is a quasiperiod if and only if is not left special.
Suppose is a quasiperiod of . The word is a quasiperiod if and only if either is not a factor of , or if occurs at least times in .
Suppose is a quasiperiod of . The word is a quasiperiod if and only if either is not a factor of , or if occurs at least times in .
A proof of Theorem 2.1 can be found in  in the right infinite case; the adaptation to the biinfinite case is immediate. That theorem basically states that it is enough to study the set of right special factors and square factors which are also prefixes to get the set of quasiperiods of a given right infinite word. As special and square factors are well-understood in combinatorics on words, it generally little additional word to get the set of quasiperiods of a given right infinite word. We will comment on the biinfinite version of the theorem, which we just stated, in a few paragraphs.
We can extend this theorem a bit further, but to do so we need the notion of overlap.
Let denote a finite word. An overlap of is a word having as a prefix and as a suffix, such that . More generally, a -overlap of is a word of the form , where is a -overlap and is such that is an overlap of .
The quantity is called the span of the overlap. If is fixed, then an overlap is uniquely determined by its span, thus we note the overlap of having span (if it exists). We write the -overlap built from overlaps , , etc. and we call the span of this overlap.
An overlap (without any explicit ) is thus a -overlap. An infinite word is -quasiperiodic if and only if two consecutive occurrences of in always form an overlap.
In general, we might have more than two occurrences of in an overlap of . For instance, contains occurrences of . We say that is a proper -overlap of if is a -quasiperiodic word containing exactly occurrences of . We write when we mean that is a proper -overlap of . A proper overlap is implicitly a proper -overlap.
Let denote a word and letters. If is a factor of an overlap of , then .
Let denote an overlap of ; by definition of an overlap, there exist words , (possibly empty) such that and . If is a factor of , then is a factor of . Let denote the words such that . Observe that , that is a prefix and a suffix of to conclude that . Thus we can simplify into , which implies and . ∎
Let denote an infinite word, and a quasiperiod of length of . A successor of is a quasiperiod of if and only if is not right special. A predecessor of is a quasiperiod of if and only if is not left special.
Let , denote letters and denote a word such that is a quasiperiod and a factor of . If is a quasiperiod of and is also factor of for a letter , then is a factor of an overlap of . Lemma 2.3 shows that : a contradiction. Conversely if is not right special, then every occurrence of continues into an occurrence of ; since covers , so does . The left special case is symmetric. ∎
Theorem 2.1 and Proposition 2.4 together imply that, in order to understand the set of quasiperiods of a biinfinite word, it is enough to know its set of special factors and its set of square factors. These two types of factors are already well-studied and well-understood in combinatorics on words, therefore we can reuse this knowledge when we need to get the set of quasiperiods of an infinite word.
Proposition 2.4 has another interesting consequence: if an infinite, aperiodic word has a quasiperiod of some length , then it also has a left-special quasiperiod and a right special quasiperiod of length . More precisely, the set of quasiperiods of some length is given by a union of chains of the form , where is left special, is right special, no other is special, and is the (unique) successor of for each . If belongs to such a chain, we call its left-special predecessor and its right special successor.
After working out several examples, one may conjecture that there is at most one right special (and thus one left special) quasiperiod of a given length in any biinfinite word. In this case, there would be at most one chain of quasiperiods of a given length, so it would be easy to determine the set of quasiperiods of an arbitrary biinfinite word. Unfortunately the following example disproves this conjecture. Let , and be defined by:
where the end of each occurrence of in is showed by a space. The definition of makes it clear that is a quasiperiod of . As the excerpt of suggests, is also a quasiperiod of : since the word is ultimately periodic, the same behaviour repeats to the left and to the right. It can be directly observed in the excerpt that both and are right special. This example is the simplest “pathological case” which we mentioned in the introduction.
3 Checking implcations between two quasiperiods
In this section we show that it is decidable to check, given two finite words and of the same length, which of the following is true:
Any -quasiperiodic biinfinite word is also -quasiperiodic;
there exists an infinite word which is - and -quasiperiodic, and another one which is just -quasiperiodic;
no infinite word may have both quasiperiods and at the same time.
First we develop a bit of vocabulary to state the conditions in a convenient way.
Let denote two different words of the same length and a proper overlap of . The word has at most one occurrence of .
By a classical lemma [8, Prop. 1.3.4], there exist finite words and an integer satisfying and . Moreover, is a primitive word. If it were not, call its primitive root and observe that an occurrence of would start at position in , yielding three occurrences of in , a contradiction. Additionally, we have either or . Indeed, if and , we would have and , implying , a contradiction with the definition of an overlap. We treat the cases and separately.
First, suppose . As , all occurrences of in must start at positions between and (included). Call the prefix of length of . The word is a factor of . Because is primitive, each factor of length occurs only once in , except itself [8, Prop. 1.3.2]. This means that there can only be one occurrence of , and therefore of , starting in the first letters of .
Now suppose . By the previous remarks, this implies , thus and is primitive. As a consequence, each factor of length in occurs only once, excepted itself (otherwise, for some finite words , and [8, Proposition 1.3.2] contradicts primitivity). In particular , if it occurs at all, occurs only once. ∎
Let denote finite nonempty words of the same length and natural integers. If the proper overlap exists and contains as a factor, then we write for the position of in ; otherwise is not defined. (Lemma 3.1 ensures that if exists, then it is unique.) If both and exist, then we define the quantity
otherwise, is undefined.
We insist on the fact that is defined only where is defined and contains an occurrence of . If is not a proper overlap (i.e. it contains more than two occurrences of , like ), then is not defined. The quantity is defined if and only if both and are. Moreover, and are not symmetric: .
Here is the intuitive interpretation of . Let denote natural integers such that is a proper -overlap of . By Lemma 3.1, the word has at most occurrences of . Suppose it has exactly two. If these two occurrences form an overlap of , then is the span of this overlap. If these two occurrences do not overlap, then there exists a nonempty word such that is a factor of ; in this case, . If has less than two occurrences of , then is not defined.
In Equation (1) we had and ; in this case the function is given by:
Computing given two finite words and of the same length can be done in time. For each and for each between and (included), compute and ; in each of them, test whether appears as a factor; if so, use Equation (2) to compute the value of . Otherwise, is not defined. The computation of and , and the search for , can be done in time using an optimal string-searching algorithm.
Let , be finite words and the set of integers such that the proper overlap exists and contains one occurrence of . Then, for all in , the following equation holds:
In particular, for all integers in we have: ; the relation implies that ; and the relation implies that .
Since contains exactly one occurrence of for all , Lemma 3.1 implies that contains exactly two occurrences of for all . As a consequence, is always defined. Conversely, is not defined if , since does not contain two occurrences of .
Now we have enough machinery to state conditions on which characterize situations where -quasiperiodicity implies -quasiperiodicity, or implies non--quasiperiodicity.
Let denote finite nonempty words of the same length. The couple is:
compatible if there exist integers such that is defined;
definite if is defined wherever is;
positive if is defined at least on one couple and is nonnegative wherever it is defined.
Since is computable in time , those relations are testable with the same time complexity.
Let denote two finite, nonempty words of the same length.
The couple is non-compatible if and only if -quasiperiodicity implies non--quasiperiodicity.
The couple is definite and positive if and only if -quasiperiodicity implies -quasiperiodicity.
The couple is compatible, but not definite positive if and only if there exists a biinfinite word with quasiperiods and , and another biinfinite word with only quasiperiod .
We prove the three statements separately.
Statement 1. Let denote -quasiperiodic word. It contains a factor of the form . By hypothesis is not defined, which means that contains either or occurrences of . By Lemma 3.1, at least one position in is not covered by , so, is not -quasiperiodic.
Conversely, suppose that each -quasiperiodic word is non--quasiperiodic. Let denote a pair of integers such that the proper -overlap exists and consider the infinite periodic word given by . Since is not -quasiperiodic, either or (or both) contains less than two occurrences of . In other terms, either or is not defined, and by Lemma 3.3 the other one is not defined either. Since this reasonning holds for any where exists, the function is nowhere defined.
Statement 2. Suppose is definite and positive and consider a -quasiperiodic biinfinite word. Any position in is covered by an occurrence of ; let denote the integers such that this occurrence is the middle one in the proper -overlap . By hypothesis, is defined and positive, so contains a proper overlap of . Lemma 3.1 implies that this proper overlap of covers the middle occurrence of . Consequently, any position in is covered by an occurrence of .
Conversely, suppose that -quasiperiodicity implies -quasiperiodicity. Let denote an arbitrary pair of integers such that the proper -overlap exists and denote the periodic biinfinite word given by . By hypothesis this word is -quasiperiodic, so by Lemma 3.1 the word contains a proper overlap of . Consequently, is defined and positive.
Statement 3. The proof is immediate as this statement exhausts all possibilities not covered by Statements 1 and 2.
4 On compatible and positive couples of quasiperiods
In this section, we investigate what the property “compatible and positive” implies for a couple of words (not necessarily definite). We get a characterization in terms of derivated sequences, and another one in terms of chains of quasiperiods.
The concept derivated sequence originates from Mouchard’s work on quasiperiodic finite words , and was later used by Marcus and Monteil to establish independence results between quasiperiodicity and other properties on right infinite words . We start by recalling the definition.
Let denote a biinfinite word and one of its quasiperiods. The sequence of positions of in is the sequence of positions of occurrences of in , in increasing order, such that is the position of the leftmost occurrence covering the position . If is the sequence of positions of in , then is called the derivated sequence of along .
For example, in Equation (1), the derivated sequence of along is and the derivated sequence along is . Observe that a word is -quasiperiodic if and only if its derivated sequence along is bounded by . In this case, the derivated sequence contains enough information to reconstruct the initial word.
Chains of quasiperiods were already mentioned in Section 2. Recall the following consequence of Theorem 2.1 and Proposition 2.4: in an infinite word , the set of quasiperiods of some length is given by a union of chains of the form , where is left special, is right special, no other is special, and is the (unique) successor of for each . Equation (1) shows an example of a word having two such chains for length .
Let denote a biinfinite word and , denote two quasiperiods of of the same length. The following statements are equivalent:
the couple is compatible and positive;
for all word having quasiperiods and , the derivated sequences along and are equal;
for all word having quasiperiods and , those quasiperiods belong to the same chain.
We actually prove something slightly more precise: the next proposition implies Theorem 4.2.
Let denote a biinfinite word and , denote two quasiperiods of of the same length. The following statements are equivalent:
for each integers such that the proper -overlap exists and is a factor of , we have .
the derivated sequences of along and along are equal up to a shift of one position;
the quasiperiods and belong to the same chain in .
The next lemma gives in Proposition 4.3, because and are nonnegative integers. However it is actually more general and we will also reuse it later in the proof.
Let denote an infinite word and , two quasiperiods of of the same length. The derivated sequences of and in are the same if and only if either: for each pair of natural integers such that is defined, we have ; or for each such pair , we have .
Call the sequence of positions of in , and similarly the sequence of positions of in ; observe that . The fact that and respectively translate to
By replacing in the previous equation, we get the two possibilities
which both imply that the derivated sequences are equal up to a shift. The converse argument work symmetrically. ∎
The next lemma proves in Proposition 4.3, which is the most technical part.
Let denote an infinite word and , two quasiperiods of of the same length. Suppose that for each pair of integers such that the proper -overlap exists and is a factor of , we have . Then the derivated sequences of along and along are identical, up to a shift of one position.
Let denote all the integers such that the proper overlap exists and is a factor of ; sort the by increasing length. If are integers such that the proper -overlap exists and is a factor of , then we call an occurring couple.
We only need to prove that either for each such that is an occurring couple, we have ; or that for each such , we have . Indeed, if for all (or the opposite one), then for each occurring couple we have ; by Lemma 3.3 we deduce that ; subsequently Lemma 4.4 shows that this is sufficient to finish our proof. Therefore now we argue that for each such that is an occurring couple.
Let and all the integers such that is an occurring couple; sort the by increasing length (in particular is a subsequence of ). The couple is not necessarily an occurring couple, but Lemma 3.3 guarantees that is well-defined and that . We can assume that ; if it is not the case, then by Lemma 3.3 and without loss of generality we consider the function instead of . Now reason by contradiction and consider the smallest integer such that . By hypothesis we can rule out , so it remains three cases to analyse.
Case 1. If , then use Lemma 3.3 to write , which is equivalent to . The quantity is negative so we have , which is a contradiction since is the smallest possible span.
Case 2. If , then consider the -quasiperiodic word whose derivated sequence is ; it would have and . Thus , but Lemma 3.3 implies : we have a contradiction.
Case 3. Finally, suppose we have . Recall that is minimal and that . By Lemma 3.3 we have , and by Lemma 3.3 the relations and imply that as well. Therefore we have four -overlaps of with spans ; ; ; and ; all with the same induced -overlap, which has span .
From we deduce that and have a common factor . In either case, the middle occurrence of is contained in this factor and last occurrence of starts in this factor. This situation is displayed on Figure 2. There exist words such that is a suffix of , and is a suffix of . Call the words satisfying , and without loss of generality suppose that . Observe that both and are suffixes of the same word , and . As are also both prefixes of , we deduce that has a period . From the same argument proves that there exist words such that is a prefix of , and is a prefix of . Call the words satisfying , and without loss of generality suppose that . Then remark that has a period . In order to simplify notation in the rest of the proof, let and .
We have . Therefore . Since is a prefix and a suffix of , by a length argument has a non-empty suffix which is a prefix of ; call it . We have . By the Fine-Wilf Theorem (see [8, Prop. 1.3.5 and its proof]), has a period of length . A period of is a suffix of and a period of is a prefix of ; by a divsibility argument, each of these periods is itself -periodic, therefore is -periodic. Besides, observe that is a prefix and a suffix of , therefore is -periodic as well.
Let us show that, for all , we have . The only case where this could fail is if and do not belong to the same occurrence of nor to the same occurrence of in . Since , the only possible span for the -overlap (and the -overlap) covering positions and is . But then, the Figure 2 shows that the prefix of length of and the prefix of length of are equal; since the latter one is -periodic, the former one is as well. By a length argument, in any case and is periodic: a contradiction. ∎
Finally, the next lemma gives in Proposition 4.3.
Let denote a biinfinite word and , denote two quasiperiods of of the same length. The derivated sequences of along and along are equal if and only if, up to swapping and , there exists a chain of quasiperiods with and , such that is the successor of .
Let and denote the sequences of positions of and in . If the derivated sequences are equal up to a shift of one position, then there exists an integer in such that for all we have . In particular always starts at the same position inside . Differently put, this means that there exists a word of length such that is a suffix of and each occurrence of in is the prefix of an occurrence of . Set and the implication is proved.
Conversely suppose that there is a family of quasiperiods such that and and is the unique successor of for each . By Proposition 2.4, none of the is right special except maybe . Therefore, there is an occurrence of exactly positions after each occurrence of in . Lemma 3.1 ensures that no other occurrences of appear in , therefore we can conclude that for each integer . ∎
Finally, the next proposition characterizes compatible and defined couples of quasiperiods.
Let denote two words of the same length. The couple is definite if and only if each -quasiperiodic biinfinite word contains infinitely many occurrences of .
If is definite, then is defined whenever the proper -overlap exists; therefore each proper overlap of contains one occurrence of . If a biinfinite word is -quasiperiodic, then it contains infinitely many occurrences of proper overlaps of , and therefore infinitely many occurrences of .
Conversely, suppose that each -quasiperiodic infinite word contains infinitely many occurrences of . Suppose that is an integer such that the proper overlap exists, but does not contain an occurrence of . Then the periodic biinfinite word given by does not contain any occurrence of ; but it should also contain infinitely many occurrences of by hypothesis. Therefore we have a contradiction and each proper overlap contains an occurrence of , which shows that is definite. ∎
5 Quasiperiods of biinfinite Sturmian words
If is an infinite word (either indexed by or by ), then denotes the number of distinct factors of length in and is the complexity function of . An infinite word is Sturmian if and only if it is not ultimately periodic and satisfies for each integer . Equivalently, a word is Sturmian if and only if it is not eventually periodic and has exactly one right special factor and one left special factor of each length. Sturmian words are an important and well-studied class of infinite words (see [9, Chapter 2] and [2, Chapter 6]). Now we determine the set of quasiperiods of any biinfinite Sturmian word. To this end, if is a biinfinite word, let denote the number of quasiperiods of length in .
Let denote a biinfinite Sturmian word and a nonnegative integer.
We have if and only if has a nonempty bispecial factor of length .
If and denotes the shortest bispecial factor of with , then quasiperiods of length in are exactly the factors of length in .
We prove the two statements separately.
Statement 1. If has a nonempty bispecial factor of length , say , then there exists a letter such that the (unique) right special factor of length in is , because any suffix of a right special factor is also right special. By Theorem 2.1 the word is not a quasiperiod of ; by Proposition 2.4 if had a quasiperiod of length , then it would have a right special quasiperiod of this length. Consequently has no quasiperiod of length .
Conversely suppose that has no bispecial factor of length . Since is Sturmian, it has exactly one right-special and one left-special factor of length , so its set of factors of length may be written , where is right special and has successors and ; both and have successor , which is left special; each other , and has respectively , and as an (unique) successor. We have , but we might have or . Figure 3 shows a graph of the “successor” relation. Observe that and (otherwise, we would have a bispecial factor of length ). As a consequence, the maximal distance between two consecutive occurrences of is , which is bounded by . In other terms, is a quasiperiod of .
Statement 2. Let denote a biinfinite Sturmian word and suppose that for some integer . The left special and the right special factors of length of , call them and , are both quasiperiods by Proposition 2.4. Call the shortest factor of having as a prefix and as a suffix. The set of factors of length of is given by a sequence , where and , such that is the successor of for each . By Proposition 2.4 again, the set is the set of quasiperiods of length of . Observe that is, by definition, exactly the shortest bispecial factor of not shorter than . ∎
Since any Sturmian word has infinitely many bispecial factors, whose difference between consecutive lengths are unbounded, we have:
Each biinfinite Sturmian word has infinitely many quasiperiods. Moreover, is unbounded.
As explained in the introduction, the biinfinite case may give nicer results about quasiperiodicity of subshifts and of Sturmian words. This paper provided a toolbox to study the quasiperiods of biinfinite words, but many questions are still to be answered.
An -word is periodic if and only if for each large enough