1 Introduction
A binary word of length is prefix normal if for all , no substring of length has more s than the prefix of length . For example, the following are the 14 prefix normal words of length :
The word , for instance, is not prefix normal because the substring has more s than the prefix ; similarly is not prefix normal, because the substring has more s than the prefix of length . The number of prefix normal words for is
respectively. This enumeration sequence is included as sequence A194850 in The OnLine Encyclopedia of Integer Sequences (OEIS) [30], listing up to . It is not difficult to show that grows exponentially. Some bounds and partial enumeration results were presented in [12], and it was conjectured there that . This conjecture was recently proved by Balister and Gerke in [5]. Finding a closed form formula or generating function for , however, remains an open problem.
Prefix normal words were originally introduced in [18] by two of the current authors, in the context of Binary Jumbled Pattern Matching (BJPM): Given a binary string of length , and a pair of nonnegative integers , decide whether has a substring with s and s. While the online version of this problem can be solved naively in time, the indexed version has attracted much attention during the past decade [8, 9, 24, 22, 3, 13, 4, 14, 2, 21, 23, 17, 1]. As was shown in [18, 12], every binary word can be assigned two canonical prefix normal word, called its prefix normal form, which can then be used to answer BJPM queries in constant time.
1.1 Our contributions
In this paper, we deal with the question of generating all prefix normal words of a given length . In combinatorial generation, the aim is to exhaustively list all instances of a combinatorial object. Typically, the number of these instances grows exponentially, and time is measured per object, and excluding the time for outputting the objects. For an introduction to combinatorial generation, see [26].
The current best generation algorithm for prefix normal words runs in time per word [15]. Our algorithm improves on this considerably, using amortized time per word.^{1}^{1}1This algorithm was originally presented in [11], where we proved that it ran in amortized time per word, and conjectured amortized time per word. Based on the result of [5] on the asymptotic number of prefix normal words, we have been able to prove the amortized running time per word. It is based on the theory of bubble languages [27, 28, 31], an interesting class of binary languages defined by the following property: is a bubble language if, for every word in , replacing the first occurrence of (if any) by results in another word in [27, 28]. Many important languages are bubble languages, including binary necklaces, Lyndon words, and ary Dyck words^{2}^{2}2Those languages are actually bubble, while prefix normal words are bubble: the difference is simply exchanging the role of and in the definition, see [27, 28].. A generic generation algorithm for bubble languages was given in [28], yielding coollex Gray codes for each subset of a bubble language containing all strings of a fixed length and weight (number of 1s). In general, a (combinatorial) Gray code is an exhaustive listing of all instances of a combinatorial object such that successive objects in the listing are “close” in some welldefined sense. In the case of coollex order, the strings differ by at most two swaps. The generic algorithm’s efficiency depends only on a languagedependent subroutine called an oracle, which in the best case leads to CAT (constant amortized time generation) algorithms.
In the following, we show that the set of all prefix normal words forms a bubble language; it is the first new and interesting language shown to be a bubble language since the original exposition [27, 28]. We develop an oracle for prefix normal words and apply the generic generation algorithm to obtain a coollex ordering of prefix normal words with length and weight . Concatenating together these lists in increasing order of weight, we obtain a Gray code for all prefix normal words of length where successive words differ by at most two swaps or by a swap and a bit flip. We then present an optimized oracle for prefix normal words, and, based on recent results from [5], we prove that our new generation algorithm runs in amortized time per word. Even though the previous time per word algorithm of [15] also provided a Gray code for prefix normal words (albeit with respect to a different measure of closeness), we are achieving a very considerable improvement in running time.
As an example, the listing of prefix normal words of length that results from our algorithm, partitioned by weight, is given in Table 1.
0000000  1000000  1010000  1101000  1101100  1110110  1110111  1111111 
1001000  1010100  1110100  1111010  1111011  
1000100  1100100  1101010  1101101  1111101  
1000010  1010010  1100110  1110101  1111110  
1000001  1100010  1110010  1101011  
1100000  1010001  1101001  1110011  
1001001  1010101  1111001  
1100001  1100101  1111100  
1110000  1100011  
1110001  
1111000 
A second contribution of this paper is a new characterization of bubble languages. We show that bubble languages can be described in terms of a closure property in the computation tree of a simple recursive generation algorithm for all binary strings. We believe that this view could aid other researchers in applying the powerful tool of bubble languages and their accompanying Gray codes. In fact, it was the discovery that prefix normal words formed a bubble language that led to an efficient generation algorithm and Gray code for our language.
The final part of the paper deals with membership testing, i.e. deciding whether a given binary word is prefix normal. Several quadratictime membership testers for prefix normal words were given in [18, 12]. The best worstcase time tester can be obtained by using the connection to Indexed Binary Jumbled Pattern Matching (BJPM), for which the current best algorithm, by Chan and Lewenstein, runs in time [13]. We present a new membership tester for prefix normal words which applies a simple twophase approach and is conjectured to run in averagecase time, where the average is taken over all words of length .
1.2 Related work
In addition to the connection to jumbled indexing, prefix normal words are also increasingly being studied for their own sake. Enumeration and languagetheoretic results were given by Burcsi et al. in [12], and Balister and Gerke strengthened some results in [5]: in particular, they proved a conjecture about the asymptotic growth behaviour of the number of prefix normal words and gave a new result about the size of the equivalence classes. Cicalese et al. gave a generation algorithm in [15], with linear running time per word, and studied infinite prefix normal words in [16]. Prefix normal words and prefix normal forms have been applied to a certain family of graphs by BlondinMassé et al. [7], and were shown to pertain to a new class of languages connected to the Reflected Binary Gray Code by Sawada et al. [29]. Very recently, Fleischmann et al. presented some results on the size of the equivalence classes in [FKNP20].
1.3 Overview
The paper is organized as follows. In Section 2, we give the necessary terminology and some basic facts about prefix normal words, and we develop a result on the average critical prefix length of a prefix normal word. This result will later be used in the analysis of our generating algorithm. In Section 3, we give a simple generation algorithm which, based on the result of [5], is proved to run in amortized per word. In Section 4, we present our novel view of bubble languages. In Section 5, we introduce our new generation algorithm, which uses the bubble framework. In Section 6, we present the new membership tester. We close with some open problems in Section 7.
2 Preliminaries
A binary word (or string) over is a finite sequence of elements from . Its length is denoted by , and the th symbol of a word by , for . We denote by the set of words over of length , by the set of all words over , and by the empty word. Let . If for some , we say that is a prefix of and is a suffix of . A substring of is a prefix of a suffix of . A binary language is any subset of . We denote by the number of occurrences in of character . The number of s in , , is also called the weight of . For a binary language , let denote the subset of all strings in with length , and that of all strings in with length and weight .
We denote by the string obtained from by exchanging the characters in positions and .
We define combinatorial Gray codes, following [26, ch. 5]: Given a set of combinatorial objects and a relation on (the closeness relation), a combinatorial Gray code for is a listing of the elements of , such that for . If we also require that , then the code is called cyclic.
2.1 Prefix normal words
Let be a binary word. For each , we define , the weight of the prefix of length , and , the maximum weight of length substrings of . The function is sometimes called maximumones function, while in the context of compact data structures, function is often called [25].
Definition 2.1
A word is called prefix normal if for all , . We denote by the language of prefix normal words, and by , the number of prefix normal words of length .
In [18, 12] it was shown that for every word there exists a unique word , called its prefix normal form, such that for all , , and is prefix normal. We give the formal definition:
Definition 2.2
Given a word , the prefix normal form of , denoted , is the prefix normal word given by , for . Two words are prefix normal equivalent if .
As an example, the word has the maximumones function as can be checked easily. It is furthermore not difficult to see that for all : or . Thus the sequence of first differences yields a binary word, in this case the word . In Table 2 we list all prefix normal words of length followed by the set of binary words with this prefix normal form.
Words with this prefix normal form  Words with this prefix normal form  

{}  {}  
{, }  
{, }  
{, , }  
{}  
{}  
{} 
The next lemma lists some properties of prefix normal words which will be needed in the following. Proofs can be found in [12].
Lemma 2.3 ([12])
Let be a binary word.

is prefix normal if and only if all of its prefixes are prefix normal.

If is prefix normal, then so is .

Let be prefix normal. Then the word is prefix normal if and only if, for every suffix of , .

is prefix normal if and only if .
We refer the interested reader to [12] for more on prefix normal words.
2.2 Critical prefix length
It will often be useful to write binary words as , where and is either or a binary word beginning with 1. In other words, is the length of the first, possibly empty, run of , is the length of the first run, and the remaining, possibly empty, suffix. Note that this representation is unique.
Definition 2.4
Let , , where and . We refer to as ’s critical prefix, and denote by the critical prefix length of .
For example, the critical prefix length of is , that of is , and that of is .
Lemma 2.5
The expected critical prefix length of a binary string of length is .
Proof. Let , with . Let
be a random variable with
. We have that if and only if , with , so for words. Otherwise for some , andwhere we have used in the last two equations that , and that the tail of the infinite sum has the closed form .
Remark: It can be shown in a similar way that the expected critical prefix length of a randomly chosen infinite binary word is .
Lemma 2.6
The sequence of the sum of over all words of length obeys the recurrence
Proof. Consider all strings of length and what happens to their critical prefix length if one character is added. For those with , it just stays the same, and since we get two new strings and , these are counted twice. The remaining strings are either of the form, with , in which case adding a will increase by , and adding a will not; there are many of these. Else it is , in which case adding a or a will increase by . So altogether we get
Incidentally, the sequence , is listed as sequence of the OEIS [30], along with the secondorder recurrence for That it also obeys the firstorder recurrence of Lemma 2.6 can be seen by computing the difference and substituting the recursive formula of order for both.
Next we show that the expected critical prefix of the prefix normal form of a randomly chosen word is . Note that this is not the same as the expected critical prefix length of a random prefix normal word, due to the fact that the equivalence class sizes of the prefix normal equivalence vary considerably (see [18], and Thm. 2 in [5]).
Lemma 2.7
Given a random word , let be the prefix normal form of . Then the expected critical prefix length of is .
Proof. Let , with and the r.v.’s and . It is known that the expected maximum length of a run in a random word of length is [19]. Clearly, equals the length of the longest run of ’s of , thus . What about ? Consider a run of of maximum length . If has more than ’s, then there is a substring of consisting of this run and one more ; the number of ’s in this substring is an upper bound on . Since these s form one single run, their number is again in expectation. If has exactly ’s, then so . The number of words with at most one run is . So we have:
It is easy to see that the number of prefix normal words grows exponentially (just note that is prefix normal for every ). Balister and Gerke [5] recently proved a conjecture from [12] about the asymptotic number of prefix normal words:
Theorem 2.8 ([5], Thm. 1)
The number of prefix normal words of length is .
We will use this theorem to prove an upper bound on for prefix normal words . We need the following lemma.
Lemma 2.9
Let and , and suppose that . Let be a random variable taking values for prefix normal words . Then .
Proof. Consider all prefix normal words with . These contribute to . Now consider all prefix normal words with . There are at most binary words with , since these words must begin with one of the patterns , and therefore, at most this number of prefix normal words with . Each prefix normal word can only contribute at most to the average. So the contribution to the average summed over all prefix normal words with is at most , which is , since , and hence negligible:
Theorem 2.10
The expected length of the critical prefix of a prefix normal word of length is .
3 A Simple Generation Algorithm for Prefix Normal Words
Our first generation algorithm uses Lemma 2.3: (1) A word is prefix normal if and only if all of its prefixes are prefix normal; (2) if is prefix normal, so is , but not necessarily ; and (3) if and only if for every suffix of , the number of ones in is strictly less than . Words for which is not prefix normal are called extension critical. Thus, whether a word is extension critical can be tested in linear time in .
We can therefore generate all prefix normal words of length by iteratively generating all prefix normal words of length , for , and extending each one by a 0 if it is extension critical, or by a 0 and a 1 if it is not. This yields a computation tree whose leaves are precisely the prefix normal words of length . We refer to this algorithm as Simple Generation Algorithm.
Theorem 3.1
The Simple Generation Algorithm generates all prefix normal words of length in amortized time per word.
Proof. Notice that an extension critical test is performed in each inner node, taking time if the node is at depth . An inner node at depth corresponds to a prefix normal word of length , so the number of tests equals the total number of prefix normal words of length smaller than . From Theorem 2.8 (Balister and Gerke, 2019 [5]) it follows that most prefix normal words are not extension critical, in particular more than half of all prefix normal words of a given length can be extended. Therefore, , and by induction , implying that the total time taken by the algorithm to generate all prefix normal words of length is
4 Bubble Languages and Combinatorial Generation
In this section we give a brief introduction to bubble languages, mostly summarising results from [27, 28]. However, our presentation is different in that it presents the generation of a bubble language as a restriction of an algorithm for generating all binary words. This view also yields a new characterization of bubble languages in terms of the computation tree of this generation algorithm (Prop. 4.2).
Definition 4.1 ([27, 28])
A language is called a first bubble language if, for every word , exchanging the first occurrence of (if any) by results in another word in . It is called a a first10 bubble language if, for every word , exchanging the first occurrence of (if any) by results in another word in . If not further specified, by bubble language we mean first01 bubble.
For example, the languages of binary Lyndon words and necklaces are bubble languages. As was shown in [27], a language is a bubble language if and only if each of its fixedweight subsets is a bubble language. This implies that for generating a bubble language, it suffices to generate its fixedweight subsets.
Next we consider combinatorial generation of binary strings. Let be a binary string of length , let be its weight, and let denote the positions of the s in . Clearly, we can obtain from the word with the following algorithm: first swap the last with the in position , then swap the st with the in position etc. Note that every is moved at most once, and in particular, once the ’th is moved into the position , the suffix remains fixed for the rest of the algorithm.
These observations lead to the Recursive Swap Generation Algorithm (Algorithm 1). Starting from the string , it generates recursively all length binary strings with weight and fixed suffix , where . The call RecursiveSwap() generates all binary strings of length with weight . The algorithm swaps the last of the first run with each of the s of the first run, thereby generating a new string each, for which it makes a recursive call. During the execution of the algorithm, the current string resides in a global array . The function Swap() swaps the values stored in and . In the subroutine Visit() we can print the contents of this array, or increment a counter, or check some property of the current string. Crucially, Visit() is called on every string exactly once.
Let denote the computation tree of RecursiveSwap(). As an example, Fig. 1 illustrates (ignore for now the highlighted words). In slight abuse of notation, in the following we identify a node with the string it represents. The depth of equals , the number of s, while the maximum degree (number of children) is , the number of s. Consider the subtree rooted at : its depth is and the maximum degree of nodes is ; the number of children of itself is exactly , and ’s th child is . Note that suffix remains unchanged in the entire subtree; that the computation tree is isomorphic to the computation tree of ; and that the critical prefix length strictly decreases along any downward path in the tree. The algorithm performs a postorder traversal of the tree, yielding a listing of the strings of length with weight , in what is referred to as coollex order [31, 28, 27].
We can express the property of bubble language in terms of the computation tree as follows:
Proposition 4.2
A language is a bubble language if and only if, for every , its fixeddensity subset is closed w.r.t. parents and left siblings in the computation tree of the Recursive Swap Generation Algorithm. In particular, if , then it forms a subtree rooted in .
Proof. Follows immediately from the definition of bubble languages.
Using Prop. 4.2, the Recursive Swap Generation Algorithm can be applied to generate any fixedweight bubble language , as long as we have a way of deciding, for a node , already known to be in , which is its rightmost child (if any) that is still in . If such a child exists, and it is the th child , then the bubble property ensures that all children to its left are also in . Thus, line in Algorithm 1 can simply be replaced by “for ”.
The framework provided in [27, 28] to list the strings in for a given bubble language , can thus be viewed as a restriction of the Recursive Swap Generation Algorithm: Given a string , compute the largest integer such that , in other words, the rightmost child of node which is still in , called the bubble upper bound^{3}^{3}3In [27, 28], actually a “bubble lower bound” is computed. Because we feel it simplifies the discussion, here we introduce a related value called the “bubble upper bound”. The bubble lower bound is equal to minus the bubble upper bound.. This simple framework is outlined in Algorithm 2 for a given bubble language , where the current word is stored globally. The function Oracle() returns the bubble upper bound for with respect to . The membership tester Member( returns true if and only if . The initial call is GenBubble() with initialized to .
It was further shown in [27] that coollex order, the order in which the generic algorithm visits the strings of , gives a Gray code. This can be seen on the tree as follows:
Lemma 4.3
Let be a node in the computation tree . Then each of the following can be obtained from by a single swap operation: (a) any sibling of , (b) , and (c) any node on the leftmost path in the subtree rooted in .
Proof. Let and be siblings, and let their parent be . Then there exist such that and . Then , while . For (c), let and for some ; then .
We now report the main result on bubble languages from [27, 28], for which we give a proof using Prop. 4.2.
Proposition 4.4 ([27, 28])
Any fixedlength bubble language , where for all , can be generated such that subsequent strings differ by at most two swaps, or by a swap and a bit flip. Given a membership tester Member() which runs in time, this generation algorithm takes amortized time per word.
Proof. For a fixedweight subset , let denote the subtree of corresponding to . Note that in a postorder traversal of , we have:
By Prop. 4.2, we have that the leftmost descendant of any node in lies on the leftmost path in . Thus, by Lemma 4.3, can be reached in one or two swaps.
By concatenating the lists for weights , the procedure GenerateAll() shown in Algorithm 3 will exhaustively list for a given bubble language . To see the Gray code property, notice that for any weight , the last string visited is , while the first string visited for the next weight is the leftmost descendant of , i.e. a string of the form , which is one swap and one bit flip away from .
For the running time, notice that for , we do at most membership tests, where is the bubble upper bound for . The successful tests can be charged to the children of , while the possible last unsuccessful test can be charged to itself.
Remark: It is even possible to give a cyclic Gray code for
, by giving the fixedweight subsets listed first by the odd weights (increasing), followed by the even weights (decreasing).
The oracle of Algorithm 2 applies a simple membership tester to compute the bubble upper bound for given . However, we do not actually need a general membership tester, since all we want to know is which of the children of a node already known to be in are in ; moreover, the membership tester is allowed to use other information, which it can build up iteratively while examining earlier nodes. In the next section, we will apply this method to the language of prefix normal words.
5 A Gray Code for Prefix Normal Words
In this section, we prove that the set of prefix normal words is a bubble language. Then using the bubble framework and applying a basic quadratictime membership tester, we show how to generate all words in in Gray code order. By concatenating the lists together for all weights in increasing order, we obtain an algorithm to list as a Gray code in amortized time per word. By then providing an enhanced membership tester for prefix normal words specific to the bubble framework, we further show how this Gray code can be generated in amortized time.
Theorem 5.1
is a bubble language.
Proof. Let be a prefix normal word containing an occurrence of . Let be the word obtained from by replacing the first occurrence of with . Then , for some . Let be a substring of . We have to show that .
Note that for any , . In fact, , and for every , . Now if is contained in or in , then is a substring of , and thus . If , with suffix of and prefix of , then . If , with prefix of , then , and is a substring of , thus . Else , with suffix of . We can assume that is a proper suffix of . Let be the substring of of the same length as and starting one position before (in other words, is obtained by shifting to the left by one position). Since does not contain as a substring, we have for some . If is a power of ’s, then and the claim holds. Else, , and is a substring of . Thus .
Since there is a membership tester for prefixnormal words that runs in time, e.g. as described in Algorithm 4, the aforementioned Gray codes for both and can be generated in amortized time (Prop. 4.4). We show the computation tree in Fig. 1, with prefix normal words in bold. The complete listing for is given in Table 1.
5.1 A More Efficient Approach
Now we develop a more efficient membership tester for that is specific to one required by an oracle for bubble languages. In particular, membership tests are only made on strings of the form , given that .
Lemma 5.2
Let be a word in where , and . Let for some . Then is not in if and only if either

, or

.
Proof. The proof is illustrated in Fig. 2. Note that .
() The prefix of of has ones. If , then it must be that contains a substring of length , or less, with at least ones. Similarly, if then also contains a substring of length with at least ones. Thus, if either or then is not in .
() Assume is not in . Then there is a shortest substring with length in such that . Clearly and since is minimal . Suppose . Then can have at most ones since and thus , a contradiction. Thus . We now consider three cases for . If then since , . But since , this means . Thus . Suppose . If then the prefix of of length overlaps with , i.e. we can write and for some nonempty containing the swapped . Since , this implies that also has more s than the prefix of the same length, a contradiction to our choice of . Thus . Since starts with and it must be that . By extending to have length we have . Finally, suppose . Then because , is a substring of and hence a substring of . Since we have . Since for all and , and , it must be that . For each of these possible values for , . Thus which means . Finally, since the length of is at least , we also have . Considering all cases, we must either have , or .
Let denote the value . By maintaining as GenBubble iterates through the prefix normal words, we can apply the previous lemma to optimize a membership tester. Pseudocode is given in Algorithm 5. This function requires the passing of the variables and from the function Oracle, recalling that the current string is stored globally.