On Greedy Algorithms for Binary de Bruijn Sequences

02/23/2019 ∙ by Zuling Chang, et al. ∙ 0

We propose a general greedy algorithm for binary de Bruijn sequences, called Generalized Prefer-Opposite (GPO) Algorithm, and its modifications. By identifying specific feedback functions and initial states, we demonstrate that most previously-known greedy algorithms that generate binary de Bruijn sequences are particular cases of our new algorithm.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

There are binary de Bruijn sequence of order  [1]. Each has period in which every binary -tuple occurs exactly once.

Relatively fast generation of a few de Bruijn sequences has a long history in the literature. Any primitive polynomial of degree generates a maximal length sequence (also known as -sequence) that can be easily modified to a de Bruijn sequence [2]. There are at least three different algorithms to generate a Ford sequence, which is the lexicograpically smallest de Bruijn sequence. They were treated, respectively, by Fredricksen and Maiorana in [3], by Fredricksen in [4], and by Ralston in [5]. There are frameworks for construction via successor rules. More recent, necklace-based, successor rules and their co-necklace variants can be found in several works, e.g., [6], [7], [8], and [9].

Many, often less efficient, methods that produce much larger numbers of de Bruijn sequences are known. The majority of them work by joining smaller cycles into de Bruijn sequences. Prominent examples include methods to join cycles generated by pure cycling register or pure summing registers given by Fredricksen in [10], Etzion and Lempel in [11], and Huang in [12]. Examples of recent works on more general component cycles are the work of Li et al. [13] and Chang et al. [14].

The focus of this work, however, in on greedy algorithms to generate de Bruijn sequences. There have also been many known ones, e.g., Prefer-One and Prefer-Same in [4] and Prefer-Opposite in [15]. They have since been generalized using the notion of preference functions in [16]. A paper of Wang et al. [17], which was recently presented at SETA 2018, discusses a greedy algorithm based on the feedback function for .

In general one can come up with numerous feedback functions to generate de Bruijn sequences using greedy algorithms. It had unfortunately been rather challenging to confirm which ones of these functions actually work. We show how to circumvent this by specifying feedback functions that come with a certificate of correctness.

We state the Generalized Prefer-Opposite (GPO) Algorithm and prove sufficient conditions on the feedback functions to ensure that the algorithm indeed generates de Bruijn sequences. This leads us to numerous classes of special feedback functions that can be used to generate de Bruijn sequences via the algorithm. As a corollary, we show that Prefer-One, Prefer-Zero, as well as others based on preference functions are special cases of the GPO algorithm.

To include even more classes of feedback functions, we put forward suitable modifications of the GPO Algorithm. Several new families of de Bruijn sequences can then be generated in a greedy manner.

After this introduction comes preliminary notions and results in Section II. We introduce the GPO Algorithm in Section III. Sufficient conditions for the algorithm to produce de Bruijn sequences are then proved. The three subsections describe three respective families of de Bruijn sequences, with some analysis on their properties. Section IV shows how to modify the GPO Algorithm when the sufficient condition is not met. The modification results in numerous instances of successful construction of de Bruijn sequences. Three more families of such sequences are then showcased. We end with a conclusion that, for any de Bruijn sequence of order and any -string initial state , one can always find a feedback function that the GPO Algorithm can take as an input to produce .

Ii Preliminaries

Let be integers. We denote by and by . An -stage shift register is a clock-regulated circuit with consecutive storage units. Each of the units holds a bit and, as the clock pulses, the bit is shifted to the next stage in line. The output is a new bit based on the bits called the initial state. The corresponding feedback function is the Boolean function that, on input , outputs .

With a function , a feedback shift register (FSR) outputs a sequence satisfying for . Let be the smallest positive integer satisfying for all . Then is -periodic or with period and one writes . We call the -th state of and states and the predecessor and successor of , respectively.

An -string is often written concisely as , especially in algorithms and tables. The complement of is . The complement of a string is the corresponding string produced by taking the complement of each element. The string of zeroes of length is denoted by . Analogously, denotes the string of ones of length . Given a state , we say that and are, respectively, the conjugate state and the companion state of .

A function , where is a Boolean function, is said to be non-singular. An FSR with non-singular feedback function will generate periodic sequences [2, p. 116]. Otherwise is said to be singular and some states would have two distinct preimages under .

The state graph of the FSR with feedback function is a directed graph whose vertices are all of the -stage states. There is a directed edge from a state to a state . We call a child of and the parent of . We allow , in which case contains a loop. A leaf in is a vertex with no child, i.e., a leaf has outdegree and indegree . We say that a vertex is a descendant of is there is a directed path that starts at and ends at , which in turn is called an ancestor of . A rooted tree in is the largest tree in in which one vertex has been designated the root with the edge that emanates out of removed from . In this work the orientation is towards the root , i.e., is an in-tree or an arborescence converging to a root [18, Chapter 6].

Iii Generalized Prefer-Opposite Algorithm

This section describes the GPO algorithm and proves sufficient conditions on the feedback function to guarantee that the algorithm generates de Bruijn sequences.

For a given feedback function and initial state , the GPO algorithm is presented here as Algorithm 1. Notice that it does not always produce de Bruijn sequences. Theorem 2 provides sufficient conditions on the feedback function and the initial state for the GPO algorithm to generate de Bruijn sequence(s) of order .

1:A feedback function and an initial state .
2:A binary sequence.
5:     Print()
7:     if  has not appeared before then
9:     else
Algorithm 1 Generalized Prefer-Opposite

We establish an important lemma that will be frequently invoked.

Lemma 1

Given the state graph of the FSR with feedback function , let the state be a vertex with two children. Let the GPO Algorithm starts with an initial state . By the time the algorithm visits it must have visited both children of .


Let and be the two children of in . Since is not a leaf, one of the children, say , must have been its predecessor in the sequence produced by the GPO Algorithm thus far. Suppose, for a contradiction, that the algorithm visits without having visited . At this point, the other possible successor of both and , which is the companion state of , must have not been visited. The fact that is the actual successor of contradicts the rule of the algorithm since the successor of ought to have been .

Theorem 2

The GPO Algorithm, on input a function and an initial state , generates a binary de Bruijn sequence of order if the state graph of the FSR satisfies the following two conditions:

  1. all of the states, except for the leaves, have exactly two children;

  2. there is a unique directed path from any state to .


We first show that it is impossible for Algorithm 1 to visit any state twice before it visits the initial state the second time. For a contradiction, suppose that is the first state to be visited twice. The assignment rule precludes visiting the state of the form twice. Hence, must be of the form , implying that has as a child the state in and is not a leaf. By Lemma 1, when is first visited by the algorithm, its two children must have been visited. The second time is visited, one of its two children must have also been visited twice. This contradicts the assumption that is the first vertex to have been visited twice.

The algorithm continues until it visits again. Any other state, say , has two possible successors in . At least one of the two must have not been visited yet since their two possible predecessors in include .

Now it suffices to show that Algorithm 1 visits all states. Since there is unique directed path to from any other state, all of the other states are descendants of . By Lemma 1, by the time the algorithm revisits , it must have visited ’s two children, even if one of the two is itself. The same lemma implies that the respective child(ren) of these two children of must have been visited beforehand. By repeated applications of the lemma we confirm that all of the descendants of must have been covered in the running of Algorithm 1.

The initial state in Prefer-One is . The next bit of the sequence is if the newly formed -stage state has not previously appeared in the sequence. If otherwise, then the next bit is . For , for example, the algorithm outputs . Prefer-Zero, introduced by Martin in [19], works in a similar manner, interchanging and in the rule of Prefer-One. Knuth nicknamed Prefer-Zero the granddaddy construction of de Bruijn sequences [20]. For , the granddaddy outputs , which is the complement of .

Corollary 3

Let . Prefer-One is a special case of the GPO Algorithm with and . Prefer-Zero is a special case of the GPO Algorithm with and .

To generate de Bruijn sequences by Algorithm 1 it suffices to find families of feedback functions and initial states that satisfy the conditions in Theorem 2. There are numerous such combinations with only three families of them explicitly discussed in the present work.

Iii-a Family One

Our next result gives a family of de Bruijn sequences that can be greedily constructed from de Bruijn sequences of lower orders. It provides an alternative proof to [16, Theorem 2.4] in the binary case.

Theorem 4

Let be a feedback function whose FSR generates a de Bruijn sequence of order . We fix a positive integer . The GPO Algorithm generates the family of de Bruijn sequences of order on input any string of length in as the initial state and


It suffices to prove that with in Equation (1) satisfies the two conditions in Theorem 2. Since the coefficient of in is , any non-leaf state has two children. Hence, contains a directed cycle whose vertices are the -stage states of . All other states are vertices in the trees whose respective roots are the states of the above cycle. Since there is a unique directed path from any state to some state in the cycle and there is a unique directed path from to , the second condition is satisfied.

Example 1

is de Bruijn with


Table I provides the distinct de Bruijn sequences in the family.

No. Initial State The Resulting de Bruijn Sequence
TABLE I: All de Bruijn Sequences in with in (2)

While implementing Theorem 4 for we observe that when , distinct initial states yield distinct de Bruijn sequences of order . We can then generate distinct de Bruijn sequences for a valid . But if , then there are collisions in the output. Two pairs of initial states, namely and , generate the same sequence. We formalize and prove the general assertion in Theorem 5. To make its proof easier to follow we start with an illustration.

Let and . Figure 1 is the state graph corresponding to the FSR that produces the de Bruijn sequence . Figures 2 and 3 give, respectively, the state graphs corresponding to the functions and .

Fig. 1: The state graph for .

Fig. 2: The state graph for .

Fig. 3: The state graph for .

For fixed and , with and , let denote the consecutive states that form a cycle in . These states are typeset in gray in Figures 1 to 3. As progresses from to , we follow the directed edges to traverse all of the states clockwise from top left. For example, the state as shown in Figure 3 is . We underline the first bits of the states in and and put a line over their last bits.

For a fixed , let be the graph obtained by removing the edges in from the state graph . Note that has disjoint trees as components. Let denote the tree in that contains as its root, for . In Figure 3, for instance, the vertex set of is .

The algorithm, on input and initial state , yields . Note that the states in occur in the exact same order as they do in . In one period the states appear in the order

The same holds for with initial state . The states in follow the same order of appearance as they do in the resulting de Bruijn sequence . The states appear in the order listed in Table II.

No. State No. State No. State No. State
TABLE II: order of appearance in a run of the GPO algorithm

Let be the consecutive states of . There is a natural bijection between and for given by

Hence, and so on. Slightly abusing the notation, we use to denote the tree in whose root is with being the first bits of .

Let , for , be the set that contains and the vertices of except for its root . Notice that elements in always come in pairs as conjugate states. The first two states are conjugate as are the last two states. Their last bits are and . This fact follows from how is defined in equation (1) where is modified by shifting the focus to the last entries, instead of the first entries. In Figure 3 we have

One of the two states in each conjugate pair belonging to has a successor which is a leaf in either or . The successor of is , which is a leaf in . The successor of is , which is a leaf in .

We will use the enumeration of the trees that contribute some leaves to a well-chosen subsequence of a de Bruijn sequence as a tool in the proof of the next theorem. Figure 2 will be useful to confirm the only two cases where a collision in the output occurs.

Theorem 5

The number of distinct de Bruijn sequences in the family with is


Let be fixed. Let be the de Bruijn sequence of order produced by an with feedback function . Let be defined based on as in Equation (1). From the state graph we let and be the subgraphs defined earlier. Arithmetic operations on the indices are taken modulo in this proof.

Let be the consecutive states of length in the directed cycle . Choose one of the vertices, which are in a one-to-one correspondence with the -stage states of , arbitrarily as . Recall that is the largest tree in whose root is . The vertex sets of the trees partition the set of vertices of .

We choose an arbitrary index and let be the de Bruijn sequence generated by the algorithm on initial state . Lemma 1 implies that, among the states , the second state that the algorithm visits must be . For a contradiction, let with be the second state visited. Then, both of its children in must have already been visited. This is impossible since one of the children, which is , has not been visited yet. Thus, by their order of appearance in , the states are

Combining this fact and Lemma 1 makes it clear that by the time is visited, all states in for all must have been visited. So in the remaining run of the algorithm, each state visited after must belong only to for some . Similarly, suppose that we run the algorithm with initial state for some to generate . By the time is visited, all states in for must have been visited. Each of the remaining states to visit belongs only to for some .

We now examine what may allow while . Without lost of generality, let . If , then we can partition the set of consecutive states visited by the algorithm on initial state into two parts. Part contains the sequence of states starting from the successor of in up until the state . This includes all states in for . Part hosts the sequence of states starting from the successor of in until the state . This includes all states in for . We say that the two parts are self-closed since all of the successors of each state in Part , except for , are also contained in Part . Similarly with Part .

Going back to , we consider each state in Part , except for , and check if the successor of is a leaf. If yes, then we identify the corresponding tree in by its root. Because Part is self-closed, we claim that all trees corresponding to elements in Part must also be in Part . To confirm this claim, we use the bijection between and for to associate


and let . This allows us to identify as and vice versa.

The algorithm’s assignment rule requires that the respective successors of the conjugate states and must be either one of the two companion states and . Notice that must be a leaf in the tree whose root has, as its first bits, . The latter is the companion state of the child of in .

Let a pair of conjugate states and whose common last bits is be given. One of their two possible successors must be a leaf in whose root is the companion state of . We generalize this observation to vertices in . All of the states comes in conjugate pairs whose respective last bits are . Each pair has a state whose successor is a leaf in . Enumerating the corresponding trees, we obtain

Hence, going through each conjugate pair in and identifying the tree that contains a successor which is a leaf gives us the following list of trees:

Part must then have the following property. Because it is self-closed, it contains not only all of the states of the trees for , but also all states of the trees with respective roots . In total, the states in Part come from

trees. The assumption that , however, implies that there are distinct trees that contribute their states to Part . This is impossible if . Thus, whenever , distinct initial states and generate distinct de Bruijn sequences.

Now, let and be the initial state. Suppose that Part contains states contributed by only one tree and let . Then and the only relevant leaf (in ) is , which is the successor of in . Thus, we have , i.e.,

There are only two cases that satisfy this constraint.

  1. Part includes , containing and , while Part contains all of the other trees.

  2. Part consists of , containing and , while Part contains all of the other trees.

Starting with , the last state of the generated de Bruijn sequence must be . If we begin with , then the second and third states of the resulting de Bruijn sequence must be and , respectively. These two de Bruijn sequences are shift equivalent. A similar argument can be made for the initial states and .

If Part fully contains the trees and , then the above analysis confirms that it also contains leaves belonging to the trees and . Because we have

which is impossible since is de Bruijn. One can proceed inductively to come to the same conclusion for the cases where there are more than trees that contribute their states to Part . This completes the proof.

Iii-B Family Two

Here is another family of de Bruijn sequences that can be greedily constructed.

Theorem 6

Given and an integer , let the feedback function be


and the initial state be any -stage state of the sequence . Then the GPO Algorithm generates the family of de Bruijn sequences of order .


First, notice that the coefficient of in as given in Equation (5) is . Second, we confirm that contains a cycle whose vertices are all of the -stage states of . All other states are vertices in the trees whose respective roots are the states of this cycle. Checking that the two conditions in Theorem 2 are satisfied is in the end rather straightforward.

Remark 1

We make two notes on the family. First, the produced sequence is the same as the output of Prefer-Same when . Our description here simplifies the one given in [4] on the said algorithm. Second, we exclude since the resulting does not satisfy the two conditions in Theorem 2, although, with the initial state , the GPO Algorithm generates the same de Bruijn sequence as the one produced by Prefer-Zero.

Example 2