We consider the power of Boolean circuits with MOD gates. First, we introduce a few basic notions of computational complexity, and describe the standard models with which we study the complexity of problems. We then define the model of Boolean circuits, equate a restricted class of circuits with an algebraic model, and present some results from working with this algebra.
2 Computational Complexity
Computational complexity is the study of problems in terms of the resources needed to compute their answers. When a problem is posed, we wish to find a decision procedure that is most efficient in terms of certain constraints (e.g., time or space) that will correctly solve the problem each time.
Depending on the amount of time or space we have access to, or the model on which we are attempting to solve the problem (e.g., a Turing machine, Boolean circuit, or quantum computer), we may or may not be able to come up with such a procedure, and these attempts are what tell us about the fundamental difficulty of the problem in question. In this section, we give some definitions of simple computational concepts and briefly discuss the field of computational complexity. For readers interested in a more in-depth discussion of the field, we recommend one of the many accessible textbooks on the subject, .
2.1 Languages and Decision Problems
First, we will give some basic definitions in order that we may talk about computation and what is, or is not, computable.
Definition 2.1 A decision problem is a question in some formal system that has a yes or no answer. A decision problem has a Boolean output in , unlike a functional problem, which can have several-bit solutions.
Definition 2.2 A formal language is a set of strings of symbols constrained by rules specific to it.
When talking about computational problems, we will often pose them as a decision problem about the membership of the input in a specific language. The use of the term “language” may be somewhat deceptive, as we use it to encompass a wide variety of mathematical problems. For example, the set of strongly-connected graphs form the language -, and a decision procedure that decides membership in this language would correctly determine whether a graph belongs in the language. We may also extend decision problems to functional problems in a natural way, simply by posing a series of decision problems that will give us a functional answer. For instance, if we wish to add two integers bit-wise, we would pose a series of decision problems, each checking for the expected value of each bit of the result. In this way, it is clear that decision problems are highly extensible and therefore encompass a great deal of complex problems.
2.2 Turing Machines
To talk about the nature of computing, it is important that we define a model that may actually carry out any computation we might specify. It must be simple, so that its operation is clear, yet powerful enough to implement any computation we can think of. Below, we define the standard Turing machine model, first proposed by A. M. Turing in 1936.
Definition 2.3 A Turing machine is a model of computation which is defined by the tuple , ), where is a finite set of states, a finite set of tape alphabet symbols, the start state, Q the set of final or accepting states, and the transition function, where : .
The Turing machine’s tape is an infinite sequence of cells, each of which contain a letter of the tape alphabet. In the case of circuit complexity, we are only concerned with boolean logic, and so the tape cells will contain only letters selected from . The tape head is a device that moves along the tape and is able to read or write symbols from the tape alphabet at its current position. The machine functions by maintaining a state from the state set , where it may transition from one state to another based on the state it is currently in, and the value of the tape alphabet character at which the tape head rests. During any transition, it may also write a new character from the tape alphabet onto its current position, and move left or right on the tape. We may extend this model by changing the way the transition function is defined, but the standard Turing machine model is no less expressive than any non-standard model we care to define.
A fundamental limitation of Turing machines, or any less powerful or equivalent model, is their incapacity to solve certain problems. We call these problems undecidable, and we note that , the question of whether a Turing machine halts on an arbitrary input, is undecidable. So, an algorithm for computing cannot exist, which was shown by A. M. Turing in 1936.
2.3 Complexity Classes
We know that Turing machines may compute many complex problems, but we have yet to discuss the difficulty that computing any given problem may have. Considering the time and space resources theoretically needed to compute a given function using a Turing machine will give us a concrete understanding of the difficulty of solving a problem in practice. We will now introduce some well-known general complexity classes.
Definition 2.4 The class is the set of languages that are decidable by a Turing machine using time.
Definition 2.5 The class is the set of languages that are decidable by a Turing machine using space.
Definition 2.6 The class is the set of languages that are decidable by a nondeterministic Turing machine using time.
Though these complexity classes are defined in terms of Turing machines, they in fact correlate to a class of problems independent of the model they are computed on. This suggests that these classes are natural and robust, and allows for the study of the fine structure of these classes within different computation models. It has been established that the expressiveness of a Turing machine with a polynomial amount of time is equivalent to that of polynomially large uniform Boolean circuits or Boolean queries using first-order logic with polynomially many blocks of quantifiers. The main goal of complexity theory is to understand the relationship between these complexity classes, and to that end, we use different models in order to study the fine structure of them. One of the most important open questions in computer science is whether or not. It has been shown by Lipton and Karp that if has polynomially sized circuits, . Since we have many tools for proving circuit lower bounds, many lines of research have been in attempting to prove a circuit lower bound on , but with little success.
3 Algebra, Logic, and Boolean Circuits
Now, we introduce three more computation models, those which will help motivate the main question of this paper.
Definition 3.1 A monoid is a set together with an operation, such that is associative, and contains an identity element such that , .
A monoid is simply a group which does not require inverses, a concept familiar to those who have studied group theory. A particularly important group, for our purposes, is the permutation group on elements, denoted by S. This is the group of actions on a set by permutation. There are clearly elements of S, as there are permutations of elements. We will denote an element of S in the standard cycle notation; for example, the cycle sends the element 1 to 2, 2 to 3, and 3 to 1. The product of two cycles is the composition of the two permutations.
3.1 Algebraic Programs
Using algebra, we present another model of computation apart from Turing machines. An -program is a sequence of instructions that are simply monoid elements, which allows us to compute a function by computing the product of the sequence of monoid instructions.
Definition 3.2 An -program is a sequence of instructions that can be modeled by functions from a given input bit to an element of a monoid. The result of the -program is the product of the monoid elements. So, a program with n input bits i, …, i and functions f : to can be expressed as a sequence of instructions:
where each term is an instruction to evaluate the function on the bit. On input a, ’s output is given by:
which accepts if and only if () in ’ for a given ’ in .
So, the M-program generates a sequence of instructions, and then evaluates the word problem for the monoid M. We denote the length of the program by the number of instructions used. The computational power of different monoids and groups is important to us, since the Tsukiji problem can be expressed as a program over a certain group.
3.2 The Group
We are interested in a particular group, denoted by , and its actions on , which is of interest to our problem. We present some important relations and identities between the five generators of this group (a, b, c, d, e) in Table 1.
This provides an interesting algebraic foundation for the Tsukiji problem, which is central to our investigation of the power of circuits with gates. There is an equivalence, within a polynomial factor, of programs over this group to the class of function computable by depth 3 of gate circuits (a formal proof is given in ), and so finding a lower bound on the size of this circuit is equivalent to finding a lower bound for the length of programs over this group.
Table 1: The Group
3.3 First Order Logic and Complexity
Here, we briefly introduce the idea of descriptive complexity, the study of computation from the perspective of logic. We can ascertain the difficulty of a problem by investigating what tools are needed to express the problem with logical queries. We present a few results here that characterize certain classes of interest to our problem at hand with their logical counterparts. For further reading on descriptive complexity and proofs of the following claims, consult .
Definition 3.3 is the class of problems whose solutions can be expressed using first-order logical queries.
We can extend the power of first order logic by adding new quantifiers, and in turn extend the expressiveness of . We are mainly interested in the modular counting quantifier and the majority quantifier, which are relevant to the Tsukiji problem. The modular counting quantifier, for counting mod , is true if and only if there are exactly 0 mod values for such that is true. Similarly, the majority quantifier is true if and only if there are a majority of values of such that is true. Adding these quantifiers, we define extensions of as follows:
Definition 3.4 is the class of problems whose solutions can be expressed using first-order logical queries with counting quantifiers modulo p.
Definition 3.5: is the class of problems whose solutions can be expressed using first-order logical queries with majority quantifiers.
It can be shown that the class is equivalent in power to the circuit class , and the class is equivalent in power to the circuit class , both of which we shall define in the next section.
4 Boolean Circuits and Complexity
We now introduce our model of interest, Boolean circuits, which are collections of interconnected gates which perform basic computations. This is different from such an abstraction as the Turing model of computation, since it allows us to look very closely at low-level computation, and find lower bounds on functions of little complexity.
First, we define some basic components of the circuit model.
Definition 4.1 A gate is a component of a circuit that performs a basic computational function.
In the standard circuit model, we consider only the gate functions , , and , which constitute the standard basis , but here we consider the added expressiveness of circuits with counting (modulo ) and majority gates. Adding these gates constitutes the circuit classes and , respectively. In particular, the Tsukiji problem requires , , and gates.
Now, a circuit is a collection of gates, namely, a set of input gates, output gates, and other computation gates which are connected via wires. In particular, the input gates are typically connected to some collection of computation gates, and eventually, each path from an input gate will terminate in an output gate. In this way, the circuit computes a function on its input bits and outputs the bit or bit string stored in the output gates. Circuits are acyclic, and we therefore avoid any wire connections from any computation gate to any input gate or ancestral gates. This allows us to avoid issues with timing, changing output values, and circuits that evolve over time.
Definition 4.2 A circuit can be modeled by the tuple . Here is a set of gates and a set of wires connecting the gates, which together form a directed acyclic graph. is the set of gates corresponding to input bits, and the set of output gates. The value of a gate is the output of the function it computes on the wires that connect into it. The output of a gate is passed on the output wires. The output of the circuit is the output value(s) of the output gate(s).
Oftentimes, a circuit will simply compute a decision problem, which restricts the circuit to producing one bit, but this construction can easily be extended to functional problems. In order to measure the complexity of a given circuit, we consider its size (the number of gates or wires) and the depth (the longest path from an input to output gate). We also consider the fan-in (the maximum number of wires used for input to a gate) and the fan-out (the maximum number of wires used for output from a gate) of the circuit in question. The complexity measures of size and depth loosely correspond to the measures of space and time, respectively, which result from the Turing model of computation. We say that a circuit has unbounded fan-in or fan-out to say that we may use an unlimited amount of wires as input or as output, respectively.
As we have defined circuits, they may only compute a function with an input of a given length. Turing machines, on the other hand, may compute a function given an input of arbitrary length. In order to extend our model to have this same capacity, and to be able to compare the two models, we must introduce the notion of circuit families.
Definition 4.3 A circuit family which computes a function is a collection of circuits such that for every , computes on inputs of length .
We can now describe the asymptotic complexity of a function in terms of the size and depth of the circuit family that computes it. Furthermore, augmenting or restricting our basis functions and the fan-in and fan-out of the circuit will determine the expressiveness of the circuit classes we care to define.
4.2 Bounds for General Boolean Functions
We may now introduce a few well-established complexity bounds on certain Boolean functions of interest to our problem. The work of Shannon and Lupanov established asymptotic lower and upper bounds on most Boolean functions on variables.
Theorem 4.1 For , the ratio of -ary boolean functions computable by circuits over with gates approaches 0 as . In other words, for large , most Boolean functions have size .
Theorem 4.2 Every -ary Boolean function can be computed by circuits with gates over the basis .
Combining these two theorems, it is easy to see that for large , most Boolean functions have complexity . However, we are interested in studying functions of complexity , which are considered feasible to implement in Boolean circuits.
It is important to mention how individual circuits are constructed, given any circuit family. For example, the halting problem is known to be uncomputable in the Turing paradigm, but we may define a circuit family which solves the unary version of this problem. On inputs of length , if Turing machine halts on itself, the circuit is simply the the constant gate 1, otherwise, it is the constant gate 0. This is obviously a well-defined circuit family, but is not constructible, since determining how to build the individual circuits would give us a decision procedure for the halting problem.
To mitigate this conceptual problem, we introduce the notion of uniformity. A uniform circuit family is one whose gates may be described by a Turing machine, constrained with polynomial time or logarithmic space. This allows us to prove equivalences between Turing classes and circuit classes. We may also show equivalences by augmenting Turing machines with an “advice” tape which varies with input size, thus making it a non-uniform machine, which could allow it to have the same non-uniform power that circuits may have. Clearly, the circuit family that decides the halting problem is a non-uniform circuit class, as it cannot be constructed by a Turing machine given even infinite time or space resources.
4.4 Circuit Complexity Classes
We now define a number of complexity classes in which to place problems computable by circuits. We are mostly interested in circuits with no more than polynomial size or polylogarithmic depth as a function of the input. These highly constrained classes of circuits cannot easily simulate other computational models, and therefore the basis from which we may select our gate functions is extremely important in determining their power. We shall assume that the following classes we define are all logarithmic space-uniform, and can therefore be described (or built) by a Turing machine with access to logarithmic space. Furthermore, we assume that, unless stated otherwise, that each circuit class we define may select gates from the standard basis .
Definition 4.4 is the set of all languages that are recognizable by polynomially sized circuits of unbounded fan-in gates over and depth.
Definition 4.5 is the set of all languages that are recognizable by polynomially sized circuits of bounded fan-in gates over and depth.
Definition 4.6 is the set of all languages that are recognizable by polynomially sized circuits of bounded fan-in gates over gates and depth. The union of for all m is denoted as , known as with counters.
Definition 4.7 is the set of all languages that are recognizable by polynomially sized circuits of unbounded fan-in gates and binary and gates and depth.
Definition 4.8 is the set of all languages that are recognizable by polynomially sized circuits of bounded fan-in gates over gates and depth.
Now, a gate returns 1 if and only if the bit-sum of its inputs is equal to 0 modulo m, and returns 0 otherwise. Similarly, a gate returns a 1 if and only if half or more of its input bits evaluate to 1. Now that we have these definitions of uniform circuit classes, we will mention some of their important relationships to standard complexity classes.
Theorem 4.3 .
Proof We can show that by constructing an algorithm that evaluates circuits in logspace, and thus, a Turing machine with logarithmic space may simulate any circuit. We may do this with a simple recursive algorithm that uses boolean operators. circuits have only bounded , , and gates (where we may push the gates to the bottom and therefore eliminate them from the circuit). The algorithm works as follows. We start at the output gate. For the recursive step: If the gate we are at is an gate, return the logical of the values of the algorithm at the right parent and the left parent. Similarly, if the gate we are at is an gate, return the logical of the values of the algorithm at the right parent and the left parent. If the gate we are at is an input gate, return the value of the gate. Clearly, this will evaluate the circuit as it was designed to be, and requires only logarithmic space to keep pointers of where it is in the recursion, which has no more than logarithmic depth.
Since has only depth, we need only keep track of bits of pointers (since each node has some bounded number of children). And so, for logspace uniform circuits, there exists a Turing machine which can describe the circuit using logarithmic space. We can use the machine to construct such circuits, and so we may evaluate any circuit in using logarithmic space, implying that is contained in .
is trivially true, as for any machine, we may construct an machine that behaves identically by copying the machine and not including any nondeterministic transitions. So this machine is simulated by our copycat machine.
We may show that by constructing an circuit which solves an -complete problem. We will use , the problem of whether a graph contains a path from node to node , as our -complete problem. Consider the predicate which is true if and only if there exists a path from node to node of length less than or equal to . We create a root node that is equivalent to , where is the total number of nodes in the graph, and therefore the maximum length of ant path in the graph. We wish to see if a midpoint exists; we make this root node an unbounded gate on - for every node m where - means there is a path from to with midpoint . These - nodes can be expressed as gates on and . We can then define these as gates on all and then gates as above, and so on in this alternating fashion, until we reach nodes of the form , which we may treat as input nodes based on whether there is an edge from to or . Clearly this is still depth, since we halve
every other step until it is 1, and it is polynomially size because at each depth we only consider ordered pairs or ordered triples of nodes, since each gate is asking whether there is a path from nodeto or from to through . Clearly, there can be at most nodes at each level, and there are levels, and so the total size of this construction is , or .
4.5 Lower Bounds
Here we state some lower bound results concerning circuit and logic complexity classes, without proof. Please refer to  for detailed proofs.
Theorem 4.5 (Furst-Saxe-Sipser) The parity function, or counting modulo 2, is not contained in . Equivalently, we say that the parity function cannot be expressed by a circuit with constant depth, polynomial size, and unbounded fan-in, with gates selected from the standard basis .
Theorem 4.5 (Smolensky’s Theorem) For distinct primes and , counting mod cannot be performed by circuits of polynomial size, constant depth, unbounded fan-in, and gates selected from . Equivalently, , and vice verses.
Smolensky showed that each prime counting class cannot capture the expressiveness of any relatively prime counting classes. This, however, leaves open the question of the power of circuits which include counting gates of two relatively prime integers. As an instance of this problem, we consider the work by Tsukiji on the power of gates with both and gates, and ask whether or not this class of circuits has some unexpected computational power.
5 The Constant Degree Hypothesis
Our main avenue of investigation was into the Constant Degree Hypothesis, a conjecture on the expressiveness of sums of polynomials and restricted circuit classes. We present another model for understanding the computational power of these circuits we have mentioned above, using multilinear polynomials. Following are some definitions and theorems on the subject, and, unless stated otherwise, the theorems presented here are Barrington’s.
Definition 5.1 A multilinear polynomial is a polynomial over that is linear in all of its variables, i.e. it contains no terms of the form for . Its degree is the degree of its maximum term, which is equal to the number of variables in the term.
Definition 5.2 A linear form, over a field , on variables, , is a polynomial represented by , with each .
Definition 5.3 A quadratic form, over a field , on variables, , is a polynomial represented by , with each .
Definition 5.4 A linear character, over a field , is a function from strings of length n to , given by , where is a linear form and is a generator for .
Definition 5.5 A quadratic character, over a field , is a function from strings of length n to , given by , where is a quadratic form and is a generator for .
For our purposes, we will only consider forms over the field and characters over the field , and thus, 2, as 2 acts as a generator for . By summing together these characters, we may compute functions by mapping a set of input bits to an output integer selected from , computed arithmetically. We consider the number of characters in said sums as a measure of complexity.
Definition 5.6 The support of a function is the number of input strings on which the function evaluates to a non-zero number.
Definition 5.7 The -weight of a function is the minimum number of degree characters needed whose sum is equal to the function. The 1-weight of a function is the number of linear characters needed to compute it, and the 2-weight is the number of quadratic characters needed.
We note that the following proofs are taken from the Sindelar paper , and have been modified in order to improve readability or to narrow their scope.
Theorem 5.1 The complexity of a function under the following models is the same up to within a polynomial factor:
Size of depth 2 circuits formed by a gate of gates.
Sums of linear characters over with forms over .
Lengths of programs over
Proof (1 2) The sums of linear characters may be simulated by circuits as follows. We connect a gate to each linear form and the constant gate 1, if it is included in the form, in the exponent of each linear character, which will compute its parity. We connect an additional constant gate to these gates, which will flip the parity bit, and so the output of this gate with be 0 if the parity of the linear form is even, and 1 otherwise. We couple each gate with a constant gate (which will always output 1), and so if the form evaluates to 1, the bit-sum of these two gates is 2, and otherwise, 1. We connect all of these to a single gate, which will then compute the sum of all linear characters modulo 3. So we have constructed a depth 2 circuit of a gate of gates, using only gates to do so.
Similarly, we may simulate an arbitrary depth 2 circuit of a gate of gates with a sum of linear characters over with forms over as follows. Suppose the depth 2 circuit contains s gates. For each gate, we define a linear form in which is present if the input is connected to the gate in question. We then define a linear character, , for each of these forms, and, taking the sum, we are done. So we are able to simulate arbitrary circuits of this form with only linear characters.
Here, we do not prove the equivalence, within a polynomial factor, of the complexity (given by length) of programs over to these models. The reader may consult  for a formal proof.
A similar result holds for sums of quadratic characters over with forms over .
Theorem 5.2 The complexity of a function under the following models is the same up to within a polynomial factor:
Size of depth 3 circuits formed by a gate of gates of binary gates.
Sums of quadratic characters over with forms over .
Lengths of programs over .
Proof This proof is similar to that of Theorem 5.1. We may simulate sums of quadratic characters over with forms over by simulating the quadratic terms with binary gates (e.g., = 1 if and only if ), and the rest of the construction follows that of Theorem 5.1.
Likewise, the depth 3 circuits of a gate of gates of binary gates may be simulated by sums of quadratic character over with forms over by constructing the sums analogously to those of Theorem 5.1, but with the added simulation of the gates by the quadratic terms in the forms belonging to the quadratic characters.
Again, the reader may consult  for a formal proof of the equivalence, within a polynomial factor, of the complexity of programs over to these two models.
Conjecture 5.1 (The Constant Degree Hypothesis) constant degree characters are required to sum to the function over a finite field .
Barrington showed that a programs over require exponential length to perform , and together with the result that the complexity of programs over is equivalent to the number of characters required for a sum (up to a linear factor), this tells us that an exponential number of linear characters are required for . We provide an alternate proof (due to Tsukiji), via a probabilistic argument, which shows that an exponential number of linear characters are needed to sum to over any finite field , not just .
Theorem 5.3 (Lower Bound for AND with Linear Characters Over a finite field of order k, linear characters are required to sum to .
Proof Let be the sum of linear characters , and let be a random linear form over . Then . Clearly, is a linear form, as both are linear forms themselves. If there exists some , then , where is the linear form with all inputs as 1’s. So, and must have degree n. Since this is equal to , this must also have degree n, which implies that at least one of the terms in the sum has degree n. But has degree n only when does, which only occurs when, for each , the coefficient of in is not the additive inverse of the coefficient of in . For any given
, randomly choosing an element that is not the additive inverse occurs with probabilityover a field , and since there are terms in the sum, the total probability of a random character added to having degree n is . So there must be at least linear characters in order to guarantee there is a term with degree . We conclude that we need at least linear characters to sum to over .
Theorem 5.4 The 2-weight of is .
Proof This is clear from the following product of quadratic characters:
This product is the sum of quadratic characters, because each multiplication of polynomials of two terms at most doubles the number of terms in the product. This product is equal to , since if all the bits evaluate to 0, then each term in the product takes the form (mod 3), and so the total product is equal to 1. Further, if any bit evaluates to 0, there there is a term in the product such that (mod 3), and so the total product is equal to 0. So the 2-weight of is less than, or equal to, .
We wish to know whether or not this is the optimal way to product the function. If it is, it would prove that is not contained in polynomially sized circuits of this form. We conjecture that this is the case, as we have not been able to find a more optimal way to produce this function.
Conjecture 5.2 (Tsukiji) The 2-weight of is exactly .
5.1 Witt Rank and Decomposition
One way of classifying the different quadratic characters is to decompose them into linearly independent terms. By doing this, we may draw isomorphisms between characters that seem different, but are really the same with respect to a change of basis that maps linear terms to linear terms. A character’s Witt decomposition allows us to consider such sets, and its Witt rank allows us to compare different quadratic characters.
Definition 5.8 The Witt decomposition of a quadratic form is an expression where the ’s are linearly independent linear forms. The Witt rank is the number in the decomposition.
Definition 5.9 For any , there are + 2 unique Witt decompositions. By change of basis, any quadratic form may be written in the one of the quadratic forms , in what we call its Witt Normal Form.
We note again that the following proofs are taken from the Sindelar paper , some of which are slightly modified to increase readability or accuracy.
Theorem 5.5 The Witt rank of a quadratic form is unique.
Proof We must reference a few facts that are proven later in this paper in order to prove this theorem. In Theorem 5.8 we show that any two quadratic forms that differ only by linear terms have the same Witt rank. For an arbitrary quadratic form, consider the family of all quadratic forms that have the same pure quadratic part, i.e., the same quadratic terms. One of these forms has support that is a function of the Witt rank, by Theorem 5.11, and therefore must have a unique Witt rank since support is clearly unique. So there are two functions with unique Witt rank and support in the family, and , with Witt rank , , and support , respectively. Since , are in the same family, they must differ by a linear term. This linear character is non-constant, since it may not be 0, and if it were 1, , which is never true for any , . There cannot be a difference in support , since adding a non-constant linear term can only change the support by if the linear is not in the quadratic form, or by 0 if the linear is already in the form.
Now that we have defined Witt rank, we wish to see how a form’s rank informs some of its basic properties. First, we shall show how the Witt rank provides a bound on the minimum number of linear characters needed to construct a function.
Theorem 5.6 Any quadratic character of Witt rank is the sum of at most linear characters.
Proof We sketch the Witt decomposition algorithm below, and we use it to write as . We note that we may write any as . To see this, consider the table below that depicts all the possible values of the linear terms . Clearly, we have shown that the sums of these linear characters and the original quadratic character are equivalent, and so we may write any term in the product as the sum of four linear characters, and the product of polynomials of at most 4 terms is at most .
Table 2: Writing a Quadratic Character as a Sum of Linear Characters
Theorem 5.7 Algorithm 1 correctly produces the Witt decomposition of any quadratic form .
Proof On each iteration, this algorithm will take two linear terms and create two independent basis elements, and removes all instances of them from the form. So, on the next iteration, the new basis elements must be independent since they cannot contain the linear terms that have been removed.
Theorem 5.8 Quadratic forms with the same pure quadratic part, i.e., that only differ in linear or constant terms, have the same Witt rank.
Proof The idea is to select and such that the quadratic form is free of some , . We accomplish this by letting be the sum of linear terms which include and such that, for all for which is a quadratic term in . Analogously, we let be the sum of linear terms which include and such that is a quadratic term in . We also add the constant 1 to if the linear term is in , and add the constant 1 to if the linear term is in . So if the pure quadratic part of and are the same, the only difference in the first iteration is the possible inclusion of the constant 1. This means that the difference between the two sets of , will be the sum of linear characters, the product of with the constant and the product of with its respective constant. So in each step, only the linear terms will change between and . Since each step only changes the quadratic terms, the decomposition takes the same amount of steps for each quadratic form, and therefore, the Witt rank is the same.
Theorem 5.9 (Tsukiji) If has Witt rank then is the sum of at most full-rank quadratic characters.
Proof Let = via its Witt decomposition. Since each of these terms are linearly independent, they form a basis for a subspace of linear forms. We may augment this set with linearly independent forms in order to get a full basis for the linear functions of variables. Then . The first term has clearly already been put into a Witt decomposition, and therefore has Witt rank , or full Witt rank. The third term, , contains only linears and constants, and thus can be combined with the first term without changing its Witt rank, by Theorem 5.8, which we will denote as . The second term may be written as . By Theorem 5.6, each term in the product may be written as a sum of four linear characters with respect to the new basis . We can rewrite the product as the product of sums of four linear characters, which will give us a sum of at most linear characters. So we have , where is a linear character. Since adding these linear characters does not change the Witt rank, we may multiply the sum by to get the sum of full rank quadratic characters. We conclude that any Witt rank quadratic character may be expressed as the sum of at most full-rank quadratic characters.
Corollary 5.9.1 (Tsukiji) If is the sum of arbitrary quadratic characters, is also the sum of at most full-rank quadratic characters.
Proof Suppose that , where the ’s are quadratic forms and let be a random quadratic form. Then we have that We pick such that is less that . Then, the probability that all terms have rank greater than is , which is less than 1, and so there must be some quadratic character such that causes all of the terms to have at least this Witt rank. By Theorem 5.9, each of these characters may be written as the sum of full rank characters, and so we have full rank characters. If this sum is equal to , we multiply each character by 2 to produce , which clearly does not change the Witt rank of the forms.
Barrington conjectured that there is a strong correlation between Witt rank, support, and 2-weight, and that a bound on the 2-weight of a function on variables can be given in terms of the support of the function. The Tsukiji problem would then be a specific example of this conjecture, since is a support 1 function. We present the conjecture and some related theorems, with a proof of the 2-weight-support trade-off for .
Theorem 5.10 The support of a non-constant linear form is .
Proof We proceed by induction. For the base case, with , there are only two non-zero linear forms, and , both of which have support 1. Assume that linear forms over variables have support . A linear form over variables has an term or it does not. If it doesn’t, then it is a linear form over variables and has support by our inductive hypothesis. Over variables, each of the times that the variables yields a non-zero result would be repeated, once when the th bit is zero, and once when it is one, so the support must be . If it does have an term, then when = 0, the form is not affected. But when = 1, all of the 0’s in the variables form become 1’s, and vice verses. Since there were 0’s, we now have support . This closes the induction, and we conclude that this holds for all linear forms.
Corollary 5.10.1 The support of a non-constant multilinear polynomial of degree d is at least .
Proof If n = d, it is clear that the support of the function must be at least one, since in this case, a non-zero constant function. So the support must be at least . Suppose that the minimum support for variables is . We proceed by induction. A polynomial of degree over variables may be written as , where and are polynomials over the first n variables, and each have degree at most . If evaluates to 0, the number of non-zero solutions is the support of for variables, which, by our inductive assumption, is at least . If evaluates to 1, then the number of solutions is the support of , which is a polynomials of at most degree , and therefore, again by our inductive assumption, is at least . So the total support of the multilinear polynomial is . By induction, this holds for all , and since was arbitrary, it hold for all as well.
The following is a correction of Theorem 6.14 in .
Theorem 5.11 A family of quadratic forms with Witt rank over variables has elements with support , with support , and the rest have support .
Proof We first note that if there is a quadratic form with support , then there is a quadratic form + 1 which will flip the bit output for any input, which shall give support . By a similar argument, if there is a quadratic form with support , there must be one of support , namely + 1.
Now, we proceed by induction. For the base case, with = 1, consider the form consisting of the term . The number of cases in which = 1 is , and similarly for = 1. But because they are linearly independent, the number of times which their product is 1 must be . If the Witt decomposition is instead, than the support of is , and since and are linearly independent, then there are = places where they both must be 1. Thus the total support is . Now, if we consider the forms , with , we now have a form in the same family, meaning they have the same support, which we may write as , where , are other linear forms and some . We have therefore shown that for a family with Witt rank 1 and decomposition , has support , has support , and has support .
Given that any family of rank has exactly elements of the form that have support , consider a decomposition of rank + 1 with . The term has support by the base case. Since they are linearly independent then both and evaluate to 1 on different inputs. Now, adding any combination of linear terms will again transform the form into another in the same family, and with the same support. Since there are ways to do so, there must be forms of this support. So the support of these forms is then . If we add the constant 1 to any of these forms, we shall flip each of the bits of evaluation, and thus we also have quadratic forms with support .
Finally, let the decomposition be . By the inductive hypothesis, we have that has support . The term has support by the base case. Since they are linearly independent, both and are 1 on different inputs. So the support is then . Thus we have shown for a family of quadratic forms with Witt rank + 1 there are elements of support of the form , of support of the form , and the rest have support , and therefore, this holds for families of quadratic character of any Witt rank by induction.
Conjecture 5.3 (Barrington) If any function has 2-weight and support , then .
Theorem 5.12 The conjectured 2-weight support trade-off is true for .
Proof For the case, a quadratic character is nonzero for any form , and so a function must be nonzero everywhere. This function always has support .
For the = 2 case, we have functions of the form . These may be factored as , which evaluates to 0 if and only if . The support any quadratic form is at most , where is its Witt rank (by Theorem 5.11), and so there must be at least zero evaluations of . Now, if has a Witt rank of 1, then it is a linear form and has support . Otherwise, the form has at least non-zero evaluations, where 1. So , and thus, .
For the = 3 case, the function is a sum of a 2-weight 2 function and a 2-weight 1 function. As we saw in case 1, the support of the 2-weight 1 function is , and so the support of the 2-weight 3 function in question is nonzero when one function is and when one isn’t, so the support is at least the symmetric difference of the two functions’ support sets. So, because the 2-weight 1 function has support , and the 2-weight 2 function has support at most = , the difference has at least support , and so , and we have that .
For the = 4 case, it is clear that we are dealing with the sum of two 2-weight 2 functions, and . We write and , in a similar fashion to the factor of a 2-weight 2 function in case 2. If , then the function is nonzero whenever , since this will cause one of the ’s to evaluate to zero and the other to have a nonzero evaluation. The support of is at least and so they must disagree in at least places. So, the function has support , and we have that .
Now, if , we may write the function as . This function evaluates to a nonzero value only if and . So it is nonzero when If , then we can write the function as . This product cannot be the zero polynomial, since otherwise, the function is identically zero and does not have weight 4. It is a multilinear polynomial of degree at most 4, and so its support is at least . So we have that .
6 Previous Work
As inspiration for this work, we took many of the above results from Sindelar , and hope to expand on the analytic results he was able to obtain. Here, we shall discuss briefly the experiments that he performed, and the results he reported.
Sindelar wished to find a property of quadratic characters that would satisfy similar conditions to those of linear characters, namely that a random quadratic character possesses said property with low probability, but the product of a random character with the function must have. Sindelar began by constructing a database of all possible functions from to , and saved information about the 2-weight functions and the sums to produce them. Using different bases, he generated datasets to see the effect of constraining quadratic characters to certain sets that share specific properties, in order to understand the effect that these properties have on the weight of .
6.1 Observed 2-weight of
Sindelar wanted to experimentally observe the 2-weight of . To do this, he built a table of all functions from to , and treated each function as a node in a massive graph, where the edges corresponded to the addition of quadratic characters. He then performed breadth-first search from the zero function, and found that the minimal number of quadratic characters need to sum to the function on 4 variables was 4, consistent with the previously known fact that the 2-weight of is 4. One such sum of these characters was , which you may verify for all settings of the ’s.
7 Experiments and Results
In order to make progress on the Constant Degree Hypothesis, we decided to investigate functions from to , and therefore, we hope to find that the 2-weight of . Unfortunately, we could not implement the same graph search that Sindelar used to observe the 2-weight of , since exhaustively generating all functions of a given 2-weight over 6 variables quickly becomes intractable as the 2-weight increases. So instead, we began by randomly sampling functions of small 2-weight (2, 3, 4, 6, …) in order to get a feel for their support distributions.
7.1 Random Sampling Functions from to
Our first approach was to randomly generate functions of small 2-weight, in order to get a rough sketch of their support distributions. Moreover, we believed this sampling will serve as corroborative evidence for the Constant Degree Hypothesis, in that we didn’t expect to see any functions of n-weight less than 8 of support 1. Below are the support distributions of one hundred thousand randomly distributed functions of various 2-weights.
Table 3: Supports of Randomly Sampled Functions of 2-weight 2
support: 16 24 28 32 36 40 48 functions: 65 7084 21111 43488 21175 7014 63
Table 4: Supports of Randomly Sampled Functions of 2-weight 3
support: 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 functions: 7 10 93 503 1934 5077 11550 18467 23198 19784 12584 5219 1356 205 13
Table 5: Supports of Randomly Sampled Functions of 2-weight 4
support: 20 23 24 25 26 27 28 29 30 31 32 functions: 1 1 1 6 6 24 99 188 405 764 1397 33 34 35 36 37 38 39 40 41 42 43 44 2139 2979 4242 5737 7430 8995 9910 10460 10265 8937 7488 6107 45 46 47 48 49 50 51 52 52 54 55 57 4714 3191 2071 1209 687 316 134 55 16 23 2 1
Table 6: Supports of Randomly Sampled Functions of 2-weight 6
support: 25 27 28 29 30 31 32 33 34 35 functions: 3 4 24 33 81 203 383 720 1162 1971 36 37 38 39 40 41 42 43 44 45 46 2925 4406 5754 7459 8986 10214 10400 10172 9225 7904 6067 47 48 49 50 51 52 53 54 55 56 57 4603 3148 1897 1202 574 306 104 50 16 3 1
From these tables, we can see that finding 2-weight 2, 3, 4, or 6 functions of very small or very large support is relatively rare. Now, since the function has support 1 for all (when = 1 for all ), we might not randomly generate the in a feasible amount of time, if it were to have such a 2-weight. This lends support to the belief that the function may not be computable by a function of the above sampled weights.
7.2 Characterizing Functions of 2-weight 3
Now, it is computationally impractical to exhaustively generate all functions of 2-weight 3, as there are some functions of this kind. Therefore, creating a graph with this many nodes in it and running breadth-first search on it is intractable.
To mitigate this, we take advantage of the fact that any 2-weight 3 function, , for quadratic forms , , and , may be written as , for different quadratic forms and . Clearly, and have the same support, as it is the same function. By change of basis, we say that has the same support as the function , where is in Witt Normal Form. In our case, with = 6, we know that must be one of