# Detecting Patterns Can Be Hard: Circuit Lower Bounds for the String Matching Problem

Detecting patterns in strings and images is a fundamental and well studied problem. We study the circuit complexity of the string matching problem under two popular choices of gates: De Morgan and threshold gates. For strings of length n and patterns of length n ≪ k ≤ n -o(n), we prove super polynomial lower bounds for De Morgan circuits of depth 2, and nearly linear lower bounds for depth 2 threshold circuits. For unbounded depth and k ≥ 2, we prove a linear lower bound for (unbounded fan-in) De Morgan circuits. For certain values of k, we prove a Ω(√(n)/ n) lower bound for general (no depth restriction) threshold circuits. Our proof for threshold circuits builds on a curious connection between detecting patterns and evaluating Boolean functions when the truth table of the function is given explicitly. Finally, we provide upper bounds on the size of circuits that solve the string matching problem.

## Authors

• 17 publications
• 14 publications
• 8 publications
11/24/2015

### Super-Linear Gate and Super-Quadratic Wire Lower Bounds for Depth-Two and Depth-Three Threshold Circuits

In order to formally understand the power of neural computing, we first ...
06/16/2018

### Average-Case Lower Bounds and Satisfiability Algorithms for Small Threshold Circuits

We show average-case lower bounds for explicit Boolean functions again...
09/02/2020

### Circuit Satisfiability Problem for circuits of small complexity

The following problem is considered. A Turing machine M, that accepts a ...
04/24/2020

### Small circuits and dual weak PHP in the universal theory of p-time algorithms

We prove, under a computational complexity hypothesis, that it is consis...
11/12/2020

### Reaching the speed limit of classical block ciphers via quantum-like operator spreading

We cast encryption via classical block ciphers in terms of operator spre...
11/18/2020

### Block Rigidity: Strong Multiplayer Parallel Repetition implies Super-Linear Lower Bounds for Turing Machines

We prove that a sufficiently strong parallel repetition theorem for a sp...
11/15/2021

### Query and Depth Upper Bounds for Quantum Unitaries via Grover Search

We prove that any n-qubit unitary can be implemented (i) approximately i...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

One of the most basic and frequently encountered problems by minds and machines is that of detecting patterns in perceptual inputs. A classical example of such a pattern recognition problem is that of string matching: deciding whether a given pattern is contained in a given string. The one dimensional setting readily extends to the 2D case where one is looking for a 2D pattern in a 2D image (e.g.,

[ABF94]). String-matching problems arise in a host of applications such as speech recognition, text processing, web search and computational biology [CHL14], and hence it is of interest to devise efficient computational devices for such tasks. It is no wonder that efficient algorithms for string matching have been studied both with respect to traditional exact algorithms [BM77, KMP77, GS83, NR02] as well as more recent algorithmic frameworks such as streaming [PP09], sketching [BYJKK04] and property testing [BEKR17].

We focus mostly on the one-dimensional case, where strings and patterns are over an alphabet . Our results generalize without much difficulty to the case of 2D patterns and images111As an illustration, we demonstrate this for our learning problem in Section 6.2. We formalize our problem of study in the following definition.

###### Definition 1.1.

In the string matching problem, the input is a string of symbols from and a pattern of symbols from . The goal is to compute the string matching function which is defined as if and only if the pattern occurs as a substring in the string , i.e., for some it holds that for all .

It has been known for more than four decades that the string matching problem can be solved in time [BM77, KMP77, GS83]. In this work we focus on nonuniform and learning algorithms for string matching problems which appear to have received significantly less attention. In the nonuniform setting we seek Boolean circuits that compute (we consider Boolean strings in this case, i.e., ). In the learning setting, we assume that a pattern of length is fixed and seek to (PAC) learn the function given i.i.d. samples of 222We also consider the agnostic setting where the learning algorithm knows that but does not know nor . For our learning problem we focus on patterns of size at most (as opposed to size exactly ) since this choice appears more natural with respect to text classification.

Our interest in nonuniform and learning algorithms for string matching stems from several sources. Specialized hardware can be used in computer vision and machine learning applications for finding features and patterns in parallel, and hence it is desirable to come up with small circuits achieving this goal. Furthermore, there is a considerable interest in parallel implementations of string matching algorithms

[BCDadH97, BG92, Gal85, CCG93], and studying circuit complexity (especially at low-depth) can be instrumental towards understanding parallelization questions.

Features and patterns can be useful in classifying texts (e.g., classifying a written document based on an occurrence of a word in it) which motivates studying classifiers which hinge on the existence of a pattern in a string. Using patterns and shapes to classify 2D arrays of bits (images) in the PAC learning framework (i.e. visual concepts [KR96, Shv90]) has received attention as well. While our classifier is too simple to be used in practical methods of text processing (which rely on multiple patterns and on the number of occurrences of these in texts [Seb02]), our hope is that our preliminary analysis may be a first step towards understanding more sophisticated classifiers that are used in NLP applications.

We begin by studying the circuit complexity of the string matching problem, considering two popular circuit families: Threshold circuits and De Morgan circuits (i.e., circuits with gates computing AND, OR and NOT). Recall that a linear threshold function (LTF) parameterized by is a Boolean function over Boolean variables which on input outputs if and only if

. A threshold circuit is a Boolean circuit whose gates compute LTF’s. The similarity between LTF’s and artificial neurons has resulted in a large body of research aimed at applying findings and methods from circuit complexity to understanding neural networks

[HMP93, PS88, Par94, MCPZ13, Mur71]. In particular, several works have studied how to implement efficiently basic arithmetic functions such as addition and multiplication using threshold circuits of small depth [SBKH93, Raz92, SB91]. This motivates the examination of the effectiveness of threshold circuits for other basic algorithmic tasks such as string matching.

Our (standard) circuit complexity measure is the number of gates (excluding input gates) needed to implement the pattern matching function. We study circuits with

unbounded fanin where every gate can be connected to an unbounded number of other gates. Hence in contrast to the case of bounded fanin where linear lower bounds are trivial, establishing nonconstant lower bounds is not immediate in the unbounded fanin case. For example, the AND of inputs can be implemented with a single gate in the unbounded fanin De Morgan case and in contrast it requires gates for bounded fanin circuits. As another example, it is easy to implement the equality function333which is a special case of where , which checks whether two bit strings are equal, using three threshold gates. There are few explicit examples of functions that are known to have non constant lower bounds (for threshold circuits) and proving such lower bounds is considered to be challenging [ROS94]. Indeed, it is mentioned in [Juk12] with respect to threshold circuits that “even proving non constant lower bounds is a nontrivial task”. We first illustrate that the string matching function admits a nearly linear (in ) implementation at low depth444excluding depth 2 De Morgan circuits. Thereafter our main focus is on fine-grained complexity seeking to establish lower bounds where is as large as possible.

We proceed by studying the complexity of the learning problem where we seek to learn an unknown pattern that classifies strings according to the string matching function. Here our main interest is in the sample complexity of the problem, namely how many i.i.d. samples are needed in order to find a pattern which with high confidence has “close” classification behavior to that of . We obtain nearly tight bounds on the sample complexity of this learning problem by analyzing the VC dimension of the set of all “string matching functions” . We also provide efficient algorithm that solves the learning problems once provided with a sufficient number of i.i.d. samples.

### 1.1 Our results

#### Nonuniform algorithms

Recall that in the nonuniform setting, we study Boolean circuits, and thus consider the Boolean string matching problem where . For certain values of (for example, for or ), it is easy to establish tight asymptotic bounds on the circuit complexity of . For example, for , the function can be computed by a circuit of constant size and depth (obviously, such a circuit has unbounded fan-in) as we only need to check whether the input text contains a one or a zero.

 SMn,k(x1,…,xn;y1)=(n⋁i=1xi∧y1)∨(n⋁i=1¯xi∧¯y1).

Here we will mostly assume that .

In order to obtain more general results, we prove lower and upper bounds for (almost) all regimes of (as a function of ). Throughout the paper, denotes the logarithm of base . For an integer denotes the set .

Depth- De Morgan and threshold circuits compute functions monotone in each of their inputs. Since is not monotone in its inputs, it cannot be computed by a depth- circuit.555In order to check that is neither monotone non-increasing nor monotone non-decreasing in some of its inputs, one can see that and .

Below we state our results for different circuit families of depth greater than . We start with De Morgan circuits.

###### Theorem 1.2 (De Morgan circuits of bounded depth).

Let be parameters for the problem.

• Depth 2 upper bound: There exists a De Morgan circuit of depth 2 and size computing .

• Depth 3 upper bound: There exists a De Morgan circuit of depth 3 and size computing .

• Depth 2 lower bound: Any De Morgan circuit of depth 2 computing must be of size at least

Ω(22k) if k≤logn+1 ; if logn+1≤k≤√n; if k≥√n.

Note that for the regime of our results for depth 2 are tight up to a multiplicative constant. We also note that this theorem gives super-polynomial lower bounds on the size of De Morgan circuits of depth 2 for .

Next, we prove that the circuit complexity of for De Morgan circuits with no restrictions on the depth and fan-in must be at least linear in .

###### Theorem 1.3 (De Morgan circuits with no depth restrictions).

Let be parameters for the problem.

• Upper bound: There exists a De Morgan circuit of size and depth computing .

• Lower bound: Any De Morgan circuit computing must be of size at least .

In fact, a similar argument proves a slightly stronger lower bound of for . Since in this paper we are focused on asymptotic results for all regimes of , we omit this proof.

Turning to threshold circuits we prove the following theorem.

###### Theorem 1.4 (Threshold circuits computing SMn,k).

Let be parameters for the problem.

• Upper bound: There exists a threshold circuit of depth 2 and size computing .

• Lower bound for unbounded depth: Suppose . Then any threshold circuit computing must be of size at least .

• Lower bound for depth 2: Suppose . Then any threshold circuit of depth 2 computing must be of size at least .

In particular, for certain values of any threshold circuit computing has size . If we restrict our attention to depth– threshold circuits, then for certain ’s we have a nearly linear (e.g., ) lower bound for depth– threshold circuits computing . We stress that there are no restrictions on the weights of the threshold gates in these lower bounds. Our proof also implies (weaker) lower bounds for (see subsection 5.2 for details).

For the proof techniques, we start our study of the complexity of for De Morgan circuits of depth 2. We first prove a lower bound on the size of minterms and maxterms of (see Section 3

for precise definitions), then prove an estimate on the number of zeros and ones of

, and then use them to derive lower bounds on the size of De Morgan circuits. In Section 5, we prove lower bounds for threshold circuits by reducing the problem of computing a “sparse hard” function to computing . Perhaps surprisingly, we show that the string matching problem can encode a truth table of an arbitrary sparse Boolean function (by a sparse here we mean a function with a few preimages of ). We also observe that our lower bounds apply to 2-dimensional pattern matching problems without much difficulty (we omit the details).

All of our upper bounds for circuits computing are straightforward and may have been discovered before. We include them here as other than [Gal85], we have not been able to find a source indicating circuit upper bounds for .

#### Learning algorithms: Tight bounds on VC dimension

We seek to understand the sample complexity of PAC-learning the string matching function where is an arbitrary string of length and is a fixed pattern of length . Towards this aim we prove (almost) tight bounds on the dimension of the class of these functions. The dimension essentially determines the sample complexity needed to learn the pattern from a set of i.i.d. samples in the PAC learning framework. We formalize these notions below.

Let be a fixed finite alphabet of size . By we denote the set of strings over of length , and by we denote the set of strings of length at most . We study the dimension of the class of functions, where each function is identified with a pattern of length at most , and outputs only on the strings containing this pattern. Recall that the length of the pattern can be a function of . We now define the set of functions we wish to learn:

###### Definition 1.5.

For a fixed finite alphabet and an integer , let us define the class of Boolean functions over as follows. Every function is parameterized by a pattern of length at most . Hence, . For a string , if and only if contains as a substring.

We now formalize what it means to PAC-learn . Let be a distribution over . Suppose we are trying to learn for . Given , the loss of with respect to is defined as

 LD,σ(τ)=Px∼D[hτ(x)≠hσ(x)].

Following the notion of PAC-learning [Val84, SSBD14], we can now define what we mean by learning

###### Definition 1.6.

An algorithm is said to PAC-learn if for every distribution over and every for all the following holds. Given i.i.d. samples where each is sampled according to the distribution ,

returns with probability at least

a function such that . Here the probability is taken with respect to the i.i.d. samples as well as the possible random choices made by the algorithm .

Throughout, we refer to as the confidence parameter and as the accuracy parameter.

In Definition 1.6 we consider the realizable case. Namely there exists that we want to learn. One can also consider the agnostic case. Consider a distribution over . We now define the loss of as

 LD(τ)=Px∼D[hτ(x)≠y],

namely the measure under of all pairs with [SSBD14]. In the agnostic case we wish to find, given i.i.d. samples , a pattern such that (where the minimum is taken over all ). Thus agnostically PAC-learning generalizes the realizable case where .

To analyze the sample complexity required to learn a function from we first define VC dimension.

###### Definition 1.7.

Let be a class of functions from a set to , and let . A dichotomy of is one of the possible labelings of the points of using a function from . is shattered by if realizes all dichotomies of . The dimension of , , is the size of the largest set shattered by .

In particular, if and only if there is a set of strings of length such that for every , there exists a pattern of length at most occurring in all the strings in and not occurring in all the strings in .

A class of functions is PAC-learnable with accuracy and confidence in samples [BEHW89, EHKV89, Han16], and is agnostic PAC-learnable in samples [AB09, SSBD14]. Thus, tight bounds on the dimension of a class of functions give tight bounds on its sample complexity.

Our main result is a tight bound on the dimension of (up to low order terms). That is:

###### Theorem 1.8.

Let be a finite alphabet of size , then

 VC(Hk,Σ)=min(log|Σ|(k−O(logk)),logn+O(loglogn)).

It follows that the sample complexity of learning patterns is . We also show that there are efficient polynomial time algorithms solving this learning problem. See Corollary 6.5 for details.

We prove our upper bound on the VC dimension by a double counting argument. This argument uses Sperner families to show that shattering implies a “large” family of non-overlapping patterns, which, on the other hand, is constrained by the length of the strings that we shatter. The lower bound is materialized by the idea to have patterns and strings such that the th string is a concatenation of all patterns with the binary expansion of their index having the th bit equal . We construct a family of patterns with the property that for any pair of distinct strings , their concatenation does not contain a string . Using this family (with some additional technical requirements) we are able to show that shatters a set of strings implying our lower bound on the VC dimension.

## 2 Related work

#### Nonuniform algorithms

We are not aware of previous works regarding circuit lower bounds for string matching (neither for De Morgan nor for threshold circuits).

As noted, proving super constant lower bounds for the size of threshold circuits computing explicit functions is nontrivial. Recall that the inner product function (a.k.a IP2) over the field with two elements receives and computes their inner product . Using gate elimination, it was proven in [GT93] that any threshold circuit (with no depth restrictions) computing IP2 requires size . A communication complexity approach was used in [ROS94] to prove nearly linear lower bounds for Boolean functions such as equality (deciding whether two strings of length are identical). These authors considered threshold functions with polynomial weights (for unrestricted weights the equality function has a threshold circuit of constant size). Nisan [Nis93] proved that a Boolean function with two-way communication complexity (for randomized protocols) cannot be computed by unrestricted threshold circuits of size . It follows that several well studied functions such as Disjointness and Generalized Inner Product have lower bounds on the size of threshold circuits computing them. However, both the gate elimination method and the communication complexity approach do not seem to give non constant bounds for threshold circuits for the string matching problem. In particular, the gate elimination methods build on discrepancy bounds (e.g., Lindsey Lemma [Juk12]) utterly fail for the string matching problem ( has very large monochromatic rectangles).

Stronger lower bounds are known for threshold circuit of bounded depth. In [HMP93], an exponential (in ) lower bound was given for the inner product function over the field with two elements (IP2) for depth threshold circuits with polynomial (in ) weights. These results have been strengthened by [FKL01] where it was shown that the exponential lower bounds for IP2 hold for depth 2 threshold circuits even if the weights of the middle (hidden) layer are unbounded. On the positive side, several arithmetic functions such as division and multiplication were shown to admit efficient implementations using threshold circuits of constant depth [SBKH93, SB91]. Superlinear lower bounds on the number of gates of arbitrary depth threshold circuits as well as depth 3 threshold circuits (with polynomial weights on the top layer) were proven recently by Kane and Williams  [KW16]. We remark that these techniques do not seem to apply to the function. Informally, [KW16] shows that a random restriction decreases the size of a depth threshold circuit by a factor of with high probability. If we apply this result to , then in order to beat the lower bound of Theorem 1.4, we need to have . For such a large value of , any (small) partition of the input coordinates gives the constant function after a random restriction, which does not lead to a lower bound. Another way to apply the random restriction lemma of [KW16] would be to fix “a hard text” of length (which encodes a hard function), and then apply a random restriction to the remaining inputs. Since we have fixed out of inputs, the resulting lower bound is too small to improve upon the lower bounds of Theorem 1.4.

For De Morgan circuits, the celebrated Håstad’s switching lemma [Hås87] established exponential lower bounds for bounded depth circuits computing explicit functions (e.g., majority, parity). We note that in contrast to the parity function, the string matching function admits a polynomial size circuit of depth 3. It is unclear (to us) how to leverage known tools for proving lower bounds for small depth circuits (such as the switching lemma) towards proving super linear lower bounds for small depth De Morgan circuits computing .

Upper bounds on the circuit complexity of 2D image matching problem under projective transformations was studied in [Ros16]. In this problem, which is considerably more complicated than the pattern matching problems we study, the goal is to find a projective transformation such that “resembles”666We refer to [Ros16] for the precise definition of distance used there for two images . Here, images are 2D square arrays of dimension containing discrete values (colors). In particular, it is proven that this image matching problem is in (it admits a threshold circuit of polynomial size and logarithmic depth in ). These results concern a different problem than the string matching considered here and do not seem to imply the upper bounds we obtain for circuits solving the string matching problem.

The idea to lower bound the circuit complexity of Boolean functions that arise in feature detection was studied in [LM01, LM02]. In [LM01, LM02] it is assumed that there are two types of features and and detectors corresponding to the two types of features are situated on a 1D or 2D grid. The binary outputs of these features are represented by an array of positions: (where if the feature is detected in position , and otherwise) and an array which is analogously defined with respect to . The Boolean function outputs if there exist with such that , and otherwise. This function is advocated in [LM02] as a simple example of a detection problem in vision that requires to identify spatial relationship among features. It is shown that this problem can be solved by threshold gates. A 2-dimensional analogue where the indices and represent two-dimensional coordinates and one is interested whether there exist indices and such that and is above and to the right of the location is studied in [LM02]. Recently, the two-dimensional version was studied in [UYZ15] where a -gate threshold implementation was given along with a lower bound of for the size of any threshold circuit for this problem. We remark that the problem studied in [LM01, LM02, UYZ15] is different from ours, and different proof ideas are needed for establishing lower bounds in our setting.

There are several algorithms that solve the string matching problem in time [BM77, KMP77]. There has also been significant interest in obtaining parallel algorithms for string matching. In [Gal85], it is mentioned (without a proof) that the parallel algorithm described there implies an De Morgan circuit of depth . It is also noted  [Gal85] that the classical Boyer-Moore and Knuth-Morris-Pratt algorithms do not seem to parallelize, thus implementing these algorithms by small depth circuit of size may be difficult or impossible.

#### Learning patterns

The language of all strings (of arbitrary length) containing a fixed pattern is regular and can be recognized by a finite automata. There is a large literature on learning finite automata (e.g., [Ang87, FKR97, RR97]

). This literature is mostly concerned with various active learning models and it does not imply our bounds on the sample complexity of learning

.

Motivated by computer vision applications several works have considered the notion of visual concepts: namely a set of shapes that can be used to classify images in the PAC-learning framework [KR96, Shv90]. Their main idea is that occurrences of shapes (such as lines, squares etc.) in images can be used to classify images and that furthermore the representational class of DNF’s can represent occurrences of shapes in images. For example, it is easy to represent the occurrence of a fixed pattern of length in a string of size as a DNF with clauses (see e.g., Theorem 3.1). We note that these works do not study the VC dimension of our pattern matching problems (or VC bounds in general). We also observe that no polynomial algorithm is known for learning DNF’s and that there is some evidence that the problem of learning DNF is intractable [DSS16]. Hence the result in [KR96, Shv90] do not imply that our pattern learning problem (represented as a DNF) can be done in polynomial time.

## 3 Bounded depth circuits

In this section we prove Theorem 1.2. See 1.2

### 3.1 Constructing bounded depth De Morgan circuits for SMn,k

In this section we prove the upper bounds (i.e., the first two items) of Theorem 1.2.

###### Lemma 3.1.

For any there exists a De Morgan circuit of depth 2 and size computing .

###### Proof.

First we note that equality of two -bit strings can be implemented using a DNF of width and size (number of clauses) . Indeed, denoting the two inputs by and , let

 EQ(z1,…,zk;w1,…,wk)=⋁a=(a1,…,ak)∈{0,1}k(∧ki=1(zi=ai)∧ki=1(wi=ai)),

where is equal to if , and otherwise.

For each let be the DNF that outputs 1 if and only if . Taking we obtain a circuit of depth- that computes the function. In order to turn it into a depth- circuit, note that the second and the third layers consist of gates, and hence can be collapsed to one layer. This way we get a depth- circuit of size . ∎

The next Lemma is likely to have been discovered multiple times. We attribute it to folklore.

###### Lemma 3.2.

There exists a De Morgan circuit of depth 3 and size computing .

###### Proof.

First we note that the equality function of two -bit strings can be implemented using a of width and size (number of clauses) . Indeed, we can check equality of two bits and using the circuit . Therefore, we can implement equality of two -bits strings using the CNF formula

 EQ(z1,…,zk;w1,…,wk)=k⋀i=1((zi∨¬wi)∧(¬zi∨wi)).

From here on we can proceed as in the previous proof, namely, for each let be the CNF that outputs 1 is and only if . Taking we obtain a circuit of depth 3 that computes the function. The output gate has fanin , the gates in the second layer have fan-in , and the gates in the first layer have fan-in . Therefore, the total size of the circuit is . ∎

### 3.2 Lower bounds for depth 2 De Morgan circuits

For the lower bound we need the following definition.

###### Definition 3.3.

A minterm (maxterm) of a Boolean function is a set of variables of , such that some assignment to those variables makes output irrespective of the assignment to the other variables. The width of a minterm (maxterm) is the number of variables in it.

First we find the minimal width of minterms and maxterms of .

###### Lemma 3.4.

For any values of and :

1. Every minterm of has width at least .

2. Every maxterm of has width at least .

3. If , then every maxterm of has width at least .

###### Proof.

For a string of length , and integers , by we denote the substring of starting at the position and ending at the position .

1. For every and every , we show that any assignment to variables can be extended to an assignment which forces to output . We prove this statement by induction on . The base cases and hold trivially. Indeed, in both cases , so fixes only one variable which clearly is not enough to make constant. We now prove the induction step. Consider the first character of the text.

If does not assign a value to , then we reduce the problem to a problem on strings and with the same number of assigned variables. Indeed, by the induction hypothesis we can find an assignment to the variables such that does not contain . Now we set to ensure that does not start with the string .

If does assign a value to then we consider two cases. If assigns a value to , then we apply the induction hypothesis to the strings and . Indeed, we have at most fixed variables among those variables, thus we have an assignment such that does not contain . This implies that does not contain as a substring. In the remaining case where assigns a value to but not to , we set and use the induction hypothesis for the strings and (these two strings have at most fixed variables).

2. Consider a substitution which fixes variables in the text and variables in the pattern. In order to force to output , for every shift there must be an index such that assigns a value to and . Thus, every of assigned variables in the text “covers” at most shifts. Since the total number of shifts is , this implies that

. Therefore, by the the inequality of arithmetic and geometric means we get that

.

3. Since we get that . The desired bound follows by noting that the function is monotone decreasing for .

Next we prove tight bounds on the number of satisfying and non-satisfying inputs of .

###### Lemma 3.5.

For , let and be the preimages of and of , respectively. That is,

 O={(x,y)∈{0,1}n+k:SMn,k(x;y)=1}Z={(x,y)∈{0,1}n+k:SMn,k(x;y)=0}.

Then for every ; for ; for every .

###### Proof.

In order to estimate the number of satisfying inputs of , we use the following result from [GW07]:

###### Theorem 3.6 (Corollary 2.2 in [Gw07]).

Let

be a uniformly distributed random string of length

, and let denote the number of distinct substrings of of length . Then there exist and , such that for every values of and :

 E[Xn,k]=2k−2k(1−2−k)n−k+1+O(n−εμk).

The upper bound on follows immediately from the following observation. A string of length can contain at most different substrings of length . For the lower bound on we consider two regimes of .

First, if , then from Theorem 3.6 we have

 |O| ≥2n⋅E[Xn,k]≥2n+k(1−(1−2−k)n−k+1) ≥2n+k(1−exp(−(n−k+1)/2k))≥2n+k(1−1/e).

If , then again from Theorem 3.6 we have

 |O| ≥2n⋅E[Xn,k]≥2n(2k−2k(1−2−k)n−k+1) ≥2n(2k−2k(1−n−k+12k+(n−k+1)22⋅22k)) ≥Ω(2n(n−k+1)).

From the relation and a trivial upper bound on , we have that for .

In order to prove the lower bound , we consider the pattern string . The number of strings of length which do not contain satisfies the generalized Fibonacci recurrence:

 Fn=k∑i=1Fn−i.

From the known bounds on the generalized Fibonacci numbers (see, e.g., Lemma 3.6 in [Wol98]) we have , which implies the lower bound on .

###### Theorem 3.7.

We have the following bounds for depth- De Morgan circuits for .

1. The -size of is

 DNF(SMn,k)≥Ω(2k⋅min(2k,n−k+1)).
2. For ’s computing we have the following lower bounds on their sizes.

CNF(SMn,k)≥Ω(2n10k) if k≤logn+1; if logn+1≤k≤√n; if k≥√n.

Theorem 3.7 implies Theorem 1.2 since any De Morgan circuit of depth 2 is either a CNF or a DNF.

###### Proof.

First we prove the lower bound for s. From Lemma 3.4, every minterm of has width at least , and thus in any computing every clause must be of width at least . Therefore, every clause in such a evaluates to 1 on at most inputs of the function. By Lemma 3.5, . Thus, every for must contain at least clauses.

The proof idea for s is similar. We say that a clause covers an input if this clause evaluates to 0 on . Note that a clause of width covers at most elements in . For the parameters we claim first that every clause of a computing must be of width at least depending on the range of (as follows from Lemma 3.4). This implies that the number of clauses in any computing is at least . Below, we use Lemma 3.5 and Lemma 3.4 to estimate and for different ranges of .

If , then by Lemma 3.5. By Lemma 3.4, the width of each maxterm is at least . Thus, the number of clauses in any computing must be at least

 Ω(|Z|/2n+k−c)≥Ω(|Z|/2n−n/k)≥Ω(2n/k(1−2−k)n)≥Ω(2n/10k),

where the last bound follows from the inequality which holds for all .

For , Lemma 3.5 gives us an lower bound on . Lemma 3.4 provides a lower bound on the width of maxterms: for , , and for , . The desired bounds on the number of clauses in any computing now follow immediately. ∎

## 4 De Morgan circuits

In this section we prove Theorem 1.3. See 1.3

###### Proof.

• [leftmargin=*]

###### Claim 4.1.

Suppose that has a circuit of size and depth . Then for all has a circuit of size and depth .

###### Proof.

Let . In order to compute for a string and a pattern , we first compute the following subcircuits .

For , is the circuit applied to the string and the pattern . For , is the circuit applied to and .

We claim than is an OR of for . Indeed, for , if and only if is a substring of starting at a position from to . Thus, an OR of for checks occurrences of in starting at a position at most . And if and only if is a substring of starting at a position greater than .

We use gates to compute ’s, and one OR gate to compute . Therefore, the circuits size of our construction is , and the depth is . ∎

Galil [Gal85] mentions a circuit of size and depth for . In particular, has a circuit of size and depth . This, together with creftype 4.1, implies that has a circuit of size and depth .

• Now we prove a lower bound of . Suppose that a circuit computes the function, and consider an input to the circuit. We prove that has at least gates using the gate elimination method. Specifically, we show that for any fixing of the bits for , the restricted function depends on the bit . Since the function depends on , any circuit computing it must have or among its inputs. Without loss of generality we assume that appears as an input. Now we show that we can fix the input so that at least one gate of the circuit is removed.

Indeed, if appears as an input to an gate, we can set , hence setting the output of the gate to be 1. This way we can remove the gate from the circuit by setting the output of the gate to be 1, and propagate it. (It is possible that we also affect other gates). Similarly, if appears as an input to an gate, we can set , hence setting the outputs of the gate to be , and remove the gate from the circuit. Therefore, we can remove at least gates from the circuit, and hence the size of the original circuit computing was at least .

Therefore, it is left to prove the following claim

###### Claim 4.2.

Let . For any fixing of the bits for , the restricted function depends on the bit .

###### Proof.

Let be the values of the fixed bits of . In order to show that the restricted function depends on , we show that there exist two inputs: and , such that and and are extensions of which differ only in the position .

We set all non-fixed bits of to , except for . Now we set to be equal to everywhere except for the position , where . Now we see that the string does not contain two ones in a row, while does. Since , we can set to be an arbitrary substring of of length which contains and . By the definition of we have and because does not contain the substring . ∎

As already noted the and depth is mentioned in [Gal85] without a proof. Without this upper bound, our best upper bound on the size of an unbounded depth De Morgan circuit computing is (see Theorem 1.2).

## 5 Threshold circuits

In this section we prove Theorem 1.4. See 1.4