PRAMs over integers do not compute maxflow efficiently

11/16/2018 ∙ by Luc Pellissier, et al. ∙ Université Paris 13 0

Finding lower bounds in complexity theory has proven to be an extremely difficult task. In this article, we analyze two proofs of complexity lower bound: Ben-Or's proof of minimal height of algebraic computational trees deciding certain problems and Mulmuley's proof that restricted Parallel Random Access Machines (prams) over integers can not decide P-complete problems efficiently. We present the aforementioned models of computation in a framework inspired by dynamical systems and models of linear logic : graphings. This interpretation allows to connect the classical proofs to topological entropy, an invariant of these systems; to devise an algebraic formulation of parallelism of computational models; and finally to strengthen Mulmuley's result by separating the geometrical insights of the proof from the ones related to the computation and blending these with Ben-Or's proof. Looking forward, the interpretation of algebraic complexity theory as dynamical system might shed a new light on research programs such as Geometric Complexity Theory.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 32

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

While the general theory of computability focused on studying what a computable function is, computer scientists quickly realised that this notion was not meaningful in practice. Indeed, one can always define a computable function such that no current computer could compute the values of for two-digits inputs within the next, say, ten years.

This lead researchers to work on the definition and understanding of the notion of feasible computation, i.e. characterise a set of functions which can be effectively computed. Within the span of a single year three different papers (hartmanisstearns; cobham; edmonds65) tackled this question, and all of them provided the same answer, namely feasible functions are those functions for which there exists a program whose running time is asymptotically bounded by a polynomial in the input. This is how the first complexity class was born: the class of polynomial-time computable functions.

Very quickly, other classes were defined, some of them considering constraints on space rather than time. The question of classifying the complexity classes became one of the main question in the field, and a number of important results were obtained within the first years.

Lower bounds.

As part of the classification problem, complexity theory has traditionally been concerned with proving separation results. Among the numerous open separation problems lies the much advertised Ptime vs. NPtime problem of showing that some problems considered hard to solve but efficient to verify do not have a polynomial time algorithm solving them.

Proving that two classes are not equal can be reduced to finding lower bounds for problems in : by proving that certain problems cannot be solved with less than certain resources on a specific model of computation, one can show that two classes are not equal. Conversely, proving a separation result provides a lower bound for the problems that are -complete (cooknpcomplete) – i.e. problems that are in some way universal for the class .

Alas, the proven lower bound results are very few, and most separation problems remain as generally accepted conjectures. For instance, a proof that the class of non-deterministic exponential problems is not included in what is thought of as a very small class of circuits was not achieved until very recently (Williams).

The failure of most techniques of proof has been studied in itself, which lead to the proof of the existence of negative results that are commonly called barriers. Altogether, these results show that all proof methods we know are ineffective with respect to proving interesting lower bounds. Indeed, there are three barriers: relativisation (relativization), natural proofs (naturality) and algebrization (algebraization), and every known proof method hits at least one of them. This shows the need for new methods111In the words of S. Aaronson and A. Wigderson (algebraization), We speculate that going beyond this limit [algebrization] will require fundamentally new methods.. However, to this day, only one research program aimed at proving new separation results is commonly believed to have the ability to bypass all barriers: Mulmuley and Sohoni’s Geometric Complexity Theory (gct) program (GCTsurvey2).

Geometric Complexity Theory

is widely considered to be a promising research program that might lead to interesting results. It is also widely believed to necessitate new and extremely sophisticated pieces of mathematics in order to achieve its goal. The research program aims to prove the Ptime NPtime lower bound by showing that certain algebraic surfaces (representing the permanent and the discriminant, which are believed (Valiant) to have different complexity if Ptime NPtime) cannot be embedded one into the other. Although this program has lead to interesting developments as far as pure mathematics is concerned, it has not enhanced our understanding of complexity lower bounds for the time being (actually, even for Mulmuley himself, such understanding will not be achieved in our lifetimes (100years)). Recently, some negative results (Ikenmeyer) have closed the easiest path towards it promised by gct.

The gct program was inspired, according to its creators, by a lower bound result obtained by Mulmuley (Mulmuley99). Specifically, it was proved that the maxflow problem (deciding whether a certain quantity can flow from a source to a target in a weighted graph) is not solvable efficiently in a specific parallel model (the pram without bit operations). The maxflow problem is quite interesting as it is known to be in Ptime

(by reduction to linear programming, or the Ford-Fulkerson algorithm

(FordFulkerson)), but there are no known efficiently parallel algorithm solving it. This lower bound proof, despite being the main inspiration of the well-known gct research program, remains seldom cited and has not led to variations applied to other problems. At first sight it relies a lot on algebraic geometric techniques and results, such as the Milnor-Thom theorem222 Let us here notice that, even though this is not mentionned by Mulmuley, the Milnor-Thom theorem was already used to prove lower bounds, c.f. papers by Dobkin and Lipton (DobkinLipton76), Steele and Yao (SteeleYao82), Ben-Or (Ben-Or83), and references therein..

Implicit Computational Complexity.

Another approach to complexity theory that emerged in the recent years is Implicit Computational Complexity (icc). Related to logical approaches of computational complexity such as Descriptive Complexity, the aim of icc is to study algorithmic complexity only in terms of restrictions of languages and computational principles. It has been established since Bellantoni and Cook’ landmark paper (bellantonicook), and following work by Leivant and Marion (leivantmarion1; leivantmarion2).

As part of icc techniques, some approaches derive from the proofs-as-programs (or Curry–Howard) correspondence. At its core, this correspondence allows one to view the execution of a program as the cut-elimination procedure of a corresponding proof in a formal deductive system (e.g. sequent calculus). Initially stated for intuitionnistic logic (howard), the correspondence extends to resource-aware logics such as linear logic (), which is well-suited to study computation. This approach to icc therefore relies on restrictions on the deductive system considered to characterise complexity classes. In particular, several variants of were shown to characterise 333is a variant of Ptime that computes a function and not just a boolean predicate.: (BLL), (SLL), (DLAL) and (LLL).

Dynamic Semantics

The geometry of interaction program was proposed by Girard (Girard:GoI0) shortly after the inception of linear logic. In opposition to traditional denotational semantics – e.g. domains –, the goi program aims at giving an account of the proofs and programs which also interprets their dynamical features, i.e. cut-elimination/execution. This program is well-suited for tackling problems involving computational complexity, and indeed, geometry of interaction’s first model was used to prove the optimality of Lamping’s reduction in -calculus (Gonthier:1992). More recently, a series of characterisations of complexity classes were obtained using goi techniques (seiller-conl; seiller-lsp; seiller-lpls; seiller-ptime).

Among the most recent and full-fledged embodiement of this program lie the second author’s Interaction Graphs models (seiller-igm; seiller-iga; seiller-igf; seiller-igg). These models, in which proofs/programs are interpreted as graphings – generalisations of dynamical systems –, encompass all previous goi models introduced by Girard (seiller-igg). In particular, Interaction Graphs allow for modelling quantitative features of programs/proofs (seiller-igf).

Semantic Approach to Complexity

Based on a study of several Interaction Graphs models characterising complexity classes (seiller-ignda; seiller-igpda), the second author has proposed to use graphings to develop a semantic approach to complexity theory (seiller-towards). The basic idea behind this program is to model and study programs as dynamical systems that acts on a space – thought of as the space of configurations. As dynamical systems are inherently deterministic, the use of graphings is needed to extend the approach to probabilistic and/or non-deterministic programs. One can then study a program through the geometry of the associated graphing (for instance, a configuration caught in a loop is represented as a point of the space of finite orbit).

The second author conjectures that advanced methods developed within the theory of dynamical systems, in particular methods specific to the study of ergodic theory using techniques from operator algebras, could enable new proof techniques for separation. It can be argued that such techniques should be able to bypass barriers (seiller-why).

2. Contents of the paper

2.1. Computation models as graphings.

The present work reports on the first investigations into how the interpretation of programs as graphings could lead to separation techniques, by rephrasing two well-known lower bound proofs. The interpretation of programs rely on two ingredients:

  • the interpretation of models of computation as monoid actions. In our setting, we view the computational principles of a computational model as elements that act on a configuration space. As these actions can be composed, but are not necessarily reversible, it is natural to interpret them as composing a monoid acting on a configuration space. As, moreover, we are intersted in having control in our computations (knowing whether it is finished, failed, succeeded,…), we consider actions that can be decomposed as a part that computes using the principles of computation and a part that just modifies a control state;

  • the realization of programs as graphings. We abstract programs as graphs whose vertices are subspaces of the product of the configuration space and the control states and edges are labelled by elements of the acting monoid, acting on subspaces of vertices.

The basic intuitions here can be summarised by the following slogan: "Computation, as a dynamical process, can be modelled as a dynamical system". Of course, the above affirmation cannot be true of all computational processes; for instance the traditional notion of dynamical system is deterministic. In practice, one works with a generalisation of dynamical systems named graphings; introduced as part of a family of models of linear logic, graphings have been shown to model non-deterministic and probabilistic computation.

To do so, we consider that a computation model is given by a set of generators (that correspond to computation principles) and its actions on a space (representing the configuration space). So, in other words, we define a computation model as an action of a monoid (presented by its generators and relations) on a space . This action can then be specified to be continuous, mesurable, …depending on the properties we are interested in.

A program in such a model of computation is then viewed as a graph, whose vertices are subspace of the configuration space and edges are generators of the monoid: in this way, both the partiality of certain operations and branching is allowed. This point of view is very general, as it can allow to study, as special model of computations, models that can be discrete or continuous, algebraic, rewriting-based,…

2.2. Entropy

We fix an action for the following discussion. One important aspect of the representation of abstract programs as graphings is that restrictions of graphings correspond to known notions from mathematics. In a very natural way, a deterministic -graphing defines a partial dynamical system. Conversely, a partial dynamical system whose graph is contained in the measured preorder (seiller-towards) can be associated to an -graphing.

The study of deterministic models of computations can thus profit from the methods of the theory of dynamical systems. In particular, the methods employed in this paper relate to the classical notion of topological entropy. The topological entropy of a dynamical system is a value representing the average exponential growth rate of the number of orbit segments distinguishable with a finite (but arbitrarily fine) precision. The definition is based on the notion of open covers: for each finite open cover , one can compute the entropy of a map w.r.t. , and the entropy of the map is then the supremum of these values when ranges over the set of all finite covers. As we are considering graphings and those correspond to partial maps, we explain how the techniques adapt to this more general setting and define the entropy of a graphing w.r.t. a cover , as well as the topological entropy defined as the supremum of the values where ranges over all finite open covers.

While the precise results described in this paper use the entropy w.r.t. a specific cover (similar bounds could be obtained from the topological entropy, but would lack precision), the authors believe entropy could play a much more prominent role in future proofs of lower bound. Indeed, while somehow quantifies over one aspect of the computation, namely the branchings, the topological entropy computed by considering all possible covers provides a much more precise picture of the dynamics involved. In particular, it provides information about the computational principles described by the ; this information may lead to more precise bounds based on how some principles are much more complex than some others, providing some lower bounds on possible simulations of the former with the latter.

All the while only the entropy w.r.t. a given cover will be essential in this work, the overall techniques related to entropy provide a much clearer picture of the techniques. In particular, the definition of entropic co-trees (13) are quite natural from this point of view and clarifies the methods employed by e.g. Ben-Or and Mulmuley.

2.3. Ben-Or’s proof

One lower bounds result related to Mulmuley’s techniques is the bounds obtained by Steele and Yao (SteeleYao82) on Algebraic Decision Trees

. Algebraic decision trees are defined as finite ternary trees describing a program deciding a subset of

: each node verifies whether a chosen polynomial, say , takes a positive, negative, or null value at the point considered. A -th order algebraic decision tree is an algebraic decision tree in which all polynomials are of degree bounded by .

In a very natural manner, an algebraic decision tree can be represented as an -graphings, when is the trivial action on the space . We use entropy to provide a bound on the number of connected components of subsets decided by -graphings. These bounds are obtained by combining a bound in terms of entropy and a variant of the Milnor-Thom theorem due to Ben-Or. The latter, which we recall below (Theorem 2) bounds the number of connected components of a semi-algebraic set in terms of the number of polynomial inequalities, their maximal degree, and the dimension of the space considered.

Theorem 29 ().

Let be a -th order algebraic decision tree deciding a subset . Then the number of connected components of is bounded by , where is the height of .

This result of Steele and Yao adapts in a straightforward manner to a notion of algebraic computation trees describing the construction of the polynomials to be tested by mean of multiplications and additions of the coordinates. The authors remarked this result uses techniques quite similar to that of Mulmuley’s lower bounds for the model of prams without bit operations. It is also strongly similar to the techniques used by Cucker in proving that (Cucker92).

However, a refinement of Steele and Yao’s method was quickly obtained by Ben-Or so as to obtain a similar result for an extended notion of algebraic computation trees allowing for computing divisions and taking square roots. We here adapt Ben-Or techniques within the framework of graphings, in order to apply this refined approach to Mulmuley’s framework, leading to a stregnthened lower bounds result.

Adapting Ben-Or’s method, we obtain a proof of the following result on computational graphings in the of algebraic computational trees. The class of computational graphings contains the interpretation of algebraic computational trees and the result generalises that of Ben-Or by giving a bound on the number of connected components of the subset decided by a computational graphing. This bound depends on the number of edges of the computational graphing, as well as its algebraic degree (18).

Theorem 43 ().

Let be a computational graphing representative, its number of edges, and its algebraic degree. Suppose computes the membership problem for in steps, i.e. for each element of , if and only if . Then has at most connected components.

This reformulation of Ben-Or techniques is then applied to strengthen a lower bound obtained by Mulmuley (Mulmuley99). While Mulmuley’s model of prams without bit operations is a restriction of the usual notion of algebraic prams over the integers, we obtain here similar lower bounds for the non-restricted model. For this purpose, we first need to show how parallelism can be accomodated within the framework of and graphings.

2.4. prams and the crew

We are able to introduce prams acting over integers in this setting. They can be described as having a finite number of processors, each having access to a private memory on top of the shared memory, and able to perform the operations as well as branching and indirect addressing. Interestingly, we can represent these machines in the graphings framework in two steps: first, by defining the sram model, with just one processor; and then by performing an algebraic operation at the level of the algebraic models of computation.

So, in a way, parallel computation is modelled per se, at the level of models. As usual, one is bound to chose a mode of interaction between the different processes when dealing with shared memory. We will consider here only the case of Concurrent Read Exclusive Write (crew), i.e. all processes can read the shared memory concurrently, but if several processes try to write in the shared memory only the process with the smallest index is allowed to do so.

The heart of our approach of parallelism is based on commutation. Among all the instructions, the ones affecting only the private memory of distinct processors can commute, while it is not the case of two instructions affecting the central memory. We do so by considering a notion of product for monoids that generalizes both the direct product and the free product: we specify, through a conflict relation, which of the generators can and can not commute, allowing us to build a monoid representing the simultaneous action.

2.5. Mulmuley’s geometrization

Contrarily to Ben-Or’s model, the pram machines do not decide sets of reals but of integers, making the use of algebraico-geometric results to uncover their geometry much less obvious. The mechanisms of Mulmuley’s proof rely on twin geometrizations: one of a special optimization problem that can be represented by a surface in Subsec. 8.1-8.2, the other one by building explicitly, given a pram, a set of algebraic surfaces such that the points accepted by the machine are exactly the integer points enclosed by the set of surfaces.

Finally, the proof is concluded by a purely geometrical theorem (Thm. 10)444We would like to stress here that this separation in three movement, with a geometrical tour-de-force, is not explicit in the original article. We nonetheless believe it greatly improves the exposition. expressing a tension between the two geometrizations. Our work focuses here only on the construction of a set algebraic surfaces representing the computation of a pram; the remaining part of our proof follows Mulmuley’s original technique closely.

Building surfaces

The first step in Mulmuley’s proof is to use the parametric complexity results of Carstensen (Carstensen:1983) to represent an instance of the decision problem associated to maxflow so that it induces naturally a partition of that can then be represented by a particular surface.

The second step is to represent any partition of induced by the run of a machine by a set of surfaces in , in order to be able to use geometric methods.

Let be a compact of and be a partition of . can be extended to a partition of the whole of in a number of ways, as pictured in Fig. 1. In particular, can always be extended to a partition (resp. , ) of such that all the cells are compact, and the boundaries of the cells are all algebraic (resp. smooth, analytic) surfaces.

In general, such surfaces have no reason to be easy to compute and the more they are endowed with structure, the more complicated to compute they are to be. In the specific case of prams, the decomposition can naturally be represented with algebraic surfaces whose degree is moreover bounded. This choice of representation might not hold for any other model of computation, for which it might be more interesting to consider surfaces of a different kind.

Figure 1. Two curves that define the same partition of

The method for building such a set of algebraic surfaces is reminiscent of the technique we used for Ben-Or’s result: build a tree summarizing the computation of a specific pram and build, along this tree a system of polynomial equations on a larger space than the space of variables actually used by the machine, this larger space allowing to consider full-fledged division. This system of integer polynomials of bounded degree then defines surfaces exactly matching our needs.

2.6. The main result

Interestingly, this allows to use Ben-Or’s technique of adding new variables to handle operations such as division and square root to prams, which is a mild improvement over Mulmuley’s proof (and indeed, as noted in his article, the method is able of handling additional instructions as long as arbitrary bits are not easy to compute: in our model, bits of low orders are easy to compute – parity is just the remainder of a division – but computing the middle order bits of a number is difficult, see Prop. 3). By considering that the length of an input is be the minimal length of a binary word representing it, we get a realistic cost model for the prams, for which we can prove:

Theorem 63 ().

Let be a pram without bit operations with processors, where is the length of the inputs and any positive integer.

does not decide maxflow in steps.

If we call the class of computation problems that can be decided by a pram over integers in time logarithmic in the length of the inputs and a number of processors polylogarithmic in the length of the inputs, we have proved that

2.7. Conclusion

This work not only provides a strengthened lower bound results, but shows how the semantic techniques based on abstract models of computation and graphings can shed new light on some lower bound techniques. In particular, it establishes some relationship between the lower bounds and the notion of entropy which, although arguably still superficial in this work, could potentially become deeper and provide new insights and finer techniques.

Showing that the interpretation of programs as graphings can translate, and even refine, such strong lower bounds results is also important from another perspective. Indeed, the techniques of Ben-Or and Mulmuley (as well as other results of e.g. Cucker (Cucker92), Yao (YaoBetti)) seem at first sight restricted to algebraic models of computation due to their use of the Milnor-Thom theorem which holds only for real semi-algebraic sets. However, the second author’s characterisations of Boolean complexity classes in terms of graphings acting on algebraic spaces (seiller-ignda) opens the possibility of using such algebraic methods to provide lower bounds for boolean models of computation.

3. Abstract Models of Computation, Abstract Progams

The basic intuitions here can be summarised by the following slogan: "Computation, as a dynamical process, can be modelled as a dynamical system". Of course, the above affirmation cannot be true of all computational processes; for instance the traditional notion of dynamical system is deterministic. In practice, one works with a generalisation of dynamical systems named graphings; introduced as part of a family of models of linear logic, graphings have been shown to model non-deterministic and probabilistic computation.

Given a set , we denote by the free monoid on , i.e. the set of finite sequences of elements of .

Definition 1 ().

We recall that a presentation of a monoid is given by a set of generators and a set of relations such that is isomorphic to .

Definition 2 ().

Let be a monoid and be a space. An action of on is a monoid morphism from to the set of endomorphisms on . We denote actions by , sometimes omitting the morphism .

In this definition, we purposely chose to not specify the kind of space considered. As a consequence, if one considers a discrete space (i.e. sets), the set of endomorphisms will simply be the set of functions . Similarly, if is a topological space, the set of endormorphisms will be continuous maps (hence will be a continuous action). Etc.

Definition 3 ().

An abstract model of computation () is defined as a triple , where is a presentation of a monoid and is a monoid action . We denote an as .

Remark.

Although it might seem enough to define an abstract model of computation solely as a monoid action, the choice of a presentation of the monoid by generators and relations is important. First, when considering several models of computation, one wants to consider a notion of compilation: an element is compilable in the action when there is an automorphism such that one can write and the restriction of to is the restriction of an element of to . To use this notion of compilation in a meaningful way, one would want to quantify the complexity of compilation. This can be done only by considering a definition of the monoid as generators and relations , allowing one to consider the degree555Let us notice that this notion has already been considered in relation to dynamical systems, used to define what is called the algebraic entropy and the fundamental group entropy (KatokHasselblatt, Section 3.1) of – the length of the smallest word in representing – and therefore the degree of the compilation of into .

Although we will not consider the notion of compilation in this work, it is remarkable that the representation of the monoid as generators and relations is needed in the definition of the parallelisation of actions – defined as the crew operation.

Definition 4 ().

A graphing representative w.r.t. a monoid action is defined as a set of edges and for each element a pair of a subspace of – the source of – and an element – the realiser of .

Graphings come in different flavours (discrete, topological, measurable), depending on the type of space one wishes to consider. If is a topological space, the action will be continuous, if is a measure space, the action will be measurable. While the notion of graphing representative does not depend on this choice, the notion of graphing is defined as a quotient of the space of graphing representative w.r.t. an adequate notion of equivalence. We will here consider the notion of topological graphing (seiller-igg), which we will simply call graphings. In this case, the notion of equivalence is easier to define than in the case of measurable graphings as the latter requires one to consider almost-everywhere equality.

Definition 5 (Refinement).

A graphing representative is a refinement of a graphing representative , noted , if there exists a partition666We allow the sets to be empty. of such that :

This notion defines an equivalence relation defined by if and only if there exists with and .

Definition 6 ().

A graphing is an equivalence class of graphing representatives w.r.t. the equivalence relation generated by refinements.

A graphing is deterministic if its representatives are deterministic, i.e. if any representative is such that for all there is at most one such that .

Definition 7 ().

An abstract program within an is defined as a finite set of control states and a graphing w.r.t. the monoid action .

An abstract program is deterministic if its underlying graphing is deterministic.

4. The crew

In this section, we explain how the abstract framework described in the last section can be used to model parallel computation. As usual, one is bound to chose a mode of interaction between the different processes when dealing with shared memory. We will consider here only the case of Concurrent Read Exclusive Write (crew), i.e. all processes can read the shared memory concurrently, but if several processes try to write in the shared memory only the process with the smallest index is allowed to do so.

We abstract the crew mode of interaction at the level of monoid, by performing an operation reminiscent (in that it also generalizes the free product) of the amalgamated sum (Bourbaki:AlgI-III, A, I, §7, 3), but chosen relatively to monoid actions. For this, we suppose that we have two monoid actions and , where represents the shared memory. Among the generators of each monoid, we will separate those that potentially conflict with the generator of the other monoid (typically a write) from the other and perform a sum over those generators.

Definition 1 (Conflicted sum).

Let , be two monoids and a relation between the generators of and , called the conflict relation, we define the conflicted sum of and over , noted ,as the monoid where is defined as:

where , and are the neutral elements of and its two components.

In the particular case where , with respectively subsets of and , we will write the sum

\tensor

[H]H\MonGaRGR.

Remark.

When the conflic relation is empty, this defines the usual direct product of monoids. This was to be expected. Indeed, one should think of this relation as representing the elements that do not commute because they interact with the shared memory. As a consequence, when it is empty no conflicts can arise w.r.t. the shared memory. In other words, the direct product of monoids corresponds to the parallelisation of processes without shared memory.

Dually, when the relation full (), it defines the free product of the monoids, so the free product corresponds to the parallelisation of processes where all instructions interact with the shared memory.

Definition 2 ().

Let be a monoid action. We say that an element is central relatively to (or just central) if the action of commutes with the first projection , i.e. ; in other words acts as the identity on .

Intuitively, central elements are those that will not affect the shared memory. As such, they do not raise any issues when the processes are put in parallel. On the other hand, non-central elements need to be dealt with care.

Definition 3 ().

Let be an . We note the set of central elements and the set .

Definition 4 (The crew operation).

Let and be . We define the

by letting on elements of , where is defined as:

with .

We now need to check that we defined the operation on monoids and the action coherently. In other words, that the previous operation is compatible with the quotient by the adequate relations, i.e. it does define a monoid action.

Lemma 5 ().

The crew operation on is well-defined.

5. Entropy and Cells

5.1. Topological Entropy

Topological Entropy was introduced in the context of dynamical systems in an attempt to classify the latter w.r.t. conjugacy. The topological entropy of a dynamical system is a value representing the average exponential growth rate of the number of orbit segments distinguishable with a finite (but arbitrarily fine) precision. The definition is based on the notion of open covers.

Open covers.

Given a topological space , an open cover of is a family of open subsets of such that . A finite cover is a cover whose indexing set is finite. A subcover of a cover is a sub-family for such that is a cover, i.e. such that .

We will denote by (resp. ) the set of all open covers (resp. all finite open covers) of the space .

We now define two operations on open covers that are essential to the definition of entropy. An open cover , together with a continuous function , defines the inverse image open cover . Note that if is finite, is finite as well. Given two open covers and , we define their join as the family . Once again, if both initial covers are finite, their join is finite.

Entropy.

Usually, entropy is defined for continuous maps on a compact set, following the original definition by Adler, Konheim and McAndrews (AKmA). Using the fact that arbitrary open covers have a finite subcover, this allows one to ensure that the smallest subcover of any cover is finite. I.e. given an arbitrary cover , one can consider the smallest – in terms of cardinality – subcover and associate to the finite quantity . This quantity, obviously, need not be finite in the general case of an arbitrary cover on a non-compact set.

However, a generalisation of entropy to non-compact sets can easily be defined by restricting the usual definition to finite covers777This is discussed by Hofer (Hofer75) together with another generalisation based on the Stone-Čech compactification of the underlying space.. This is the definition we will use here.

Definition 1 ().

Let be a topological space, and be a finite cover of . We define the quantity as

In other words, if is the cardinality of the smallest subcover of , .

Definition 2 ().

Let be a topological space and be a continuous map. For any finite open cover of , we define:

One can show that the limit exists and is finite; it will be noted . The topological entropy of is then defined as the supremum of these values, when ranges over the set of all finite covers .

Definition 3 ().

Let be a topological space and be a continuous map. The topological entropy of is defined as .

5.2. Graphings and Entropy

We now need to define the entropy of deterministic graphing. As mentioned briefly already, deterministic graphings on a space are in one-to-one correspondence with partial dynamical systems on . To convince oneself of this, it suffices to notice that any partial dynamical system can be represented as a graphing with a single edge, and that if the graphing is deterministic its edges can be glued together to define a partial continuous function . Thus, we only need to extend the notion of entropy to partial maps, and we can then define the entropy of a graphing as the entropy of its corresponding map .

Given a finite cover , the only issue with partial continuous maps is that is not in general a cover. Indeed, is a family of open sets by continuity of but the union is a strict subspace of (namely, the domain of ). It turns out the solution to this problem is quite simple: we notice that is a cover of and now work with covers of subspaces of . Indeed, is itself a cover of and therefore the quantity can be defined as .

We now generalise this definition to arbitrary iterations of by extending Definitions 2 and 3 to partial maps as follows.

Definition 4 ().

Let be a topological space and be a continuous partial map. For any finite open cover of , we define:

The entropy of is then defined as , where is again defined as the limit .

Now, let us consider the special case of a graphing with set of control states . For an intuitive understanding, one can think of as the representation of a pram machine. We focus on the specific open cover indexed by the set of control states, i.e. , and call it the states cover. We will now show how the partial entropy is related to the set of admissible sequence of states. Let us define those first.

Definition 5 ().

Let be a graphing, with set of control states . An admissible sequence of states is a sequence of elements of such that for all there exists a subset of – i.e. a set of configurations – such that contains an edge from to a subspace of .

Example 6.

As an example, let us consider the very simple graphing with four control states and edges from to , from to , from to and from to . Then the sequences and are admissible, but the sequences , , and are not.

Lemma 7 ().

Let be a graphing, and its states cover. Then for all integer , the set of admissible sequences of states of length is of cardinality .

Proof.

We show that the set of admissible sequences of states of length has the same cardinality as the smallest subcover of . Hence , which implies the result.

The proof is done by induction. As a base case, let us consider the set of of admissible sequences of states of length and the open cover of . An element of is an intersection , and it is therefore equal to where is the set . This set is empty if and only if the sequence belongs to . Moreover, given another sequence of states (not necessarily admissible), the sets and are disjoint. Hence a set is removable from the cover if and only if the sequence is not admissible. This implies the result for .

The step for the induction is similar to the base case. It suffices to consider the partition as . By the same argument, one can show that elements of are of the form where is the set . Again, these sets are pairwise disjoint and empty if and only if the sequence is not admissible. ∎

A tractable bound on the number of admissible sequences of states can be obtained by noticing that the sequence is sub-additive, i.e. . A consequence of this is that . Thus the number of admissible sequences of states of length is bounded by . We now study how the cardinality of admissible sequences can be related to the entropy of .

Lemma 8 ().

For all , there exists an integer such that for all , .

Proof.

Let us fix some . Notice that if we let , the sequence satisfies . By Fekete’s lemma on subadditive sequences, this implies that exists and is equal to . Thus .

Now, the entropy is defined as . This then rewrites as . We can conclude that for all finite open cover .

Since is the limit of the sequence , there exists an integer such that for all the following inequality holds: , which rewrites as . From this we deduce . ∎

Lemma 9 ().

Let be a graphing, and let . Then as goes to infinity.

5.3. Cells Decomposition

Now, let us consider a deterministic graphing , with its state cover . We fix a length and reconsider the sets (for a sequence of states ) that appear in the proof of Lemma 7. The set is a partition of the space .

This decomposition splits the set of initial configurations into cells satisfying the following property: for any two initial configurations contained in the same cell , the -th first iterations of goes through the same admissible sequence of states .

Definition 10 ().

Let be a deterministic graphing, with its state cover . Given an integer , we define the -fold decomposition of along as the partition .

Then Lemma 7 provides a bound on the cardinality of the -th cell decomposition. Using the results in the previous section, we can then obtain the following proposition.

Proposition 11 ().

Let be a deterministic graphing, with entropy . The cardinality of the -th cell decomposition of w.r.t. , as a function of , is asymptotically bounded by , i.e. .

We also state another bound on the number of cells of the -th cell decomposition, based on the state cover entropy, i.e. the entropy with respect to the state cover rather than the usual entropy which takes the supremum of cover entropies when the cover ranges over all finite covers of the space. This result is a simple consequence of 7.

Proposition 12 ().

Let be a deterministic graphing. We consider the state cover entropy where is the state cover. The cardinality of the -th cell decomposition of w.r.t. , as a function of , is asymptotically bounded by , i.e. .

6. Algebraic Computation Trees and Ben-Or’s technique

We will now explain how to obtain lower bounds for algebraic models of computation based on the interpretation of programs as graphings and entropic bounds. These results make use of the Milnor-Thom theorem which bounds the sum of the Betti numbers of algebraic varieties. In fact, we will use a version due to Ben-Or of this theorem.

6.1. Milnor-Thom theorem

Let us first recall the classic Milnor-Thom theorem.

Theorem 1 ((Milnor:1964, Theorem 3)).

If is defined by polynomial identities of the form

with total degree , then

We will use in the proof the following variant of the Milnor-Thom bounds, stated and proved by Ben-Or.

Theorem 2 ().

Let .

Let be the maximal number of connected components of sets be a set defined by the following polynomial equations:

for of degree lesser than .

If , we have:

First, we will write composition of functions as instead of .

6.2. Algebraic decision trees

One lower bounds result related to Mulmuley’s techniques is the bounds obtained by Steele and Yao (SteeleYao82) on Algebraic Decision Trees. Algebraic decision trees are defined as finite ternary trees describing a program deciding a subset of : each node verifies whether a chosen polynomial, say , takes a positive, negative, or null value at the point considered.

Definition 3 ((SteeleYao82)).

Let .

A -th order algebraic decision tree for is a ternary tree where

  • each internal node contains a test of the form , where is a polynomial of degree at most ;

  • each leaf is labelled by or .

We say that the son of an internal node labeled by a polynom is consistent for if it is the right son and , the middle son and , or the left son and . A branch is consistent for if all the sons of the internal nodes in the branch are consistent for .

An algebraic decision tree decides a set if, for all , if and only if the unique maximal branch consistent with ends on a leaf labelled by .

We now define an of algebraic decision trees. In a very peculiar way, the underlying space of algebraic decision trees is , and the set of generators and relations of the monoid is empty (which means that the monoid is ), so the is where denotes the trivial action. Intuitively, this is to be expected as algebraic decision trees do not act on the space of configuration.

Let be an algebraic decision tree. It can be described as a finite set where the are polynomials on , together with a relation between the elements of the control states.

Definition 4 ().

Let be an algebraic decision tree. We define as the graphing with set of control states where the are the polynomials of , and each internal node with label and sons defines three edges:

  • one of source realized by ;

  • one of source realized by ;

  • one of source realized by .

From 11, one obtains easily the following theorem.

Theorem 5 ().

Let be a -th order algebraic decision tree deciding a subset . Then the number of connected components of is bounded by , where is the height of .

Proof.

We let be the height of , and be the maximal degree of the polynomials appearing in . Then the -th cell decomposition of defines a family of semi-algebraic sets defined by polynomials equalities and inequalities of degree at most . Moreover, 12 states that this family has cardinality bounded by ; since because each state has at most one antecedent state, this bound becomes . Thus, the -th cell decomposition defines at most algebraic sets which have at most connected components. Since the set decided by is obtained as a union of the semi-algebraic sets in the -th cell decomposition, it has at most connected components. ∎

Corollary 6 (Steele and Yao (SteeleYao82)).

A -th order algebraic decision tree deciding a subset with connected components has height .

This result of Steele and Yao adapts in a straightforward manner to a notion of algebraic computation trees describing the construction of the polynomials to be tested by mean of multiplications and additions of the coordinates. The authors remarked this result uses techniques quite similar to that of Mulmuley’s lower bounds for the model of prams without bit operations. It is also strongly similar to the techniques used by Cucker in proving that (Cucker92).

However, a refinement of Steele and Yao’s method was quickly obtained by Ben-Or so as to allow for computing divisions and taking square roots in this notion of algebraic computation trees. We will now explain Ben-Or techniques from within the framework of graphings. We will later adapt this refinement of Steele and Yao’s method to Mulmuley’s prams without bit operations, in order to obtain the main theorem of this paper.

6.3. Algebraic Computational Trees

Algebraic computational trees follow the same principles as algebraic decision trees, but they allow for the representation of computations as part of the tree. I.e. one consider nodes for every algebraic operation on the set of polynomials.

More formally, an algebraic computational tree is defined from the nodes , , , , and a test node with three sons corresponding to , and as in the algebraic decision trees case.

The difference is thus that algebraic computation trees only perform tests on expression that are first defined by means of algebraic operations. If one restricts to the fragment without division and square root, the overall computational power, i.e. the sets decided, of computational trees and decision trees are the same. However, while testing wether a given polynomial is greater than need only one node in an algebraic decision tree, in general it requires more in algebraic computational trees since one needs to compute the polynomial explicitly from basic algebraic operations.

It is not a surprise then that similar bounds to that of algebraic decisions trees can be computed using similar methods in the restricted fragment without division and square roots. An improvement on this is the result of Ben-Or generalising the technique to algebraic computational trees with division and square root nodes. The principle is quite simple: one simply adds additional variables to avoid using the square root or division, obtaining in this way a system of polynomial equations. For instance, instead of writing the equation , one defines a fresh variable and considers the system

This method seems different from the direct entropy bound obtained in the case of algebraic decision trees. However, we will see how it can be adapted directly to graphings.

Given an integer , we define the following subspaces of :

  • ;

  • ;

  • ;

  • ;

  • ;

  • .

start

false

true
Figure 2. An algebraic computation tree
Definition 7 (treeings).

A treeing is an acyclic and finite graphing, i.e. a graphing for which there exists a finite graphing representative with set of control states and such that every edge of is state-increasing, i.e. for each edge of source , for all ,

where denotes the projection onto the control states space.

A computational graphing is a graphing with distinguished states , which admits a finite representative such that each edge has its source equal to one among , , , , , , and .

A computational treeing is a treeing which is a computational graphing with the distinguished states , being incomparable maximal elements of the state space.

Definition 8 (Algebraic computation trees, (Ben-Or83)).

An algebraic computation tree on is a binary tree with a function that assigns:

  • to any vertex with only one son (simple vertex) an operational instruction of the form

    where , are ancestors of and is a constant;

  • to any vertex with two sons a test instruction of the form

    where is an ancestor of or ;

  • to any leaf an output YES or NO.

Let be any set and be an algebraic computation tree. We say that computes the membership problem for if for all , the traversal of following ends on a leaf labelled YES if and only if .

We can define the of algebraic computation trees. The underlying space is and the acting monoid is generated by , , , , , , , , , for and acting on as follows:

  • ;

  • ;

  • ;

  • if ;

  • ;

  • ;

  • ;

  • if ;

  • if .

Definition 9 ().

Let be a computational treeing on the of algebraic computational trees. The set of inputs (resp. outputs ) is the set of integers (resp. ) such that there exists an edge in :

  • either is realised by one of , , , , , , , , , , , ;

  • or the source of is one among , , , , , and .

The input space of a treeing on the of algebraic computational trees is defined as the set of indices belonging to but not to .

Definition 10 ().

Let be a treeing on the of computational trees, and let be an integer larger than the maximal element in . We say that computes the membership problem for if for all , the successful iterations of on the subspace reach the state if and only if .

Remark.

Consider two elements in . One can easily check that if and only if , where is the projection onto the state space and represents the -th iteration of on . It is therefore possible to consider only a standard representative of , for instance , to decide whether is accepted by .

Definition 11 ().

Let be an algebraic computation tree on , and be the associated directed acyclic graph, built from by merging all the leaves tagged YES in one leaf and all the leaves tagged NO in one leaf . Suppose the internal vertices are numbered ; the numbers being reserved for the input.

We define as the graphing with control states and where each internal vertex of defines either:

  • a single edge of source realized by:

    • if is associated to and is the son of ;

    • if is associated to and is the son of ;