Parametrised second-order complexity theory with applications to the study of interval computation

11/28/2017
by   Eike Neumann, et al.
0

We extend the framework for complexity of operators in analysis devised by Kawamura and Cook (2012) to allow for the treatment of a wider class of representations. The main novelty is to endow represented spaces of interest with an additional function on names, called a parameter, which measures the complexity of a given name. This parameter generalises the size function which is usually used in second-order complexity theory and therefore also central to the framework of Kawamura and Cook. The complexity of an algorithm is measured in terms of its running time as a second-order function in the parameter, as well as in terms of how much it increases the complexity of a given name, as measured by the parameters on the input and output side. As an application we develop a rigorous computational complexity theory for interval computation. In the framework of Kawamura and Cook the representation of real numbers based on nested interval enclosures does not yield a reasonable complexity theory. In our new framework this representation is polytime equivalent to the usual Cauchy representation based on dyadic rational approximation. By contrast, the representation of continuous real functions based on interval enclosures is strictly smaller in the polytime reducibility lattice than the usual representation, which encodes a modulus of continuity. Furthermore, the function space representation based on interval enclosures is optimal in the sense that it contains the minimal amount of information amongst those representations which render evaluation polytime computable.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/10/2017

Representations and evaluation strategies for feasibly approximable functions

A famous result due to Ko and Friedman (1982) asserts that the problems ...
09/23/2018

Representation Theory of Compact Metric Spaces and Computational Complexity of Continuous Data

Choosing an encoding over binary strings for input/output to/by a Turing...
09/23/2018

Quantitatively Admissible Representations and the "Main Theorem" of Type-2 COMPLEXITY Theory

Choosing an encoding over binary strings is usually straightforward or i...
10/10/2019

Implementing evaluation strategies for continuous real functions

We give a technical overview of our exact-real implementation of various...
09/23/2018

Quantitatively Admissible Representations and the "Main Theorem" of Continuous COMPLEXITY Theory

Choosing an encoding over binary strings for input/output to/by a Turing...
07/04/2021

Interval probability density functions constructed from a generalization of the Moore and Yang integral

Moore and Yang defined an integral notion, based on an extension of Riem...
02/28/2018

Partial Identification of Expectations with Interval Data

A conditional expectation function (CEF) can at best be partially identi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Computable analysis is an extension of the theory of computation over the natural numbers to continuous data, such as real numbers and real functions, based on the Turing machine model of computation. Computability of real numbers is studied already in Turing’s paper

[Tur36] on the halting problem. Computability of real functions was introduced by Grzegorczyk [Grz57, Grz55a, Grz55b] and Lacombe [Lac55]. Kreitz and Weihrauch [KW85, Wei00] introduced a general theory of computation on second-countable -spaces. This was further generalised by Schröder [Sch02a, Sch02b] to quotients of countably based spaces, which constitute in a certain sense the largest class of topological spaces which can be endowed with a reasonable computability structure [Sch02b, Theorem 13].

One of the goals of computable analysis is to provide mathematically rigorous semantics for computation over continuous data structures. Algorithms in numerical analysis are usually described using the real number model, where real numbers are regarded as atomic entities. A widely used mathematically rigorous formalisation this idea is the Blum-Shub-Smale [BSS89, BCSS98] machine. Such algorithms cannot be implemented directly on a physical machine, as real numbers cannot be encoded with a finite number of bits. The usual substitution for real numbers are floating point numbers, which behave quite differently. For instance, addition and multiplication on floating point numbers are not even associative. Thus, the behaviour of floating point algorithms depends on phenomena that are absent in the real number model, such as numerical stability and error propagation. These issues have to be studied separately, which usually requires a substantial amount of additional effort. Even then, the precise contract that an implementation fulfils is usually not fully specified and the semantics of the implementation remain vague. As a consequence the semantics of an algorithm can differ considerably from the semantics of its implementation and different implementations may well have different semantics.

By contrast, any algorithm based on computable analysis can be implemented directly on a physical computer. It consists of a rigorous specification of input and output, so that it precisely describes the steps that have to be taken to obtain the desired result to a given accuracy. Software packages based on computable analysis include iRRAM [Mül17], Ariadne [BBC08], AERN [Kon], and RealLib [Lam06].

For the study of practical algorithms it is clear that computational complexity should play a central role. Whilst the notion of computability over continuous data is robust, well understood, and universally accepted in the computable analysis community, computational complexity in analysis is far less developed and even some basic definitions are the subject of ongoing debate. The study of computational complexity in analysis was initiated by Friedman and Ko [KF82]

. They defined the computational complexity of real numbers and of real functions on compact intervals and proved some famous hardness results for problems such as integration, maximisation, or solving ordinary differential equations (see

[Fri84, Ko83], cf. also [Kaw10]). This line of research is summarised nicely in Ko’s book [Ko91]

. The main gap in the work of Friedman and Ko is that, while their definition of computational complexity for real functions carries over to functions on compact metric spaces, it does not generalise easily to functions on non-compact spaces. In practice one is most interested in the study of operators on infinite dimensional vector spaces, such as spaces of continuous functions,

-spaces or Sobolev spaces. The aforementioned hardness results concern operators of this kind, but they are non-uniform in the sense that they establish that the operators in question map certain feasibly computable functions to functions of conjecturally 222assuming standard conjectures in computational complexity such as P NP. high complexity. While such non-uniform lower bounds are sufficient to show that an operator is infeasible, it remained unclear which operators should be considered feasible.

One of the main reasons for such a notion not being available was the lack of an accepted notion of feasibility for second-order functionals. A candidate solution had been proposed by Mehlhorn already in 1975 [Meh76]: The class of basic feasible functionals, which he defined by means of a generalisation of a limited recursion scheme that leads to a well-known characterisation of the polytime computable functions on the natural numbers. However, it remained a point of debate for a long time to which extent this class fully captures the intuitive notion of feasibility [Coo92]. Further investigations into this topic revealed the type-two basic feasible functionals to be a very stable class that became established as the foundation of second-order complexity theory [Pez98, IRK01]. An important step in this process, and something that opened the field up for applications, was the characterization of the basic feasible functionals by means of resource bounded oracle Turing machines due to Kapron and Cook [KC96]. Based on this characterization, Kawamura and Cook introduced a framework for complexity of operators in analysis [KC12] that generalizes the definition of feasibly computable functions of Friedman and Ko to a wider class of spaces, including the aforementioned examples. This kicked off a series of investigations [FGH14, FZ15, Ste17, SS17, and many more].

However, there remains a gap between theory and practice. Within the framework of Kawamura and Cook it is impossible to model the behaviour of software based on computable analysis such as the libraries mentioned above. The reason for this is that all these implementations are based on interval arithmetic or extensions thereof, such as Taylor models. The representations which underlie these approaches are known to exhibit highly pathological behaviour within the framework of Kawamura and Cook [KP14a, Ste16] and those representations which can be used within their framework do not always seem to be an appropriate substitute. For instance, in the Kawamura-Cook model any representation of the space of continuous functions which renders evaluation polytime computable also allows for the computation of some modulus of continuity of a given function in polynomial time. In iRRAM this requires an exponential exhaustive search, see [BS17].

The present work is an attempt to bridge this gap by extending the framework of Kawamura and Cook in order to develop a meaningful complexity theory for a broader class of representations.

We do so by endowing a represented space with an additional function , called the parameter, which is intended to measure the complexity of the names of elements of . These parameters are a generalisation of the size function which is used to measure complexity of string functions in second-order complexity theory. The pair is called a parametrised representation and the triple is called a preparametrised space. The complexity of an algorithm for computing a function between preparametrised spaces is measured by the dependence of the running time of the algorithm in terms of the parameter of the input name and by the growth in size of the parameter of the output name compared with the parameter of the input name. As in the Kawamura-Cook model, polytime computability is defined using second-order polynomials: An algorithm runs in polynomial time if and only if both its running time and the parameter of the output name are bounded by a second-order polynomial in the parameter of the input name. A preparametrised space is called a parametrised space if its identity function is polynomial time computable.

The pathological behaviour of the representation of real numbers based on interval enclosures is eliminated by the natural choice of a parameter for this representation. The resulting parametrised representation is polytime equivalent to the usual Cauchy representation with the size function as parameter (Proposition ). We define a parametrised space of continuous functions based on interval enclosures, and show that this space most precisely reflects the behaviour of functions expected from implementations: On one hand, it leads to the right notion of polytime computability of functions (Corollary ), function evaluation is polytime computable (Proposition ) and the same is true for many of the usual operations one wants to compute quickly (Theorem , Proposition ). On the other hand, finding a modulus of continuity, which is notoriously slow in practice, is provably not polytime computable (Corollary ).

We investigate the parametrised space of interval functions further and prove that any other parametrised representation of this space such that evaluation is polytime computable can be translated to it (Theorem ). There are two reasons why we consider this result to be especially important: Firstly it resembles a result that Kawamura and Cook proved about a representation they introduced, which is currently considered to be the standard representation for the continuous functions on the unit interval for this reason. We compare their representation to the representation using interval enclosures and show that it sits strictly higher in the lattice of polytime translatability (Corollary ). This reflects the fact that the minimality result by Kawamura and Cook ranges over a restricted class of parametrised spaces. We characterise the spaces they consider as essentially those that have a polytime computable parameter (Theorem ). The second reason why we consider the minimality result for interval functions to be important is that it includes a quantification over all parametrised representations. This demonstrates that the definition of a parametrised space is not chosen too general to allow for meaningful results. Throughout the paper we provide more support for our belief that parametrised spaces are a good general framework for complexity considerations in computable analysis (for instance Theorem ).

Related work.

Parameters and parametrised complexity in our sense are present in the work of Rettinger [Ret13] and Lambov [Lam05]. Rettinger works in a different setting, avoiding second-order polynomials and we significantly add to and modify Rettinger’s ideas. Lambov’s work includes a good part of our results on interval representations in a different language. However, Lambov does not attempt to build a general framework of parametrised complexity in analysis, which requires further ideas that are not present in his work. We hence believe that the present work extends his results considerably.

For a restricted case some of the core definitions proposed in this paper are also present in the work of Kawamura, Müller, Rösnick and Ziegler [KMRZ12]. The authors introduce parameter functions with integer values. This covers only a very special case of preparametrised spaces, namely those where the value of the parameter on a name is a constant function. More significantly, their applications all make the value of the parameter accessible to the algorithm and therefore allow for formulation in the framework of Kawamura and Cook by modification of the representations involved. For their applications this is unavoidable, as the functions they consider fail to be feasibly computable unless the parameter is provided as extra advice. Most of the content of the preprint [KMRZ12] was published in [KMRZ15]. Unfortunately, in the published version the authors decided to further restrict the definition of parameters, by making it a function on the represented space, rather than a function on the domain of the representation. They have to recover the desired behaviour by first introducing suitable covering spaces of the spaces of interest. This makes it a lot more difficult to see the connection between their work and ours.

We also consider work by Schröder to be related [Sch04]. There are two ways in which his results relate to the contents of this paper: The first connection is that he equips a representation with an additional integer-valued size function that can, just like in the previous paragraph, be considered a special case of a parameter. The second connection is more interesting but also more difficult to make: Schröder provides conditions for represented spaces under which every machine that computes a function between these spaces has a well-defined first-order running time. This is can be interpreted as devising a pair of a representation and a parameter for the computable functions between two spaces such that evaluation is polytime computable: The representation takes an index of a machine computing a realiser of a function to that function and the parameter assigns to such an index the running time of the machine it encodes. The running time of the evaluation operation is the overhead needed for simulating a machine. However, one should note that these observations are not explicitly stated in [Sch04]. Also, Schröder does not consider polytime computability but only reasons about the existence of time bounds in general.

Kawamura and Pauly [KP14a] study exponential objects in the category of polytime computable mappings. They consider one of the standard function space constructions from computable analysis, which is obtained by encoding a continuous function by an index of a Turing machine together with an oracle, such that the machine computes the function relative to the oracle. They show that well-behaved function spaces can be constructed for a class of spaces which they call “effectively polynomially bounded”. This class of spaces is very similar to the one considered by Schröder. Their function space construction can be viewed as an extension of the construction sketched in the previous paragraph by adding arbitrary oracles. Like Schröder they measure the size of objects in effectively polynomially bounded spaces by means of a parameter function with values in the natural numbers. Their work is also the only example in the literature that we are aware of that discusses the issue of polytime computability of the identity function on a parametrised space. Curiously, in the published version of the paper the connections to parametrised complexity are significantly obscured. The connection to our work is much more visible in an early preprint [KP14b].

Outline of the paper.

Section 1 recalls the most important concepts from second-order complexity theory. The approach via resource bounded oracle Turing machines is chosen, and this is essential for rest of the paper. The discussion of the framework of Kawamura and Cook , which imposes an additional condition on the names, is postponed to the later Section 3. The second part 1.2 recalls the Cauchy and the interval representation of the real numbers and discusses how second-order complexity theory can be applied to find a well behaved notion of complexity of the former and why the same approach fails for the latter.

Section 2 introduces the general concept of parametrised space. Polytime computable functions are introduced using second-order polynomials. It is shown that they are closed under composition. Section 2.1 applies this to the representation of real numbers based on sequences of nested intervals. It is shown that the parametrised space of interval reals is polytime isomorphic to the parametrised space of Cauchy reals with the size function as parameter. In particular, the polytime computable points of the interval representation are the usual polytime computable real numbers.

Section 2.2 introduces a parameter for the space of continuous functions on the unit interval, where a function is represented by an interval enclosure. It is shown that this choice of representation and parameter is optimal in the sense that the resulting structure of parametrised space on the space of continuous functions is minimal amongst those structures which render evaluation polytime computable. In particular, the polytime computable points of this representation are the usual polytime computable functions.

Section 3 compares the approach presented here to the one of Kawamura and Cook. Theorem characterises those parametrised spaces which admit a polytime equivalent length-monotone representation and thus can be treated within the framework of Kawamura and Cook. Section 3.2 shows that the interval representation of continuous functions is not of this kind. Hence, the interval representation of continuous functions contains strictly less information than the function space representation which is used by Kawamura and Cook. This result mainly relies on an auxiliary result due to Brauße and Steinberg [BS17].

Appendix A discusses some alternative choices of representations to demonstrate the robustness of our definitions. The representation for real numbers used Appendix A also coincides with the one used by Müller to model the behaviour of iRRAM [Mül01].

1.1 Second-order complexity theory

Fix a non-empty alphabet . Let be an oracle machine, that is, a Turing machine with two designated tapes called the ‘oracle query’ and the ‘oracle answer’ tape and a special state called the ‘oracle state’. An oracle for such an oracle machine is an element of Baire space . The oracle machine can be executed on a given string with a given oracle . It is executed like a regular Turing machine, except that whenever it enters the oracle state, the current content of the oracle answer tape is replaced with , where is the current content of the oracle query tape. If the oracle machine terminates on input with oracle , the string that is written on its output tape after termination is denoted by . This defines a partial function which maps a string such that terminates on and with oracle to the string . Every oracle machine computes a partial operator on Baire space: The domain of consists of all oracles such that is total. If is an element of the domain of then the value of in is given by .

This is slightly different from the definition of oracle Turing machine in classical computability theory, where the oracles are subsets of . While this is not important for computability considerations, it does make a difference when it comes to computational complexity. To be able to reason about complexity in this extended setting it is necessary to fix a convention for how to count oracle calls for the time consumption of a machine. We adapt the most common convention: An oracle call is considered to take one time step in which the answer magically appears on the oracle answer tape. Essentially this means that the machine is not forced to read the whole oracle answer. Another detail is the position of the read/write heads. We assume that the head position does not change during the oracle interaction. Denote the number of steps the oracle machine takes on input and with oracle by . Of course, one of the sanity checks for any reasonable computational model is that the details fixed above should be irrelevant. One should end up with the same computational complexity classes when, for instance, the machine returns its head to the beginning of the oracle query tape during the oracle interaction.

Since the oracle is considered an input of the computation, a function bounding the time consumption of an oracle machine should be allowed to depend on the size of the oracle in addition to the size of the input string. The most common way to measure the size of a string function is to use the worst-case length-increase from input to output. That is, for a string function let its size be defined by

(s)

Since a time bound for an oracle machine should produce a bound on the number of steps the execution takes from a size of the oracle and a size of the input it should be of type . To talk about polytime computability it is necessary to find a subclass of functions of this type that is considered to have “polynomial” growth.

Definition

The class of second-order polynomials is the smallest class such that the following conditions hold:

  • For all we have .

  • Whenever then also .

  • Whenever then both the point-wise sum and the point-wise product are contained in .

Since second-order polynomials are used as running time bounds, only the values on functions that turn up as sizes of string functions are relevant. These are exactly the monotone functions, i.e., functions satisfying . This restriction is important, as the following lemma fails for more general arguments:

Lemma (Monotonicity)

Let be a second-order polynomial and let be monotone functions such that is point-wise bigger than . Then

The proof is a straightforward induction.

Definition

We call a partial operator polytime computable, if there is an oracle machine that computes and a second-order polynomial such that

It was proved by Kapron and Cook [KC96] that a total operator is polytime computable if and only if the corresponding functional is basic feasible in the sense of Mehlhorn [Meh76]. The generalization to partial operators adds an additional choice: By our choice the machine is only required to comply with the time bound in the case where the oracle is in the domain of the operator. Another approach would be to require the existence of a total extension that runs in polynomial time. This corresponds to replacing the quantification over by a quantification over all of Baire space. It is possible to prove that this does indeed lead to a more restrictive notion of polytime computability [KS17].

To show closure of the class of polytime computable operators under composition, one needs monotonicity from Lemma and the following closure property of the second-order polynomials:

Proposition

Whenever and are second-order polynomials, then so are the following mappings:

Just like Lemma , Proposition can be proven by a straightforward induction on the structure of second-order polynomials.

Theorem (Composition)

Whenever and are polytime computable, then so is .

This is proven in a more general setting in Theorem and we refrain from restating the proof here.

To compute on more general spaces representations are used:

Definition

Let be a set. A representation of is a partial surjective function . A represented space is a pair of a set and a representation of that set.

The elements of are called the names of . An element of a represented space is called computable if it has a computable name.

Computability of functions between represented spaces can be defined via realisers.

Definition

Let be a function between represented spaces. A function is called a realiser of if it translates names of the input to names of the output, that is if

A function between represented spaces is called computable if it has a computable realiser. It is called polytime computable if it has a polytime computable realiser.

The later parts of this paper need a slight generalisation of the above to multi-valued functions. Recall that a multifunction assigns to each element a subset . Its domain consists of all whose image under is nonempty. A partial single-valued function can be identified with the multifunction which sends elements to the singleton and elements outside of the domain of to the empty set. The definition of computability using realisers generalizes to multifunctions in a straightforward way: A function is called a realiser of if for all . The elements of are thus interpreted as “acceptable return values” for an algorithm which computes . If and are multifunctions then their composition is the multifunction with

and

This definition ensures that the composition of two realisers is a realiser of the composition. Note that while multivalued functions can formally be identified with relations, from conceptual point of view it is better not do do so. For instance, the composition defined above is different from the natural notion of composition for relations. Multifunctions are a standard tool in computable analysis to avoid certain kinds of continuity issues and are needed in Section 3.2 for this exact reason.

1.2 Notations and complexity on the reals

A name of an element of a represented space should be understood as a black box that provides on-demand information about the object it encodes. For real numbers, for instance, a reasonable query to such a black box could be ‘provide me with a approximation to the real number’, and the answer that the name provides should be such an approximation. The input and the output of the name are finite binary strings, and questions like the above can be formulated by encoding elements of discrete structures like the integers and the rational numbers.

A notation of a space is a partial surjective mapping . Fix the following standard notations: Let be the mapping defined on a string whose second digit is a 1 by

and zero on strings that do not have a second digit. A dyadic rational is a rational number of the form for some and . The set of these numbers is denoted by . The reason for their use is that they are a good model for machine numbers, as they are precisely those rational numbers which have a finite binary expansion. Encode a dyadic number as its unique finite binary expansion starting in a code of an integer followed by a separator symbol that is used to mark the position of the decimal point. To avoid confusion with unary and binary notations, we do not specify a notation for the natural numbers. Instead of working on directly, we use the integer . This means that implicitly use the unary encoding of natural numbers while we use the binary encoding for the integers.

To also be able to accept or return pairs of integers or dyadic numbers use a pairing function for strings. For technical reasons that become apparent in Section 2.1, we choose to use a very specific pairing function. For two strings and let the pairing be the string which is constructed as follows: Let be the string that that starts in the digit 1, repeated times, followed by a 0, then a bit indicating which of and is longer and finally enough 0

’s to make it as long as the longer of the two strings. Then pad the strings

and to the length of the longer of the two strings by adding zeros to the end. is the string whose bits alternate between the digits of , and . It is important for this paper that the initial segment of and can be read from a initial segment of .

In the following we use these encodings to identify Baire space with the space of functions between the encoded structures. For instance the statement ‘ is a name of an element’ is used as an abbreviation of the statement ‘any function such that implies is a name of the element’.

Definition

Define the Cauchy representation of as follows: A function is a name of a real number if and only if holds for all .

This adopts the widespread convention used in real complexity theory to provide accuracy requirements as natural numbers in unary. It would have equivalently been possible to provide a natural number in binary and require the return value to be a -approximation or to provide a dyadic rational and require the return value to be an -approximation. We refer to the space , as the represented space of Cauchy reals. The representation is used throughout literature with great confidence that it induces the right notion of complexity for real numbers and there are many results supporting this: The functions that have a polytime computable realizer are exactly those that are polytime computable in the sense of Ko [Lam06] as proved by Lambov [Lam06]. It is well known that Ko’s notion can be reproduced in Weihrauch’s type two theory of effectivity [Wei00].

While the Cauchy representation is in principle straightforward to realize on a physical computer, a naive implementation can suffer exponential overhead in many calculations if aliased branches of a computation tree need to be evaluated twice to the same accuracy. Any serious implementation will have to rely on caching- and memoisation-techniques to circumvent this problem, which not only introduce a great amount of conceptual complexity, but also tend to lead to high memory consumption. Therefore, algorithms in rigorous numerical analysis and exact real computation are usually based on interval methods which are conceptually simpler and empirically much more memory efficient.

This suggests a different choice of representation: Let denote the set of finite dyadic intervals together with the infinite interval . We use the abbreviation

Any finite dyadic interval can be written in this form and we encode such an interval as the pair of a code of and a code of as dyadic numbers. Denote the length of a dyadic interval by .

Definition

A function is a -name of if is a nested sequence of intervals with .

In certain implementations, notably in iRRAM [Mül17], the monotone convergence assumption of Definition is relaxed to convergence in the Hausdorff metric. A more thorough discussion of this can be found in Appendix A, where a proof is given that this choice makes no difference up to polytime equivalence. From a theoretical point of view it is much more convenient to work with monotone sequences of intervals.

The use of the interval representation is avoided in real complexity theory since it does not seem to lead to a good notion of complexity: Every real number has names that keep the sequence of intervals constant for an arbitrary long time before decreasing the size of the next interval and these names are of slowly increasing size. As a consequence, the represented space has very pathological complexity theoretical properties: On the one hand a function operating on names of this kind may need to read a very long initial segment before having any information about the encoded object available, while not being granted any time due to the small size of the input. As a consequence there are usually very few polytime computable functions whose domain is the space of real numbers endowed with the interval representation. On the other hand, a function that has to produce an interval name of a real number may delay the time until it returns information about the function value indefinitely. Consequentially, all computable functions with values in the real numbers with the interval representation are computable in linear time [Sch04, KP14a, Ste16].

The goal of the next section is to give a definition of computational complexity for spaces like the space of real numbers equipped with the interval representation which avoids such pathological behaviour.

2 Parametrised spaces

Definition

Let be a representation. A parameter for is a single-valued total map from to such that

The pair is called a parametrised representation of . The triple is called a preparametrised space.

The monotonicity assumption guarantees that the second-order polynomials behave as expected, i.e., it makes it possible to use the monotonicity of second-order polynomials from Lemma .

We do not make any assumptions about the computability or even continuity of the parameter here. This is for a few reasons: The first is that no assumptions of this kind are needed for the content of this paper. Results like the minimality from Theorem do provide a construction that works without this assumption. Being more restrictive in the definitions would make these results less general. Another reason that we do encounter discontinuous parameters in situations we consider to be of practical relevance. An example is discussed in more detail in Appendix A. While in this example an isomorphic space with continuous parameter can be found, it is easy to construct examples where this isn’t the case. Allowing discontinuous parameters hence allows us to investigate spaces that could otherwise not be equipped with a meaningful complexity notion.

A familiar class of parameters are restrictions of the size function

Since the function is total and all its values are monotone, it can be used as a parameter for any representation. Its restriction to the domain of a representation is called the standard parameter for the representation. In principle, any represented space can be made into a preparametrised space by equipping it with the standard parameter of its representation.

Definition

Let and be preparametrised spaces. A function is called computable in polynomial time if there is a machine that computes a realizer of and two second-order polynomials and such that the following conditions are satisfied:

  • bounds the running time of in terms of the parameter, i.e., for all oracles and strings we have

  • bounds the parameter blowup of , that is for all and all we have

We say that has polynomial running time and polynomially bounded parameter blowup with respect to the parameters.

We often conflate and to a single polynomial which bounds both the running time and the parameter blowup. Let us call a realiser computed by a machine as in Definition a witness for the polytime computability of . In the case where both spaces come with the standard parameter a realiser is a witness for the polytime computability of the function if and only if it is polytime computable in the usual sense: The first condition coincides with the usual running time restriction and the second condition is automatic, as writing the output counts towards the total time consumption of the machine.

As we make no assumption on the relation between the parameter of a preparametrised space and the size function, Definition 2 does not guarantee that the identity on a preparametrised space is polytime computable. If it is, the identity on Baire space need not be a witness for the polytime computability of the identity on the space. We hence need the following additional definition:

Definition

A preparametrised space  is called a parametrised space, if the identity function is polytime computable.

In the case where the parameter is the standard parameter polytime computability of the identity is automatic as the identity on Baire space is a witness for its polytime computability. In general the parameter might not provide enough time to read all of an oracle answer and proving polytime computability of the identity function usually boils down to proving that limited information can be read from a beginning segment of the result of an oracle query. An example of this is discussed in Proposition .

While Definition might look innocent, its implications should not be underestimated. It implicitly connects the parameter of the space to the size function: While for an arbitrary name there need not be any relation, the time constraint imposed on a witness of polytime computability of the identity function forces that the size of the name it returns is bounded by a second-order polynomial in the parameter of the input name. The application of such a witness hence constitutes a normalisation procedure which reduces the size of excessively large names. This connection is in particular important as it guarantees the stability under small changes in the model of computation as discussed in Section 1.1. One could alternatively require from the beginning that the parameter be point-wise bigger than the size function, or that the identity on Baire space be a witness of polytime computability of the identity. While these alternatives are slightly more restrictive than our chosen definition, all three choices are essentially equivalent.

Why we chose the above definition over these alternatives is a subtle point. An obvious drawback of our choice is that the stability under changes to the model of computation is only true once a normalisation procedure has been applied. The proof that a preparametrised space is a parametrised space usually relies on the details of the computational model and the details of the encodings of the discrete structures and pairs. Once this fact has been established a space can be specified that is stable under changes of the model and isomorphic with respect to the present model. Despite this somewhat peculiar property, our chosen approach has the advantage that it allows for the most natural definition of both the representations and the parameters that we are interested in. The other definitions usually force that either the size function shows up in the definition of the parameter or the normalisation procedure is hard-coded into the definition of the representation. The former can sometimes lead to waste of resources, as it allows for wasteful encodings, while the latter usually includes a non-canonical choice. The proof of Proposition is an instructive illustration of this.

Theorem (Composition)

Let , , and be parametrised spaces. If and are computable in polynomial time, then their composition is also computable in polynomial time.

Proof

Let and be the parameters of , and . Let be a machine that computes a realizer of in time and with parameter blow-up bounded by . Let be a machine that computes a realizer of in time and with blow-up bounded by . A machine for computing can be obtained by replacing each oracle call of the machine with a subroutine that carries out the operations that

would perform. To estimate the time this machine takes to run on input

with oracle , first note that the steps the machine takes can be divided into the steps it takes when executing the commands from and the ones it takes when executing commands from .

The number of steps that are taken while executing the commands from is the same as the number of steps that would take on input and therefore bounded by . By the second condition of Definition we have . Therefore, by the monotonicity of second-order polynomials from Lemma we have

The number of steps takes with each execution of is bounded by , where is the content of the tape that replaces the oracle query tape of . Due to the limited time available to to write this query, we have . Thus,

The number of times the oracle is called in the computation of with oracle and on input is also bounded by the time of steps the machine may take. Thus, a bound on the total number of steps that takes on input with oracle can be obtained by multiplying the two time bounds above. This can be seen to be a second-order polynomial in and using the closure properties of second-order polynomials from Proposition .

Finally, to obtain the bound on the output parameter, note that

This completes the proof that computes in polynomial time. Since is a realiser of it follows that is polytime computable.

Theorem shows that parametrised spaces form a category with polytime computable mappings as morphisms. It includes the closure of second-order polytime computable operators under composition as a special case. The proof of Theorem is considerably more uniform than its statement: a polytime algorithm for computing is obtained by composing any two polytime algorithms for and in the natural way.

The rest of this section introduces some basic notions and constructions that are needed for reasoning about parametrised spaces throughout the paper. We use straightforward adaptations from the theory of represented spaces. The correctness of our choices is supported by category theory in the sense that they are the ‘usual’ ones in the category of parametrised spaces with polytime computable functions as morphisms.

Real complexity theory has a history of non-uniformity and as a result the point-wise complexity structure is often known in more detail than the uniform structure. This makes it desirable to be able to reason about points of parametrised spaces. We arrive at the following notion:

Definition

An element of a preparametrised space is called computable in polynomial time if it has a polytime computable name whose parameter is bounded by a polynomial.

Another way of thinking about a polytime computable point in a parametrised space is as a polytime computable map from the one-point space. Here, the one-point space is equipped with the unique total representation and the constant zero parameter and is the terminal object of the category of parametrised spaces. A polytime computable point is therefore what is referred to as ‘global element’ in category theory. We obtain the following corollary:

Corollary

polytime computable functions between parametrised spaces take polytime computable points to polytime computable points.

The usual construction of the product of two represented spaces can be extended to define a product of parametrised spaces. Define the paring function by

Definition

Let and be parametrised spaces. Equip the product with the representation

i.e. a name of a pair is a pair of names of the components. Furthermore, equip this space with the parameter defined by

The triple is denoted by . It is straightforward to see that is indeed the product of and in the category of parametrised spaces and polytime computable functions.

In the theory of representations the notion of reduction plays a central role. It generalises easily to parametrised representations:

Definition

Let be a set and let and be parametrised representations of . We say that is polytime translatable to if the map is polytime computable. If is polytime translatable to and is polytime translatable to , we say that and are polytime equivalent.

We chose the word “translatable” over the more common term “reducible” as this leads to less confusion about the direction of the translations. If a parametrised representation is polytime translatable to a parametrised representation we also say that contains less information than . This is a slight abuse of language as “information content” is more appropriately measured by topological or computable translatability.

2.1 A parametrised space of real numbers

Recall the interval representation of the real numbers from Definition :

A function is a -name of if is a nested sequence of intervals with .

Also recall that the represented space has very pathological complexity properties. This rules out the standard parameter for making this space into a parametrised space. It is possible to endow the space with a different parameter which yields a sensible complexity theory. For a real number , let denote the least integer number bigger than or equal to .

Definition

For a -name of , define the parameter as

The parametrised space of interval reals is the triple

The parameter mainly encodes the rate of convergence of a sequence of intervals. Small parameter blowup for a realiser of a function hence means that the rate of convergence of the output sequence is similar to the rate of convergence of the input sequence. It remains to show that this really defines a parametrised space, i.e., that the parameter is well-defined on the domain of the representation and that the identity is polytime computable.

Proposition

The space is a parametrised space.

Proof

That the parameter is well-defined follows directly from the definitions.

A family of witnesses of the polytime computability of the identity on the space can be specified as follows: For a fixed non-constant polynomial , let be the oracle machine that on input (as usual encoded in unary) and with oracle queries the oracle for the interval . If the interval is infinite, it returns the infinite interval. Otherwise it reads approximations and to precision from initial segments of and to compute numbers and such that is the largest dyadic number with denominator with and is the smallest dyadic number with denominator with . It then returns the interval . To see that each of the machines computes a witnesses of the polytime computability of the identity on , first note that it computes the identity: By construction we have . Also by construction, the resulting sequence of intervals is nested. Thus any of the intervals returned by contains the real number that encodes. To see that the diameter of the intervals still goes to zero let some be given. Since is a name, there exists an such that for all bigger than the corresponding is smaller than . The polynomial is non-constant and therefore holds for all . Thus we may choose so big that .

To obtain a polynomial bound on the running time of , first note that the time that takes on input of length can be bounded in terms of and a bound on the absolute value of . Furthermore,

and

This, together with implies

Since the right hand side is a second-order polynomial, this proves that has polynomially bounded parameter blowup.

Choosing a witness of polytime computability of the identity corresponds to restricting the maximal precision that may be present in the component of a name. However, the way in which the precision is restricted is mostly arbitrary and it may be beneficial in practice to use different cut-off precisions in different computations.

We often indirectly use the polytime computability of the identity by assuming that the value of the parameter on all names of real numbers is linear. A machine that works correctly on names of linear parameter can be transformed into one that works correctly on all names by precomposing it with one of the realizers from the previous proof, where the corresponding polynomial is chosen linear. In the following we use the big-o notation: For integer functions and say that if there exists a constant such that for all we have .

Proposition ()

The space of Cauchy reals and the space of interval reals are polytime isomorphic as parametrised spaces.

Proof

First construct the translation from the Cauchy reals to the interval reals: Let return on oracle and input the interval . Then is a -name of . To produce this result, the machine needs to make steps for copying (which is given in unary) and steps produce the return value from and the dyadic approximation returned by . It remains to check that the machine has appropriate parameter blow-up. By definition of the parameter of the interval reals we have

By the construction of we have , and therefore the first summand in the above equation is always bounded by . The absolute value of on the other hand is bounded by the size of the encoding of the dyadic number returned by . That is

where is the length of the encoding of . This proves that computes a witness of polytime computability of the translation.

For the other direction first note that, by applying an appropriate witness of the polytime computability of the identity from Proposition , it may be assumed that the size of any -name of some satisfies . Define a machine that on such an oracle and on input proceeds as follows: It searches for an such that . This condition can be checked in time since the name is short. The search halts as soon as the value of is equal to the first summand of the parameter . Let the machine return the midpoint of the interval. Since all names are short and the encodings reasonable, obtaining the midpoint takes at most time . By construction the machine does at most loops of a computation that takes steps to carry out and thus runs in polynomial time. Since the size of the return value is in , the machine has polynomially bounded parameter blowup.

Since polytime computable functions preserve polytime computable points by Corollary we obtain:

Corollary

A real number is polytime computable if and only if it is polytime computable as an element of the parametrised space .

A final property to mention about the space and a property that distinguishes it from the Cauchy reals and is therefore not preserved under isomorphism is that any polytime computable function on the interval reals can be computed by a machine whose running time is bounded by a first-order function.

Theorem

Whenever is polytime computable, then can be computed by an oracle machine such that there eixsts a and a second-order polynomial satisfying

Proof

Since is polytime computable, there exists a machine and a second-order polynomial that bounds the running time and the parameter blowup. Without loss of generality assume . Let the machine with oracle and input spend steps on simulating what does on oracle and binary encodings of as input for starting from zero and counting up. Return the return value of the machine on the biggest input where the simulation finished in time. In case none of the computations have terminated, return the infinite interval. Obviously, takes not more than steps, thus it is left to specify an appropriate polynomial . For this, note that for a given , since bounds the running time of , choosing bigger than

forces all the simulations of with oracle and on input smaller than to come to an end. Since the parameter blowup of is bounded by , the absolute value of the number encoded by is bounded by and choosing bigger than forces the diameter of the returned interval to be smaller than . This implies that on input of size bigger than

the interval that returns has diameter smaller than and that can be picked as

That is a second-order polynomial follows from the closure properties of second-order polynomials from Proposition .

Note that the proof follows the construction which is used to show that the computational complexity of the interval representation is ill-behaved with respect to running time bounds in terms of the size function [Sch04, KP14a, Ste16]. However, in contrast to the size of a name, the parameter of a name contains meaningful information and as a consequence delaying the time until a meaningful output is produced leads to an increase of the parameter blow-up. That is, instead of removing computational cost, the construction trades time needed to produce approximations for a worse convergence behaviour. While the extent to which this is done in Theorem may not be appropriate, this construction can be viewed as a means of separating the reasoning about an algorithm into two parts. The first part is the computation of approximations to the function value from approximations to the input. The complexity of these computations can be expressed using first-order bounds. The second part is the convergence analysis, which does require second-order bounds, as it relates the rate of convergence of the input sequence to the rate of convergence of the output sequence. This seems to be more in line with how the complexity of algorithms is studied in the real number model and related models. Most “natural” algorithms for computing functions do indeed have first-order time bounds.

2.2 A parametrised space of continuous functions

Let denote the closed unit interval. Let be the parametrised space that is obtained by considering the unit interval as a subspace of the parametrised space  of interval reals. By this we mean that the representation of is the range restriction of to and the parameter is the restriction of to the domain of the representation.

Consider the space of continuous functions from the unit interval to the real numbers. Having made and into parametrised spaces which are closely tied to the complexity of interval methods, it is natural to ask whether the function space admits a similar structure.

Definition

Define the interval function representation as follows: A function is a name of if and only if

Note that if is a name of a continuous function , then is necessarily monotone as an interval function, i. e.  implies that . Unlike the Kawamura-Cook representation of continuous functions, which is recalled in Section 3, the present Definition 2.2 employs a canonical exponential construction to represent the function space. The restriction to compact intervals of reals is mainly necessary in order to ensure the well-definedness of the parameter, which essentially encodes a modulus of uniform continuity of the represented function:

Definition

Let the parameter be defined by

whenever is a -name of a function.

While the time taken by a polytime machine operating on parametrised space may depend on the parameter of the input name, the machine does not have direct access to the parameter and hence can in general not use it in its computation. The best way to compute a modulus of continuity from a name of a function in the interval function representation seems to be a search that may in the worst case take time exponential in the value of the parameter defined above. Indeed, in Section 3.2 we show that the modulus function cannot be computed in polynomial time with respect to the above parameter. This is a difference to the representation used by Kawamura and Cook, where a name comes with explicit information about the modulus of continuity.

Lemma

The parameter is well-defined on .

Proof

Assume that is a name of a function. Our goal is to show that for every the minimum

exists. For a fixed denote the smallest dyadic number with denominator that is bigger than by . Then is a -name of . Since we assumed to be a name, it follows that as . Hence, there exists with . The family of interiors of the intervals is an open cover of . Since the unit interval is compact, there exists a finite subcover. As a finite family of open intervals, this family has a smallest overlap. Let be big enough such that is smaller than this overlap. It follows that every interval whose diameter is smaller than is contained in an interval from this finite family. By the monotonicity of , for each such interval it follows that and since the diameter of is smaller than , the same holds for .

Consider the space . We call this the parametrised space of interval functions. Just like the corresponding result for the interval reals, proving that the interval functions are a parametrised space mainly consists of the introduction of a rounding procedure, which includes the non-canonical choice of a polynomial that controls the cut-off precision.

Proposition

The space is a parametrised space.

Proof

First note that, while copying the input interval to the oracle query tape is possible in polynomial time, it may not be possible to copy the answer to the output tape. This is because a name of a function may return intervals such that is fairly big and is fairly small but still and have large encodings. The time the machine is granted, however, only increases with the diameter of the interval getting smaller or the midpoint getting bigger. Instead of copying and directly, rounded versions of these numbers can be read from a beginning segment of the oracle answer. The details of this rounding procedure are given in Proposition : The machines introduced there make a single oracle query at the very beginning of the computation and may therefore instead be considered as machines that do not have an oracle and take an interval as input. Fix such a machine . A witness of polytime computability for the identity is computed by the machine which maps a given name of a function to the composition of the machine and .

Consider the evaluation operator defined as follows:

A sanity check for the definition of is that evaluation should be polytime computable. For this statement to make sense, it is necessary to use the product of parametrised spaces given at the end of the introduction of Section 2.

Proposition (Evaluation)

Evaluation as operator from to is polytime computable.

Proof

Consider the machine that when given oracle with and , such that has linear size returns a string function of linear size obtained from by applying an appropriate realizer of the identity constructed in Proposition .. This machine computes a realizer of the evaluation operator by the definition of the representation from Definition . Due to the appropriate truncations, the time the machine takes is bounded polynomially. To see that the machine does not inflate the parameter too much, note that

Thus computes a witness of the polytime computability of the evaluation operator.

Recall the notion of polytime translatability from Definition . The parametrised representation is the “correct” representation for , viewed as a function space, as it contains the least amount of information (in the sense of Definition ) among those representations which render evaluation polytime computable.

Theorem (Minimality)

For a parametrised representation of the following are equivalent:

  1. Evaluation is polytime computable with respect to .

  2. The pair is polytime translatable to .

Proof

The implication 2.1. directly follows from the closure of the polytime computable operators under composition from Theorem and the polytime computability of the evaluation operator on the interval functions from Proposition .

For the other implication assume that the evaluation operator is polytime computable with respect to . Note that due to the equivalence of the Cauchy reals and the interval reals from Proposition , the evaluation operator is also polytime computable if and are equipped with the Cauchy representation. Let be a machine that computes evaluation with time consumption and parameter blowup bounded by a second-order polynomial .

Define a machine that computes a translation of into in polytime as follows: Fix some -name of some function as oracle and an interval with as input. Let be the largest natural number such that (if the input interval has diameter bigger than one, return the interval ). Let be rounded to precision . The machine follows the steps that would take on oracle and inputs going from to zero. In each of the runs it saves the maximal query that is posed to the oracle and when the computation on ends, it compares to . If is bigger than the machine decreases and starts over, unless is already zero, in which case it returns . If is smaller than , and returns , then let return the interval .

To see that this produces a -name of from the -name , let be the polynomial time-bound of . Let be a sequence of intervals that converge to a point . Let be the sequence of intervals which are returned by on input . We may assume without loss of generality that is a nested sequence. First note that each of the intervals contains , so that it remains to show that the sequence converges to a point. There exists an integer constant such that for any the function has size smaller than . Thus, the number of steps of in any of the simulations of and in particular the value of in each such simulation is bounded by

(k)

Since this value is independent of , the computation on all intervals of diameter smaller or equal results in return values of diameter smaller than . In particular the sequence converges to a point.

Finally, the machine runs in polytime: To bound the number of steps takes by a second-order polynomial in the length of the input and the parameter of the oracle note that the number of steps taken in each simulation of is bounded by as above and that at most of these simulations need to be carried out. Let us now show that has polynomial parameter blowup. We have

Since is polynomially bounded in by (k), it remains to provide a bound on the supremum norm of the function . Cover with finitely many intervals of the form where is a dyadic rational number with bits. When these are fed into the machine , it will produce approximations to the range of over these intervals to error . Hence a bound on the output size of the machine over these intervals yields a bound on the supremum norm of . By our previous considerations the running time (and hence the output size) of the machine on each interval is bounded by

so that the supremum norm of is bounded polynomially in .

As is usually the case for minimality results of this kind, the proof generalizes to the slightly stronger statement that is an exponential object in the category of parametrised spaces. More explicitly:

Corollary

For any parametrised space  and any polytime computable there exists a unique polytime computable mapping such that for all and we have