On the Complexity of Completing Binary Predicates

07/10/2019 ∙ by Samuel Epstein, et al. ∙ Apple, Inc. 0

Given a binary predicate P, the length of the smallest program that computes a complete extension of P is less than the size of the domain of P plus the amount of information that P has with the halting sequence. This result is derived from a theorem in this paper which says a prefix free set with large M measure will have small monotone complexity, Km.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A central concept of Algorithmic Information Theory is Kolmogorov complexity, which is a measure over strings equal to the size of the shortest program which produces that string. A binary predicate is a set of pairs where and . Binary predicates are used in learning theory to repesent samples of a target concept which a learning algorithm must approximate with a hypothesis. A complete extension to a binary predicate is another binary predicate over all that is consistent with , where it is defined.

In this paper, we prove upper bounds on the size of the smallest program that computes a complete extension of a given binary predicate . We prove that for non-exotic predicates, this size is not more than the number of elements of . Exotic predicates have high mutual information with the halting sequence, and thus no algorithm can generate such predicates. To prove this, we first show new properties about the universal lower-semicomputable continuous semi-measure, . In particular, for a non-exotic prefix free set of strings , the monotone complexity of , , is less than the negative logarithm of . See Section 3 for a formal definition of and .

2 Related Work

For information relating to the history of Algorithmic Information Theory and Kolmpogorov complexity, we refer the readers to the textbooks [LV08] and [DH10]. A survey about the shared information between strings and the halting sequence is in the work [VV04]. Work on the deficiency of randomness can be found in [She83, KU87, V’Y87, She99]. Stochasticity of objects can be found in the works [She83, She99, V’Y87, V’Y99]. More information on stochasticity and algorithmic statistics are in the works [GTV01, VS17, VS15]. In Section 5, lemmas and theorems from [EL11] and [Eps13] are described, for invocation in the proof of the main theorem of this paper.

3 Conventions and Context

We use , , , , , , and to denote rationals, natural numbers, whole numbers, reals, bits, finite strings, and infinite strings. The notation and is used to denote the positive and nonnegative members of . If mathematical statement is true, then , otherwise . Natural numbers and other elementary objects will be used reciprocally with finite strings. The empty string is denoted by . For a string , is equal to with the last bit removed. . For (finite or infinite) strings , , we say iff or is a prefix of . We say if and . The bit length of a string is . The th bit of is represented with . The first bits of is represented by .

We use , to represent a self delimiting code for , such as . The self delimiting code for a finite set of strings is . For , . The number of elements of a set is denoted to be .

A measure over natural numbers is a nonnegative function . The support of a measure is denoted by , and it is equal to . An elementary measure is a discrete measure with finite support and a range of . Elementary measures are elementary objects and can be encoded by finite strings. We say a measure is a semimeasure iff . We say is a probabilty measure iff . For a set of natural numbers , its measure with respect to is equal to . For semimeasure , the function is a -test, if .

For positive real functions , we denote , , with the notation , , . Furthermore, we denote , , , by , , , respectively.

An algorithm is prefix-free if for all auxillary inputs , there are no two strings , such that halts and halts. There is a universal prefix free algorithm , such that for all algorithms , there is a string , where for all and , . We define Kolmogorov complexity with respect to this universal machine, where .

Let be an enumeration of partial computable functions . For a partial computable function , let be the indices of in . Then the complexity of is defined to be . We say that a function is lower computable if there exists an enumeration for the set . Let be an enumeration of all enumerations that output elements of . For lower computable function , let be the indices of the enumerations of in the list . Then the complexity of is .

A binary predicate is defined to be a function of the form , where . We say that binary predicate is an extension of , if , and for all , . If binary predicate has a domain of and is an extension of binary predicate , then we say it is a complete extension of . The self-delimiting code for a binary predicate with a finite domain is . The Kolmogorov complexity of a binary predicate with an infinite sized domain is , where is a partial computable function where if and is undefined otherwise. If there is no such partial computable function, then .

The halting sequence is the characteristic sequence of the domain of , where . We use to denote the amount of information that has about string . For strings and

, the chain rule states that

. The universal probability of a set

is . The universal probability of a string is . By the coding theorem, we have that .

In addition to the standard definition of Kolmogorov complexity, we introduce a monotonic variant. The monotone complexity of a finite prefix-free set of finite strings is . This is larger than the usual definition of monotone complexity, see for example [LV08]. This is due to the requirement of halting and being a standard universal program (instead of a monotone operator). However since the results in this paper are an upper bound on , they apply to smaller definitions of monotonic complexity. For , we use shorthand to mean .

A continuous semi-measure is a function , such that and for all , . The function is the uniform measure, with . For continuous semi-measure , prefix free set , . For an open set of the Cantor space, . Let be a largest, up to a multiplicative factor, lower semi-computable continuous semi-measure. Note that may differ from . is used to denote . The notation is used to denote .

4 Left-Total Machines

An string is total with respect to algorithm iff will halt on all expansions of that are long enough. Another way to define the concept is a string is total with respect to iff there exists a finite set of strings , such that and halts on each element in the set . For sequences , is to the left of , denoted by , if there is a string such that and . We say that a machine is left total if for auxilliary inputs and all , if halts, and , then is total for .

For the remaining sections of this paper, we assume that the universal Turing machine

is left-total. We refer readers to [Eps13], Section 5, for an explanation on how to construct a left-total universal machine. The complexity terms, including , , etc, are defined without loss of generality with respect to a left-total universal Turing machine. Let be the border sequence, defined as the unique sequence where if is a prefix of , , then has total and non-total expansions. If for , , then is total. If , then will diverge on all expansions of . This is why was given the terminology “border”. For total string , let , the length of the longest output of a string from a program to the left of or that extends . is 0 if is not total.

5 Stochasticity

We use notions from algorithmic statistics, most notably the deficiency of randomness of a string with respect to probability measure and string , denoted by . By definition, the function is a -test. In addition, for any elementary probability measure , for any lower computable -test , and for any string , over all , we have that . For more information about , we refer the readers to [G1́3]. The stochasticity of string , conditional to is denoted

A total computable function cannot increase the stochasticity of a sequence by more than constant factor of its complexity. This notion is captured in Proposition 5 of [VS17]. Another expression of this idea can be found in the following lemma.

Lemma 1

Given total recursive function , .

Proof.

Let and be the program and elementary probability measure that realize , where and . The image probability measure of with respect to is denoted by , where . The function is a -test, because

With access to , the function is lower computable and it has complexity (conditioned on ) . Since is a universal lower computable -test, we have the inequality . Let be a program for , that contains and a shortest program for . Thus and . Because , we have . This gives us

So we have that

Lemma 2 is taken from [Eps13]. It is easy to see that if is total and is not, then . This is due to the fact that has both total and non-total extensions. The following lemma states that if a prefix of border is simple relative to a string (and its own length), then it will be the common information shared between and the halting sequence .

Lemma 2

If string is total and is not, then for all strings , .

Lemma 3, from [EL11], states that the mutual information of a string with the halting sequence is an upper bound for the string’s stochasticity value. Another proof to Lemma 3 can be seen in [Eps13].

Lemma 3

For , .

Theorem 1, also from [EL11], states that sets with low mutual information with the halting sequence will contain members that have a big fraction of the probability of the sets.

Theorem 1

For finite set ,
.

6 String-Monotonic Machines

In this section, we relate string-monotonic programs with continuous semi-measures. Informally speaking, a string-monotonic program is a Turing machine with an input tape, a work tape, and an output tape, where the tape heads of input tape and the output tape can only move in one direction. A total computable function is string-monotonic iff for all strings and , . Let be used to represent to the unique extension of to infinite sequences. Its definition for all is , where the supremum is respect to the partial order derived with the relation. The following theorem relates prefix monotone machines and continuous semi-measures. It is similar to Theorem 4.5.2 in [LV08], with the additional property that the string-monotonic machine be total computable.

Theorem 2

For each lower-semicomputable continuous semi-measure over , there is a string-monotonic function such that for all , .

Proof.

We prove this theorem by an explicit construction of . Since is lower-semicomputable, there exists a total computable function , such that and . Without loss of generality, we can assume, for all , and also for all , .

For a finite set of strings , such that for all , , we define . If contains a string of length not less than , then is undefined. For each string and , we define the finite prefix-free sets and . For each , , we define .

For each , we will use natural numbers , to be defined later. starts by setting equal to some constant , , and . Also for , . The variable starts at 0.

The algorithm for iterates in a loop, where at the beginning of the loop, is incremented by 1. Next, the variable is set to . Starting with , we perform the following operation on each string where , with the operation being performed on before and . We set and . This operation is defined because and . The string may have received a finite number of strings from its parent . The string adds these strings to . For , if , then the string will gift enough strings from into such that . The gifted strings are removed from and also put into . After this step is completed, the algorithm for restarts the loop, starting with the incrementing of again.

On input of , is defined to be , where is equal to first occurrence of a string in the looping algorithm described above, i.e. smallest , with one of the following properties:

  • there exists a , , with

  • there exists a , , , with .

From the construction, it can be seen that the algorithm for is total computable. This construction satisfies the properties of the theorem. This is because for any , if for , there exists an , and such that , then . This combined with the fact that for all , , ensures the theorem.

Corollary 1

For finite prefix free set , .

7 Complexity of Completing Predicates

The following proposition says that is a monotonically increasing function. It is used in the proof of Theorem 3.

Proposition 1

For , if , then .

Proof.

We use the fact that as can be computed from and . So . Let and and assume not. Then , causing a contradiction for large enough .

Theorem 3

For any finite prefix-free set of strings,
.

Proof.

Let . By Corollary 1, . Let be the smallest number with fraction of inputs such that . Thus . Let be the shortest total string with . It must be that because there is a program that when given , and , enumerates total strings of length and returns the first string such that , which we call satisfying property . This string is unique, otherwise there is a , that satisfies property . This implies the existence of , that satisfies property , since and . This contradicts the definition of being the shortest string with property . It also implies that is not total. Let . So . Applying Theorem 1, conditional to we get , where

(1)

Each string has . So . So applying Proposition 1 to Equation 1, we get . There exists a function , total computable relative to , that when given a set , computes and returns the set . The function is total computable (relative to ) because is total computable. Thus and . Using Lemma 1, conditioned on gives us

Due to Lemma 3, . So . By Lemma 2, . So . because . So .

Corollary 2

For binary predicate and the set of complete extensions of ,

Proof.

The corollary is meaningless if , so we can assume is finite. Let . Let be a set of strings such that . Thus . Theorem 3, applied to , gives where . Since we have that and , we have . In addition, since is a universal semi-computable continuous semi-measure, . So . Thus there exists complete extension of that equals up to index , and equals 0 everywhere else. Thus .

References

  • [DH10] R. G. Downey and D.R. Hirschfeldt. Algorithmic Randomness and Complexity. Theory and Applications of Computability. Springer New York, 2010.
  • [EL11] Samuel Epstein and Leonid Levin. On sets of high complexity strings. CoRR, abs/1107.1458, 2011.
  • [Eps13] Samuel Epstein. All sampling methods produce outliers. CoRR, abs/1304.3872, 2013.
  • [G1́3] P. Gács. Lecture notes on descriptional complexity and randomness, 2013.
  • [GTV01] P. Gács, J. Tromp, and P. Vitányi. Algorithmic Statistics. IEEE Transactions on Information Theory, 47(6):2443–2463, 2001.
  • [KU87] A. N. Kolmogorov and V. A. Uspensky. Algorithms and Randomness. SIAM Theory of Probability and Its Applications, 32(3):389–412, 1987.
  • [LV08] M. Li and P. Vitányi. An Introduction to Kolmogorov Complexity and Its Applications. Springer Publishing Company, Incorporated, 3 edition, 2008.
  • [She83] A. Shen. The concept of (alpha,beta)-stochasticity in the Kolmogorov sense, and its properties. Soviet Mathematics Doklady, 28(1):295–299, 1983.
  • [She99] A. Shen. Discussion on Kolmogorov Complexity and Statistical Analysis. The Computer Journal, 42(4):340–342, 1999.
  • [VS15] Nikolai K. Vereshchagin and Alexander Shen. Algorithmic statistics revisited. CoRR, abs/1504.04950, 2015.
  • [VS17] Nikolay K. Vereshchagin and Alexander Shen. Algorithmic statistics: Forty years later. In Computability and Complexity, pages 669–737, 2017.
  • [VV04] N. Vereshchagin and P. Vitányi. Kolmogorov’s Structure Functions and Model Selection. IEEE Transactions on Information Theory, 50(12):3265 – 3290, 2004.
  • [V’Y87] V.V. V’Yugin. On Randomness Defect of a Finite Object Relative to Measures with Given Complexity Bounds. SIAM Theory of Probability and Its Applications, 32:558–563, 1987.
  • [V’Y99] V.V. V’Yugin. Algorithmic complexity and stochastic properties of finite binary sequences, 1999.