The notion of VC-dimension was introduced by Vapnik and Červonenkis in [VC71]
. Although originally motivated by applications in probability and statistics, it was quickly adapted to computer science, learning theory, combinatorics, logic and other areas. We refer to[Vap98] for the extensive review of the subject, and to [Che16] for an accessible introduction to combinatorial and logical aspects.
1.1. Definitions of VC-dimension and VC-density
Let be a set and be a family of subsets in . A subset is shattered by if for every subset , there is with . In other words, every subset of can be cut out by . The largest size of a subset shattered by is called the VC-dimension of , denoted by . If no such largest size exists, we write .
The shatter function is defined as follows:
where denotes the number of subsets which can be cut out by . The VC-density of , denoted by is defined as
In other words, in case has finite VC-dimension . In general, VC-density can be much smaller than VC-dimension, and also behaves a lot better under various operations on .
1.2. NIP theories and bounds on VC-dimension/density
It is of interest to distinguish the first-order theories in which VC-dimension and VC-density behave nicely. Let be a first-order language and be an -structure. Consider a partitioned -formula whose free variables are separated into two groups (objects) and (parameters). For each parameter tuple , let
Associated to is the family . We say that is NIP, short for “ does not have the independence property”, if has finite VC-dimension. The structure is called NIP if every partitioned -formula is NIP in .
One prominent example of an NIP structure is Presburger Arithmetic , which is the first-order structure on with only addition and inequalities. The main result of this paper are the lower and upper bounds on the VC-dimensions of PA-formulas. These are contrasted with the following notable bounds on the VC-density:
Theorem 1 ([A+16]).
Given a PA-formula with , holds.
In other words, VC-density in the setting of PA depends only on the dimension of parameter variables , and thus completely independent of the object variables , let alone other quantified variables or the description of . This follows from a more general result in [A+16], which says that every quasi-o-minimal structure satisfies a similar bound on the VC-density. We refer to [A+16] for the precise statement of this result and for the powerful techniques used to bound the VC-density.
Karpinski and Macintyre raised a natural question whether similar bounds would hold for the VC-dimension. In [KM97], they gave upper bounds for the VC-dimension in some o-minimal structures (PA is not one), which are polynomial in the parameter dimension . Later, they extended their arguments in [KM00] to obtain upper bounds on the VC-density, this time linear in . To our knowledge, no effective upper bounds on the VC-dimensions of general PA-formulas exist in the literature.111We were informed by Matthias Aschenbrenner that in [KM00], the authors claimed to have an effective bound on the VC-dimensions of PA-formulas. However, we cannot locate such an explicit bound in any papers.
1.3. Main results
We consider PA-formulas with a fixed number of variables (both quantified and free). Clearly, this also restricts the number of quantifier alternations in . The atoms in are linear inequalities in these variables with some integer constants and coefficients (in binary). Given such a formula , denote by the length of , i.e., the total bit length of all symbols, operations, integer coefficients and constants in .
We can further restrict the form of a PA-formula by requiring that it does not contain too many inequalities. For fixed and , denote by the family of PA-formulas with at most variables (both free and quantified) and inequalities. When and are clear, a formula is simply called a short Presburger formula. In this case, is essentially the total length of a bounded number of integer coefficients and constants. Our main result is a lower bound on the VC-dimension of short Presburger formulas:
For every , there is a short Presburger formula in the class with
Moreover, can be computed in probabilistic polynomial time in . Here are singletons and .
So in contrast with VC-density, the VC-dimension of a PA-formula crucially depends on the actual length . For the formulas in the theorem, we have:
where the last inequality follows by Theorem 1. Note that if one is allowed an unrestricted number of inequalities in , a similar lower bound to Theorem 2 can be easily established by an elementary combinatorial argument. However, since the formula is short, we can only work with a few integer coefficients and constants.
The construction in Theorem 2 uses a number-theoretic technique that employs continued fractions to encode a union of many arithmetic progressions. This technique was explored earlier in [NP17b] to show that various decision problems with short Presburger sentences are intractable. In this construction we need to pick a prime roughly larger than , which can be done in probabilistic polynomial time in . This can be modified to a deterministic algorithm with run-time polynomial in , at the cost of increasing :
For every , there is a short Presburger formula in the class with
Moreover, can be computed in deterministic polynomial time in .
We conclude with the following polynomial upper bound for the VC-dimension of all (not necessarily short) Presburger formulas in a fixed number of variables:
For a Presburger formula with at most variables (both free and quantified), we have:
where and the constant depend only on .
This upper bound implies that Theorem 2 is tight up to a polynomial factor. The proof of Theorem 4 uses an algorithm from [NP17a] for decomposing a semilinear set, i.e., one defined by a PA-formula, into polynomially many simpler pieces. Each such piece is a polyhedron intersecting a periodic set, whose VC-dimensions can be bounded by elementary arguments.
We note that the number of quantified variables is vital in Theorem 4. In 3.3, we construct PA-formulas with singletons and many quantified variables, for which grows doubly exponentially compared to .
Proof of Theorem 3
Let and . Since contains all of the subsets of , we have . We order the sets in lexicographically. In other words, for , we have if . Thus, the sets in can be listed as , where and . Next, define:
We show in Lemma 5 below that the set is definable by a short Presburger formula with only quantified variables and inequalities. Using this, it is clear that the parametrized formula
describes the family (with as the parameter), and thus has VC dimension . We remark that has only quantifer alternation (see below).
The set is definable by a short Presburger formula with and a combination of inequalities with length .
Our strategy is to represent the set as a union of arithmetic progressions (APs). In [NP17b], we already gave a method to define any union of APs by a short Presburger formula of polynomial size. For each , let . From (2.1), we have:
From the lexicographic ordering of the sets , we can easily describe each set as:
So each set is not simply an AP, but the Minkowski sum of two APs. However, we can easily modify each into an AP by defining:
It is clear that is an AP that starts at with step size , and ends at . Let
which is a union of APs. Using the construction from [NP17b], we can define by a short Presburger formula:
where and is a Boolean combination of at most inequalities. This construction works by finding a single continued fraction whose successive convergents encode the starting and ending points of our APs. We refer to Section 4 in [NP17b] for the details. Each term in that construction is at most the product of the largest terms in the APs we want to encode. For each , the largest term in has length . Thus, the product has length , and so does each term . Therefore, the final continued fraction is a rational number , with length . This implies that as well.
Proof of Theorem 2
Note that the construction of and in the proof above is deterministic with run-time polynomial in , again as a consequence of the construction in [NP17b]. Since in Theorem 2 we need only the existence of a short Presburger formula with high VC-dimension, our lower bound can be improved to , for some , as follows. Recall that is the largest element in the arithmetic progression in (2.5). Pick the smallest prime larger than . This prime can substitute for the large number in Section 4.1 of [NP17b], which was (deterministically) chosen as , so that it is larger and coprime to all ’s. The rest of the construction follows through. Note that by Chebyshev’s theorem, which implies that the final continued fraction has length . This completes the proof.
Proof of Theorem 4
Let be a Presburger formula in , with other quantified variables, where is fixed. In [NP17a] (Theorem 5.2), we gave the following polynomial decomposition on the semilinear set defined by :
Here each is a polyhedron in , and each is a periodic set, i.e., a union of several cosets of some lattice . In other words, the set defined by is a union of pieces, each of which is a polyhedron intersecting a periodic set. Our decomposition is algorithmic, in the sense that the pieces and lattices can be found in time , with and depending only on . The algorithm describes each piece by a system of inequalities and each lattice by a basis. Denote by and
the total binary lengths of these systems and basis vectors, respectively. These also satisfy:
Each can be written as the intersection , where each is a half-space in , and is the number of facets of . Note that . We rewrite (2.6) as:
Therefore, the set is a Boolean combination of half-spaces and periodic sets. In total, there are
of those basic sets.
For a set and , denote by the subset and by the family . For a half-space , it is easy to see that . For each periodic set with period lattice , the family has cardinality at most . Thus, we have
Applying this to (2.8), we get , where
3. Final remarks and open problems
Our constructed short formula is of the form . It is interesting to see if similar polynomial lower bounds are obtainable with existential short formulas. For such a formula , the expression captures the set of integer points lying in a union of some polyhedra ’s. Note that the total number of polyhedra and their facets should be bounded, since we are working with short formulas. Therefore, simply capture the pairs in the projection of along the direction. Denote this set by . The work of Barvinok and Woods [BW03] shows that has a short generating function, and can even be counted efficiently in polynomial time. In our construction, the set that yields high VC-dimension is a union arithmetic progressions, which cannot be counted efficiently unless (see [SM73]). This difference indicates that has a much simpler combinatorial structure, and may not attain a high VC-dimension.
One can ask about the VC-dimension of a general PA-formula with no restriction on the number of variables, quantifier alternations or atoms. Fischer and Rabin famously showed in [FR74] that PA has decision complexity at least doubly exponential in the general setting. For every , they constructed a formula of length so that for every triple
we have if and only if . Using this “partial multiplication” relation, one can easily construct a formula of length and VC-dimension at least . This can be done by constructing a set similar to in (2.1) with replaced by using . We leave the details to the reader.
Regarding upper bound, Oppen showed in [Opp78] that any PA-formula of length is equivalent to a quantifier-free formula of length for some universal constant . This implies that , and thus , is at most triply exponential in . We conjecture that a doubly exponential upper bound on holds in the general setting. It is unlikely that such an upper bound could be established by straightforward quantifier elimination, which generally results in triply exponential blow up (see [Wei97, Thm 3.1]).
We are grateful to Matthias Aschenbrenner and Artëm Chernikov for many interesting conversations and helpful remarks. This paper was finished while both authors were visitors at MSRI; we are thankful for the hospitality, great work environment and its busy schedule. The second author was partially supported by the NSF.
- [A+16] M. Aschenbrenner, A. Dolich, D. Haskell, D. Macpherson and S. Starchenko, Vapnik-Chervonenkis density in some theories without the independence property, I, Trans. AMS 368 (2016), 5889–5949.
- [BW03] A. Barvinok and K. Woods, Short rational generating functions for lattice point problems, Jour. AMS 16 (2003), 957–979.
- [Che16] A. Chernikov, Models theory and combinatorics, course notes, UCLA; available electronically at https://tinyurl.com/y8ob6uyv.
- [FR74] M. J. Fischer and M. O. Rabin, Super-Exponential Complexity of Presburger Arithmetic, in Proc. SIAM-AMS Symposium in Applied Mathematics, AMS, Providence, RI, 1974, 27–41.
M. Karpinski and A. Macintyre, Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks,J. Comput. System Sci. 54 (1997), 169–176.
- [KM00] M. Karpinski and A. Macintyre, Approximating volumes and integrals in o-minimal and -minimal theories, in Connections between model theory and algebraic and analytic geometry, Seconda Univ. Napoli, Caserta, 2000, 149–177.
- [NP17a] D. Nguyen and I. Pak, Enumeration of integer points in projections of unbounded polyhedra, in Proc. 19th IPCO, Springer, Cham, 2017, 417–429; arXiv:1612.08030.
- [NP17b] D. Nguyen and I. Pak, Short Presburger arithmetic is hard, to appear in Proc. 58th FOCS (2017); arXiv:1708.08179.
- [LO87] J. C. Lagarias and A. M. Odlyzko, Computing : an analytic method, J. Algorithms 8 (1987), 173–191.
- [Opp78] D. C. Oppen, A upper bound on the complexity of Presburger arithmetic, J. Comput. System Sci. 16 (1978), 323–332.
- [Sa72] N. Sauer, On the density of families of sets, J. Combin. Theory, Ser. A 13 (1972), 145–147.
- [Sh72] S. Shelah, A combinatorial problem; stability and order for models and theories in infinitary languages, Pacific J. Math. 41 (1972), 247–261.
- [SM73] L. J. Stockmeyer and A. R. Meyer, Word problems requiring exponential time: preliminary report, in Proc. Fifth STOC, ACM, New York, 1973, 1–9.
- [TCH12] T. Tao, E. Croot and H. Helfgott, Deterministic methods to find primes, Math. Comp. 81 (2012), 1233–1246.
- [VC71] V. N. Vapnik and A. Ja. Červonenkis, The uniform convergence of frequencies of the appearance of events to their probabilitie, Theor. Probability Appl. 16 (1971), 264–280.
- [Vap98] V. N. Vapnik, Statistical learning theory, John Wiley, New York, 1998.
- [Wei97] V. D. Weispfenning, Complexity and uniformity of elimination in Presburger arithmetic, in Proc. 1997 ISSAC, ACM, New York, 1997, 48–53.