A function is convex if for all . Convexity of functions is a natural and interesting property. Given oracle access to a function , an
-tester for convexity has to decide with high constant probability, whetheris a convex function or whether every convex function evaluates differently from on at least domain points, where . Parnas, Ron, and Rubinfeld [PRR06] gave an -tester for convexity that has query complexity . Blais, Raskhodnikova, and Yaroslavtsev [BRY14] showed that this bound is tight for constant for nonadaptive algorithms555The queries of a nonadaptive algorithm does not depend on the answers to the previous queries. The algorithm is adaptive otherwise.. An improved upper bound of was shown by Ben-Eliezer [BEN19] in a work on the more general question of testing local properties. Recently, Belovs, Blais and Bommireddi [BBB20] complemented this result by showing a tight lower bound of .
In this work, we further investigate and develop new insights into this well-studied problem, thereby asserting that there is more way to go towards a full understanding of testing convexity of functions . We show that the number of distinct discrete derivatives , as opposed to the input size , is the right input parameter to express the complexity of convexity testing, where a discrete derivative is a value of the form for . Specifically, we design a nonadaptive convexity tester with query complexity , and complement it with a nearly matching lower bound of . Our work is motivated by the work of Pallavoor, Raskhodnikova and Varma [PRV18] who introduced the notion of parameterization in the setting of sublinear algorithms.
Our results bring out the fine-grained complexity of the problem of convexity testing. In particular, always and therefore, our tester is at least as efficient as the state of the art convexity testers. Furthermore, the parameterization that we introduce, enables us to circumvent the worst case lower bounds expressed in terms of the input size and obtain more efficient algorithms when .
1.1 Our Results
We begin our investigation with the simple and highly restricted case of testing convexity of functions having at most two distinct discrete derivatives. We design an adaptive algorithm that exactly decides convexity by making queries and a nonadaptive algorithm that -tests convexity by making queries. The highlight is that both these algorithms are deterministic.
There exists a deterministic algorithm that, given oracle access to a function having at most distinct discrete derivatives, exactly decides convexity by making at most adaptive queries.
Theorem 1.1 is significant because one can construct simple examples of two distributions, both over functions having at most distinct discrete derivatives, one over convex functions and the other over non-convex functions, such that no nonadaptive deterministic algorithm making queries can distinguish the functions. Therefore, the above result shows the power of adaptivity even in this restricted setting.
We also design a constant-query deterministic nonadaptive testing algorithm for convexity of functions having at most distinct discrete derivatives.
Let . There exists a deterministic nonadaptive -sided error -tester for convexity of functions having at most distinct discrete derivatives with query complexity .
Next, we consider the general case of functions having at most distinct discrete derivatives and design the following nonadaptive tester.
Let . There exists a nonadaptive -sided error -tester with query complexity for convexity of real-valued functions having at most distinct discrete derivatives.
We complement Theorem 1.3 with the following lower bound that is tight for constant . The bound holds even for adaptive testers, thereby showing that one cannot hope for a separation between adaptive and nonadaptive testers for this general setting.
For every sufficiently large , every , and for every sufficiently large , every -tester for convexity of functions having at most distinct derivatives has query complexity .
1.2 Related Work
The study of property testing was initiated by Rubinfeld and Sudan [RS96] and Goldreich, Goldwasser and Ron [GGR98]. The first example where parameterization has helped in the design of efficient testers is the work of Jha and Raskhodnikova [JR13] on testing the Lipschitz property. A systematic study of parameterization in sublinear-time algorithms was initiated by Pallavoor, Raskhodnikova and Varma [PRV18] and studied further by [BEL18, CS19, SUR20, NV21].
In this work, we are concerned only with convexity of real-valued functions over a domain. We would like to note that not much is known about testing convexity of functions over higher dimensional domains. One possible reason behind this could be the following: there is no single definition of discrete convexity for real-valued functions of multiple variables. For a good overview of this topic, we refer interested readers to the textbook by Murota on discrete convex analysis [MUR03]. Ben Eliezer [BEN19], in his work on local properties, studied the problem of testing convexity of functions of the form and designed a nonadaptive tester with query complexity . Later, Belovs, Blais and Bommireddi [BBB20] showed a nonadaptive query lower bound of for testing convexity of real-valued functions over . For functions of the form , they design an adaptive tester with query complexity and show that the complexity of nonadaptive testing is .
For a natural number , we denote by , the set . Let . For , the is the number of points such that . Points are consecutive if . A function is convex if and only if
for all such that . A set of points is said to violate convexity if is not convex.
For a function , and , we use the term ‘discrete derivative’ at for . We denote by , the derivative function . The cardinality of the range of is referred to as the number of distinct discrete derivatives of . A function is convex if and only if is monotone non-decreasing.
If is not convex, then there exists three consecutive points that violate eq. 1.
We note that although convexity of is equivalent to monotonicity of , it is not true that if is -far from being convex then is -far from monotonicity for some positive constant . E.g., consider that is defined by , and . Then if nearly -far from being convex for , while is almost the -constant function.
Let . A function is -far from convex if every convex function evaluates differently from on at least points. A basic -tester for convexity gets oracle access to a function , a parameter , and is such that, it accepts if is convex, and rejects, with probability at least , if is -far from convex.
3 Deterministic Convexity Testers for Functions having at most Distinct Discrete Derivatives
Let the range of be . We do not assume that the algorithm knows the values but we will refer to them in our proofs and reasoning below. As it turns out, in this case there is a deterministic adaptive algorithm that can precisely decide if is convex, making only queries. This is based on the fact that the class of convex functions having at most two distinct derivatives is very restricted.
If is convex and takes at most two values then is of the following form: there is such that and for some and some . We denote such as .
Algorithm 1 is a deterministic, adaptive algorithm; it can make the last query only after knowing the values of at the first points. Moreover, if is a function having at most distinct discrete derivatives, then there is always an integer point such that .
Algorithm 1 accepts every convex function having at most distinct derivatives, and rejects every function having at most distinct derivatives that is not convex.
We note that the lemma asserts that Algorithm 1 decides convexity correctly on every function that has at most distinct derivatives, regardless of the distance to convexity.
If is convex, then the restriction of to every subset of is also convex and Algorithm 1 accepts.
Suppose is not convex. This immediately implies that has two distinct discrete derivatives, which we denote by . Now, it is necessary that and for the restriction of to to be convex, for otherwise Algorithm 1 immediately rejects.
Assuming that the restriction of to the set is convex, Observation 3.1 implies that the only convex function with distinct derivatives that is consistent with on the points in is the function (from Observation 3.1), where the value of is unique and is as determined by Algorithm 1. If , then the restriction of to is not convex and Algorithm 1 rejects. In the rest of the proof, we argue that if , then and has to evaluate to the same value on every point in and that is convex. Specifically, each one of the discrete derivatives of upto the must be , for otherwise, will be larger than . Moreover, each one of the discrete derivatives of from upto must be , for otherwise, will be smaller than . That is, the functions are identical on every point in . ∎
3.1 A Nonadaptive Deterministic Convexity Tester for Functions having at most Distinct Discrete Derivatives
As remarked above, Algorithm 1 is deterministic and decides convexity exactly under the promise that has at most distinct derivatives. However, it is adaptive. What can be said about nonadaptive algorithms for the same problem? It is easy to see that for any deterministic algorithm that makes nonadaptive queries , there are two functions , both having at most distinct derivatives, for which but is convex while is not convex. Hence there is no deterministic nonadaptive algorithm that decides convexity exactly, while making at most queries. This line of reasoning immediately extends to a lower bound on randomized nonadaptive algorithms that exactly decide convexity.
Here we come back to the property testing scenario. We show that there is a deterministic nonadaptive tester, Algorithm 2, that accepts every convex function and rejects every function that is -far from convex, under the promise that has at most distinct derivatives. Algorithm 2 makes only nonadaptive deterministic queries.
The claim about the query complexity is clear. Further, by Observation 3.1, if is convex having at most distinct derivatives, then for some , the function is of the form given in the observation. Let . Then is consistent with in the acceptance criterion of Algorithm 2, and hence will be accepted.
Next, consider a function that is accepted by Algorithm 2. That is, there exists such that for every , and that for every , where are the two distinct discrete derivatives of . Since , the function is a linear function when restricted to the set . Similarly, when restricted to the set , the function is linear with slope . Further, it can be seen that can be corrected to be convex by changing the values for to be consistent with . As this changes at most points, it implies that is -close to convex. ∎
4 Convexity Tester for Functions having at most Distinct Discrete Derivatives
In this section, we describe our convexity tester for the case that the function has at most distinct discrete derivatives and prove Theorem 1.3. A basic tester is presented in Algorithm 3. For simplicity, we assume throughout this section, that is an integer that divides .
The top level idea is the following: suppose that is convex with at most distinct discrete derivatives, and let be a set of nearly equally spaced consecutive pairs of points in starting with , namely, . Let for . By the assumption on , the function is the constant function on at least of the intervals . Further, if is constant on , then obviously for . Thus in order to check that is convex, we first check that is convex using the nonadaptive, 1-sided error basic -tester of Belovs et al. [BBB20] by making queries. Afterwards, we test that is close to being a linear function on most intervals . To test “linearity” of on most such , it is enough to pick a random such interval and test the distance to the appropriate linear function, which will result in a large enough success probability. The details follow.
Algorithm 3 invokes Algorithm 4 as a subroutine, where Algorithm 4 is a basic tester for convexity of functions defined over subdomains of . The following theorem can be proven by modifying the analysis of the convexity tester by Belovs et al. [BBB20] in a fairly straightforward manner. We have included its proof in the Appendix.
Theorem 4.1 (Belovs et al. [Bbb20]).
Let . There exists a basic -tester for convexity of functions of the form that works for all with query complexity .
Lemma 4.2 shows that Algorithm 3 is indeed a basic -tester for convexity. Our convexity tester with query complexity is obtained by repetitions of Algorithm 3. This completes the proof of Theorem 1.3.
Algorithm 3, rejects with probability at least , every function having at most distinct discrete derivatives that is -far from convex.
In the rest of the proof, we assume that is -close to convex. In other words, it is possible to modify in at most points in in order to make convex.
Let for be shorthand for the index , and let stand for the index . Let denote the interval of indices for . Let denote the interval of indices . For , the interval is nearly linear if
The interval is nearly linear if
We first prove a lower bound on the number of nearly linear intervals. Recall that there is a way to modify by changing its values on at most bad points. For , the interval is bad if one among is a bad point, and is good otherwise. Likewise, is bad if one among is a bad point, and is good otherwise. The number of bad intervals is, therefore, at most , since a bad point can make at most two intervals bad. Now, is at most , by substituting the value of .
For , if the interval is good, then none of the points in are bad, and hence we have
Similarly, if is good, then
For a good interval (or ) for that is not nearly linear, one of the inequalities in Equation 4 (Equation 5, respectively) must be a strict inequality. Since the number of distinct discrete derivatives in is at most , the number of distinct discrete derivatives among restricted to the good points is also at most . Since the function restricted to the good points is convex, the number of good intervals with strict inequalities (in Equation 4 or Equation 5) is at most . Hence, the number of good intervals that are not nearly linear is at most .
Since is -far from being convex and each interval has at most indices, the restriction of to the set of indices belonging to good nearly linear intervals, has distance at least from convexity.
For , for a good nearly linear interval , we use to denote the Hamming distance of to the linear function defined as for , where . Similarly, if is good, we use to denote the Hamming distance of to the linear function defined as , where .
Consider the restriction of to the set of indices that belong to the good nearly linear intervals. We can make this restriction convex by replacing with for each such that is a good nearly linear interval. Hence,
Consider a good nearly linear interval such that . Consider the (favorable) set consisting of all points such that and . Clearly, both and are in . If Algorithm 3 samples, in Step 6, a point , it rejects. We now show that we can repair the function values at points not in and make be equal to . Consider an interval of points such that none of them are in , and where, both and are in . Since and are both in , we have that . We repair the function on the interval by assigning the value for all . We can repair the function on the whole interval and make be equal to by applying the same modification on every such maximal subinterval, where the maximality is in the sense of not belonging to .
Hence, the probability that the tester rejects in Step 6 is at least . This completes the proof. ∎
5 Lower Bound
In this section, we prove Theorem 1.4.
For every sufficiently large , every , and for every sufficiently large , every -tester for convexity of functions having at most distinct derivatives has query complexity .
We use Yao’s principle. Let and . Consider the distributions and from Belovs et al. [BBB20] (proof of Theorem 1.3) of functions . Every function sampled from these distributions have at most distinct derivatives. Moreover, every function sampled from is convex, and every function sampled from is -far from convex. They show that every tester distinguishing these distributions, with probability at least , has to make at least queries.
Consider an integer that is an integer multiple of . Let denote . We define distributions and of functions as follows.
For , to sample a function from , first sample a function from . For , let . For , let . Now, for all and for all , set .
By construction, every function sampled from is convex. Additionally, every function sampled from is -far from convex. To see this, consider a function sampled from that is -close to being convex. Let denote the function sampled from from which we constructed . Let denote the set of bad points such that changing the values of on points in makes it convex. Since is piecewise linear, it is clear that for each point in of the form for , either the set of points , or the set of points has to belong to . Thus, the number of points in of the form for is at most . By construction of , we know that for all , it holds that . Thus, the distance of to convexity is at most .
Consider a deterministic algorithm that distinguishes these distributions by making queries. One can use to distinguish, with the same success probability, the distributions and by making at most twice the number of queries as , which leads to a contradiction. ∎
- [BBB20] (2020) Testing convexity of functions over finite domains. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pp. 2030–2045. External Links: Cited by: Appendix A, §1.2, §1, Theorem 4.1, §4, §4, §5.
Adaptive lower bound for testing monotonicity on the line.
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2018, August 20-22, 2018 - Princeton, NJ, USA, E. Blais, K. Jansen, J. D. P. Rolim, and D. Steurer (Eds.), LIPIcs, Vol. 116, pp. 31:1–31:10. External Links: Cited by: §1.2.
- [BEN19] (2019) Testing local properties of arrays. In 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, January 10-12, 2019, San Diego, California, USA, LIPIcs, Vol. 124, pp. 11:1–11:20. External Links: Cited by: §1.2, §1.
- [BRY14] (2014) Lower bounds for testing properties of functions over hypergrid domains. In IEEE 29th Conference on Computational Complexity, CCC 2014, Vancouver, BC, Canada, June 11-13, 2014, pp. 309–320. External Links: Cited by: §1.
- [CS19] (2019) Adaptive boolean monotonicity testing in total influence time. In 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, January 10-12, 2019, San Diego, California, USA, A. Blum (Ed.), LIPIcs, Vol. 124, pp. 20:1–20:7. External Links: Cited by: §1.2.
- [GGR98] (1998) Property testing and its connection to learning and approximation. J. ACM 45 (4), pp. 653–750. External Links: Cited by: §1.2.
- [JR13] (2013) Testing and reconstruction of lipschitz functions with applications to data privacy. SIAM J. Comput. 42 (2), pp. 700–731. External Links: Cited by: §1.2.
- [MUR03] (2003) Discrete convex analysis. SIAM monographs on discrete mathematics and applications, Vol. 10, SIAM. External Links: Cited by: §1.2.
New sublinear algorithms and lower bounds for LIS estimation. In 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), LIPIcs, Vol. 198, pp. 100:1–100:20. External Links: Cited by: §1.2.
- [PRV18] (2018) Parameterized property testing of functions. ACM Trans. Comput. Theory 9 (4), pp. 17:1–17:19. External Links: Cited by: §1.2, §1.
- [PRR06] (2006) Tolerant property testing and distance approximation. J. Comput. Syst. Sci. 72 (6), pp. 1012–1042. External Links: Cited by: §1.
- [RS96] (1996) Robust characterizations of polynomials with applications to program testing. SIAM J. Comput. 25 (2), pp. 252–271. External Links: Cited by: §1.2.
- [SUR20] (2020) Improved algorithms and new models in property testing. Ph.D. Thesis, Boston University. Cited by: §1.2.
Appendix A Basic Tester for Convexity over Subdomains of
Definition A.1 (Test-Set, and its Root, Hub, and Scale).
Consider a point and an integer . Let be one among the two points closest to such that is a multiple of . Let be such that , and . We refer to as the test-set with root , hub , and scale .
Consider points such that . Then there exists test-sets with roots and respectively, having a common hub and scales at most
Consider the smallest integer such that there exists a unique multiple of satisfying . There are at least multiples of in the range . If there were only one multiple of in that range, then it contradicts our assumption that is the smallest integer such that there is a unique multiple of in . Now, since there are at least multiples of in the range , we have that .
The lemma follows by setting such that as the common hub of the test-sets with roots and , and as their scale. ∎
Consider a set that violates convexity such that . At least one of the test-sets involving or violate convexity. Moreover, scale is not exceeding .
Let be such that , , and .
Similarly, let be the common hub for test-sets of and with scale at most and let be such that . Note that , and are the test-sets being alluded to in this case.
If none of the aforementioned four test-sets violate convexity, then, we immediately have the following inequalities:
This contradicts our assumption that violates convexity. ∎
We are now ready to prove Theorem 4.1. The query complexity of Algorithm 4 is clear from its description. Additionally, it always accepts convex functions. Consider a function that is -far from convex. The total number of possible scales for test-sets (see Algorithm 4) is at most . Hence, the total number of test-sets is at most .
We will construct a set such that and for every , one of the test-sets rooted at of scale at most violate convexity. Initialize to be . If , then is not convex. We can find consecutive points , such that violates convexity on these three points. Since contains fewer than many points, we have that is at most . Hence, by Lemma A.3, we know that there exists a violating test-set involving either or of scale at most . We add such a point to set . Hence, we have at least violating test-sets. Therefore, the tester rejects with probability at least in a single iteration.