Optimal Sampling Gaps for Adaptive Submodular Maximization

by   Shaojie Tang, et al.

Running machine learning algorithms on large and rapidly growing volumes of data are often computationally expensive, one common trick to reduce the size of a data set, and thus reduce the computational cost of machine learning algorithms, is probability sampling. It creates a sampled data set by including each data point from the original data set with a known probability. Although the benefit of running machine learning algorithms on the reduced data set is obvious, one major concern is that the performance of the solution obtained from samples might be much worse than that of the optimal solution when using the full data set. In this paper, we examine the performance loss caused by probability sampling in the context of adaptive submodular maximization. We consider a easiest probability sampling method which selects each data point independently with probability r∈[0,1]. We define sampling gap as the largest ratio of the optimal solution obtained from the full data set and the optimal solution obtained from the samples, over independence systems. Our main contribution is to show that if the utility function is policywise submodular, then for a given sampling rate r, the sampling gap is both upper bounded and lower bounded by 1/r. One immediate implication of our result is that if we can find an α-approximation solution based on a sampled data set (which is sampled at sampling rate r), then this solution achieves an α r approximation ratio for the original problem when using the full data set. We also show that the property of policywise submodular can be found in a wide range of real-world applications, including pool-based active learning and adaptive viral marketing.


Instance Specific Approximations for Submodular Maximization

For many optimization problems in machine learning, finding an optimal s...

Practical Budgeted Submodular Maximization

We consider the Budgeted Submodular Maximization problem, that seeks to ...

A Unified Framework of Robust Submodular Optimization

In this paper, we shall study a unified framework of robust submodular o...

Using Partial Monotonicity in Submodular Maximization

Over the last two decades, submodular function maximization has been the...

Horizontally Scalable Submodular Maximization

A variety of large-scale machine learning problems can be cast as instan...

A Sketching Method for Finding the Closest Point on a Convex Hull

We develop a sketching algorithm to find the point on the convex hull of...

Scaling the Indian Buffet Process via Submodular Maximization

Inference for latent feature models is inherently difficult as the infer...

Please sign up or login with your details

Forgot password? Click here to reset