On a generalization of iterated and randomized rounding

11/05/2018
by   Nikhil Bansal, et al.
0

We give a general method for rounding linear programs that combines the commonly used iterated rounding and randomized rounding techniques. In particular, we show that whenever iterated rounding can be applied to a problem with some slack, there is a randomized procedure that returns an integral solution that satisfies the guarantees of iterated rounding and also has concentration properties. We use this to give new results for several classic problems where iterated rounding has been useful.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

06/01/2020

Randomized Kaczmarz for Tensor Linear Systems

Solving linear systems of equations is a fundamental problem in mathemat...
08/02/2020

Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Analyzing probabilistic programs and randomized algorithms are classical...
03/04/2020

Notes on Randomized Algorithms

Lecture notes for the Yale Computer Science course CPSC 469/569 Randomiz...
04/04/2022

Randomized Block Adaptive Linear System Solvers

Randomized linear solvers leverage randomization to structure-blindly co...
02/16/2021

The Randomized Elliptical Potential Lemma with an Application to Linear Thompson Sampling

In this note, we introduce a randomized version of the well-known ellipt...
01/20/2018

Probabilistic Tools for the Analysis of Randomized Optimization Heuristics

This chapter collects several probabilistic tools that proved to be usef...
05/30/2022

Random Rank: The One and Only Strategyproof and Proportionally Fair Randomized Facility Location Mechanism

Proportionality is an attractive fairness concept that has been applied ...

1 Introduction

A powerful approach in approximation algorithms is to formulate the problem at hand as a - integer program and consider some efficiently solvable relaxation for it. Then, given some fractional solution to this relaxation, apply a suitable rounding procedure to to obtain an integral - solution. Arguably the two most basic and extensively studied techniques for rounding such relaxations are randomized rounding and iterated rounding.

Randomized Rounding. Here, the fractional values

are interpreted as probabilities, and used to round the variables independently to

or . A key property of this rounding is that each linear constraint is preserved in expectation, and its value is tightly concentrated around its mean as given by Chernoff bounds, or more generally Bernstein’s inequality (definitions in Section 2). Randomized rounding is well-suited to problems where the constraints do not have much structure, or when they are soft and some error can be tolerated. Sometimes these errors can be fixed by applying problem-specific alteration steps. We refer to [34, 33] for various applications of randomized rounding.

Iterated Rounding. This on the other hand, is based on linear algebra and is useful for problems with hard combinatorial constraints or when the constraints have some interesting structure. Here, the rounding proceeds in several iterations , until all variables are rounded to or . Let denote the solution at the beginning of iteration , and let denote the number of fractional variables in (i.e. those strictly between and ). Then one (cleverly) chooses some collection of linear constraints on these fractional variables, say specified by rows of the matrix , with dimension , and updates the solution as

by some arbitrary non-zero vector

satisfying so that some fractional variable reaches or . The process is then iterated with . Note that once a variable reaches or it stays fixed.

Despite its simplicity, this method is extremely powerful and most basic results in combinatorial optimization such as the integrality of matroid, matroid-intersection and non-bipartite matching polytopes follow very cleanly using this approach. Similarly, several breakthrough results for problems such as degree-bounded spanning trees, survivable network design and rounding for column-sparse LPs were obtained by this method. An excellent reference is

[21].

1.1 Our Results

Motivated by several problems that we describe in Section 1.3, a natural question is whether the strengths these of two seemingly different techniques can be combined to give a more powerful rounding approach.

Our main result is that such an algorithm exists, and we call it sub-isotropic rounding. In particular, it combines iterated and randomized rounding in a completely generic way and significantly the extends the scope of previous dependent rounding techniques. Before describing our result, we need some definitions.

Let be independent -random variables with mean and be arbitrary reals (possibly negative), then the sum is concentrated about its mean and satisfies the following tail bound [13],

(1.1)

where . The lower tail follows by applying the above to , and the standard Chernoff bounds correspond to (1.1) when for (details in Section 2).

The following relaxation of Bernstein’s inequality will be extremely relevant for us.

Definition 1.1 (-concentration).

Let . For a vector valued random variable where are (possibly) dependent - random variables, we say that is -concentrated, if for every , is well-concentrated and satisfies Bernstein’s inequality up to factor in the exponent, i.e.

(1.2)

Main result. We show that whenever iterated rounding can be applied to a problem so that in any iteration , there is some slack in the sense that for some , then -concentration can be achieved for free. More precisely, we show the following.

Theorem 1.2.

Let be a problem for which there is an iterated rounding algorithm , that at iteration , chooses a subspace with , where . Then there is a polynomial time randomized algorithm that given a starting solution , returns such that

  • With probability , satisfies all the guarantees of the iterated rounding algorithm .

  • for every variable , and is -concentrated with .

A simple example in Appendix 7.1 shows that the dependence cannot be improved beyond constant factors.

The generality of Theorem 1.2 directly gives new results for several problems where iterated rounding gives useful guarantees. All one needs to show is that the original iterated rounding argument for the problem can be applied with some slack, which is often straightforward and only worsens the approximation guarantee slightly. Before describing these applications in Section 1.3, we discuss some prior work on dependent rounding to place our results and techniques in the proper context.

1.2 Comparison with Dependent Rounding Approaches

Motivated by problems that involve both soft and hard constraints, there has been extensive work on developing dependent rounding techniques, that round the fractional solution in some correlated way to satisfy both the hard constraints and ensure some concentration properties. Such problems arise naturally in many ways. E.g. the hard constraints might arise from an underlying combinatorial object such as spanning tree or matching that needs to be produced, and the soft constraints may arise due to multiple budget constraints, or when the object to be output is used as input to another problem and needs to satisfy additional properties, see e.g. [14, 15, 17, 3].

Some examples of such methods include swap rounding [14, 15], randomized pipage [2, 31, 17, 18], maximum-entropy sampling [3, 30, 4], rounding via discrepancy [24, 28, 10] and gaussian random walks [27]. A key idea here is that the weaker property of negative dependence (instead of independence) also suffices to get concentration. There is a rich and deep theory of negative dependence and various notions such as negative correlation, negative cylinder dependence, negative association, strongly rayleigh property and determinantal measures, imply interesting concentration properties [26, 12, 16]. This insight has been extremely useful and for many general problems such as those involving assignment or matroid polytopes, one can exploit the underlying combinatorial structure to design rounding approaches that ensure negative dependence between all or some suitable collection of random variables.

Limitations. Even though very powerful and ingenious, these methods are also limited by the fact that requiring negative dependence can substantially restrict the kinds of rounding that can be designed and the problems they can be applied to. Moreover, even when such a rounding is possible, it typically requires a lot of creativity and careful understanding of the problem structure to come up with the rounding.

Our approach. In contrast, Theorem 1.2 makes no assumption on the structure of the problem and by working with the more relaxed notion of -concentration, we can get rid of the need for negative dependence. Moreover, our algorithm needs no major ingenuity to apply, and minor tweaks to previous iterated rounding algorithms to create some slack suffice.

1.3 Motivating problems and Applications

We now give several applications and briefly discuss why they seem beyond the reach of current dependent rounding methods.

1.3.1 Rounding Column-sparse LPs

Let be some fractional solution satisfying , where is an matrix. The celebrated Beck-Fiala algorithm [11] gives an integral solution so that , where is the maximum norm of the columns of . This guarantee is substantially stronger than that given by randomized rounding if is small.

Many problems however, involve both some column-sparse constraints that come from the underlying combinatorial question, and some general arbitrary constraints which might not have much structure. This motivates the following natural question.

The problem. Let be a linear system with two sets of constraints given by matrices and , where is column-sparse, while is arbitrary. Given some fractional solution , can we round it to get error fo rows of , while doing no worse than randomized rounding for ?

Note that simply applying iterated rounding on the rows of gives no control on the error for . Similarly, just doing randomized rounding will not give error for . Also as and can be completely arbitrary, previous negative dependence based techniques do not seem to apply.

Solution. We show that a direct modification of the Beck-Fiala argument gives slack , for any , while worsening the error bound slightly to . Setting, say and applying Theorem 1.2 gives that (i) has error at most for rows of , (ii) satisfies and is -concentrated, thus giving similar guarantees as randomized rounding for rows of . In fact, the solution produced by the algorithm will satisfy concentration for all linear constraints and not just for the rows of .

Komlós Setting. We also describe an extension to the so-called Komlós setting, where the error depends on the maximum norm of columns of .

These results are described in Section 5.1.

1.3.2 Makespan minimization on unrelated machines

The classic makespan minimization problem on unrelated machines is the following. Given jobs and machines, where each job has arbitrary size on machine , assign the jobs to machines to minimize the maximum machine load. In a celebrated result, [22] gave a rounding method with additive error . In many practical problems however, there are other soft resource constraints and side constraints that are added to the fractional formulation. So it is useful to find a rounding that satisfies these approximately but still violates the main load constraint by only . This motivates the following natural problem.

The Problem. Given a fractional assignment , find an integral assignment with additive error and that satisfies and concentration for all linear functions of .

Questions related to finding a good assignment with some concentration properties have been studied before [17, 2, 15], and several methods such as randomized pipage and swap rounding have been developed for this. However, these methods crucially rely on the underlying matching structure and round the variables alternately along cycles, which limits them in various ways: either they give partial assignments, or only get concentration for edges incident to a vertex.

Solution. We show that the iterated rounding proof of the result of [22] can be easily modified to work for any slack while giving additive error . Theorem 1.2 (say with ), thus gives a solution with error at most and satisfying -concentration.

The result also extends naturally to the resource setting, where is a -dimensional vector. These results are described in Section 5.2

1.3.3 Degree-bounded Spanning Trees and Thin Trees

In the minimum cost degree-bounded spanning tree problem, we are given an undirected graph with edge costs for , and integer degree bounds for , and the goal is to find a minimum cost spanning tree satisfying the degree bounds. In a breakthrough result, Singh and Lau [29] gave an iterated rounding algorithm that given any fractional spanning tree , finds a spanning tree with cost at most and degree violation .

The celebrated thin-tree conjecture (details in Appendix 7.2) asks if given a fractional spanning tree , there is a spanning tree satisfying for every , where . Here is the number of edges of crossing , and is the -value crossing .

The result of [29] implies that the degree of every vertex . However, despite remarkable progress [1], the best known algorithmic results for the thin-tree problem give [3, 14, 30, 18]. The motivates the following natural question as a first step towards the thin-tree conjecture.

The Problem. Can we find a spanning tree with for single vertex cuts and for general cuts?

The current algorithmic methods for thin-trees crucially rely on the negative dependence properties of spanning trees, which do not give anything better for single vertex cuts (e.g. even if for all , by a balls and bins argument a random tree will have maximum degree ). On the other hand, if we only care about single vertex cuts the methods of [29] do not give anything for general cuts.

Solution. We show that the iterated rounding algorithm of [29] can be easily modified to create slack while violating the degree bounds by at most . Applying Theorem 1.2 with thus gives a distribution supported on trees with degree vioation , and has concentration. By a standard cut counting argument [3], the concentration property implies -thinness for every cut.

We describe these results in Section 5.3, where in fact we consider the more setting of the minimum cost degree bounded matroid basis problem.

1.3.4 Multi-budgeted bipartite matchings

In the above examples, it was relatively easy to create slack since the number of hard combinatorial constraints were bounded a constant factor away from , and the slack was only introduced in the soft constraints (e.g. machine load or vertex degrees) while worsening the approximation slightly.

As a different type of illustrative example, we now consider perfect matching problem in bipartite graphs. Here, and more generally in matroid intersection, one needs to main tight rank constraints for two matroids, which typically requires linearly independent constraints for elements, and it is not immediately clear how to introduce slack.

Problem. Let be a bipartite graph with and , and given a fractional perfect matching defined by for all , and for all . Can we round it to a perfect or almost perfect matching while satisfying -concentration.

Building on the work of [2], [15] designed a beautiful randomized swap rounding procedure that for any , finds an almost perfect matching where each vertex is matched with probability at least , and satisfies -concentration. They also extend this result to non-bipartite matchings and matroid intersection.

We give an alternate proof of this result using our framework. Our proof is quite from that in [15] and is more in the spirit of iterated rounding where we carefully choose the set of constraints to maintain as the rounding proceeds. This is described in Section 5.4.

1.4 Overview of Techniques

We now give a high level overview of our algorithm and analysis. The starting observation is that randomized rounding can be viewed as a iterative algorithm, by doing a standard Brownian motion on the cube as follows. Given as the starting fractional solution, consider a random walk in the cube starting at , with tiny step size chosen independently for each coordinate, where upon reaching a face of the cube (i.e. some reaches or ) the walk stays on that face. The process stops upon reaching some vertex of the cube. By the martingale property of random walks, the probability that is exactly and as the walk in each coordinate is independent, has the same distribution on as under randomized rounding.

Now consider iterated rounding, and recall that here the update at step must lie in the nullspace of . So a natural idea is to do a random walk in the null space of until some variable reaches or . The slack condition implies that the nullspace has at least dimensions, which could potentially give “enough randomness” to the random walk.

It turns out however that doing a standard random walk in the null space of does not work. The problem is that as the constraints defining can be completely arbitrary in our setting, the random walk can lead to very high correlations between certain subsets of coordinates causing the -concentration property to fail. For example, suppose and consists of the constraints . Then the random walk will update independently, but for the update will be completely correlated, and the linear function will have very bad concentration.

To get around this problem, we design a different random walk that looks almost like an independent walk in every direction. More formally, call the random variable , -almost independent if for every ,

If are independent, note that the above holds as equality with , and hence this can be viewed as a relaxation of independence. We show that whenever , there exist -almost independent random updates in the null space of with . Moreover these updates can be found by solving a semidefinite program (SDP).

Next, using a variant of Freedman’s martingale analysis, we show that applying these almost independent random updates until all the variables reach -, gives an integral solution with -concentration.

These techniques are motivated by our recent works on algorithmic discrepancy [7, 8]. While discrepancy is closely related to rounding [23, 28], a key difference in discrepancy is that the error for rounding a linear system depends on the norms of the coefficients of the constraints and not on . E.g. suppose satisfies , then the sum stays upon randomized rounding with high probability, while using discrepancy methods directly gives error. So our results can be viewed as using techniques from discrepancy to obtain bounds that are sensitive to . Recently, this direction was explored in [10] but their method gave much weaker results and applied to very limited settings.

2 Technical Preliminaries

2.1 Probabilistic tail bounds and Martingales

The standard Bernstein’s probabilistic tail for independent random variables is the following.

Theorem 2.1.

(Bernstein’s inequality.) Let be independent random variables, with for all . Let and . Then, with we have,

The lower tail follows by applying the above to , so we only consider the upper tail. As we will be interested in bounding , where the random variables are - and the are arbitrary reals (possibly negative), we will use the form given by (1.1).

The well-known Chernoff bounds correspond to the special case of (1.1) when for . In particular, setting with , and using that in (1.1), we get

(2.3)

Remark: For , the bound (2.3) can be improved slightly to by optimizing the choice of parameters in the proof. In this regime, an analogous version of Theorem 2.1 is called Bennett’s inequality ([13]), and similar calculations also give such a variant of Theorem 1.2. As this is relatively standard, we do not discuss this here.

We will use the following Freedman-type martingale inequality.

Lemma 2.2.

Let be a sequence of random variables with , such that is deterministic and for all . If for all

where denotes . Then for all and , it holds that

Before proving Theorem 2.2, we first give a simple lemma.

Lemma 2.3.

If and , .

Proof.

Let , where we set . It can be verified that is increasing for all , which implies that for any , Taking expectations and using that for all this gives,

Proof.

(Lemma 2.2) By Markov’s inequality,

so it suffices to show that . As is deterministic, this is same as . Now,

As this holds for all , using that gives the result. ∎

2.2 Semidefinite Matrices

Let denote the class of all symmetric matrices with real entries. For two matrices , the trace inner product of and is defined as A matrix

is positive semidefinite (psd) if all its eigenvalues are non-negative and we note this by

. Equivalently, iff for all .

For, let , where is the spectral decomposition of

with orthonormal eigenvectors

. Then is psd and for . For , we say that if .

2.3 Approximate independence and sub-isotropic random variables

Let be a random vector with possibly dependent.

Definition 2.4 ( sub-isotropic random vector).

For and , We say that is sub-isotropic if it satisfies the following conditions.

  1. and for all , and .

  2. For all it holds that

    (2.4)

Note that if are independent then (2.4) holds with equality for .

Let be the covariance matrix of . That is, . Every covariance matrix is psd as for all . Let denote the diagonal matrix with entries , then (2.4) can be written as for every , and hence equivalently expressed as

Some examples. If each is independent then is sub-isotropic. The same holds if each is independent . If is a random vector in some -dimensional subspace of (i.e.  is an orthonormal basis for and are independent ) then is isotropic. If is gaussian with covariance matrix , and are then is sub-isotropic, where is the maximum eigenvalue of .

We will need the following result from [8], about finding sub-isotropic random vectors orthogonal to a subspace.

Theorem 2.5 ([8]).

Let be a subspace with dimension . Then for any and satisfying , there exists a sub-isotropic random vector such that is orthogonal to with probability . Moreover, can be sampled in polynomial time by solving a SDP.

2.4 Formal Description of Iterated Rounding

By iterated rounding we refer to any procedure that works as follows. Let be the starting fractional solution. We set , and round it to a - solution by applying a sequence of updates as follows. Let denote the solution at the beginning of iteration . We say that variable frozen if is to , otherwise it is alive. Frozen variables are never updated anymore. Let denote the number of alive variables.

Based on , the algorithm picks a set of constraints of rank at most , given by the rows of some matrix . It finds an (arbitrary) non-zero direction such that and if is frozen. The solution is updated as .

In typical applications of iterated rounding, is obtained by dropping one or more rows of , and is obtained in a black-box way by solving the LP given by the constraints where , restricted to the alive variables and . As , at least one more variable in reaches or and hence the algorithm terminates in at most steps. However, as we will not work with basic feasible solutions, we will view the processing of generating the update as described above.

3 Rounding Algorithm

We assume that the problem to be solved has an iterated rounding procedure, as discussed in Section 2.4 that in any iteration specifies some subspace with , and the update must satisfy We now describe the rounding algorithm.

Algorithm. Initialize , where in the starting fractional solution given as input. For each iteration repeat the following until all variables reach or .

Iteration . Let be the current solution and be the fractional variables in . As the variables not in do not change anymore, we assume for ease of notation that .

  1. Apply theorem 2.5, with , , and to find the covariance matrix .

  2. Let . Let be a random vector with independent entries. Set

    where is the largest number such that .

4 Algorithm Analysis

Let denote the final solution. The property that follows directly as the update at each time step has mean zero in each coordinate. As the algorithm always moves in the nullspace of , clearly it will also satisfy the iterated rounding guarantee with probability .

To analyze the running time, we note that whenever , there is at least probability that some new variable will reach or after the update by . So, in expectation there are at most such steps. So we focus on the iterations where . During these iterations the energy rises is expectation by at least , as,

By a standard argument [6], this implies that the algorithm terminates in time with constant probability. One can also make the energy increase more deterministic by adding an extra constraint that be orthogonal to . This only adds one to the dimension of , which can be subsumed by making slightly smaller as , as long as . When becomes smaller, one can revert to the analysis above. This gives an improved running time of .

It remains to show the -concentration property, which we do next.

4.1 Isotropic updates imply concentration

Let be the final rounded solution, and fix some linear function . We will show that

for .

Proof.

By scaling of ’s and , we can assume that . Let us define the random variable

where will be optimized later.

Initially, where and .

As , a simple calculation gives that

(4.5)

We now show that satisfies the conditions of Lemma 2.2 for a suitable .

Claim 4.1.

Proof.

As for all , taking expectations in (4.5) gives that

(4.6)

We now upper bound . Using and the expression for in (4.5)

where we use that , as and .

As is sub-isotropic, and , and we can bound as

where we use that for all as , and the expression for in (4.6).

As , this gives that

By Claim 4.1, we can apply Lemma 2.2 with , provided the conditions for Lemma 2.2 are satisfied. Now, as and . To show that we argue as follows. By (4.5) and as (as ), and , we have As , the columns of have length at most , and thus , and thus

By Lemma 2.2, this gives that , or equivalently

(4.7)

Let (note this satisfies our assumption that ). Then and (4.7) gives Setting and the values of gives

5 Applications

5.1 Rounding Column-Sparse LP

Let be a fractional solution satisfying , where is an arbitrary matrix. Let be the maximum norm of the columns of . Beck and Fiala [11] gave a rounding method to find so that the error row .

Beck-Fiala Rounding. We first recall the iterated rounding algorithm of [11]. Initially . Consider some iteration , and let denote the matrix restricted to the alive coordinates. Call row big if its -norm in is strictly more than . The number of big rows is strictly less than as each column as norm at most and thus the total norm of all entries is at most . So the algorithm sets to be the big rows of , and applies the iterated rounding update.

We now analyze the error. Fix some row . As long as row is big, its rounding error is during the update steps. But when it is no longer big no matter how the remaining alive variables are rounded in subsequent iterations, the error incurred can be at most its -norm, which is at most .

Introducing Slack. To apply Theorem 1.2, we can easily introduce -slack for any , as follows. In iteration , call a row big if its norm exceeds , and by the argument above the number of big rows is strictly less than . This gives the following result.

Theorem 5.1.

Given a matrix with maximum -norm of any column at most , and any , then for any the algorithm returns such that , and and satisfies -concentration.

This implies the following useful corollary.

Corollary 5.2.

Given a matrix with some collection of rows such that the columns restricted to have norm at most , then say setting , the rounding ensures at most error for rows of , while the error for other rows of is similar to that as for randomized rounding.

Komlós Setting.

For a matrix , let denote the maximum norm of the columns of . Note that (and it can be much smaller, e.g. if is -, ).

The long-standing Komlos conjecture (together with a connection between hereditary discrepancy and rounding due to [23]) states that any can be rounded to , so that . Currently, the best known bound for this problem is [5, 7].

An argument similar to that for Theorem 5.1 gives the following result in this setting.

Theorem 5.3.

If has maximum column -norm , then given any , the algorithm returns such that , where also satisfies -concentration.

Proof.

We will apply Theorem 1.2 with . During any iteration , call row