The problem is a fundamental covering issue that widely explored in the theory of optimization. A nicely formulation of this problem may given by the notion of hypergraphs which offer tools to deal with sets.
A hypergraph is a pair , where is a finite set and is a family of some subsets of . We call the elements of vertices and the elements of (hyper-)edges. Further let , . W.l.o.g. let the elements of be enumerated as . Let be the maximum edge size, the average edge size and let be the maximum vertex degree, where the degree of a vertex is the number of edges containing that vertex. If for every than the hypergraph is called uniform.
Let be given. If a vertex , , is contained in at least edges of some subset , we say that the vertex is fully covered by edges in . A set multicover in is a set of edges such that every vertex in is fully covered by edges in . problem is the task of finding a set multicover of minimum cardinality. Note that the usual set cover problem, which is known to be NP-hard , is a special case with for all . Furthermore Peleg, Schechtman and Wool conjectured that for any fixed and the problem cannot be approximated by a ration smaller than unless . Hence it remained an open problem whether an approximation ration of with constant can be proved. We say that an algorithm is an approximation algorithm for problem with performance ratio , if for each instance of size of problem, runs in polynomial time in and returns a value such that , where is the cardinality of a minimum set multicover.
The problem can also be formulated as an integer linear program as follows
where is the vertex-edge incidence matrix of and
is the given integer vector.
The linear programming relaxation LP() of ILP() is given by relaxing the integrality constraints to for all . Let be an optimal solution of LP() than is the value of the optimal solution to LP(), We have .
Related Work. The set cover problem has been over decades intensively explored. Several deterministic approximation algorithms are exhibited for this problem [1, 8, 10, 14], all with approximation ratios . On the other hand Khot and Regev in  proved that the problem cannot be approximated within factor assuming that unique games conjecture is true. Furthermore Johnson  and Lovász  gave a greedy algorithm with performance ratio , where is the harmonic number. Notice that . For hypergraphs with bounded , Duh and Fürer  used the technique called semi-local optimization improving to . In contrast to set cover problem it is less known for the case . Let give a brief summary of the known approximability results. In paper , Vazirani using dual fitting method extended the result of Lovász  for . Later Fujito et al.  improved the algorithm of Vazirani and achieved an approximation ratio of for bounded. Hall and Hochbaum  achieved by a greedy algorithm based on LP duality an approximation ratio of . By a deterministic threshold algorithm Peleg, Schechtman and Wool in 1997 [19, 20] improved this result and gave an approximation ratio . They were also the first to propose an approximation algorithm for the problem with approximation ratio below , namely a randomized rounding algorithm with performance ratio for a small constant . However, their ratio is depending on and asymptotically tends to . A randomized algorithm of hybrid type was later given by Srivastav et al . Their algorithm achieves for hypergraphs with an approximation ratio of
with constant probability.
Our Results. The main contribution of our paper is the combination of a deterministic threshold-based algorithm with conditioned randomized rounding steps. The idea is to algorithmically discard instances that can be handled deterministically in favor of instances for which we obtain a constant factor approximation less than using a randomized strategy.
In the following we give a brief overview of the method. First we give some fundamentals results based on the LP relaxation with threshold that allows us to come up with an approximation ratio strictly less than and use this results for the first algorithm. This is an extension of an algorithm by Hochbaum  for the set cover problem and the vertex cover problem.
Let be the optimal solution of the LP(). We define , and . It follows that and is a feasible set multicover.
Our first algorithm is designed as a cascade of a deterministic and a randomized rounding step followed by greedy repairing. The threshold type algorithm first solves the relaxed LP() problem and then picks all the edges corresponding to variables with fractional values at least to the output set. Depending on the cardinality of sets and , we use LP-rounding with randomization for the hyperedges of the set . Every edge of is independently added to the output set with probability . To guarantee feasibility, we proceed with a repairing step. Our algorithm is an extension of an example given in [4, 5, 6, 8, 9, 20] for the vertex cover, partial vertex cover and problem in graphs and hypergraphs.
The methods used in this paper rely on an application of the Chernoff–Hoeffding bound technique for sums of independent random variables and are based on estimating the variance of the summed random variables for invoking the Chebychev-Cantelli inequality.
We give a detailed analysis of the first algorithm in which we explore the cases by comparing the cardinality of the two sets and and the relative cardinality of with respect to . Our algorithm yields a performance ratio of . This ratio means a constant factor less than for many settings of the parameters , and . Further it is asymptotically better than the former approximation ratios due to Peleg et al and Srivastav et al. Furthermore we consider the problem in hypergraphs with and do not assume that and are constants. We give a polynomial-time approximation algorithm with an approximation ratio of for any fixed . The main progress is that our approximation ratio of at most . Hence we disprove the conjecture of peleg et al for a large and important class of hypergraphs. Note that uniform hypergraphs fulfill the condition .
Fundamental results and approximations for problem
|-||where is a constant. |
|, for||(this paper)|
Outline of the paper. In Section 2 we give all the definitions and tools needed for the analysis of the performed results. Section 3 we present a randomized algorithm of hybrid type and its analysis. Section 4 we give a deterministic algorithm based on matching/covering duality and its analysis. Finally we sketch some open questions.
2 Definitions and preliminaries
Let be a hypergraph, and is the set of vertices and hyperedges respectively. For every vertex we define the vertex degree of as and the set of edges incident to . The maximum vertex degree is . Let denote the maximum cardinality of a hyperedge from . It is convenient to order the vertices and edges, i.e., and , and to identify the vertices and edges with their indices.
Let be a hypergraph and . We call a set multicover if every vertex is contained in at least hyperedges of . is the problem of finding a set multicover with minimum cardinality.
For the later analysis we will use the following Chernoff–Hoeffding Bound inequality for a sum of independent random variables:
Theorem 2.1 (see )
Let be independent -random variables. Let . For every we have
A further useful concentration theorem we will use is the Chebychev-Cantelli inequality:
Theorem 2.2 (see , page 64)
Let be a non-negative random variable with finite mean and variance Var. Then for any it holds that
let . A -matching in is a set such that in no vertex of more than , hyperedges from are incident. The -matching problem is to find a maximum cardinality -matching. denotes this maximum cardinality, and is called the -matching number of .
We need the following duality theorem from combinatorics:
Theorem 2.3 (Ray-Chaudhuri, 1960 )
Consider a hypergraph with vertex degree for every . let such that for every , . A subset of hyperedges is a -matching in if and only if the subset is a -set multicover in . Furthermore is of maximum cardinality if and only if is of minimum cardinality.
Remark 1. Note that Theorem 2.3 holds also for hypergraphs with multi-sets i.e., hypergraphs with multiple hyperedges.
3 The randomized rounding algorithm
Let be a hypergraph with maximum vertex degree and maximum edge size . An integer, linear programming formulation of problem is the following:
is the vertex-edge incidence matrix of
is the given integer vector.
We define and .
The linear programming relaxation LP() of ILP() is given by relaxing the integrality constraints to for all . Let resp. be the value of an optimal solution to ILP() resp. LP(). Let be the optimal solution of the LP(). So and .
The next lemma shows that the greatest values of the LP variables correspondent to the incident edges for any vertex are all greater than or equal to .
Lemma 1 (see )
Let with . Let , such that . Then at least of the fulfill the inequality .
Our second lemma shows that the greatest values of the LP variables correspondent to the incident edges for any vertex are all greater than or equal to and with lemma 1 we summarize about the greatest values of the LP variables correspondent to the incident edges for any vertex .
Let with . Let , such that . Then at least of the fulfill the inequality and exists an element distinct of them all who fulfill the inequality .
W.L.O.G we suppose .
So we have
Since for all then for all .
Furthermore by lemma and the assumption on the orders of the variables , for all we have and particularly .
Let be a hypergraph and be the optimal solution of the LP(). Then is a set multicover such that .
Clearly with lemma 1, is a feasible set multicover.
Let and .
Note that and , so we have .
Let Then . Since we have
From this we can immediately deduce
3.1 The algorithm
In this section we present an algorithm with conditioned randomized rounding based on the properties satisfied by the two sets and .
Let us give a brief explanation of the ingredients of the algorithm SET -MULTICOVER.
We start with an empty set C, which will be extended to a feasible set multicover. First we solve the LP-relaxation LP() in polynomial time. Let and . The rest of the action depends on the following two cases.
If or : we pick in the cover all edges of the two sets and .
Recall that by lemma 1 the is a feasible set multicover.
If and : we use LP-rounding with randomization on the edges of the set , every edge of is independently picked in the cover with probability . To guarantee a feasible cover we proceed for a step of repairing.
3.2 Analysis of the algorithm
Case or .
Case . With the definition of the sets and we have
Case . We have
Next with 1 we have
Case and .
Let be -random variables defined as follows:
Note that the are independent for a given . For all we define the - random variables as follows:
We denote and respectively the cardinality of the cover and the cardinality of vertices fully covered before the step of repairing. At this step by lemma 2, one more edge for each vertex is at most needed to be fully covered. The cover denoted by C obtained by the algorithm 1 is bounded by
Our goal by the next lemma is to estimate the expectation of the random variable so is the expectation and variance of the random variable for the proof of the theorem 3.2. This is a restriction of Lemma in  to the last case in algorithm 1.
Let and be the maximum size of an edge resp. the maximum vertex degree, not necessarily constants. Let , and as in Algorithm 1. In case and we have
(i) Let , . If , then the vertex is fully covered and . Otherwise we get by Lemma 2 and . Therefore
(ii) We have
(iii) By using the LP relaxation and the definition of the sets and , and since for all , we get
Now we can get a lower bound for the expectation of
Let us consider
the subhypergraph induced by in witch degree equality gives
Since the minimum vertex degree in the subhypergraph is with , we have
With we obtain
Suppose such that .With we have
Choosing and we obtain
Therefore it holds that
Since we have