# Approximately counting and sampling small witnesses using a colourful decision oracle

In this paper, we prove "black box" results for turning algorithms which decide whether or not a witness exists into algorithms to approximately count the number of witnesses, or to sample from the set of witnesses approximately uniformly, with essentially the same running time. We do so by extending the framework of Dell and Lapinskas (STOC 2018), which covers decision problems that can be expressed as edge detection in bipartite graphs given limited oracle access; our framework covers problems which can be expressed as edge detection in arbitrary k-hypergraphs given limited oracle access. (Simulating this oracle generally corresponds to invoking a decision algorithm.) This includes many key problems in both the fine-grained setting (such as k-SUM, k-OV and weighted k-Clique) and the parameterised setting (such as induced subgraphs of size k or weight-k solutions to CSPs). From an algorithmic standpoint, our results will make the development of new approximate counting algorithms substantially easier; indeed, it already yields a new state-of-the-art algorithm for approximately counting graph motifs, improving on Jerrum and Meeks (JCSS 2015) unless the input graph is very dense and the desired motif very small. Our k-hypergraph reduction framework generalises and strengthens results in the graph oracle literature due to Beame et al. (ITCS 2018) and Bhattacharya et al. (CoRR abs/1808.00691).

## Authors

• 9 publications
• 5 publications
• 15 publications
01/13/2022

### Faster Counting and Sampling Algorithms using Colorful Decision Oracle

In this work, we consider d-Hyperedge Estimation and d-Hyperedge Sample ...
07/14/2017

### Fine-grained reductions from approximate counting to decision

The main problems in fine-grained complexity are CNF-SAT, the Orthogonal...
08/02/2018

### Triangle Estimation using Polylogarithmic Queries

Estimating the number of triangles in a graph is one of the most fundame...
05/04/2020

### Sampling Arbitrary Subgraphs Exactly Uniformly in Sublinear Time

We present a simple sublinear-time algorithm for sampling an arbitrary s...
08/07/2018

### Quantum Lower Bound for Approximate Counting Via Laurent Polynomials

We consider the following problem: estimate the size of a nonempty set S...
10/07/2018

### Counting homomorphisms in plain exponential time

In the counting Graph Homomorphism problem (#GraphHom) the question is: ...
11/22/2019

### An Efficient ε-BIC to BIC Transformation and Its Application to Black-Box Reduction in Revenue Maximization

We consider the black-box reduction from multi-dimensional revenue maxim...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Many decision problems reduce to the question: Does a witness exist? Such problems admit a natural counting version: How many witnesses exist? For example, one may ask whether a bipartite graph contains a perfect matching, or how many perfect matchings it contains. As one might expect, the counting version is never easier than the decision version, and is often substantially harder; for example, deciding whether a bipartite graph contains a perfect matching is easy, and counting the number of such matchings is #P-complete [42]. However, even when the counting version of a problem is hard, it is often easy to approximate well. For example, Jerrum, Sinclair and Vigoda [32] gave a polynomial-time approximation algorithm for the number of perfect matchings in a bipartite graph. The study of approximate counting has seen amazing progress over the last two decades, particularly in the realm of trichotomy results for general problem frameworks such as constraint satisfaction problems, and is now a major field of study in its own right [17, 18, 24, 27, 28]. In this paper, we explore the question of when approximating the counting version of a problem is not merely fast, but essentially as fast as solving the decision version.

We first recall the standard notion of approximation in the field: For all real and , we say that is an -approximation to if . Note in particular that any -approximation to zero is itself zero, so computing an -approximation to is always at least as hard as deciding whether holds. For example, it is at least as hard to approximately count the number of satisfying assignments of a CNF formula (i.e. to -approximate #Sat) as it is to decide whether it is satisfiable at all (i.e. to solve Sat).

Perhaps surprisingly, in many cases, the converse is also true. For example, Valiant and Vazirani [43] proved that any polynomial-time algorithm to decide Sat can be bootstrapped into a polynomial-time -approximation algorithm for #Sat, or, more formally, that a size- instance of any problem in #P can be -approximated in time using an NP-oracle. A similar result holds in the parameterised setting, where Müller [40] proved that a size- instance of any problem in #W[] with parameter can be -approximated in time using a W[]-oracle for some computable function . Another such result holds in the subexponential setting, where Dell and Lapinskas [14] proved that the (randomised) Exponential Time Hypothesis is equivalent to the statement: There is no -approximation algorithm for #3-Sat which runs on an -variable instance in time .

We now consider the fine-grained setting, which is the focus of this paper. Here, we are concerned with the exact running time of an algorithm, rather than broad categories such as polynomial time, FPT time or subexponential time. The above reductions all introduce significant overhead, so they are not fine-grained. Here only one general result is known, again due to Dell and Lapinskas [14]. Informally, if the decision problem reduces “naturally” to deciding whether an -vertex bipartite graph contains an edge, then any algorithm for the decision version can be bootstrapped into an -approximation algorithm for the counting version with only overhead. (See Section 1.1 for more details.)

The reduction of [14] is general enough to cover core problems in fine-grained complexity such as

Orthogonal Vectors

, 3SUM and Negative-Weight Triangle, but it is not universal. In this paper, we substantially generalise it to cover any problem which can be “naturally” formulated as deciding whether a -partite -hypergraph contains an edge; thus we essentially recover the original result on taking . For any problem which satisfies this property, our result implies that any new decision algorithm will automatically lead to a new approximate counting algorithm whose running time is at most a factor of larger. Our framework covers several reduction targets in fine-grained complexity not covered by [14], including -Orthogonal Vectors, -SUM and Exact-Weight -Clique, as well as some key problems in parameterised complexity including weight- CSPs and size- induced subgraph problems. (Note that the overhead of can be re-expressed as using a standard trick, so an FPT decision algorithm is transformed into an FPT approximate counting algorithm; see Section 1.3.)

In fact, we get more than fast approximate counting algorithms — we also prove that any problem in this framework has an algorithm for approximately-uniform sampling, again with overhead over decision. There is a well-known reduction between the two for self-reducible problems due to Jerrum, Valiant and Vazirani [33], but it does not apply in our setting since it adds polynomial overhead.

In the parameterised setting, our results have interesting implications. Here, the requirement that the hypergraph be -partite typically corresponds to considering the “colourful” or “multicolour” version of the decision problem, so our result implies that uncoloured approximate counting is essentially equivalent to multicolour decision. We believe that our results motivate considerable further study of the relationship between multicolour parameterised decision problems and their uncoloured counterparts.

Finally, we note that the applications of our results are not just complexity-theoretic in nature, but also algorithmic. They give a “black box” argument that any decision algorithm in our framework, including fast ones, can be converted into an approximate counting or sampling algorithm with minimal overhead. Concretely, we obtain new algorithms for approximately counting and/or sampling zero-weight subgraphs, graph motifs, and satisfying assignments for first-order models, and our framework is sufficiently general that we believe new applications will be forthcoming.

In Section 1.1, we set out our main results in detail as Theorems 1 and 2, and discuss our edge-counting reduction framework (which is of independent interest). We describe the applications of Theorems 1 and 2 to fine-grained complexity in Section 1.2, and their applications to parameterised complexity in Section 1.3.

### 1.1. The k-hypergraph framework

Given a -hypergraph , write , and let

 C(G):={(X1,…,Xk):X1,…,Xk are % disjoint subsets of V}.

We define the coloured independence oracle of to be the function such that if has no edges, and otherwise. Informally, we think of elements of as representing -colourings of induced subgraphs of , with being the ’th colour class; thus given a vertex colouring of an induced subgraph of , the coloured independence oracle outputs 1 if and only if no colourful edge is present. We consider a computation model where the algorithm is given access to and , but can only access via . We say that such an algorithm has coloured oracle access to , and for legibility we write it to have as an input. Our main result is as follows.

###### Theorem 1.

There is a randomised algorithm with the following behaviour. Suppose is an -vertex -hypergraph, and that Count has coloured oracle access to . Suppose and are rational with . Then, writing : in time , and using at most  queries to cIND, outputs a rational number

. With probability at least

, we have .

As an example of how Theorem 1 applies to approximate counting problems, consider the problem #-Clique of counting the number of cliques in an -vertex graph of size . We take to be the -hypergraph on vertex set whose hyperedges are precisely those size- sets which span cliques in . Thus -approximating the number of -cliques in corresponds to -approximating the number of hyperedges in . We may use a decision algorithm for -Clique with running time to evaluate in time , by applying it to an appropriate subgraph of (in which we delete all edges within each colour class ). Thus Theorem 1 gives us an algorithm for -approximating the number of -cliques in in time . Any decision algorithm for -Clique must read a constant proportion of its input, so we have and our overall running time is . It follows that any decision algorithm for -clique yields an -approximation algorithm for #-Clique with overhead only .

The polynomial dependence on in Theorem 1 is not surprising, as by taking and rounding we can obtain the number of edges of exactly. Thus if the dependence on were subpolynomial, Theorem 1 would essentially imply a fine-grained reduction from exact counting to decision. This is impossible under SETH in our setting; see [14, Theorem 3] for a more detailed discussion.

We extend Theorem 1 to approximately-uniform sampling as follows.

###### Theorem 2.

There is a randomised algorithm which, given a rational number with and coloured oracle access to an -vertex -hypergraph containing at least one edge, outputs either a random edge or Fail. For all , outputs with probability ; in particular, it outputs Fail with probability at most . Moreover, writing , runs in time and uses at most queries to .

We call the output of this algorithm an -approximate sample. Note that there is a standard trick using rejection sampling which, given an algorithm of the above form, replaces the factor in the running time by a factor; see [33]. Unfortunately, it does not apply to Theorem 2, as we do not have a fast way to compute the true distribution of Sample’s output.

By the same argument as above, Theorem 2 may be used to sample a size- clique from a distribution with total variation distance at most from uniformity with overhead only over decision. (We also note that it is easy to extend Theorems 1 and 2 to cover the case where the original decision algorithm is randomised, at the cost of an extra factor of in the number of oracle uses; we discuss this further in the full version.)

Theorems 1 and 2 are also of independent interest, generalising known results in the graph oracle literature. Our colourful independence oracles are a natural generalisation of the bipartite independent set (BIS) oracles of Beame et al. [6] to a hypergraph setting, and when the two notions coincide. Their main result [6, Theorem 4.9] says that given BIS oracle access to an -vertex graph , one can -approximate the number of edges of using BIS queries (which they take as their measure of running time). The case of Theorem 1 gives a total of queries used, improving their running time for most values of , and Theorem 2 extends their algorithm to approximately-uniform sampling.

When , our colourful independence oracles are similar to the tripartite independent set (TIS) oracles of Bhattacharya et al. [8]. (These oracles ask whether a 3-coloured graph contains a colourful triangle, rather than whether a 3-coloured 3-hypergraph contains a colourful edge. But if is taken to be the 3-hypergraph whose edges are the triangles of , then the two notions coincide exactly.) Their main result, Theorem 1, says that given TIS oracle access to an -vertex graph of maximum degree at most , one can -approximate the number of triangles in using at most TIS queries. Our Theorem 1 gives an algorithm which requires only TIS queries, with no dependence on , and which also generalises to approximately counting -cliques for all fixed . Again, Theorem 2 extends the result to approximately-uniform sampling.

We note in passing that the main result of [14] doesn’t quite fit into this setting, as it also makes unrestricted use of edge existence queries. It resembles a version of Theorem 1 restricted to and with slightly lower overhead in .

### 1.2. Corollaries in fine-grained complexity

In [14], fine-grained reductions from approximate counting to decision were shown for the problems Orthogonal Vectors, 3SUM and Negative-Weight Triangle (among others). The approximate counting procedure for -uniform hypergraphs in Theorem 1 allows us to generalize these reductions to -OV, -SUM, Zero-Weight -Clique, and other subgraph isomorphism problems. They also apply to model checking of first-order formulas with variables. In each case, Theorem 2 yields a corresponding result for approximate sampling of witnesses.

#### 1.2.1. First-order Formulas on Sparse Structures and k Orthogonal Vectors

We consider first-order formulas , that is, formulas of the form: . The variables  are the free variables of , each is a quantifier from , and is a quantifier-free Boolean formula over the variables . We consider first-order formulas in prenex-normal form with free variables and quantifier-rank at most ; let denote the set of all such formulas. The property testing problem for is, given a formula and a structure (e.g., the edge relation of a graph), to decide whether the formula is satisfiable in the structure, that is, whether there is an assignment to the free variables that makes the formula true. Correspondingly, the property counting problem is to count all satisfying assignments.

Model checking and property testing are important problems in logic and database theory, and have recently been studied in the context of fine-grained complexity [15, 25, 45]: Gao et al. [25] devise an algorithm for the property testing problem for that runs in time , where is the number of distinct tuples in the input relations. This improves upon an already slightly non-trivial algorithm.444The notation means . By using this improved decision algorithm as a black box, we obtain new algorithms for approximate counting (via Theorem 1) and approximate sampling (via Theorem 2). Note all our approximate counting algorithms work with probability at least ; this can easily be increased to in the usual way, i.e. running them times and taking the median result.

###### Corollary 3.

Fix , suppose an instance of property testing for can be solved in time , where is the size of the universe and is the number of tuples in the structure, and write for the set of satisfying assignments. Then there is a randomised algorithm to -approximate , or draw an -approximate sample from , in time .

In combination with the algorithm of Gao et al. [25], we can thus -approximately sample from the set of satisfying assignments to any -property in time . For example, this algorithm can be used to sample an approximately uniformly random solution tuple to a conjunctive query.

The -Orthogonal Vectors (-OV) problem is a specific example of a property testing problem, and has connections to central conjectures in fine-grained complexity theory [1, 25]. The problem asks, given  sets of Boolean vectors, whether there exist , …, such that . (The sum and product are the usual arithmetic operations over .) When are viewed as representing subsets of in the canonical manner, this condition is equivalent to requiring they have an empty intersection; when , it is equivalent to and being orthogonal. Any tuple satisfying the condition is called a witness. Clearly, -OV can be solved in time using exhaustive search. Gao et al. [25] stated the Moderate-Dimension -OV Conjecture, which says that -OV cannot be solved in time time for any . We show that any reasonable-sized improvement over exhaustive search carries over to approximate counting and sampling.

###### Corollary 4.

Fix , suppose an -vector -dimension instance of -OV can be solved in time , and write for the set of witnesses. Then there is a randomised algorithm to -approximate , or draw an -approximate sample from , in time .

Note that such an improvement is already known for 2-OV, which has an -time algorithm [3], although Chan and Williams [12] already generalised this to an exact counting algorithm.

#### 1.2.2. k-Sum

The -SUM problem has been studied since the 1990s as it arises naturally in the context of computational geometry, see for example [23], and it has become an important problem in fine-grained complexity theory [46]. For all integers , the -SUM problem asks, given a set of integers, whether some of them sum to zero. Each -subset of integers that does sum to zero is called a witness. While Kane, Lovett, and Moran [34]

very recently developed almost linear-size linear decision trees for

-SUM, the fastest known algorithm for this problem still runs in time , and as is ruled out under the exponential-time hypothesis [41]. We prove that any sufficiently non-trivial improvement over the best known decision algorithm carries over to approximate counting and witness sampling.

###### Corollary 5.

Fix , suppose an -integer instance of -SUM can be solved in time , and write for the set of witnesses. Then there is a randomised algorithm to -approximate , or draw an -approximate sample from , in time .

#### 1.2.3. Exact-Weight k-Clique and Other Subgraph Problems

Recall that Theorem 1 applies to the problem #-Clique. This observation generalizes to other subgraph problems as well. We consider weighted graph problems, where we are given a graph  with an edge-weight function . The weight of a clique  in  is the sum over all edges  with . The Exact-Weight -Clique problem is to decide whether there is a -clique  of weight exactly . It has been conjectured [1] that there is no real and integer such that the Exact-Weight -Clique problem on -vertex graphs and with edge-weights in can be solved in time . (For the closely related Min-Weight -Clique problem, a subpolynomial-time improvement over the exhaustive search algorithm is known [1, 44, 12], with running time .) Theorems 1 and 2 imply that any sufficiently non-trivial improvement on the running time of an Exact-Weight -Clique algorithm will carry over to the approximate counting and sampling versions of the problem.

###### Corollary 6.

Fix , suppose an -vertex -edge instance of Exact-Weight -Clique with weights in can be solved in time , and write for the set of zero-weight -cliques. Then there is a randomised algorithm to -approximate , or draw an -approximate sample from , in time .

There is a more general version of Exact-Weight -Clique which takes as input an edge-weighted -hypergraph and asks whether it contains a zero-weight -clique. A similar conjecture exists for this version of the problem [1], and Theorems 1 and 2 yield a result analogous to Corollary 6.

Our framework also applies to subgraphs more general than cliques. The Exact-Weight- problem asks, given an edge-weighted graph , whether there exists a subgraph of  that has weight zero and is isomorphic to . We say is a core if every homomorphism from  to

is also an automorphism. Cores are a rich class of graphs, including cliques, odd cycles, and (with high probability) any binomial random graph

with edge probability (see [10, Theorem 2]). Corollary 6 generalises to Exact-Weight- whenever is a core. In particular, Abboud and Lewi [2, Corollary 5] prove that Exact-Weight- can be solved in time , where is a graph parameter that is small whenever  has a balanced separator, so we obtain the following result.

###### Corollary 7.

Let be a core, let be an -vertex graph, and let be the set of zero-weight -subgraphs in . There is an algorithm to draw an -approximate sample from in time .

Our framework also applies to colourful subgraphs. The Colourful- problem asks, given a graph  and a vertex colouring , whether there contains a colourful copy of — that is, a subgraph isomorphic to containing one vertex from each colour class.

###### Corollary 8.

Let  be a fixed graph, suppose an -vertex -edge instance of Colourful- can be solved in time , and write for the set of colourful -subgraphs. Then there is a randomised algorithm to -approximate , or draw an -approximate sample from , in time .

Díaz, Serna, and Thilikos [16] show using dynamic programming that #Colourful- can be solved exactly in time , where is the treewidth of . Marx [37] asks whether it is possible to detect colourful subgraphs in time , and proves that is impossible under the exponential-time hypothesis (ETH). Our result shows that any algorithm to detect colourful subgraphs in time would essentially also have to approximately count these subgraphs — a more difficult task.

### 1.3. Corollaries in parameterised complexity

When considering approximation algorithms for parameterised counting problems, an “efficient” approximation scheme is an FPTRAS (fixed parameter tractable randomised approximation scheme), as introduced by Arvind and Raman [5]; this is the analogue of an FPRAS in the parameterised setting. An FPTRAS for a parameterised counting problem with parameter  is an algorithm that takes an instance of (with ) and a rational number , and in time (where is some computable function) outputs a rational number such that

 P[(1−ε)Π(I)≤z≤(1+ε)Π(I)]≥2/3.

Note that this definition is equivalent to that given in [5] which requires the failure probability to be at most , where is part of the input; repeating the process above times and returning the median solution allows us to reduce the error probability from to .

As mentioned above, a large number of well-studied problems in parameterised complexity fall within our -hypergraph framework; for standard notions in parameterised (counting) complexity we refer the reader to [21]. Observe that we can rewrite our overhead of in the form : if then , and if then . Thus we can consider this to be a “fine-grained FPT overhead”.

Theorems 1 and 2 can therefore be applied immediately to any self-contained -witness problem (see [39]); that is, any problem with integer parameter in which we are interested in the existence of witnesses consisting of -element subsets of some given universe, and we have the ability to quickly test whether any given -element set is such a witness. Examples include weight- solutions to CSPs, size- solutions to database queries, and sets of vertices in a (weighted) graph or hypergraph which induce a sub(hyper)graph with specific properties. This last example encompasses many of the best-studied problems in parameterised counting complexity, including the problem #Sub() (with parameter ) which asks for the number of subgraphs of isomorphic to ; the well-studied problems of counting -vertex paths, cycles and cliques are all special cases. More generally, we can consider the problem #Induced Subgraph with Property() (#ISWP()), introduced by Jerrum and Meeks [31], for any property .

However, our coloured independence oracle doesn’t quite correspond to deciding whether a witness exists: it needs to solve a multicolour version of the decision problem. The multicolour decision version of a self-contained -witness problem takes as input a universe together with a -colouring of the elements of , and asks whether there exists a witness which contains precisely one element of each colour. The following result is immediate from Theorems 1 and 2 on taking the vertex set of the hypergraph to be , the edges to be the -witnesses, and simulating the coloured independence oracle by invoking a multicolour decision algorithm.

###### Theorem 9.

Let be a self-contained -witness decision problem, and suppose that the multicolour version of can be solved in time when the universe has size . Let be a colouring, let be the set of (uncoloured) witnesses of , and let be the set of multicolour witnesses of with respect to . Then given and , in time , there is a randomised algorithm to -approximate or , or draw an -approximate sample from or .

Such multicolour problems have been studied before in the literature, including #MISWP, the multicolour version of #ISWP; see [38] for a survey of results relating the complexity of multicolour and uncoloured problems in this setting. In many cases, the multicolour decision problem reduces straightforwardly to the original decision problem — for example, if our witnesses are -vertex cliques in a graph. But this is not true in general; if our witnesses are -vertex cliques and -vertex independent sets, then the uncoloured decision problem admits a trivial FPT algorithm by Ramsey’s theorem [5], but the W[1]-complete problem -Clique reduces to the multicolour version [38]. In the restricted setting of Sub(,), it is straightforward to verify that the multicoloured and uncoloured versions of the problem are equivalent when the graph is a core555The reduction appears inside the proof of Lemma 27., but this is not known for general . In fact, a proof of equivalence would imply the long-standing dichotomy conjecture for the parameterised embedding problem (see [13] for recent progress on this conjecture). We believe that Theorem 9 motivates substantial further research into the complexity relationship between multicoloured problems and their uncoloured counterparts.

One consequence of Theorem 9 is that if MISWP() admits an FPT decision algorithm, then we obtain FPTRASes for both #MISWP() and #ISWP() with roughly the same running time as the original decision algorithm. This generalises a previous result of Meeks [38, Corollaries 4.8 and 4.10] which states that subject to standard complexity-theoretic assumptions, if we restrict our attention to properties that are preserved under adding edges, there is an FPTRAS for the counting problems #MISWP() and #ISWP() if and only if there is an FPT decision algorithm for MISWP(). Theorem 9 strengthens this result in two ways. Firstly, we no longer need the restriction that the property is preserved under adding edges, as we can now consider an arbitrary property . Secondly, we demonstrate a close relationship between the running-times for decision and approximate counting, meaning that any improvement in a decision algorithm immediately translates to an improved algorithm for approximate counting.

One example where Theorem 9 already gives an improvement (in almost all settings) to the previously best-known algorithm for approximate counting is the Graph Motif problem, introduced by Lacroix, Fernandes and Sagot [36] in the context of metabolic networks. This problem takes as input an -vertex -edge graph with a (not necessarily proper) vertex-colouring, together with a multiset of colours, and a solution is a subset of vertices such that the subset induced by is connected and the colour multiset of is exactly ; is called a motif, and we call a motif witness for .

There has been substantial progress in recent years on improving the running-time of decision algorithms for Graph Motif [7, 9, 19, 26, 35], with the fastest randomised algorithm [9] (based on constrained multilinear detection) running in time . For the counting version, Guillemot and Sikora [26] addressed the related problem of counting -vertex subtrees of a graph whose vertex set has colour multiset (which counts motif witnesses for weighted by the the number of trees spanned by ). They demonstrated that this problem admits an FPT algorithm for exact counting when is a set, but is #W[1]-hard otherwise. Subsequently, Jerrum and Meeks [31] addressed the more natural counting analogue of Graph Motif in which the goal is to count motif witnesses for without weights. They demonstrated that this problem is #W[1]-hard to solve exactly even if is a set, but gave an FPTRAS to solve it approximately. By using this FPTRAS together with Theorems 1 and 2, we prove the following.

###### Corollary 10.

Given an -vertex instance of Graph Motif with parameter and , there is a randomised algorithm to -approximate the number of motif witnesses or to draw an -approximate sample from the set of motif witnesses in time .

Theorem 9 also generalises a known relationship between the complexity of uncoloured approximate counting and multicolour decision in the special case of Sub(,). In this restricted setting, multicolour decision is actually equivalent to multicolour exact counting; there is an FPT algorithm to exactly count the number of multicolour solutions whenever the treewidth of is bounded by a constant, with essentially the same running time as the best-known decision algorithm [4]. On the other hand, even the multicolour decision problem is W[1]-hard if is restricted to any class of graphs with unbounded treewidth [38]. Alon et al. [29] essentially give a fine-grained reduction from uncoloured approximate counting to multicolour exact counting, giving an algorithm with running time matching the best-known algorithm for multicolour decision. (Note that their running time is slightly better than that obtained by applying Theorem 9, and that uncoloured exact counting is #W[1]-hard even when is a path or cycle [22].)

However, in general it is not true that multicolour exact counting is equivalent to multicolour decision — indeed, there are natural examples (such as counting -vertex subsets that induce connected subgraphs) in which the counting is #W[1]-hard but the decision is FPT [31]. Theorem 9 therefore strengthens [29], in the sense that if a faster multicolour decision algorithm is discovered then the improvement to the running time will immediately be carried over to uncoloured approximate counting, whether or not the new algorithm generalises to exact multicolour counting.

In this specific case, the existing decision algorithm turns out to already give an algorithm for exact counting with the same asymptotic complexity; however, there is no theoretical reason why the constant in the exponent could not be improved, and our results mean that any such improvement in a decision algorithm could immediately be translated to a faster algorithm for approximate counting.

Organisation. In the following section, we set out our notation and quote some standard probabilistic results for future reference. We then prove Theorem 1 in Section 3.2, using a weaker approximation algorithm which we set out in Section 4. We then prove Theorem 2 (using Theorem 1) in Section 5. Finally, we prove our assorted corollaries in Section 6; we emphasise that in general, the proofs in this section are easy and use only standard techniques.

## 2. Preliminaries

### 2.1. Notation

Let and let be a -hypergraph, so that each edge in has size exactly . We write . For all , we write for the subgraph induced by . If are disjoint, then we write for the -partite -hypergraph on whose edge set is . For all , we write for the degree of in . If , then we will sometimes write .

For all positive integers , we write . We write for the natural logarithm, and for the base-2 logarithm. Given real numbers and , we say that is an -approximation to if , and write . We extend this notation to other operations in the natural way, so that (for example) means that .

When stating quantitative bounds on running times of algorithms, we assume the standard randomised word-RAM machine model with logarithmic-sized words; thus given an input of size , we can perform arithmetic operations on -bit words and generate uniformly random -bit words in time.

Recall the definitions of and the coloured independence oracle of , and coloured oracle access from Section 1.1. Note that for all , is a restriction of cIND. Thus an algorithm with coloured oracle access to can safely call a subroutine that requires coloured oracle access to .

### 2.2. Probabilistic results

We use some standard results from probability theory, which we collate here for reference. The following lemma is commonly known as Hoeffding’s inequality.

###### Lemma 11 ([11, Theorem 2.8]).

Let

be independent real random variables, and suppose there exist

be such that with probability 1. Let . Then for all , we have

 P(|X−E(X)|≥t)≤2e−2t2/∑mi=1(bi−ai)2.\qed

The next lemma is a form of Bernstein’s inequality.

###### Lemma 12.

Let be independent real random variables. Suppose there exist and such that with probability 1, and for all . Let . Then for all , we have

 P(|X−E(X)|≥z)≤2exp(−3z26ν+2Mz).
###### Proof.

Apply [11, Corollary 2.11] to both and , taking and , then apply a union bound. ∎

The next lemma collates two standard Chernoff bounds.

###### Lemma 13 ([30, Corollaries 2.3-2.4]).

Suppose is a binomial or hypergeometric random variable with mean . Then:

1. for all , ;

2. for all , .∎

Our final lemma is a standard algebraic bound.

###### Lemma 14.

For all positive integers and with , we have .

###### Proof.

We have

 (2N−kN−k)/(2NN) =(2N−k)!N!(2N)!(N−k)!=k−1∏i=0(N−i)/k−1∏j=0(2N−j) ≥(N−k+12N−k+1)k=(12−k−12(2N−k+1))k ≥2−k(1−kN)k≥2−k(1−k2N)≥2−k−1.

## 3. The main algorithm

In this section we prove our main approximate counting result, Theorem 1. We will make use of an algorithm with a weaker approximation guarantee, whose properties are stated in Lemma 18; we will prove this lemma in Section 4.

### 3.1. Sketch proof

We first sketch a toy argument for the purpose of illustration. Suppose for convenience that our input hypergraph has vertices for some integer . Let be a suitably large integer, and take independent uniformly random subsets subject to for all . It is not hard to show using Lemma 14 that for all . Thus, using Hoeffding’s inequality (Lemma 11), we can show that the total number of edges is concentrated around its mean of roughly . It follows that, with high probability, .

Repeating this expansion procedure yields the following (bad) algorithm. We maintain a list of pairs , where is positive and , and we preserve the invariant with high probability. (We expect the quality of approximation to degrade as the algorithm runs, but we ignore this subtlety in our sketch.) Initially, we take , which clearly satisfies this invariant. At each stage, for each pair , we independently choose uniformly random subsets subject to for all , as above. We then delete from and replace it by . Thus, as we proceed, grows, but the sets in ’s entries become smaller, and the invariant is maintained. Eventually, the entries of become so small that for all , we can use cIND to count quickly by brute force, and at this point we are done.

The problem with the algorithm described above is that in order to maintain the invariant with high probability, we must take , and to bring the vertex sets in down to a manageable size we require expansion operations. Thus our final list will have length , resulting in an algorithm with superpolynomial running time. We avoid this problem by exploiting a statistical technique called importance sampling, previously applied to the case by Beame et al. [6]

. Given a coarse estimate of each

, which need only be accurate to within a large multiplicative factor, this technique allows us to prune to a manageable length in time, while maintaining the invariant with high probability. Our algorithm for this, Trim, gives a substantially shorter list than the algorithm used in [6], thereby improving our running time.

To use this technique, we need the ability to find such coarse estimates. Beame et al. [6] gave a method to find these in the case, which we substantially generalise to apply in our setting. Details of our coarse approximation algorithm, Coarse, can be found in Section 4.

Unlike [6], we also use these coarse estimates to improve the efficiency of our expansion procedure. The algorithm described above treats all pairs equally, expanding each one into smaller pairs. Thus grows by a factor of in a single expansion step. Our real algorithm will work differently. For each pair , we will choose the number of replacement pairs according to our coarse estimate of . We will take to be large if accounts for a large proportion of , and small otherwise; thus we only spend a lot of time processing a pair if it is “important” (see Halve in Section 3). This optimisation, together with the improved importance sampling procedure discussed above, drops our running time by a factor of roughly . We therefore improve the results of [6] even in the case.

### 3.2. The main algorithm

We first prove a technical lemma, which should be read as follows. We are given the ability to sample from bounded probability distributions

on . We wish to estimate the sum of their means using as few samples as possible, and we are given access to a crude estimate of the mean of each with multiplicative error (for “bias”). Lemma 15 says that we can do so to within relative error , with failure probability at most , by sampling times from for each . We will use this lemma in both Trim and Halve.

###### Lemma 15.

Let , let , and let . For all , let be a probability distribution on with mean . For all , let satisfy , and let

 ti=⌈4bMilog(2/δ)ξ2∑j^μj⌉.

Let be independent random variables with . Then with probability at least ,

 q∑i=1ti∑j=1Xi,jti∈(1±ξ)q∑i=1μi.

Note that while Lemma 15 does not require a lower bound on , without one it is useless as may be arbitrarily large. When we apply Lemma 15, we will do so with for all .

###### Proof.

We will apply a form of Bernstein’s inequality (Lemma 12). Let

 X=q∑i=1ti∑j=1Xi,jti,x=q∑i=1μi.

Thus we seek to prove . Note that , and that

 q∑i=1ti∑j=1E((Xi,j/ti)2)≤q∑i=1ti∑j=11t2iE(MiXi,j)=q∑i=1Miμiti.

Let , so that for all . Then by Lemma 12, applied to the variables and with , it follows that

 P(|X−x|≥ξx) ≤2exp(−3ξ2x26∑iMiμiti+2Mξx)≤2exp(−3ξ2x22max{6∑iMiμiti, 2Mξx}) (1) =max{2exp(−ξ2x24∑iMiμiti), 2exp(−3ξx4M)}.

We now bound the exponents of each term in the max. By our choice of ’s, we have

Since for all , we have , so

 (2) ξ2x24∑iMiμiti≥log(2/δ)≥ln(2/δ).

Moreover, again by our choice of ’s we have

 M=max{Miti:i∈[q]}≤max{ξ2∑j^μj4blog(2/δ):i∈[q]}≤ξ2x4log(2/δ),

so

 (3) 3ξx4M≥3log(2/δ)ξ>ln(2/δ).

The result therefore follows from (1), (2) and (3). ∎

Recall from our sketch proof in Section 3.1 that our algorithm will maintain a weighted list of induced subgraphs of steadily decreasing size. For convenience, we will also include coarse estimates of the edge count of each graph in . Rather than set out the format of this list each time we use it, we define it formally now.

###### Definition 16.

Let be a hypergraph, let be an integer, and let be rational. Then a -list is a list of triples such that and are positive rational numbers, with , and . For any -list , we define

 Z(L) :=∑(w,S,^e)∈Lwe(G[S]).

Initially, we will take where , so that . As the algorithm progresses, will remain a good approximation to , and eventually we will be able to compute it efficiently. We are now ready to set out our importance sampling algorithm, Trim, which we will use to keep the length of low.

Algorithm . Input: is an -vertex -hypergraph, where is a power of 2, to which Trim has (only) coloured oracle access. is a rational number with , and is a positive integer. is a -list with and . is a rational number with , and is a rational number with . Behaviour: outputs a -list satisfying the following properties. . With probability at least , .   Calculate and
(Every significant entry of will be contained in exactly one , and entries satisfy .)
For each , calculate
For each , calculate a multiset as follows. If , let . Otherwise, sample entries from independently and uniformly at random, let , and let . Form by concatenating the multisets in arbitrary order, and return .

Note that Trim improves significantly on the importance sampling algorithm of [6, Lemma 2.5], which in this setting outputs a list of length .

###### Lemma 17.

behaves as claimed above, has running time , and does not invoke cIND.

###### Proof.

Running time. It is clear that does not invoke cIND. Recall that we work with the word-RAM model, so we can carry out elementary arithmetic operations on -sized numbers in time. Thus step (T1) takes time , and step (T2) takes time . Since , steps (T3) and (T4) take time . The required bounds follow.

Correctness. Since every entry of is an entry of (perhaps with a different first element), and is a -list, is also a -list. We next prove (a). We have

 |L′| =∑|i|≤a|L′i|≤∑|i|≤ati≤∑|i|≤a(1+16b22i|Li|log(2/δ)ξ2W) (4) =2a+1+16b2log(2/δ)ξ2W∑|i|≤a2i|Li|.

Recall from the definition of that, for all , we have , so

 ∑|i|≤a2i|Li|≤∑|i