1 Introduction
A prominent class of problems in algorithmic graph theory consist of finding a subgraph with certain properties in an input graph , if one exists. Some variations of this problem can be solved in polynomial time (detecting a triangle), while the general problem is NPcomplete since it generalizes the Clique problem. In recent years, there has been an increasing interest in understanding the complexity of such subgraph detection problems in weighted graphs, where either the vertices or the edges are assigned integral weight values, and the goal is either to find a subgraph of a given form which optimizes the total weight of its elements, or alternatively, to find a subgraph whose total weight matches a prescribed value.
Incorporating weights in the problem definition can have a significant effect on computational complexity. For example, determining whether an unweighted vertex graph has a triangle can be done in time (where is the exponent of matrix multiplication) [16], while for the analogous weighted problem of finding a triangle of minimum edgeweight, no algorithm of running time is known for any . Some popular conjectures in finegrained complexity theory even postulate that no such algorithms exist [31]. Weights also have an effect on the bestpossible exponential running times of algorithms solving NPhard problems: the currentfastest algorithm for the NPcomplete Hamiltonian Cycle problem in undirected graphs runs in time [4], while for its weighted analogue, Traveling Salesperson, no algorithm with running time is known for general undirected graphs (cf. [24]).
In this work we investigate how the presence of weights in a problem formulation affects the compressibility and kernelization complexity of NPhard problems. Kernelization is a subfield of parameterized complexity [7, 10] that investigates how much a polynomialtime preprocessing algorithm can compress an instance of an NPhard problem, without changing its answer, in terms of a chosen complexity parameter.
For a motivating example of kernelization, we consider the Vertex Cover problem. For the unweighted variant, a kernelization algorithm based on the NemhauserTrotter theorem [26] can efficiently reduce an instance of the decision problem, asking whether has a vertex cover of size at most , to an equivalent one consisting of at most vertices, which can therefore be encoded in bits via its adjacency matrix. In the language of parameterized complexity, the unweighted Vertex Cover problem parameterized by the solution size admits a kernelization (selfreduction) to an equivalent instance on bits. For the weighted variant of the problem, where an input additionally specifies a weight threshold and a weight function on the vertices, and the question is whether there is a vertex cover of size at most and weight at most , the guarantee on the encoding size of the reduced instance is weaker. Etscheid et al. [11, Thm. 5] applied a powerful theorem of Frank and Tardös [13] to develop a polynomialtime algorithm to reduce any instance of Weighted Vertex Cover to an equivalent one with edges, which nevertheless needs bits to encode due to potentially large numbers occurring as vertex weights. The Weighted Vertex Cover problem, parameterized by solution size , therefore has a kernel of bits.
The overhead in the kernel size for the weighted problem is purely due to potentially large weights. This led Etscheid et al. [11] to ask in their conclusion whether this overhead in the kernelization sizes of weighted problems is necessary, or whether it can be avoided. As one of the main results of this paper, we will prove a lower bound showing that the kernelization complexity of some weighted problems is strictly larger than their unweighted counterparts.
Our results
We consider an edgeweighted variation of the Clique problem, parameterized by the number of vertices :
ExactEdgeWeight Clique (EEWC) Input: An undirected graph , a weight function , and a target . Question: Does have a clique of total edgeweight exactly , i.e., a vertex set such that for all distinct and such that ?
Our formulation of EEWC does not constrain the cardinality of the clique. This formulation will be convenient for our purposes, but we remark that by adjusting the weight function it is possible to enforce that any solution clique has a prescribed cardinality. Through such a cardinality restriction we can obtain a simple reduction from the problem with potentially negative weights to equivalent instances with weights from , by increasing all weights by a suitably large value and adjusting according to the prescribed cardinality. Note that an instance of EEWC can be reduced to an equivalent one where has all possible edges, by simply inserting each nonedge with a weight of . Hence the difficulty of the problem stems from achieving the given target weight as the total weight of the edges spanned by , not from the requirement that must be a clique.
EEWC is a natural extension of ZeroWeight Triangle [1], which has been studied because it inherits finegrained hardness from both 3Sum [33] and All Pairs Shortest Paths [30, Footnote 3]. EEWC has previously been considered by Abboud et al. [2] as an intermediate problem in their W[1]membership reduction from Sum to Clique. VassilevskaWilliams and Williams [33] considered a variation of this problem with weights drawn from a finite field. The related problem of detecting a triangle of negative edge weight is central in the field of finegrained complexity for its subcubic equivalence [32] to All Pairs Shortest Paths. Another example of an edgeweighted subgraph detection problem with an exact requirement on the weight of the target subgraph is ExactEdgeWeight Perfect Matching, which can be solved using algebraic techniques [23, §6] and has been used as a subroutine in subgraph isomorphism algorithms [22, Proposition 3.1].
The unweighted version of EEWC, obtained by setting all edge weights to , is NPcomplete because it is equivalent to the Clique problem. When using the number of vertices as the complexity parameter, the problem admits a kernelization of size obtained by simply encoding the instance via its adjacency matrix. We prove the following lower bound, showing that the kernelization complexity of the edgeweighted version is a factor larger. The lower bound even holds against generalized kernelizations (see Definition 2).
The ExactEdgeWeight Clique problem parameterized by the number of vertices does not admit a generalized kernelization of bits for any , unless .
Intuitively, the lower bound exploits the fact that the weight value of each of the edges in the instance may be a large integer requiring bits to encode. We also provide a randomized kernelization which matches this lower bound.
There is a randomized polynomialtime algorithm that, given an vertex instance of ExactEdgeWeight Clique, outputs an instance of bitsize , in which each number is bounded by , that is equivalent to
with probability at least
. Moreover, if the input is a YESinstance, then the output is always a YESinstance.The proof is based on the idea that taking the weight function modulo a random prime preserves the answer to the instance with high probability. We adapt the argument by Harnik and Naor [14] that it suffices to pick a prime of magnitude . As a result, each weight can be encoded with just bits.
It is noteworthy that the algorithm above can produce only false positives, therefore instead of using randomization we can turn it into a conondeterministic algorithm which guesses the correct values of the random bits. The framework of crosscomposition excludes not only deterministic kernelization, but also conondeterministic [9], thus the lower bound from Theorem 1 indeed makes the presented algorithm tight.
Together, Theorems 1 and 1 pin down the kernelization complexity of ExactEdgeWeight Clique, and prove it to be a factor larger than for the unitweight case. For Clique, the kernelization of bits due to adjacencymatrix encoding cannot be improved to for any , as was shown by Dell and van Melkebeek [9].
We extend our results to the hypergraph setting, which is defined as follows: given a regular hypergraph () with nonnegative integer weights on the hyperedges, and a target value , test if there is a vertex set for which each size subset is a hyperedge (so that is a hyperclique) such that the sum of the weights of the hyperedges contained in is exactly . By a bootstrapping reduction using Theorem 1, we prove that ExactEdgeWeight Uniform Hyperclique does not admit a generalized kernel of size for any unless , while the randomized hashing technique yields a randomized kernelization of size .
We can view the edgeweighted (hyper)clique problem on as a weighted constraint satisfaction problem (CSP) with weights from
, by introducing a binary variable for each vertex, and a weighted constraint for each subset
of vertices, which is satisfied precisely when all variables for are set to true. If is a (hyper)edge then the weight of the constraint on equals the weight of ; if is not a hyperedge of , then the weight of the constraint on is set to to prevent all its vertices from being simultaneously chosen. Under this definition, has a (hyper)clique of edgeweight if and only if there is an assignment to the variables for which the total weight of satisfied constraints is . Via this interpretation, the lower bounds for EEWC yield lower bounds on the kernelization complexity of weighted variants of CSP. We employ a recently introduced framework [18] of reductions among different CSPs whose constraint languages have the same maximum degree of their characteristic polynomials, to transfer these lower bounds to other CSPs (see Section 3.3 for definitions). We obtain tight kernel bounds when parameterizing the exactsatisfactionweight version of CSP by the number of variables, again using random prime numbers to obtain upper bounds. Our lower bounds for ExactEdgeWeight Uniform Hyperclique transfer to all CSPs with degree . In degree1 CSP each constraint depends on exactly one variable, therefore its exactweighted variant is equivalent to the Subset Sum problem, for which we also provide a tight lower bound.[] Subset Sum parameterized by the number of items does not admit a generalized kernelization of size for any , unless .
Theorem 1 tightens a result of Etscheid et al. [11, Theorem 14], who ruled out (standard) kernelizations for Subset Sum of size assuming the Exponential Time Hypothesis. Our reduction, conditioned on the incomparable assumption , additionally rules out generalized kernelizations that compress into an instance of a potentially different problem. Note that the new lower bound implies that the input data in Subset Sum cannot be efficiently encoded in a more compact way, whereas the previous lower bound relies on the particular way the input is encoded in the natural formulation of the problem. On the other hand, a randomized kernel of size is known [14].
The results described so far characterize the kernelization complexity of broad classes of weighted constraint satisfaction problems in which the goal is to find a solution for which the total weight of satisfied constraints is exactly equal to a prescribed value. We also broaden our scope and investigate the maximization or minimization setting, in which the question is whether there is a solution whose cost is at least, or at most, a prescribed value. Some of our upperbound techniques can be adapted to this setting: using a procedure by Nederlof, van Leeuwen and de Zwaan [25] a maximization problem can be reduced to a polynomial number of exact queries. This leads, for example, to a Turing kernelization (cf. [12]) for the weightmaximization version of Uniform Hyperclique which decides an instance in randomized polynomial time using queries of size to an oracle for an auxiliary problem. We do not have lower bounds in the maximization regime.
In an attempt to understand the relative difficulty of obtaining an exact target weight versus maximizing the target weight, we finally investigate different models of weight reduction for the Weighted Vertex Cover problem studied extensively in earlier works [6, 11, 25]. We consider the problem on bipartite graphs, where an optimal solution can be found in polynomial time, but we investigate whether a weight function can be efficiently compressed while either preserving (a) the collection of minimumweight vertex covers, or (b) the relative ordering of total weight for all inclusionminimal vertex covers. We give a polynomialtime algorithm for case (a) which reduces to a weight function with range using a relation to matchings, but show that in general it is impossible to achieve (b) with a weight function with range , by utilizing lower bounds on the number of different threshold functions.
Organization
We begin with short preliminaries with the crucial definitions. We prove our main Theorem 1 in Section 3 by presenting a crosscomposition of degree 3 into ExactEdgeWeight Clique and employing it to obtain kernelization lower bounds for uniform hypergraphs for . This section also contains the kernelization lower bound for Subset Sum as well as generalization of these results to Boolean CSPs. Next, in Section 4 we focus on bipartite Weighted Vertex Cover and the difficulty of compressing weight functions. The proofs of statements marked with are located in the appendix. The kernel upper bounds, including the proof of Theorem 1, together with Turing kernelization for maximization problems, are collected in Appendix B.
2 Preliminaries
We denote the set of natural numbers including zero by , and the set of positive natural numbers by . For positive integers we define . For a set and integer we denote by the collection of all size subsets of . All logarithms we employ have base . Given a set and a weight function , for a subset we denote .
All graphs we consider are undirected and simple. A (standard) graph has a vertex set and edge set . For , a uniform hypergraph consists of a vertex set and a set of hyperedges , that is, each hyperedge is a set of exactly vertices. Hence a uniform hypergraph is equivalent to a standard graph. A clique in a uniform hypergraph is a vertex set such that for each we have : each possible hyperedge among the vertices of is present. A vertex cover for a graph is a vertex set containing at least one endpoint of each edge. A vertex cover is inclusionminimal if no proper subset is a vertex cover.
Parameterized complexity
A parameterized problem is a subset of , where is a finite alphabet.
Let be parameterized problems and let be a computable function. A generalized kernel for into of size is an algorithm that, on input , takes time polynomial in and outputs an instance such that:

and are bounded by , and

if and only if .
The algorithm is a kernel for if . It is a polynomial (generalized) kernel if is a polynomial.
[Linearparameter transformations] Let and be parameterized problems. We say that is linearparameter transformable to , if there exists a polynomialtime computable function , such that for all , (a) if and only if and (b) . The function is called a linearparameter transformation.
We employ a linearparameter transformation for proving the lower bound for Subset Sum. For other lower bounds we use the framework of crosscomposition [5] directly.
[Polynomial equivalence relation, [5, Def. 3.1]] Given an alphabet , an equivalence relation on is called a polynomial equivalence relation if the following conditions hold.
There is an algorithm that, given two strings , decides whether and belong to the same equivalence class in time polynomial in .
For any finite set the equivalence relation partitions the elements of into a number of classes that is polynomially bounded in the size of the largest element of .
[Degree crosscomposition] Let be a language, let be a polynomial equivalence relation on , and let be a parameterized problem. A degreed ORcrosscomposition of into with respect to is an algorithm that, given instances of belonging to the same equivalence class of , takes time polynomial in and outputs an instance such that:
the parameter is bounded by , where is some constant independent of , and
if and only if there is an such that .
[[5, Theorem 3.8]] Let be a language that is NPhard under Karp reductions, let be a parameterized problem, and let be a real number. If has a degree ORcrosscomposition into and parameterized by has a polynomial (generalized) kernelization of bitsize , then .
3 Kernel lower bounds
3.1 ExactEdgeWeight Clique
In this section we show that ExactEdgeWeight Clique parameterized by the number of vertices in the given graph does not admit a generalized kernel of size , unless . We use the framework of crosscomposition to establish a kernelization lower bound [5]. We will use the NPhard RedBlue Dominating Set (RBDS) as a starting problem for the crosscomposition. Observe that RBDS is NPhard because it is equivalent to Set Cover and Hitting Set [20].
RedBlue Dominating Set (RBDS) Input: A bipartite graph with a bipartition of into sets (red vertices) and (blue vertices), and a positive integer . Question: Does there exist a set with such that every vertex in has at least one neighbor in ?
The following lemma forms the heart of the lower bound. It shows that an instance of EEWC on vertices can encode the logical OR of a sequence of instances of size each. Roughly speaking, this should be interpreted as follows: when , each of the roughly edge weights of the constructed graph encodes useful bits of information, in order to allow the instance on edges to represent all inputs.
There is a polynomialtime algorithm that, given integers and a set of instances of RBDS such that and for each , constructs an undirected graph , integer , and weight function such that:

the graph contains a clique of total edgeweight exactly if and only if there exist such that has a redblue dominating set of size at most ,

the number of vertices in is , and

the values of and depend only on , and .
Proof.
We describe the construction of ; it will be easy to see that it can be carried out in polynomial time. Label the vertices in each set arbitrarily as , and similarly label the vertices in each set as . We construct a graph with edgeweight function and integer such that has a clique of total edge weight exactly if and only if some is a YESinstance of RBDS
. In the following construction we interpret edge weights as vectors of length
written in base , which will be converted to integers later. Starting from an empty graph, we construct as follows; see Figure 1.
For each , create a vertex . The vertices form an independent set, so that any clique in contains at most one vertex .

For each , create a vertex set and insert edges of weight between all possible pairs of .

For each , create a vertex . The vertices form an independent set, so that any clique in contains at most one vertex .

For each , for each , insert an edge between and of weight .
The next step is to ensure that the neighborhood of a vertex in is captured in the weights of the edges which are incident on in .

For each , for each , insert an edge between and .

The weight of each edge is a vector of length , out of which the least significant positions are divided into blocks of length each, and the most significant position is 1. The numbering of blocks as well as positions within a given block start with the least significant position.
For each , for each , the weight of edge is defined as follows. For each , for each , the value represents the value of the position of the block of the weight of . The value is defined based on the neighborhood of vertex in as follows:
(1) Intuitively, the vector representing the weight of edge is formed by a 1 followed by the concatenation of blocks of length , such that the block is the incidence vector describing which of the blue vertices of instance are adjacent to .
Note that the blue vertices of an input instance are represented by a single blue vertex in . The difference between distinct blue vertices is encoded via different positions of the weight vectors. The most significant position of the weight vectors, which is always set to for edges of the form , will be used to keep track of the number of red vertices in a solution to RBDS.
The graph constructed so far has a mechanism to select the first index of an instance (by choosing a vertex ), to select the second index (by choosing vertices ), and to select the third index (by choosing a vertex ). The next step in the construction adds weighted edges , of which a solution clique in will contain exactly one. The weight vector for this edge is chosen so that the domination requirements from all RBDS instances whose third index differs from (and which are therefore not selected) can be satisfied “for free”.

For each , insert an edge between and .

As in Step 6, the weight of the edge is a ()tuple consisting of the most significant position followed by blocks of length . There is a at the most significant position, block consists of zeros, and the other blocks are filled with ones. Hence the weight of the edge is independent of .
To be able to ensure that has a clique of exactly weight if some input instance
has a solution, we need to introduce padding numbers which may be used as part of the solution to
EEWC.
For each position of a weight vector, add a vertex set to . Recall that is the upper bound on the solution size for RBDS.

For each , for each , for each , add an edge . The weight of edge has value 1 at the position and zeros elsewhere.

For each , for each , add an edge of weight for all , i.e., for all vertices which were not already adjacent to .
We define the target weight to be the length vector with value at each position, which satisfies Condition 3. Observe that has vertices: Steps 1 and 3 contribute vertices, Step 2 contributes , and Step 9 contributes . Hence Condition 2 is satisfied. It remains to verify that has a clique of total edge weight exactly if and only if some input instance has a solution of RedBlue Dominating Set of size at most . Before proving this property, we show the following claim which implies that no carries occur when summing up the weights of the edges of a clique in .
For any clique , for any position of a weight vector, there are at most edges of the clique whose weight vector has a at position , and all other weight vectors are at position . By construction, the entries of the vector encoding an edge weight are either or .
By Steps 1 and 3, a clique in contains at most one vertex and one vertex . Since does not have edges between vertices in distinct sets and by Step 2, any clique in consists of at most one vertex , one vertex , a subset of one set , and a subset of . For any fixed position , the only edgeweight vectors which can have a at position are the edges from to , the edge , and the edges between and . As this yields edges that possibly have a at position , the claim follows.
The preceding claim shows that when we convert each edgeweight vector to an integer by interpreting the vector as its baserepresentation, then no carries occur when computing the sum of the edgeweights of a clique. Hence the integer edgeweights of a clique sum to the integer represented by vector , if and only if the edgeweight vectors of the edges in sum to the vector . In the remainder, it therefore suffices to prove that there is a YESinstance of RBDS among the inputs if and only if has a clique whose edgeweight vectors sum to the vector . We prove these two implications.
If some input graph has a redblue dominating set of size at most , then has a clique of edgeweight exactly . Let of size at most be a dominating set of . We define a vertex set as follows. Initialize , and for each vertex , add the corresponding vertex to .
We claim that is a clique in . To see this, note that is a clique by Step 2. Vertex is adjacent to all vertices of by Step 4. Vertex is adjacent to all vertices of by Step 5. By Step 8 there is an edge between and .
Let us consider the weight of clique . Since is a dominating set of , if we sum up the weight vectors of the edges for , then by Step 6 we get a value of at least one at each position of block . The most significant position of the resulting sum vector has value . By Step 8 the weight vector of the edge consists of all ones, except for block and the most significant position, where the value is zero. Thus adding the edge weight of to the previous sum ensures that each block has value at least everywhere, whereas the most significant position has value . All other edges spanned by have weight . Letting denote the vector obtained by summing the weights of the edges of clique , we therefore find that has value as its most significant position and value at least everywhere else.
Next we add some additional vertices to the set to get a clique of weight exactly . By Step 11, vertices from the sets for are adjacent to all other vertices in the graph and can be added to any clique. All edges incident on a vertex have weight , except the edges to vertices of the form whose weight vector has a at the position and elsewhere. Since contains exactly one such vertex , for any we can add up to vertices from to increase the weight sum at position from its value of at least in , to a value of exactly . Hence has a clique of edgeweight exactly .
If has a clique of edgeweight exactly , then some input graph has a redblue dominating set of size at most . Suppose is a clique whose total edge weight is exactly . Note that only edges for which one of the endpoints is of the form for have positive edge weights. The remaining edges all have weight . Also, by Step 1 there is at most one vertex in . Hence since there is exactly one vertex in . By Step 9 and 10, the edges of type for contribute at most to the value of each position of the sum. Hence for each position there is an edge in clique of the form or which has a at position . We use this to show there is an input instance with a redblue dominating set of size at most .
By Step 3, there is at most one vertex in . Let if , and otherwise let be the unique vertex in . Since the weight of the edge has zeros in block by Step 8, our previous argument implies that for each of the positions of block , there is an edge in clique of the form whose weight has a at that position. Hence contains at least one vertex, and by Step 2 all vertices in the clique are contained in a single set . We show that has a redblue dominating set of size at most . Let . Since for each of the positions of block there is an edge in with a at that position, by Step 5 each blue vertex of has a neighbor in . Hence is a redblue dominating set. By Step 5, the most significant position of each edge incident on has value . As the most significant position of the target is set to , it follows that , which proves that has a redblue dominating set of size at most . This completes the proof of Lemma 3.1. ∎
Lemma 3.1 forms the main ingredient in a crosscomposition that proves kernelization lower bounds for ExactEdgeWeight Clique and its generalization to hypergraphs. For completeness, we formally define the hypergraph version as follows.
ExactEdgeWeight Uniform Hyperclique (EEWHC) Input: A uniform hypergraph , weight function , and a positive integer . Question: Does have a hyperclique of total edgeweight exactly ?
The following theorem generalizes Theorem 1. The case of the theorem follows almost directly from Lemma 3.1 and Theorem 2, as the construction in the lemma gives the crucial ingredient for a degree crosscomposition. For larger , we essentially exploit the fact that increasing the size of hyperedges by one allows one additional dimension of freedom, as has previously been exploited for other kernelization lower bounds for Hitting Set and Set Cover [8, 9]. The proof is given in Appendix A.1.
[] For each fixed , ExactEdgeWeight Uniform Hyperclique parameterized by the number of vertices does not admit a generalized kernel of size for any , unless .
3.2 Subset Sum
We show that Subset Sum parameterized by the number of items does not have generalized kernel of bitsize for any , unless . We prove the lower bound by giving a linearparameter transformation from Exact RedBlue Dominating Set. We use Exact RedBlue Dominating Set rather than RedBlue Dominating Set as our starting problem for this lower bound because it will simplify the construction: it will avoid the need for ‘padding’ to cope with the fact that vertices are dominated multiple times.
The Subset Sum problem is formally defined as follows.
Subset Sum (SS) Parameter: Input: A multiset of positive integers and a positive integer . Question: Does there exist a subset with ?
We use the following problem as the starting point of the reduction.
Exact RedBlue Dominating Set (ERBDS) Parameter: Input: A bipartite graph with a bipartition of into sets (red vertices) and (blue vertices), and a positive integer . Question: Does there exist a set of size exactly such that every vertex in has exactly one neighbor in ?
Jansen and Pieterse proved the following lower bound for ERBDS. [[17, Thm. 4.9]] Exact RedBlue Dominating Set parameterized by the number of vertices does not admit a generalized kernel of size unless .
Actually, the lower bound they proved is for a slightly different variant of ERBDS where the solution is required to have size at most , instead of exactly . Observe that the variant where we demand a solution of size exactly is at least as hard as the at most version: the latter reduces to the former by inserting isolated red vertices. Therefore the lower bound by Jansen and Pieterse also works for the version we use here, which will simplify the presentation.
See 1
Proof.
Given a graph with a bipartition of into and with , , and target value for ERBDS, we transform it to an equivalent instance of SS such that . We start by defining numbers in base . For each , the number consists of digits. We denote the digits of the number by , where is the least significant and is the most significant digit. Intuitively, the number corresponds to the red vertex . See Figure 2 for an illustration.
For each , for each , digit of number is defined as follows:
(2) 
Hence the most significant digit of each number is , and the remaining digits of number form the vector indicating to which of the blue vertices is adjacent in .
To complete the construction we set and we define as follows:
(3) 
Observe that under these definitions, there are no carries when adding up a subset of the numbers in , as each digit of each of the numbers is either or and we work in base .
The number of items in the constructed instance of SS is , linear in the parameter of ERBDS. It is easy to see that the construction can be carried out in polynomial time. To complete the linearparameter transformation from ERBDS to SS, it remains to prove that has a set of size exactly such that every vertex in has exactly one neighbor in , if and only if there exist a set with .
In the forward direction, suppose that there exists a set of size exactly such that every vertex in has exactly one neighbor in . We claim that is a solution to SS. The resulting sum has value at the most significant digit since . All other digits correspond to vertices in . Since each blue vertex is adjacent to exactly one vertex from it is easy to verify that all remaining digits of the sum are exactly one, implying that the numbers sum to exactly .
For the reverse direction, suppose there is a set with . Since the most significant digit of is set to and each number in has a as most significant digit, we have since there are no carries during addition. Define as the set of the red vertices corresponding to the numbers in . As and no carries occur in the summation, we have for each . As the th digit of all numbers is either or by definition, there is a unique with , so that is the unique neighbor of in . This shows that is an exact redblue dominating set of size , concluding the linearparameter transformation.
If there was a generalized kernelization for SS of size , then we would obtain a generalized kernelization for ERBDS of size by first transforming it to SS, incurring only a constantfactor increase in the parameter, and then applying the generalized kernelization for the latter. Hence by contraposition and Theorem 3.2, the claim follows. ∎
3.3 Constraint Satisfaction Problems
In this section we extend our lower bounds to cover Boolean Constraint Satisfaction Problems (CSPs). We employ the recently introduced framework [18] of reductions among different CSPs to make a connection with EEWHC. We start with introducing terminology necessary to identify crucial properties of CSPs.
Preliminaries on CSPs
A ary constraint is a function . We refer to as the arity of , denoted . We always assume that the domain is Boolean. A constraint is satisfied by an input if . A constraint language is a finite collection of constraints , potentially with different arities. A constraint application, of a ary constraint to a set of Boolean variables, is a triple , where the indices select of the Boolean variables to whom the constraint is applied, and is an integer weight. The variables can repeat in a single application.
A formula of CSP is a set of constraint applications from over a common set of variables. For an assignment , that is, a mapping from the set of variables to , the integer is the sum of weights of the constraint applications satisfied by . The considered decision problems are defined as follows.
ExactWeight CSP Parameter: Input: A formula of CSP over variables, an integer . Question: Is there an assignment for which ?
MaxWeight CSP Parameter: Input: A formula of CSP over variables, an integer . Question: Is there an assignment for which ?
The compressibility of MaxWeight CSP has been studied by Jansen and Włodarczyk [18], who obtained essentially optimal kernel sizes for every in the case where the weights are polynomial with respect to . Even though the upper and lower bounds in [18] are formulated for MaxWeight CSP, they could be adapted to work with ExactWeight CSP. The crucial idea which allows to determine compressibility of is the representation of constraints via multilinear polynomials.
For a ary constraint its characteristic polynomial is the unique ary multilinear polynomial over satisfying for any .
It is known that such a polynomial always exists and it is unique [27].
The degree of constraint language , denoted , is the maximal degree of a characteristic polynomial over all .
The main result of Jansen and Włodarczyk [18] states that MaxWeight CSP with polynomial weights admits a kernel of bits and, as long as the problem is NPhard, it does not admit a kernel of size , for any , unless . It turns out that in the variant when we allow both positive and negative weights the problem is NPhard whenever [19]. The lower bounds are obtained via linearparameter transformations, where the parameter is the number of variables . We shall take advantage of the fact that these transformations still work for an unbounded range of weights.
[[18], Lemma 5.4] For constraint languages such that , there is a polynomialtime algorithm that, given a formula on variables and integer , returns a formula on variables and integer , such that

,

,

.
Kernel lower bounds for CSP
The lower bound of has been obtained via a reduction from SAT (with ) to MaxWeight CSP, combined with the fact that Max SAT does not admit a kernel of size for [9, 18]. We are going to show that when the weights are arbitrarily large, then the optimal compression size for ExactWeight CSP becomes essentially , so the exponent is always larger by one compared to the case with polynomial weights. To this end, we are going to combine the aforementioned reduction framework with our lower bound for ExactEdgeWeight Uniform Hyperclique.
Consider a constraint language consisting of a single ary constraint , which is satisfied only if all the arguments equal 1. The characteristic polynomial of is simply , hence the degree of equals . We first translate our lower bounds for the hyperclique problems into a lower bound for ExactWeight CSP for all , and then extend it to other CSPs.
[] For all , ExactWeight CSP does not admit a generalized kernel of size , for any , unless .
The lower bound for ExactWeight CSP given by Lemma 3.3 yields a lower bound for general ExactWeight CSP using the reduction framework described above.
For any with , ExactWeight CSP does not admit a generalized kernel of size , for any , unless .
Proof.
Consider an variable instance of Weighted Exact CSP, where . It holds that . By Lemma 3.3, there is a linearparameter transformation that translates into an equivalent instance of Weighted Exact CSP. If we could compress into bits, this would entail the same compression for . The claim follows from Lemma 3.3. ∎
This concludes the discussion of kernelization lower bounds. The kernelization upper bounds discussed in the introduction can be found in Appendix B.
4 Nodeweighted Vertex Cover in bipartite graphs
Preserving all minimum solutions
For a graph with nodeweight function , we denote by the collection of subsets of which are minimumweight vertex covers of . For vertex bipartite graphs there exists a weight function with range that preserves the set of minimumweight vertex covers, which can be computed efficiently.
[] There is an algorithm that, given an vertex bipartite graph and nodeweight function , outputs a weight function such that . The running time of the algorithm is polynomial in and the binary encoding size of .
The proof of the theorem is given in Appendix C. It relies on the fact that a maximum
matching (the linearprogramming dual to
Vertex Cover) can be computed in strongly polynomial time in bipartite graphs by a reduction to Max Flow. The structure of a maximum matching allows two weightreduction rules to be formulated whose exhaustive application yields the desired weight function. The bound of on the largest weight in Theorem 4 is bestpossible, which we prove in Lemma C.1 in Appendix C.Preserving the relative weight of solutions
For a graph , we say that two nodeweight functions are vertexcover equivalent if the ordering of inclusionminimal vertex covers by total weight is identical under the two weight functions, i.e., for all pairs of inclusionminimal vertex covers we have . While a minimumweight vertex cover of a bipartite graph can be found efficiently, the following theorem shows that nevertheless weight functions with exponentially large coefficients may be needed to preserve the ordering of minimal vertex covers by weight.
[] For each , there exists a nodeweighted bipartite graph on vertices with weight function such that for all weight functions which are vertexcover equivalent to , we have: .
5 Conclusions
We have established kernelization lower bounds for Subset Sum, ExactEdgeWeight Uniform Hyperclique, and a family of ExactWeight CSP problems, which make it unlikely that there exists an efficient algorithm to compress a single weight into bits. This gives a clear separation between the setting involving arbitrarily large weights and the case with polynomiallybounded weights, which can be encoded with bits each. The matching kernel upper bounds are randomized and we leave it as an open question to derandomize them. For Subset Sum parameterized by the number of items , a deterministic kernel of size is known [11].
Kernelization of minimization/maximization problems is so far less understood. We are able to match the same kernel size as for the exactweight problems, but only through Turing kernels. Using techniques from [11] one can obtain, e.g., a kernel of size for MaxEdgeWeight Clique. Improving upon this bound possibly requires a better understanding of the threshold functions. Our study of weighted Vertex Cover on bipartite graphs indicates that preserving the order between all the solutions might be overly demanding and it could be easier to keep track only of the structure of the optimal solutions. Can we extend the theory of threshold functions so that better bounds are feasible when we just want to maintain a separation between optimal and nonoptimal solutions?
References
 [1] Amir Abboud, Shon Feller, and Oren Weimann. On the finegrained complexity of parity problems. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 811, 2020, Saarbrücken, Germany (Virtual Conference), volume 168 of LIPIcs, pages 5:1–5:19. Schloss Dagstuhl  LeibnizZentrum für Informatik, 2020. doi:10.4230/LIPIcs.ICALP.2020.5.
 [2] Amir Abboud, Kevin Lewi, and Ryan Williams. Losing weight by gaining edges. In Andreas S. Schulz and Dorothea Wagner, editors, Algorithms  ESA 2014  22th Annual European Symposium, Wroclaw, Poland, September 810, 2014. Proceedings, volume 8737 of Lecture Notes in Computer Science, pages 1–12. Springer, 2014. doi:10.1007/9783662447772_1.
 [3] László Babai, Kristoffer Arnsfelt Hansen, Vladimir V. Podolskii, and Xiaoming Sun. Weights of exact threshold functions. In Petr Hlinený and Antonín Kucera, editors, Mathematical Foundations of Computer Science 2010, 35th International Symposium, MFCS 2010, Brno, Czech Republic, August 2327, 2010. Proceedings, volume 6281 of Lecture Notes in Computer Science, pages 66–77. Springer, 2010. doi:10.1007/9783642151552_8.
 [4] Andreas Björklund. Determinant sums for undirected Hamiltonicity. SIAM J. Comput., 43(1):280–299, 2014. doi:10.1137/110839229.
 [5] Hans L. Bodlaender, Bart M. P. Jansen, and Stefan Kratsch. Kernelization lower bounds by crosscomposition. SIAM J. Discrete Math., 28(1):277–305, 2014. doi:10.1137/120880240.
 [6] Miroslav Chlebík and Janka Chlebíková. Crown reductions for the minimum weighted vertex cover problem. Discret. Appl. Math., 156(3):292–312, 2008. doi:10.1016/j.dam.2007.03.026.
 [7] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. doi:10.1007/9783319212753.
 [8] Holger Dell and Dániel Marx. Kernelization of packing problems. In Yuval Rabani, editor, Proceedings of the TwentyThird Annual ACMSIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 1719, 2012, pages 68–81. SIAM, 2012. URL: https://doi.org/10.1137/1.9781611973099.6, doi:10.1137/1.9781611973099.6.
 [9] Holger Dell and Dieter van Melkebeek. Satisfiability allows no nontrivial sparsification unless the polynomialtime hierarchy collapses. J. ACM, 61(4):23:1–23:27, 2014. doi:10.1145/2629620.
 [10] Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013. doi:10.1007/9781447155591.
 [11] Michael Etscheid, Stefan Kratsch, Matthias Mnich, and Heiko Röglin. Polynomial kernels for weighted problems. J. Comput. Syst. Sci., 84:1–10, 2017. doi:10.1016/j.jcss.2016.06.004.
 [12] Henning Fernau. Kernelization, Turing kernels. In Encyclopedia of Algorithms, pages 1043–1045. Springer, 2016. doi:10.1007/9781493928644_528.

[13]
András Frank and Éva Tardos.
An application of simultaneous diophantine approximation in combinatorial optimization.
Combinatorica, 7(1):49–65, 1987.  [14] Danny Harnik and Moni Naor. On the compressibility of NP instances and cryptographic applications. SIAM Journal on Computing, 39(5):1667–1713, 2010. doi:10.1137/060668092.
 [15] Anwar A. Irmatov. Asymptotics of the number of threshold functions and the singularity probability of random matrices. Doklady Mathematics, 101:247–249, 2020. doi:10.1134/S1064562420030096.
 [16] Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. SIAM J. Comput., 7(4):413–423, 1978. doi:10.1137/0207033.
 [17] Bart M. P. Jansen and Astrid Pieterse. Optimal sparsification for some binary CSPs using lowdegree polynomials. TOCT, 11(4):28:1–28:26, 2019. doi:10.1145/3349618.
 [18] Bart M. P. Jansen and Michal Wlodarczyk. Optimal polynomialtime compression for Boolean Max CSP. In Fabrizio Grandoni, Grzegorz Herman, and Peter Sanders, editors, 28th Annual European Symposium on Algorithms, ESA 2020, September 79, 2020, Pisa, Italy (Virtual Conference), volume 173 of LIPIcs, pages 63:1–63:19. Schloss Dagstuhl  LeibnizZentrum für Informatik, 2020. doi:10.4230/LIPIcs.ESA.2020.63.
 [19] Peter Jonsson and Andrei Krokhin. Maximum colourable subdigraphs and constraint optimization with arbitrary weights. Journal of Computer and System Sciences, 73(5):691 – 702, 2007. doi:10.1016/j.jcss.2007.02.001.
 [20] Richard M. Karp. Reducibility among combinatorial problems. In Proceedings of a symposium on the Complexity of Computer Computations, held March 2022, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972. doi:10.1007/9781468420012_9.

[21]
Richard M. Karp and Michael O. Rabin.
Efficient randomized patternmatching algorithms.
IBM journal of research and development, 31(2):249–260, 1987.  [22] Dániel Marx and Michal Pilipczuk. Everything you always wanted to know about the parameterized complexity of subgraph isomorphism (but were afraid to ask). CoRR, abs/1307.2187, 2013. arXiv:1307.2187v3.
 [23] Ketan Mulmuley, Umesh V. Vazirani, and Vijay V. Vazirani. Matching is as easy as matrix inversion. Comb., 7(1):105–113, 1987. doi:10.1007/BF02579206.
 [24] Jesper Nederlof. Bipartite TSP in time, assuming quadratic time matrix multiplication. In Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy, editors, Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 2226, 2020, pages 40–53. ACM, 2020. doi:10.1145/3357713.3384264.
 [25] Jesper Nederlof, Erik Jan van Leeuwen, and Ruben van der Zwaan. Reducing a target interval to a few exact queries. In Branislav Rovan, Vladimiro Sassone, and Peter Widmayer, editors, Mathematical Foundations of Computer Science 2012  37th International Symposium, MFCS 2012, Bratislava, Slovakia, August 2731, 2012. Proceedings, volume 7464 of Lecture Notes in Computer Science, pages 718–727. Springer, 2012. doi:10.1007/9783642325892_62.
 [26] G.L. Nemhauser and L.E.jun. Trotter. Vertex packings: structural properties and algorithms. Math. Program., 8:232–248, 1975. doi:10.1007/BF01580444.
 [27] Noam Nisan and Mario Szegedy. On the degree of Boolean functions as real polynomials. Computational Complexity, 4:301–313, 1994. doi:10.1007/BF01263419.
 [28] James B. Orlin. Max flows in time, or better. In Symposium on Theory of Computing Conference, STOC’13, pages 765–774. ACM, 2013. doi:10.1145/2488608.2488705.
 [29] Alexander Schrijver. Combinatorial Optimization: Polyhedra and Efficiency, volume 24. SpringerVerlag, Berlin, 2003.
 [30] Virginia Vassilevska and Ryan Williams. Finding, minimizing, and counting weighted subgraphs. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, pages 455–464. ACM, 2009. doi:10.1145/1536414.1536477.
 [31] Virginia Vassilevska Williams. Hardness of easy problems: Basing hardness on popular conjectures such as the strong exponential time hypothesis (invited talk). In Thore Husfeldt and Iyad A. Kanj, editors, 10th International Symposium on Parameterized and Exact Computation, IPEC 2015, September 1618, 2015, Patras, Greece, volume 43 of LIPIcs, pages 17–29. Schloss Dagstuhl  LeibnizZentrum für Informatik, 2015. doi:10.4230/LIPIcs.IPEC.2015.17.
 [32] Virginia Vassilevska Williams and R. Ryan Williams. Subcubic equivalences between path, matrix, and triangle problems. J. ACM, 65(5):27:1–27:38, 2018. doi:10.1145/3186893.
 [33] Virginia Vassilevska Williams and Ryan Williams. Finding, minimizing, and counting weighted subgraphs. SIAM J. Comput., 42(3):831–854, 2013. doi:10.1137/09076619X.
Appendix A Kernel lower bounds
a.1 Omitted proofs for ExactEdgeWeight Clique
See 3.1
Proof.
We give a degree crosscomposition (Definition 2) from RBDS to the weighted hyperclique problem using Lemma 3.1. We start by giving a polynomial equivalence relation on inputs of RBDS. Let two instances of RBDS be equivalent under if they have the same number of red vertices, the same number of blue vertices, and the same target value . It is easy to check that is a polynomial equivalence relation.
Consider inputs of RBDS from the same equivalence class of . If is not a power of an integer, then we duplicate one of the input instances until we reach the first number of the form , which is trivially such a power. This increases the number of instances by at most the constant factor and does not change whether there is a YESinstance among the instances. As all requirements on a crosscomposition are oblivious to constant factors, from now on we may assume without loss of generality that for some integer . By definition of , all instances have the same number of red vertices, the same number of blue vertices, and have the same maximum size of a solution.
For , we can simply invoke Lemma 3.1 for the