A tractable class of binary VCSPs via M-convex intersection

A binary VCSP is a general framework for the minimization problem of a function represented as the sum of unary and binary cost functions. An important line of VCSP research is to investigate what functions can be solved in polynomial time. Cooper--Živný classified the tractability of binary VCSP instances according to the concept of "triangle," and showed that the only interesting tractable case is the one induced by the joint winner property (JWP). Recently, Iwamasa--Murota--Živný made a link between VCSP and discrete convex analysis, showing that a function satisfying the JWP can be transformed into a function represented as the sum of two M-convex functions, which can be minimized in polynomial time via an M-convex intersection algorithm if the value oracle of each M-convex function is given. In this paper, we give an algorithmic answer to a natural question: What binary VCSP instances can be solved in polynomial time via an M-convex intersection algorithm? Under a natural condition, we solve this problem by devising a polynomial-time algorithm for obtaining a concrete form of the representation in the representable case. Our result presents a larger tractable class of binary VCSPs, which properly contains the JWP class. We also show the co-NP-hardness of testing the representability of a given binary VCSP instance as the sum of two M-convex functions.

There are no comments yet.

Authors

• 9 publications
• 6 publications
• 1 publication
• 28 publications
• Generalized minimum 0-extension problem and discrete convexity

Given a fixed finite metric space (V,μ), the minimum 0-extension problem...
09/21/2021 ∙ by Martin Dvorak, et al. ∙ 0

• Maximum Weight Independent Sets for (S_1,2,4,Triangle)-Free Graphs in Polynomial Time

The Maximum Weight Independent Set (MWIS) problem on finite undirected g...
06/22/2018 ∙ by Andreas Brandstädt, et al. ∙ 0

• A polynomial-time algorithm for median-closed semilinear constraints

A subset of Q^n is called semilinear (or piecewise linear) if it is Bool...
08/29/2018 ∙ by Manuel Bodirsky, et al. ∙ 0

• Cyclotomic Identity Testing and Applications

We consider the cyclotomic identity testing (CIT) problem: given a polyn...
07/26/2020 ∙ by Nikhil Balaji, et al. ∙ 0

• On Tractable Representations of Binary Neural Networks

We consider the compilation of a binary neural network's decision functi...
04/05/2020 ∙ by Weijia Shi, et al. ∙ 0

• Optimal matroid bases with intersection constraints: Valuated matroids, M-convex functions, and their applications

For two matroids M_1 and M_2 with the same ground set V and two cost fun...
03/05/2020 ∙ by Yuni Iwamasa, et al. ∙ 0

• A Convex Surrogate Operator for General Non-Modular Loss Functions

Empirical risk minimization frequently employs convex surrogates to unde...
04/12/2016 ∙ by Jiaqian Yu, et al. ∙ 0

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The valued constraint satisfaction problem (VCSP) provides a general framework for discrete optimization (see [36] for details). Informally, the VCSP framework deals with the minimization problem of a function represented as the sum of “small” arity functions, which are called cost functions

. It is known that various kinds of combinatorial optimization problems can be formulated in the VCSP framework. In general, the VCSP is NP-hard. An important line of research is to investigate what restrictions on classes of VCSP instances ensure polynomial time solvability. Two main types of VCSPs with restrictions are

structure-based VCSPs and language-based VCSPs (see e.g., [6, 36]). Structure-based VCSPs deal with restrictions on the hypergraph structure representing the appearance of variables in a given instance. For example, Gottlob–Greco–Scarcello [12] showed that, if the hypergraph corresponding to a VCSP instance has a bounded hypertree-width, then the instance can be solved in polynomial time. Language-based VCSPs deal with restrictions on cost functions that appear in a VCSP instance. Kolmogorov–Thapper–Živný [20] gave a precise characterization of tractable valued constraint languages via the basic LP relaxation. Kolmogorov–Krokhin–Rolínek [19] gave a dichotomy for all language-based VCSPs (see also [2, 35] for a dichotomy for all language-based CSPs).

Hybrid VCSPs, which deal with a combination of structure-based and language-based restrictions, have emerged recently [6]. Among many kinds of hybrid restrictions, a binary VCSP, VCSP with only unary and binary cost functions, is a representative hybrid restriction that includes numerous fundamental optimization problems. Cooper–Živný [4] showed that if a given binary VCSP instance satisfies the joint winner property (JWP), then it can be minimized in polynomial time. The same authors classified in [5] the tractability of binary VCSP instances according to the concept of “triangle,” and showed that the only interesting tractable case is the one induced by the JWP (see also [6]). Furthermore, they introduced cross-free convexity as a generalization of JWP, and devised a polynomial-time minimization algorithm for cross-free convex instances , provided a “cross-free representation” of is given (see section 2.3 for detail).

In this paper, we introduce a novel tractability principle going beyond triangle and cross-free representation for binary VCSPs. A binary VCSP is formulated as follows, where () are finite sets.

Given:

Unary cost functions for and binary cost functions for .

Problem:

Find a minimizer of defined by

 F(X1,X2,…,Xr):=∑1≤p≤rFp(Xp)+∑1≤p

Our tractability principle is built on discrete convex analysis (DCA) [25, 27], which is a theory of convex functions on discrete structures. In DCA, L-convexity and M-convexity play primary roles; the former is a generalization of submodularity, and the latter is a generalization of matroids. A variety of polynomially solvable problems in discrete optimization can be understood within the framework of L-convexity/M-convexity (see e.g., [27, 28, 29]). Recently, it has also turned out that discrete convexity is deeply linked to tractable classes of VCSPs. L-convexity is closely related to the tractability of language-based VCSPs. Various kinds of submodularity induce tractable classes of language-based VCSP instances [20], and a larger class of such submodularity can be understood as L-convexity on certain graph structures [14]. On the other hand, Iwamasa–Murota–Živný [18] have pointed out that M-convexity plays a role in hybrid VCSPs. They revealed the reason for the tractability of a VCSP instance satisfying the JWP from a viewpoint of M-convexity. We here continue this line of research, and explore further applications of M-convexity in hybrid VCSPs.

A function is called M-convex [22, 27] if it satisfies the following generalization of the matroid exchange axiom: for , and with , there exists with such that

 f(x)+f(y)≥f(x−χi+χj)+f(y+χi−χj),

where, for a function , the effective domain is denoted as , and is the

th unit vector.

555Although M-convex functions are defined on in general, we only need functions on here. M-convex functions on are equivalent to the negative of valuated matroids introduced by Dress–Wenzel [9, 10]. An M-convex function can be minimized in a greedy fashion similarly to the greedy algorithm for matroids. Furthermore, a function that is representable as the sum of two M-convex functions is called M-convex. As a generalization of matroid intersection, the problem of minimizing an M-convex function, called the M-convex intersection problem, can also be solved in polynomial time if the value oracle of each constituent M-convex function is given [23, 24]; see also [26, Section 5.2]. Our proposed tractable class of VCSPs is based on this result.

Let us return to binary VCSPs. The starting observation for relating VCSP to DCA is that the objective function on can be regarded as a function on by the following correspondence between the domains:

 Dp:={1,2,…,np}∋i ⟷ (0,…,0,iˇ1,0,…,0np)(p∈{1,2,…,r}). (2)

With this correspondence, the minimization of can be transformed to that of . A binary VCSP instance is said to be M-representable if the function obtained from via the correspondence (2) is M-convex.

It is shown in [18] that a binary VCSP instance satisfying the JWP can be transformed to an M-representable instance,666In [18], a binary VCSP instance satisfying the JWP was transformed into the sum of two M-convex functions. It can be easily seen that this function can also be transformed into the sum of two M-convex functions. and two M-convex summands can be obtained in polynomial time. Here the following natural question arises: What binary VCSP instances are M-representable? In this paper, we give an algorithmic answer to this question by considering the following problem:

Testing M-Representability
Given:

A binary VCSP instance .

Problem:

Determine whether is M-representable or not. If is M-representable, obtain a decomposition of the function into two M-convex functions and , where is the function transformed from via (2).

We assume the following condition for an instance .

():

For all and , there is with .

This assumption is fairly reasonable, though checking whether a binary VCSP instance satisfies () is NP-hard in general (since it is equivalent to solving a general binary CSP). If all binary cost functions appearing in a given instance only take finite values, then the instance obviously satisfies (). Furthermore, every binary VCSP instance can be reduced to an instance satisfying () by redefining for and such that for all .

Our main result is the following:

Theorem 1.1.

For binary VCSP instances satisfying (), Testing M-Representability can be solved in time.

An M-convex function can be minimized in polynomial time if such a decomposition can be obtained in polynomial time. Thus we obtain the following corollary of Theorem 1.1.

Corollary 1.2.

An M-representable binary VCSP instance satisfying () can be minimized in polynomial time.

Our result provides us with cross-free representations, and presents a new tractable class of binary VCSPs that goes beyond JWP. A nice feature of our contribution is that the tractability based on M-representability is independent of a particular representation (1) of a given instance, while the tractability based on JWP or cross-free convexity depends on a representation; see section 2.3.

We also show the following theorem, which implies the NP-hardness of checking () when combined with Theorem 1.1.

Theorem 1.3.

Testing M-Representability is co-NP-hard.

Our approach to a polynomial-time algorithm for Testing M-Representability is outlined as follows:

• We establish a unique representation theorem of M-convex functions arising from binary VCSP instances (Theorem 2.2).

• With this result, our problem can be separated into two subproblems named Decomposition and Laminarization. The former is the problem of obtaining the unique representation of a given M-convex function, and the latter is the problem of making a laminar family from a given family of subsets by means of certain transformations.

• We devise a polynomial-time algorithm for each problem, Decomposition and Laminarization (Theorems 3.7 and 4.11).

A unique representation theorem (Theorem 2.2), a polynomial-time algorithm for Decomposition (Theorem 3.7), and that for Laminarization (Theorem 4.11) are the major results of this paper. In particular, Laminarization seems to be an interesting problem of combinatorial nature in its own right.

Organization.

In Section 2, we introduce the representation theorem (Theorem 2.2) of quadratic M-convex functions arising from VCSP instances as well as the subproblems, Decomposition and Laminarization. We also prove Theorem 1.3. In Sections 3 and 4, we present polynomial-time algorithms for Decomposition and Laminarization, respectively.

Notation.

Let , , , and denote the sets of integers, reals, nonnegative reals, and positive reals, respectively. In this paper, functions can take the infinite value , where , for , and . Let . For a positive integer , we define .

Remark 1.4.

This is a full version of an extended abstract [15], which did not include all proofs and only dealt with finite-valued CSPs, with cost functions taking only finite real values. In this paper, we deal with general-valued CSPs, where the cost functions can take both finite real values and the infinite value ().

2 Towards testing M2-representability

2.1 Representation theorem

We introduce a class of quadratic functions on that has a bijective correspondence to binary VCSP instances. Let be a partition of with for . We say that is a VCSP-quadratic function of type if is represented as

 f(x1,x2,…,xn):=⎧⎪⎨⎪⎩∑i∈[n]aixi+∑1≤i

where and with for all for some . We assume for distinct .

Suppose that a binary VCSP instance of the form (1) is given, where we assume for distinct . The transformation of to based on (2) in Section 1 is formalized as follows. Choose a partition of with and a bijective correspondence . Define if corresponds to , if and correspond to and , respectively, and otherwise. Then the function in (3) is a VCSP-quadratic function of type . Note that condition () for is equivalent to the following condition for :

():

For all , there is with .

The class of M-convex VCSP-quadratic functions admits a decomposition into simpler functions . For , let be defined by

 ℓX(x):=∑k−(X)

where

 k−(X) := the number of indices p∈[r] with X⊇Ap, k+(X) := the number of indices p∈[r] with X∩Ap≠∅.

That is, is the sum of the distances from

for . In the following, we consider subsets with , and denote the family of such subsets by

 Π=ΠA:={X⊆[n]∣k−(X)+2≤k+(X)}.

In other words, if and only if for more than one .

A family is said to be laminar if , , or holds for all . For a subpartition of (which often corresponds to quadratic coefficients with the infinite value), a family is said to be -laminar if is laminar and all elements in are minimal in . Define by if and for each , and otherwise. Note that is the indicator function of . Then the following holds.

Lemma 2.1.

For any -laminar family and any positive weight , the function is M-convex.

Proof.

We use the well-known fact that laminar convex functions are M-convex (see [27, Section 6.3]). Let the unary discrete convex function defined by for , and for . Then coincides with the function on . On the other hand, can be regarded as for a unary discrete convex function , where . Since is laminar, the function on is M-convex. Furthermore it is known [30, Theorem 3.1] that the restriction of an M-convex function to is M-convex if the effective domain is nonempty after the restriction. Hence is M-convex.

Our representation theorem (Theorem 2.2) says that an M-convex VCSP-quadratic function is always represented as the sum of and a linear function on . To state it precisely, there are substantial complications to be resolved. In our setting, we are given a VCSP-quadratic function of type , which is defined only on . It can happen that functions and are identical on (i.e., ) even when . Thus we have to make a judicious choice between them to demonstrate M-representability of .

To cope with such complications, we define an equivalence relation by: . For , let be the set of representatives (in ) of all elements in . The equivalence relation is extended to subsets of by: . A subset of is said to be -laminar if and there is a -laminar family with . A family is said to be -laminarizable if is -laminar. For simplicity, the equivalence class of is also denoted by , and a member of is also denoted by its representative .

Our first result is a representation theorem of M-convex functions. For a VCSP-quadratic function of type , let be a multipartite graph on with partition such that edge (, , ) exists if and only if . Define as the number of connected components of . A connected component with at least one edge is said to be non-isolated.

Theorem 2.2.

Let be a VCSP-quadratic function of type satisfying condition (), and be the set of non-isolated connected components of . Then is M-convex if and only if one of the following conditions (I) and (II) holds:

(I)

or , and .

(II)

, and there exist a -laminar family and a positive weight such that

 f=⎛⎝∑X∈Pfcf(X)ℓX+δB⎞⎠+δA+(linear function), (4)

where “” means a function for some and .

In the case of (II), and in (4) are uniquely determined.

By Theorem 2.2, an M-convex function has the summand with , where is a -laminar family with . Since satisfies condition (), so does . This implies and for every . The proof of Theorem 2.2 is given in Sections 2.4 and 2.5.

2.2 Decomposition and Laminarization

To test for M-representability by Theorem 2.2 for the case of , we first solve the following problem Decomposition, which detects non-M-convexity of or obtains decomposition (4).

Decomposition
Given:

A VCSP-quadratic function of type satisfying condition ().

Problem:

Either detect the non-M-convexity of , or obtain some and satisfying

 f=(∑X∈Pc(X)ℓX+δB)+δA+(linear function), (5)

where is not required to be -laminar in general, but in case of M-convex , and should coincide, respectively, with and in (4).

We emphasize that Decomposition may possibly output the decomposition (5) even when the input is not M-convex, but if Decomposition detects the non-M-convexity then indeed the input is not M-convex.

Suppose that decomposition (5) is obtained after solving Decomposition. In this case we have at hand. Then we have to check for the -laminarizability of an arbitrarily chosen family with . This motivates us to consider the following problem.

Laminarization
Given:

and a subpartition of satisfying for each .

Problem:

Determine whether there exists a -laminar family with . If it exists, obtain a -laminar family with .

Laminarization is a purely combinatorial problem on a set system. Indeed, the equivalence relation can be rephrased in a combinatorial way as follows. For , define

 ⟨X⟩:=⋃{Ap∈A∣∅≠X∩Ap≠Ap}, (6)

which is the union of contributing to nonlinearly. One can see the following.

Lemma 2.3.

For , if and only if .

Laminarization can be regarded as the problem of transforming a given family to a laminar family by repeating the following operation: replace with , , or with some satisfying . Figure 1 illustrates an example of the input (left) and an output (right) of Laminarization.

A decomposition into two M-convex functions and can be constructed from and found by Decomposition and Laminarization as and . By Lemma 2.1, is an M-convex function, and is a linear function on .

For the case of or , we devise an -time algorithm for checking the linearity of in Section 3.1. For the case of , we devise an -time algorithm for Decomposition in Section 3.2 and an -time algorithm for Laminarization in Section 4. Thus we obtain Theorem 1.1.

Remark 2.4.

Our representation theorem (Theorem 2.2) and decomposition algorithm (in Section 3) are inspired by the polyhedral split decomposition due to Hirai [13]. This general decomposition principle decomposes, by means of polyhedral geometry, a function on a finite set of points of into a sum of simpler functions, called split functions, and a residue term. Actually, (5) can be viewed as a specialization of the polyhedral split decomposition, where , and is a sum of split functions. We refer the reader to [13] for details.

2.3 Relation to other problems

Relation to JWP.

A binary VCSP instance of the form (1) is said to satisfy the JWP (Joint Winner Property) [4] if

 Fij(a,b)≥min{Fjk(b,c),Fik(a,c)}

for all distinct and all . It is shown in [4] that if satisfies the JWP, then can be transformed, in polynomial time, into a function satisfying the JWP, , and

 |\operatornamewithlimitsargmin{F′ij(a,c),F′ij(a,d),F′ij(b,c),F′ij(b,d)}|≥2 (7)

for any distinct , distinct , and distinct . A function with the JWP satisfying (7) is said to be Z-free. Z-free functions can be minimized in polynomial time. Thus, if satisfies the JWP, then can be minimized in polynomial time. It is shown in [18] that Z-free instances are M-representable.

The tractability based on M-representability depends solely on the function values, and is independent of how the function is given. Indeed, an M-representable instance can be characterized by the existence of a Z-free instance that satisfies for all . This stands in sharp contrast with the tractability based on the JWP, which depends heavily on the representation of . For example, let be a binary VCSP instance satisfying the JWP. By choosing a pair of distinct , , and arbitrarily, replace and by and , respectively. Then does not change but violates the JWP in general. Our result can explore such hidden M-convexity.

Relation to cross-free convexity.

A pair is said to be crossing if , , , and are all nonempty. A family is said to be cross-free if there is no crossing pair in . A (not necessarily binary) VCSP instance is said to be cross-free convex [5] if the function obtained from via the correspondence (2) can be represented as

 f(x)=δA(x)+∑X∈FgX(∑i∈Xxi), (8)

where is a partition of with , is cross-free, and is a unary discrete convex function for each . Note that the polynomial-time minimization algorithm proposed in [5] for cross-free convex instances does not work unless the cross-free representation (8) is given.

Cross-free convexity is a special class of M-representability, where a (not necessarily binary) VCSP instance is M-representable if the function obtained from via the correspondence (2) is M-convex. Indeed, it is clear that is M-convex. Furthermore, by a similar argument to Lemma 2.1, the function is also M-convex on . Thus is M-convex, and hence, is M-representable.

In case of binary VCSPs, the cross-free convexity and the M-representability are equivalent by Lemma 2.1 and Theorem 2.2. Hence our result provides, in binary VCSPs, a polynomial-time minimization algorithm for cross-free convex instances even when the expression (8) is not given.

Application to quadratic pseudo-Boolean function minimization.

Consider a pseudo-Boolean function represented as . For such , we define by

 ^f(x1,…,xn,xn+1,…,x2n):=⎧⎪⎨⎪⎩∑i∈[n]aixi+∑1≤i

Then we have and for any . Hence minimizing is equivalent to minimizing .

Our result provides a new tractable class of quadratic pseudo-Boolean functions for minimization (see e.g., [1, 7]). We can regard as a function of the form (3) with the partition of given by for . Therefore, if is M-convex, then we can obtain two M-convex functions and satisfying by our proposed algorithm, and can minimize (and hence ) in polynomial time.

We give an example of such a minimizable function. Define by

 f(x1,x2,x3,x4):=4x1x2+x1x3+3x2x3+2x2x4+(linear function).

Then with is represented as

 ^f(x)=⎧⎪⎨⎪⎩4x1x2+x1x3+3x2x3+2x2x4+∞⋅∑i∈[4]xixi+4+(linear % function)if ∑i∈[8]xi=4,+∞otherwise.

By solving Decomposition, we obtain and , , , and , where we denote by for distinct . By solving Laminarization, we obtain a laminar family (see also Figure 1). Note and by Lemma 2.3. Thus is M-convex, and we obtain and . The two M-convex functions for are given by and a linear function on .

2.4 Proof of the characterization

In this subsection, we prove the characterization part of Theorem 2.2. That is, a VCSP-quadratic function of type satisfying condition () is M-convex if and only if is a linear function on (the case of or ), or (4) holds for some and (the case of ).

We first review fundamental facts about a general quadratic (not necessarily VCSP-quadratic) function represented as

 g(x1,x2,…,xn)=⎧⎪⎨⎪⎩∑i∈[n]aixi+∑1≤i

where with , , and . We assume condition () for . Let be a graph on node set such that edge () exists if and only if . Note that is not a multipartite graph in general (unlike for a VCSP-quadratic function ). Define as the number of connected components of . Let be the node sets of the non-isolated connected components of . Let be the indicator function of , which is defined as for and for . Then the M-convexity of is characterized by the following lemma, which is a refinement of the result of [16] and [31].

Lemma 2.5 ([17, Theorem 3.1]).

A function of the form (9) satisfying condition () is M-convex if and only if each connected component of is a complete graph and one of the following conditions (I), (II), and (III) holds:

(I):

and

 aij+akl≥min{aik+ajl,ail+ajk} (10)

holds for every distinct .

(II):

and

 aij+akl=ail+ajk (11)

holds for every , distinct , and distinct .

(III):

and

 aij+akl=ail+ajk (12)

holds for every distinct , distinct , and distinct .

In particular, (II) or (III) holds if and only if is represented as .

An M-convex function of the form (9) is said to be non-trivial if . We say that satisfies the anti-tree metric property if (10) holds, and that satisfies the anti-ultrametric property if

 aij≥min{aik,ajk} (13)

holds for all distinct . It is known [8] that the anti-ultrametric property is stronger than the tree-metric property (10). Anti-ultrametric property has a graphical interpretation, as follows.

Lemma 2.6 ([18, Lemma 8]).

satisfies the anti-ultrametric property if and only if there are a subpartition of , a -laminar family , and a positive weight such that

 aij={+∞if i,j∈B for some B∈B,∑{c(L)∣L∈L with i,j∈L}−α∗otherwise,

where .

Quadratic coefficients are called -coefficients if [ for some ]. The following is a variation of a well-known technique (Farris transform) in phylogenetics [33] to transform a tree metric to a ultrametric, and follows from the validity of Algorithm I in [17].

Lemma 2.7 ([17]).

Suppose that satisfies the anti-tree metric property. Let and for . Then satisfies the anti-ultrametric property and holds for any .

We now return to VCSP-quadratic functions. The following is a key proposition.

Proposition 2.8.

Let be a VCSP-quadratic function of type satisfying condition (), and be the set of non-isolated connected components (as a set of nodes) of . Then is M-convex if and only if one of the following conditions (I) and (II) holds:

(I)

or , and .

(II)

, and can be represented as

 f(x)=∑i,j∈[n]aijxixj+δA(x)+(linear function),

where are -coefficients satisfying the anti-ultrametric property.

Proof.

Note that, for any subpartition of , is an M-convex function that can be represented as on