Outer approximations of core points for integer programming

07/21/2020 ∙ by David Bremner, et al. ∙ 0

For several decades the dominant techniques for integer linear programming have been branching and cutting planes. Recently, several authors have developed core point methods for solving symmetric integer linear programs (ILPs). An integer point is called a core point if its orbit polytope is lattice-free. It has been shown that for symmetric ILPs, optimizing over the set of core points gives the same answer as considering the entire space. Existing core point techniques rely on the number of core points (or equivalence classes) being finite, which requires special symmetry groups. In this paper we develop some new methods for solving symmetric ILPs (based on outer approximations of core points) that do not depend on finiteness but are more efficient if the group has large disjoint cycles in its set of generators.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Formulation symmetries occur in practice when relabellings yield equivalent problem structure; this causes repeated work for branching solvers, and state of the art commercial and research solvers make efforts to break symmetries [9]. Let be a permutation group acting on by permuting coordinates. For any integer point , the orbit polytope of is the convex hull of the -orbit of . If the vertices of an orbit polytope are the only integer points in the polytope we call it lattice-free and call a core point. Instead of seeing symmetry as a problem, core point techniques seeks to exploit it to solve integer linear programs (ILPs) faster. In the most direct approach, when the number of core points is finite (which only holds for certain special groups), core points are enumerated and tested individually [2]. It should be noted that core point techniques are not useful for binary problems since every -point is a core point; [5] considers an alternative approach based on lexicographical order.

Computation of symmetries in the MIPLIB 2010 and 2017 instances has been done in [8] and this study shows that many instances are affected by symmetry.

A (not necessarily polyhedral) outer approximation is a set of constraints that is feasible for all of the points in the set one wishes to approximate. A well known example of an outer approximation is an ILP, where the (initial) linear constraints define an outer approximation of the integer points inside. Outer approximations lead naturally to a hybrid approach where synthesized constraints are added to an existing formulation and then solved with a traditional solver. Outer approximations are implicit in previous results bounding the distance of core points to certain linear subspaces (see e.g. Theorem 3.2.4 in [10]). The distance bounds do not themselves seem to be tight enough to provide a practical improvement for solving ILPs. In this paper we develop some new constraints for problems with formulation symmetries. While these constraints are nonlinear and non-convex, initial experiments with nonlinear solvers seem promising.

In Section 2 we give some basic definitions. In Section 3 we consider integer linear programs with cyclic symmetry groups and provide some new constraints to determine outer approximations of their core points. We also provide an algorithm that uses these constraints to solve an ILP. In Section 4 we generalize this algorithm for ILPs where only some of their variables have cyclic symmetry. In Section 5 we generalize the algorithm for direct products of some cyclic groups. In Section 6 we classify permutation groups based on their generators and explain how the algorithms of the previous sections can be applied to ILPs having arbitrary permutation groups as symmetry groups. And finally in Section 7 we use our algorithms to solve some hard symmetric integer linear feasibility problems.

2 Basic Definitions

Definition 2.1.

Let be a convex polytope with integral vertices. We call lattice-free if where is the set of vertices of .

In this paper we assume that all groups considered are permutation groups which act by permuting coordinates.

Definition 2.2.

Let be a finite group of isometries acting on Euclidean space . Let be the G-orbit of some point . We call the convex hull of this orbit an orbit polytope and denote it by .

Definition 2.3.

Let be a finite group of unimodular matrices. A point is called a core point for if and only if the orbit polytope is lattice-free.

Let be the cyclic group generated by , a cyclic permutation of coordinates. In other words, is given by

We represent points in

by column vectors but for convenience, we write such vectors here in a transpose way. The map

is easily iterated:

Definition 2.4.

Let act on a set . The subset of preserved by all elements of is called the fixed space

In particular, we denote by , the intersection

Definition 2.5.

We define the k-th layer to be the set

Note that the set is -invariant because acts by permuting coordinates.

Definition 2.6.

Two points are called equivalent if there exists such that . It follows from the group axioms that this is an equivalence relation.

Definition 2.7.

Two points are called isomorphic if there exists such that . This is an equivalence relation because is a lattice.

Potentially larger equivalence classes (based on normalizers) of core points are studied in [6].

Definition 2.8.

Two integer points and in are called co-projective if there exists an integer such that . Equivalently is a translation of through the fixed space.

Definition 2.9.

A point is called a universal core point if it is isomorphic to a -vector.

Definition 2.10.

An integer point is called an atom if there is a universal core point in the layer containing such that the distance between and is .

For example if , then the fixed space is spanned by and is an universal core point. And is an atom since its distance to the universal core point is .

Definition 2.11.

Suppose the cyclic group acts on each point by permuting coordinates. Then is called active if there is such that

For example if acts on then are active but is non-active.

3 Circulant Matrices

Circulant matrices play an important role in finding our constraints because any point in the orbit polytope of the integer point under the cyclic group can be written as , where , and is the circulant matrix of .

Definition 3.1.

A circulant matrix is a matrix where each column vector is rotated one element down relative to the preceding column vector. An circulant matrix takes the form

One amazing property of circulant matrices is that the eigenvectors are always the same for all

circulant matrices. The eigenvalues are different for each matrix, but since we know the eigenvectors a circulant matrix can be diagonalized easily. For more detailed background on circulant matrices see  

[4].

The -th eigenvector for any circulant matrix is given by:

(1)

where Suppose (usually for us will be an integer point in ). By Euler’s formula we have , where

(2)
(3)

The eigenvalue of the -th eigenvector of the circulant matrix is given by

(4)

So we have

(5)

where

is the orthogonal matrix composed of the eigenvectors as columns, and

is the diagonal matrix with diagonal elements .

The inverse of a circulant matrix is circulant [7] and its inverse is given by

(6)

Since is a diagonal matrix its inverse is also a diagonal matrix with diagonal elements , where

(7)
Remark 3.1.

Note that the length of the projection of a vector onto a complex vector is defined as

(8)

Furthermore, the term in (7) is the length of the projection of onto invariant subspaces .

Theorem 3.1.

Let the orbit polytope of be full dimensional. Then the inverse of circulant matrix is where defined as follows

where

For the proof of Theorem 3.1 we use the following Lemma.

Lemma 3.2.

Let the orbit polytope of be full dimensional. Then the inverse of the circulant matrix is where defined as follows:

(9)

Note that

Proof.

By (6) we have . Now suppose is the -th component of , we have

So, ,the th row of , is equal

Now since is a circulant matrix it is enough to find the first column of (the other columns can be found by permutation of the first column). Notice that the first row and column of and is so multiplying each row of with vector gives us the first column of which is

where

Proof of Theorem 3.1 : As as shown in Lemma 3.2 we have

Since and also and are complex conjugates of each other, for odd we have

Recall that so the last equality holds because and is in terms of and is in terms of .

If is even then for we have and so does not have the complex conjugate pair. But since the imaginary part of it is zero we have

Lemma 3.3.

Let and be non-singular then .

Proof.

Recall that the convex hull of the orbit polytope of can be described as the following

Suppose is non-singular and is a point in , from Theorem 3.1 we have

Since and are in the same layer we have

hence

Since , and (since is non-singular, and are not in layer 0), must be zero. ∎

Remark 3.4.

Note that if and then

This implies that Now suppose is invertible (then ) and so . To check if or not, we only need to check if and the constraint is redundant.

Let us denote by the first row of which is

This will simplify the notation in Section 3.2.

3.1 New constraints for singular circulant matrices

As was mentioned before the circulant matrix corresponding to an integer point is not invertible if and only the determinant of is zero. Since the determinant of a square matrix is equal to the product of its eigenvalues we have

(10)

Furthermore, since for , is complex conjugate of we have

if is odd.

if is even.

So in order to check if an ILP has an optimal feasible integer solution whose circulant matrix is singular, we could add the following constraints to the problem

(11)
(12)

Equations (11) and (12) are polynomials in of degree and hard to solve. We can reformulate them in a way that make them easier to solve in practice by the replacing constraints (11) and (12) with

(13)
(14)

where for ,

is a binary variable and

is a term in the multiplication formula. Note that constraint (14) forces at least one of the to be zero and so the determinant will be zero.

3.2 New constraints for non-singular circulant matrices

In this section we develop some new constraints to find an outer approximation of core points. These new constraints depend only on the symmetry group and not the ILP.

Suppose is an arbitrary integer point in . By Remark 3.4, a point is in iff there exists such that

(15)

If is non-singular we have

(16)

and by Theorem 3.1

(17)

The orbit polytope of a core point is a lattice-free polytope, so is a core point iff for at least one of the permutations of any integer point we have

(18)

or equivalently since and are in the same layer

(19)

It should be mentioned that, for each the constraint is nonlinear and non-convex with respect to .

Note that

is the equation of a hyperplane in

with normal vector which is perpendicular to the fixed space (Lemma 3.3). Furthermore, is a core point if for all integer points there is an index such that .

Suppose is the orbit polytope of a non core point and is the orbit polytope of a universal core point in the same layer. Many integer points whose orbit polytope contains , also have orbit polytopes containing . The idea for making new constraints is to remove from the feasible region all integer points whose orbit polytope contains atoms or universal core points. This process can be done by searching layer by layer in the feasible region.

Lemma 3.5.

For two co-projective integer points , we have

for all .

Moreover, constraints (19) are invariant under translation in the fixed space that is if and are two co-projective integer points in the same layer as and respectively, we have

Proof.

Let . Since lies in the fixed space for any , and the fixed space is orthogonal to other invariant subspaces , , we have

So by Theorem 3.1 they have the same Now let and then by Lemma 3.3 we have

(20)

The barycenter of an orbit polytope of is given by

(21)

Note that this point lies in . Since (21) is a convex combination, core points in layers are easy to compute. In other words, in these layers is an integer point and so all core points lie in the fixed space. Furthermore, these layers whose index is a multiple of provide a stopping criterion as follows. Suppose is an optimal solution of a maximization ILP and ; one of the layers between and is a multiple of . Let be this layer. Since the feasible region is convex, if the intersection with layer is lattice-free then the problem is infeasible. So for solving an ILP we need to search at most in layers.

Note that in order to check if an integer point is a core point or not it is impossible to add constraints (19) for all because there are infinite integer points in each layer. But we can do that for a finite subset of integer points in . As was described above a good choice is atom points and universal core points since they are closest points to the barycenter. For this finite set we have the following definition.

Definition 3.2.

In each layer , a chosen subset of atoms and universal core points for making inequalities (19) is called the essential set of the layer and denoted by .

Remark 3.6.

Lemma 3.3 plays an important role to define the specific essential set for any layer. Actually, we can translate each integer point through the fixed space to get an integer point with entries and use inequality (19) which can be written as

(22)

Consider the following equivalence relation between layers of

(23)

where is an integer. Since the sum of is zero, inequity (22) is the same for all layers that are in the same equivalence class.

For example, for an ILP in , we can define the essential set in layer as:

So the corresponding constraint for is

(24)

Since by Lemma 3.3 we have , inequality (24) can be written as

Furthermore, we can define the essential set as follow and use constraint (22).

If we consider layers as representations of each class, we can define the essential set for layers and use universal or atom points with entries .

Definition 3.3.

The essential set in layers is called the projected essential set and it is denoted by for .

The idea is that first we search for integer points whose circulant matrix is singular. If no integer point is found, by adding new constraints in each layer we search for an integer point whose orbit polytope does not contain the integer points of the essential set . Notice that since in this step is non-singular we can make new constraints non-singular by adding the following inequalities.

(25)

We will present variations on this idea for different kinds of symmetry group. In all of these variations there are three different type of subproblems as follows:

  1. Adding constraints (19) and (25) for each point in the projected essential set.

  2. Adding constraints (13) and (14) for the case of singular circulant matrices.

  3. Adding constraints described to check the feasibility of an integer point in one of the essential sets.

Algorithm 1

Suppose is the objective function of the ILP. The algorithm for solving a maximization ILP has the following steps:

  1. [leftmargin=1.5em,itemsep=1ex]

  2. Construct a subproblem of type by adding constraints (13) and (14) to .

  3. If is integer feasible, denote the optimal solution by ; if not set .

  4. Solve the LP relaxation of to determine layers and upper bound of the solution.

  5. We start from the last layer which is the maximum layer determined in the previous step.

  6. Construct subproblems , of type for checking the feasibility of each integer point in .

  7. If one of the subproblems is integer feasible, denote the feasible point by and go to step 10. If not set .

  8. Construct a subproblem of type by adding constraints (25) for all and constraints (19) for each to .

  9. if is integer feasible, denote the optimal solution by and go to step 11. If not set .

  10. If for some integer , go to step 10. If not set and go to step 5.

  11. Stop, if max is finite then it is the optimal objective value, otherwise the ILP is infeasible.

  12. Stop, if max is finite then it is the optimal objective value, otherwise the ILP is infeasible.

4 Partial-Circulant Matrices

In the previous section we assumed that the order of the cyclic permutation group of an ILPs is equal to the dimension of the problem. In this section we generalize the algorithm of the previous section for some ILPs where not all variables are active.

For the remainder of this section, without loss of generality, we assume that the cyclic permutation group acts on the first variables. For any vector we denote by the first coordinates that are active in the cyclic group.

Notice that in this case the symmetry group is not transitive. Also the dimension of the fixed space is and it is spanned by the following orthogonal vectors

Moreover, invariant subspaces of the action of on are the same as on except that we have zero in non-active coordinates.

The following definition is a generalization of the circulant matrix to define the orbit polytope of an integer point in this case.

Definition 4.1.

For , a Partial-Circulant Matrix is a matrix where the first rows form a circulant matrix and the remaining rows are scalar multiples of . A partial circulant matrix takes the form