Visualizing Multidimensional Linear Programming Problems

The article proposes an n-dimensional mathematical model of the visual representation of a linear programming problem. This model makes it possible to use artificial neural networks to solve multidimensional linear optimization problems, the feasible region of which is a bounded non-empty set. To visualize the linear programming problem, an objective hyperplane is introduced, the orientation of which is determined by the gradient of the linear objective function: the gradient is the normal to the objective hyperplane. In the case of searching a maximum, the objective hyperplane is positioned in such a way that the value of the objective function at all its points exceeds the value of the objective function at all points of the feasible region, which is a bounded convex polytope. For an arbitrary point of the objective hyperplane, the objective projection onto the polytope is determined: the closer the objective projection point is to the objective hyperplane, the greater the value of the objective function at this point. Based on the objective hyperplane, a finite regular set of points is constructed, called the receptive field. Using objective projections, an image of a polytope is constructed. This image includes the distances from the receptive points to the corresponding points of the polytope surface. Based on the proposed model, parallel algorithms for visualizing a linear programming problem are constructed. An analytical estimation of its scalability is performed. Information about the software implementation and the results of large-scale computational experiments confirming the efficiency of the proposed approaches are presented.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

06/10/2021

VaLiPro: Linear Programming Validator for Cluster Computing Systems

The article presents and evaluates a scalable algorithm for validating s...
01/03/2022

Maximizing the Sum of the Distances between Four Points on the Unit Hemisphere

In this paper, we prove a geometrical inequality which states that for a...
01/07/2020

An Efficient Gradient Projection Method for Structural Topology Optimization

This paper presents an efficient gradient projection-based method for st...
06/18/2020

An Integer Linear Programming Framework for Mining Constraints from Data

Various structured output prediction problems (e.g., sequential tagging)...
01/06/2014

Constraint Solvers for User Interface Layout

Constraints have played an important role in the construction of GUIs, w...
12/30/2021

A General Traffic Shaping Protocol in E-Commerce

To approach different business objectives, online traffic shaping algori...
01/07/2020

Highly Efficient Feasible Direction Method (HEFDiM) for Structural Topology Optimization

Feasible Direction Method (FDM) is a concise yet rigorous mathematical m...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The rapid development of big data technologies [12, 11]

has led to the emergence of mathematical optimization models in the form of large-scale linear programming (LP) problems 

[24]. Such problems arise in industry, economics, logistics, statistics, quantum physics, and other fields [4, 8, 22, 3, 25]. In many cases, the conventional software is not able to handle such large-scale LP problems in an acceptable time [2]. At the same time, in the nearest future, exascale supercomputers will appear [6], which are potentially capable of solving such problems. In accordance with this, the issue of developing new effective methods for solving large-scale LP problems using exascale supercomputing systems is urgent.

Until now, the class of algorithms proposed and developed by Dantzig based on the simplex method [5] is one of the most common ways to solve the LP problems. The simplex method is effective for solving a large class of LP problems. However, the simplex method has some fundamental features that limit its applicability to large LP problems. First, in the worst case, the simplex method traverses all the vertices of the simplex, which results in exponential time complexity [35]. Second, in most cases, the simplex method successfully solves LP problems containing up to 50,000 variables. However, a loss of precision is observed when the simplex method is used for solving large LP problems. Such a loss of precision cannot be compensated even by applying such computational intensive procedures as “affine scaling” or “iterative refinement” [34]. Third, the simplex method does not scale well on multiprocessor systems with distributed memory. A lot of attempts to parallelize the simplex method were made, but they all failed [19]. In [14], Karmarkar proposed the inner point method having polynomial time complexity in all cases. This method effectively solves problems with millions of variables and millions of constraints. Unlike the simplex method, the inner point method is self-correcting. Therefore, it is robust to the loss of precision in computations. The drawbacks of the interior point method are as follows. First, the interior point method requires careful tuning of its parameters. Second, this method needs a known point that belongs to the feasible region of the LP problem to start calculations. Finding such an interior point can be reduced to solving an additional LP problem. An alternative is the iterative projection-type methods [23, 26, 31], which are also self-correcting. Third, like the simplex method, the inner point method does not scale well on multiprocessor systems with distributed memory. Several attempts at effective parallelization for particular cases have been made (see, for example, [10, 15]). However, it was not possible to make efficient parallelization for the general case. In accordance with this, the research directions related to the development of new scalable methods for solving LP problems are urgent.

A possible efficient alternative to the conventional methods of LP is optimization methods based on neural network models. Artificial neural networks [20, 21]

are one of the most promising and rapidly developing areas of modern information technology. Neural networks are a universal tool capable of solving problems in almost all areas. The most impressive successes have been achieved in image recognition and analysis using convolutional neural networks 

[18]. However, in the scientific periodicals, there are almost no works devoted to the use of convolutional neural networks for solving linear optimization problems [17]. The reason is that convolutional neural networks are focused on image processing, but there are no works on the visual representation of multidimensional linear programming problems in the scientific literature. Thus, the issue of developing new neural network models and methods focused on linear optimization remains open.

In this paper, we have tried to develop a n-dimensional mathematical model of the visual representation of the LP problem. This model allows us to employ the technique of artificial neural networks to solve multidimensional linear optimization problems, the feasible region of which is a bounded nonempty set. The visualization method based on the described model has a high computational complexity. For this reason, we propose its implementation as a parallel algorithm designed for cluster computing systems. The rest of the paper is organized as follows. Section 2 is devoted to the design of a mathematical model of the visual representation of multidimensional LP problems. Section 3 describes the implementation of the proposed visualization method as a parallel algorithm and provides an analytical estimation of its scalability. Section 4 presents information about the software implementation of the described parallel algorithm and discusses the results of large-scale computational experiments on a cluster computing system. Section Conclusion summarizes the obtained results and provides directions for further research.

2 Mathematical Model of the LP Visual Representation

The linear optimization problem can be stated as follows

(1)

where , , and . Here and below,

stands for the dot product of vectors. We assume that constraint

is also included in the system in the form of the following inequalities:

The vector is the gradient of the linear objective function

(2)

Let denote the feasible region of problem (1):

(3)

We assume from now on that is a nonempty bounded set. This means that is a convex closed polytope in space , and the solution set of problem (1) is not empty.

Let be a vector formed by the elements of the th row of the matrix . Then, the matrix inequality is represented as a system of the inequalities

(4)

We assume from now on that

(5)

for all . Let us denote by the hyperplane defined by the equation

(6)

Thus,

(7)

The half-space generated by the hyperplane is the half-space defined by the equation

(8)

From now on, we assume that problem (1) is non-degenerate, that is

(9)

The half-space generated by the hyperplane is recessive with respect to vector  if

(10)

In other words, the ray coming from the hyperplane in the direction opposite to the vector lies completely in , but not in . The necessary and sufficient condition for the recessivity of the half-space with respect to the vector is the condition

(11)

Let us prove the necessity first. Let the condition (10) hold. Equation (7) implies

(12)

By virtue of (5),

(13)

Comparing (10) with (12) and (13), we obtain

In view of (7) and (8), this implies

(14)

Using simple algebraic transformations of inequality (14), we obtain (11). Thus, the necessity is proved.

Let us prove the sufficiency by contradiction. Assume that (11) holds, and there are and such that

In accordance with (7) and (8), this implies

that is equivalent to

Since , it follows from (11) that

but this contradicts our assumption that . Fix a point such that the half-space

(15)

includes the polytope :

In this case, we will call the half-space the objective half-space, and the hyperplane , defined by the equation

(16)

the objective hyperplane. Denote by the orthogonal projection of point onto the objective hyperplane :

(17)

Here, stands for the Euclidean norm. Define distance from to the objective hyperplane   as follows:

(18)

Comparing (15), (17) and (18), we find that, in this case, the distance can be calculated as follows:

(19)

The following Proposition 2 holds. For all ,

Equation (19) implies that

Proposition 2 says that problem (1) is equivalent to the following problem:

(20)

Let the half-space be recessive with respect to the vector  . The objective projection of a point onto the recessive half-space is a point defined by the equation

(21)

where

Examples of objective projections in are shown in Figure 1.

Figure 1: Objective projections in space : ; .

The following Proposition 2 provides an equation for calculating the objective projection onto a half-space that is recessive with respect to the vector . Let half-space defined by the inequality

(22)

be recessive with respect to the vector . Let

(23)

Then,

(24)

According to definition 2, we have

where

Thus, we need to prove that

(25)

Consider the strait line defined by the parametric equation

Let the point be the intersection of the line with the hyperplane :

(26)

Then, must satisfy the equation

(27)

for some . Substitute the right side of equation (27) into equation (6) instead of :

It follows that

(28)

Substituting the right side of equation (28) into equation (27) instead of , we obtain

which is equivalent to

(29)

Since, according to (26), , equation (25) will hold if

(30)

holds. Assume the opposite, that is, there exist such that

(31)

and

(32)

Then, it follows from (22) and (32) that

This is equivalent to

(33)

Proposition 2 implies that . Hence, equation (33) is equivalent to

Thus, we have a contradiction with (31). Let . The objective projection  of the point  onto the polytope  is the point defined by the following equation:

(34)

where

If

then we set , where stands for a point that is infinitely far from the polytope . Examples of objective projections onto polytope  in  are shown in Figure 2.

Figure 2: Objective projections onto polytope  in : ; .

A receptive field  of density  with center  and rank  is a finite ordered set of points satisfying the following conditions:

(35)
(36)
(37)
(38)
(39)

The points of the receptive field will be called receptive points. Here, stands for the convex hull of a finite point set :

In definition 2, condition (35) means that the center of the receptive field belongs to this field. Condition (36) implies that the distance from the central point  to each point  of the receptive field does not exceed . According to (37), for any two different points  of the receptive field, the distance between them cannot be less than . Condition  (38) says that for any point  of a receptive field, there is a point  in this field such that the distance between  and is equal to . Condition (39) implies that for any point  belonging to the convex hull of the receptive field, there is a point  in this field such that the distance between  and  does not exceed . An example of a receptive field in the space  is presented in Figure 3.

Figure 3: Receptive field in space .

Let us describe a constructive method for building a receptive field. Without loss of generality, we assume that . Consider the following set of vectors:

It is easy to see that

This means that is an orthogonal basis in . In particular,

(40)

The following Proposition LABEL:prp:S_c+z=H_ shows that the linear subspace of dimension  generated by the orthogonal vectors is a hyperplane parallel to the hyperplane . Define the following linear subspace  of dimension  in :

(41)

Then,

(42)

Let , that is

Then,

In view of (40), this implies

Comparing this with (16), we obtain . Define the following set of vectors:

(43)

It is easy to see that the set is an orthonormal basis of subspace .

The procedure for constructing a receptive field is presented as Algorithm 1. This algorithm constructs a receptive field  consisting of

(44)

points. These points are arranged at the nodes of a regular lattice having the form of a hypersquare (a hypercube of dimension ) with the edge length equal to . The edge length of the unit cell is . According to Step 13 of Algorithm 1 and Proposition 2, this hypersquare lies in the hyperplane and has the center at the point .

1:, ,
2:
3:for  do
4:     
5:     for  do
6:         
7:         …
8:         for  do
9:              
10:              
11:              for  do
12:                  
13:              end for
14:              
15:         end for
16:     end for
17:end for
Algorithm 1 Building receptive field

The drawback of Algorithm 1 is that the number of nested for loops depends on the dimension of the space. This issue can be solved using the function

, which calculates a point of the receptive field by its ordinal number (numbering starts from zero; the order is determined by Algorithm 

1). The implementation of the function is represented as Algorithm 2.

1:, ,
2:function G()
3:     for  do
4:         
5:         
6:     end for
7:     
8:     for  do
9:         
10:     end for
11:     
12:end function
Algorithm 2 Function  calculates a receptive point by its number 

The following Proposition 2 provides an estimation of time complexity of Algorithm 2. Algorithm 2 allows an implementation that has time complexity111Here, time complexity refers to the number of arithmetic and comparison operations required to execute the algorithm.

(45)

where is the space dimension. Consider Algorithm  3 representing a low-level implementation of Algorithm 2.

1:; ; ;
2:
3:repeat
4:     
5:     
6:     
7:     
8:until 
9:
10:repeat
11:     
12:     
13:     repeat
14:         
15:         
16:     until 
17:     
18:until 
Algorithm 3 Low-level implementation of Algorithm 2

Values calculated in Steps 1–2 of the algorithm 3 do not depend on the receptive point number  and therefore can be considered constants. In Steps 3–8, the repeat/until loop runs times and requires operations. In steps 13–16, the nested repeat/until loop runs times and requires operations. In steps 10–18, the external repeat/until loop runs times and requires operations. In total, we obtain

Time complexity of Algorithm 2 can be estimated as .

Let . Fix , . The image generated by the receptive field is an ordered set of real numbers defined by the equation

(46)

The order of the real numbers in the image is determined by the order of the respective receptive points. The following Algorithm 4 implements the function building an image as a list of real numbers.

1:, ,
2:function ()
3:     
4:     for  do
5:         
6:         
7:     end for
8:end function
Algorithm 4 Building image

Here, stands for the empty list, and  stands for the operation of list concatenation.

Let . This means that the half-space  is recessive with respect to the vector (see Proposition 2). Let there be a point . Assume that we managed to create an artificial neural network DNN, which receives the image  as an input, and outputs the point  such that

Then, we can build the following Algorithm 5 solving the linear programming problem (20) using DNN.

1:
2:
3:repeat
4:     
5:     
6:     
7:until 
8:
9:stop
Algorithm 5 Linear programming using DNN

Only an outline of the forthcoming algorithm is presented here, which needs further formalization, detalization and refinement.

3 Parallel Algorithm for Building LP Problem Image

When solving LP problems of large dimension with a large number of constraints, Algorithm  4 of building the LP problem image can incur significant runtime overhead. This section presents a parallel version of Algorithm 4, which significantly reduces the runtime overhead of building the image of a large-scale LP problem. The parallel implementation of Algorithm 4 is based on the BSF parallel computation model [28, 27]. The BSF model is intended for a cluster computing system, uses the master/worker paradigm and requires the representation of the algorithm in the form of operations on lists using higher-order functions Map and Reduce defined in the Bird–Meertens formalism [1]. The BSF model also provides a cost metric for the analytical evaluation of the scalability of a parallel algorithm that meets the specified requirements. Examples of the BSF model application can be found in [7, 31, 30, 32, 33].

Let us represent Algorithm 4 in the form of operations on lists using higher-order functions Map and Reduce. We use the list of ordinal numbers of inequalities of system (4) as a list, which is the second parameter of the higher-order function Map:

(47)

Designate . We define a parameterized function

which is the first parameter of a higher-order function Map, as follows:

(48)

where (see Algorithm 2), and is calculated by equation (24). Informally, the function  maps the ordinal number of half-space to the distance from the objective projection to the objective hyperplane if is recessive with respect to (see Proposition 2) and the objective projection belongs to . Otherwise, returns the special value .

The higher-order function Map transforms the list into the list by applying the function to each element of the list :

Define the associative binary operation as follows:

Informally, the operation  calculates the minimum of two numbers.

The higher-order function Reduce folds the list to the single value by sequentially applying the operation   to the entire list:

The Algorithm 6 builds the image  of LP problem using higher-order functions Map and Reduce.

1:, ,
2:input
3:
4:
5:for  do
6:     
7:     
8:     
9:end for
10:output
11:stop
Algorithm 6 Building the image  by Map and Reduce

The parallel version of Algorithm 6 is based on algorithmic template 2 in [28]. The result is presented as Algorithm 7.

Master

th Worker (l=0,…,L-1)

 

1:input
2:
3:
4:repeat
5:     SendToWorkers
6:     
7:     
8:     RecvFromWorkers
9:     
10:     
11:     
12:     
13:     SendToWorkers
14:until 
15:output
16:stop
1:input
2:
3:
4:repeat
5:     RecvFromMaster
6:     
7:     
8:     SendToMaster
9:     
10:     
11:     
12:     
13:     RecvFromMaster
14:until 
15:
16:stop
Algorithm 7 Parallel algorithm of building image

Let us explain the steps of Algorithm 7. For simplicity, we assume that the number of constraints  is a multiple of the number of workers . We also assume that the numbering of inequalities starts from zero. The parallel algorithm includes  processes: one master process and  worker processes. The master manages the computations. In Step 1, the master reads the space dimension . In Step 2 of the master, the image variable  is initialized to the empty list. Step 3 of the master assigns zero to the iteration counter . At Steps  4–14, the master organizes the repeat/until loop, in which the image  of the LP problem is built. In Step 5, the master sends the receptive point number to all workers. In Step 8, the master is waiting the particular results from all workers. These particular results are folded to a single value, which is added to the image (Steps 9–10 of the master). Step 11 of the master increases the iteration counter  by 1. Step 12 of the master assigns the logical value to the Boolean variable exit. In Step 13, the master sends the value of the Boolean variable exit to all workers. According to (44), means that not all the points of the receptive field have been processed. In this case, the control is passed to the next iteration of the external repeat/until loop (Step 14 of the master). After exiting the repeat/until loop, the master outputs the constructed image (Step 15) and terminates its work (Step 16).

All workers execute the same program codes but with different data. In Step 3, the th worker defines its own sublist. In Step 4, the worker enters the repeat/until loop. In Step 5, it receives the number  of the next receptive point. In Step 6, the worker processes its sublist  using the higher-order function Map, which applies the parameterized function , defined by (48), to each element of the sublist. The result is the sublist , which includes the distances  from the objective hyperplane  to the objective projections of the receptive point  onto hyperplanes  for all  from the sublist . In Step 7, the worker uses the higher-order function Reduce to fold the sublist to the single value of , using the associative binary operation  , which calculates the minimum distance. The computed particular result is sent to the master (Step 8 of the worker). In Step 13, the worker is waiting for the master to send the value of the Boolean variable . If the received value is false, the worker continues executing the repeat/until loop (Step 14 of the worker). Otherwise, the worker process is terminated in Step 16.

Let us obtain an analytical estimation of the scalability bound of the parallel algorithm 7 using the cost metric of the BSF parallel computation model [28]. Here, the scalability bound means the number of workers at which maximum speedup is achieved. The cost metric of the BSF model includes the following cost parameters for the repeat/until loop (Steps 4–14) of the parallel algorithm 7: : length of the list ; : latency (the time taken by the master to send one byte message to a single worker); : the time taken by the master to send the coordinates of the receptive point to a single worker and receive the computed value from it (including latency); : the time taken by a single worker to process the higher-order function Map for the entire list ; : the time taken by computing the binary operation  . According to equation (14) from  [28], the scalability bound of the Algorithm 7 can be estimated as follows:

(49)

Calculate estimations for the time parameters of equation  (49). To do this, we will introduce the following notation for a single iteration of the repeat/until loop (Steps 4–14) of Algorithm 7: : the quantity of numbers sent from the master to the worker and back within one iteration; : the quantity of arithmetic and comparison operations computed in Step 5 of the sequential algorithm 6; : the quantity of arithmetic and comparison operations required to compute the binary operation  . At the beginning of every iteration, the master sends each worker the receptive point number . In response, the worker sends the distance from the receptive point  to its objective projection. Therefore,

(50)

In the context of Algorithm 6

(51)

where is the number of operations taken to compute coordinates of the point , and  is the number of operations required to calculate the value of , assuming that the coordinates of the point  have already been calculated. The estimation of  is provided by Proposition 2. Let us estimate . According to (24), calculating the objective projection takes arithmetic operations. It follows from  (19) that the calculation of takes arithmetic operations. Inequalities (4) imply that checking the condition  takes  arithmetic operations and  comparison operations. Hence, takes a total of operations. Hence,

(52)

Substituting the right-hand sides of equations (45) and (52) in (51), we obtain

(53)

To perform the binary operation  , one comparison operation must be executed:

(54)

Let stand for average execution time of arithmetic and comparison operations, and stand for average time of sending a single real number (excluding the latency). Then, using equations (50), (53), and (54) we obtain

(55)
(56)