1 Introduction
Combinatorial optimization is the seeking for one or more optimal solutions in a well defined discrete problem space. In real life approaches, this means that people are interested in finding efficient allocations of limited resources for achieving desired goals, when all the variables have integer values. As workers, planes or boats are indivisible (like many other resources), the Combinatorial Optimization Problems (COPs) receive today an intense attention from the scientific community.
The current reallife COPs are difficult in many ways: the solution space is huge, the parameters are linked, the decomposability is not obvious, the restrictions are hard to test, the local optimal solutions are many and hard to locate, and the uncertainty and the dynamism of the environment must be taken into account. All these characteristics, and other more, constantly make the algorithm design and implementations challenging tasks. The quest for more and more efficient solving methods is permanently driven by the growing complexity of our world.
The Matrix Bandwidth Minimization Problem (MBMP) is a fundamental mathematical problem, searching for a simultaneous permutation of the rows and the columns of a square matrix that keeps its nonzero entries as much as possible close to the main diagonal. This problem is NPcomplete in general Papadimitriou (1976), and it remains so even in restricted solutions spaces Garey (1978), that is why any attempt to improve its solutions is beneficent.
The main contribution of this paper is to emphasize the effectiveness of using soft computing methods in order to solve the Matrix Bandwidth Minimization Problem. Genetic algorithms and antbased systems are natural computing methods used in this paper in order to solve the MBMP instances. Computational experiments confirm that these methods provide robust and lowcost solutions for the MBMP. We also introduce a new theoretical reinforcement learning model for solving the MBMP. So far, such a learning model has not been reported in the MBMP literature.
The rest of the paper is organized as follows. Section 2 briefly presents the matrix bandwidth minimization problem, emphasizing its relevance and also presenting existing approaches for solving it. The fundamentals of the soft computing approaches considered in this paper, i.e genetic algorithms, ant colony systems and reinforcement learning, are given in Section 3. In Section 4 we propose two natural computing methods for solving the MBMP instances, namely genetic algorithms and ant colony systems. A theoretical reinforcement learning model for solving MBMP is introduced in Section 5. Section 6 provides an experimental evaluation of the proposed methods and Section 7 contains some conclusions of the paper and future development of our work.
2 The Matrix Bandwidth Minimization Problem
This section introduces the concept and the literature review related to the Matrix Bandwidth Minimization Problem.
2.1 Matrix Bandwidth Minimization Problem description
Given a square positive symmetric matrix the bandwidth is the value . The Matrix Bandwidth Minimization Problem searches for a row (and column) permutation that minimizes the bandwidth for the new matrix.
An equivalent form of MBMP uses the graphtheory approach, based on the layout notion. Given an undirected, connected graph G=(V, E), a layout is a bijection between V and . The bandwidth of G is . Intuitively, computing the bandwidth for a graph is to find a linear ordering of its vertices that minimizes the maximum distance between two adjacent vertices.
Starting from the given matrix , an equivalent graph can be defined and the MBMP can be viewed as the problem of minimizing the bandwidth of . In this graph, the set of vertices is and two vertices and are connected through an edge iff , i.e iff.
The current exact approaches devise algorithms that solve the general MBMP in running time Cygan (2009). Classic results for approximation approaches establish an approximation factor of for general MBMP Feige (1998) and for trees Gupta (2000).
The MBMP arose in the solving systems of linear equation; the ordering of the system matrix has great impact on the resources needed when actually solving the system, and may lead to a substantial efficiency increase. Minimizing the bandwidth of a matrix helps in improving the efficiency of certain linear algorithms, like Gaussian elimination.
The MBMP current applications in computer science include VLSI design, network survivability, data storage. Other applications are in electromagnetic industry Esposito (1998), largescale power transmission systems, chemical kinetics and numerical geophysics Marti (2001), information retrieval in hypertext Berry (1996).
Some generalizations of MBMP are currently investigated by the world researchers. For example, the twodimensional bandwidth problem is to embed a graph into a planar grid such that the maximum distance between adjacent vertices is as small as possible Lin (2010).
2.2 Literature review.
The importance of the bandwidth minimization problem is also reflected by the large number of publications describing algorithms for solving it. Cuthill and McKee propose in 1969 in Cuthill (1969)
the first stable heuristic method for
MBMP: the CM algorithm with BreadthFirst Search.Marti et al. have used in Marti (2001) Tabu Search for solving the MBMP problem. They used a candidate list strategy to accelerate the selection of moves in the neighborhood of the current solution.
A with Path Relinking method given by Pinana et al. in Pinana (2004) has been shown to achieve better results than the Tabu Search procedure but with longer running times. Lim et al. propose in Lim (2003) a Genetic Algorithm integrated with Hill Climbing to solve the bandwidth minimization problem.
A simulated annealing algorithm is shown in RodriguezTello (2008) for the matrix bandwidth minimization problem. The algorithm proposed by Tello et al. is based on three distinguished features including an original internal representation of solutions, a highly discriminating evaluation function and an effective neighborhood. More recently, the Ant Colony Optimization metaheuristic has been used in Lim (2006), Pintea (2010) in order to solve the the MBMP.
3 Background
In this section we will briefly review the fundamentals of the soft computing approaches used in this paper for solving the MBMP, i.e genetic algorithms, ant colony optimization and reinforcement learning.
Soft Computing is the collection of computing branches that cope with the imprecision, uncertainty, partial truth, and approximation, manifested in nature and naturally (and gracefully) operated by biologic entities (cells, organisms, or collections of individuals). The goal of soft computing approaches is to achieve tractability, robustness and lowcost solutions, facing the reallife, complex, highlydimensioned problems.
Genetic algorithms
(GAs), invented by John Holland in the 1960s, are the most widely used approaches to computational evolution. Genetic algorithms provide an approach to machine learning
Mitchell (1998), method motivated by analogy to biological evolution. Hypotheses are often described by bit strings whose interpretation depends on the application, though hypotheses may also be described by symbolic expressions or even computer programs Goldberg (1989).Ant Colony Optimization (ACO) studies artificial systems inspired by the behavior of real ant colonies and which are used to solve COPs Dorigo (2004). The ACO methods use a set of cooperative artificial ants, each constructing a solution, based on the expected quality of the available moves and on the good solutions found by the community. ACO demonstrated a high flexibility and strength by solving with very good results either academic instances of many COPs or reallife problems. To improve the efficiency, the antbased algorithms are designed using problemspecific information and involve local search methods.
The goal of building systems that can adapt to their environments and learn from their experiences has attracted researchers from many fields including computer science, mathematics, cognitive sciences Sutton (1998).
Reinforcement learning (RL) is learning what to do  how to map situations to actions  so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In RL, the computer is simply given a goal to achieve. The computer then learns how to achieve that goal by trialanderror interactions with its environment.
4 Natural computing models for the MBMP
In this section we propose two natural computing methods for solving the MBMP instances: genetic algorithms and ant colony systems. The computational experiments from Section 6 confirm that these methods provide robust and lowcost solutions for the MBMP.
4.1 Genetic Algorithm.
In the following, a hybrid genetic algorithm (HGA) is proposed for solving the MBMP. The algorithm proposed in this section is a slight modification of the Genetic Algorithm integrated with Hill Climbing proposed by Lim et al. in Lim (2003).
Let us consider that is the square symmetric matrix whose bandwidth has to be minimized.
In the HGA we use, a chromosome is a dimensional sequence representing a permutation of
.
Thus, a matrix can be associated to a chromosome , i.e the matrix obtained starting from the matrix by permuting it rows (and columns) in the order given by permutation . The fitness function associated to a chromosome is defined as the bandwidth of the corresponding matrix, i.e . The problem consists of minimizing the fitness function, i.e finding the individual with the minimum associated fitness value.
We have used the traditional structure for a genetic algorithm, adding a Hill Climbing step in order to quickly tune solutions to reach local optimum Lim (2003). HGA algorithm operates as follows:

At the beginning, an initial group of chromosomes is constructed, as it will be further detailed.

Then, middlepoint crossover and a swap mutation are performed on this group of chromosomes to generate new chromosomes Lim (2003). Hill Climbing is now applied to each newlygenerated chromosome, as proposed in Lim (2003). As the number of individuals within a population remains , fittest chromosomes will remain in the next generation. After the new generation is formed, a swap mutation is applied on all chromosomes within the new generation, excepting the best one. Then, Hill Climbing is applied again to each newlygenerated chromosome.

Step [ii.] is repeated for a given number of generations; the algorithm stops and the best result is reported as solution.
The initial population for the HGA is constructed as follows. Starting from matrix , we construct the corresponding graph . Then, the initial chromosomes are built by performing BFS on the graph, starting from each node. This way, initial individuals are constructed. Applying Hill Climbing Lim (2003) on the obtained individuals, the initial population for the HGA is obtained. The construction of the initial population for the HGA is slightly different from the method from Lim (2003).
As further work we will investigate the appropriateness of replacing the Hill Climbing step from the HGA with other local search mechanisms, such as PSwap or MPSwap procedures that will be described in Subsection 4.2.
4.2 Antbased system.
A hybridized ACO approach using a local search procedure is proposed in this section for solving the MBMP. This local search method is designed to reduce the bandwidth of the current solution and is executed during the local search stage of the ACO framework.
In Pintea (2010) Ant Colony System (ACS) Dorigo (2004) is hybridized with a local search mechanism. The ACS model is based on the level structure used by the CuthillMcKee algorithm Cuthill (1969). The local search procedure aims at improving ACS solutions, by reducing the maximal bandwidth. The integration of a local search phase within the proposed ACS approach to MBMP facilitates the refinement of ants’ solutions.
The main stages of the proposed hybrid ACS are as follows.

First, the current matrix bandwidth is computed, the pheromone trails are initialized and the parameters values are established.

The construction stage consists of executing the next steps within a given number of iterations. At first all the ants are placed in the node from the first level, then the local search mechanism is applied.
Each ant builds a feasible solution by repeatedly making pseudorandom choices from the available neighbors. While constructing its solution, an ant also modifies the amount of pheromone on the visited edges by applying the local updating rule Dorigo (2004).
After each partial solution is built, in order to improve each ant’s solution, the local search mechanism is applied. Finally, once all ants have finished their tour, the amount of pheromone on edges is modified again by applying the global updating rule Dorigo (2004).

The best current solution is listed.
As illustrated above, the local search procedure is used twice within the proposed hybrid model: at the beginning of each iteration and after each partial solution is built, in order to improve each ant’s solution.
In Pintea (2010) two local search mechanisms are introduced: PSwap and MPSwap. The local search mechanisms are denoted by hACS and respectively hMACS.
PSwap firstly founds the maximum and minimum degrees. Then, for all indexes with the maximum degree, it randomly selects an unvisited node with a minimum degree and then swaps the nodes and .
In order to avoid stagnation was introduced hMACS. First are found the maximum and minimum degrees. For all indexes with the maximum degree, it randomly selects an unvisited node with a minimum degree such as the matrix bandwidth decreases and then swaps the nodes and .
The experimental results reported in Pintea (2010) show that MPSwap procedure performs better on small instances, while PSwap is better on larger ones.
5 A theoretical reinforcement learning model for solving MBMP
In this section we investigate a reinforcement learning approach for solving the MBMP problem and introduce our RL model.
Let us assume, in the following, that is the symmetric matrix of order whose bandwidth has to be minimized.
5.1 Problem definition.
We define the RL problem associated to MBMP as in Czibula (2010):

The environment consists of the set of states
extended with a state that is connected to all other states, i.e. . 
The initial state of the agent in the environment is .

A state
reached by the agent at a given moment after it has visited states
is a terminal (final) state if the number of states visited by the agent in the current sequence is , i.e. . 
The transition function between the states is defined as , where . This means that, at a given moment, from the state i the agent can move to any state from , excepting the initial state. We say that a state j that is accessible from state i () is the neighbor (successor) state of i.

The transitions between the states are equiprobable, the transition probability
between a state i and each neighbor state j of i is .
The RL problem consists in training the MBMP agent to find a path from the initial to a final state, i.e a permutation of that minimizes the corresponding matrix bandwidth Czibula (2010). Let us consider that, at a given moment, the agent has visited states , where , , ; and , . Starting from the path , we construct a permutation of , denoted by . An element () is computed as follows:

If , then .

If and , then
. 
If and , then
, where , and and such that , , .
Based on the definition of given above, it can be proved that is a permutation of . Now, a matrix can be obtained from the initial matrix by permuting its rows (and columns) in the order given by permutation .
Consequently, a path of the agent in the environment corresponds to the matrix obtained as we have described above.
5.2 Reinforcement function.
As we aim at obtaining a permutation of that minimizes the matrix bandwidth, we define the reinforcement function as indicated in Equation (1). We mention that an alternative method to define the reinforcement function was considered in Czibula (2010).

the reward received in state after states were visited, denoted by is computed as the bandwith of matrix minus the bandwidth of matrix .
(1) 
Considering the reward defined in Equation (1), as the learning goal is to maximize the total amount of rewards received on a path from the initial to a final state, it can be easily proved that the agent is trained to find a permutation of that minimizes the bandwidth of the corresponding matrix .
5.3 The learning process.
During the training step of the learning process, the agent will determine its optimal policy in the environment, i.e the policy
that maximizes the sum of the received rewards. During the training process, the states’ utilities estimations converge to their exact values, thus, at the end of the training process, the estimations will be in the vicinity of the exact values.
It is proved that the RL algorithm (such as SARSA Sutton (1998)) converges with probability 1 to an utility function as long as all stateaction pairs are visited an infinite number of times and the policy converges in the limit to the Greedy policy.
Consequently, after the training step of the agent has been completed, the solution learned by the agent will be constructed by starting from the initial state and following the policy until a solution is reached. From given state , using the policy, the agent moves to an unvisited neighbor of having the maximum utility value. The solution of the MBMP reported by the RL agent is a permutation of such that , being the utility function learned by the agent during its training. Considering the general goal of a RL agent, it can be proved that the permutation of learned by the agent converges to the permutation that corresponds to the matrix with the minimum bandwidth.
6 Computational experiments
In this section follows the comparative evaluation of the techniques proposed in Section 4 in order to solve the MBMP. The results are compared with those reported by CM algorithm Cuthill (1969).
Nine benchmark instances from National Institute of Standards and Technology, Matrix Market, HarwellBoeing sparse matrix collection HarwellBoeing (2010) were used in the computational experiments. In Table 1 are illustrated, for each considered instance, the following characteristics: number of lines, number of columns and number of nonzero entries.
No.  Instance  Euclidean Characteristics 

1  can_24  24 24 92 
2  can_61  61 61 309 
3  can_62  62 62 140 
4  can_73  73 73 225 
5  can_96  96 96 432 
6  can_187  187 187 839 
7  can_229  229 229 1003 
8  can_256  256 256 1586 
9  can_268  268 268 1675 
The hybrid genetic algorithm HGA (Subsection 4.1) and the hybrid ant systems hACS and hMACS (Subsection 4.2) were implemented and applied for the instances described in Table 1. Some details regarding the implementations of HGA, hACS and hMACS are following.
The Hybrid GA is based on a Delphi implementation Zavoianu (2009) and is tested with mutation rate, and , respectively generations. is denoted the hybrid genetic algorithm with generations and the hybrid genetic algorithms with generations.
The hybrid ant algorithms Pintea (2010) hACS and hMACS are implemented in Java. For each instance, both algorithms are executed times.
The parameter values for both implementations are: ants, iterations, , , , . The algorithms were compiled on an AMD 2600 computer with 1024 MB memory and 1.9 GHz CPU clock.
In Table 2 are comparatively illustrated the best solution (the bandwidth of the matrix) obtained by CM, hACS, hMACS, GA1 and GA2 algorithms for the instances given in Table 1.
Instance no.  CM  hACS  hMACS  GA1  GA2 

1  8  14  11  6  6 
2  26  43  42  19  19 
3  9  20  12  8  8 
4  27  28  22  22  23 
5  23  17  17  25  25 
6  23  63  33  53  51 
7  49  120  120  63  63 
8  116  148  189  91  91 
9  134  165  210  90  90 
A graphical representation of the results is given in Figure 1. Based on Figure 1 some conclusions follows.
Excepting two instances ( and ) the hybrid naturalbased algorithms provide better result than CM algorithm. hMACS algorithm performs better than hACS algorithm on small instances, while hACS algorithm is better than hMACS on larger ones. For six instances the hybrid genetic algorithm performed better than antbased algorithms. The number of generations considered for GA1 and GA2 has no significant influence on the results.
In order to assure a better convergence to the solution, the antbased hybrid models should offer an ”ideal” set of parameters and also a good strategy of placing the agents in the environment.
The Matrix Bandwidth Minimization Problem’s results could be improved using reinforcement learning in new hybrid natural basedcomputing techniques.
7 Conclusions and further work
The Matrix Bandwidth Minimization Problem (MBMP) is a classic mathematical problem, relevant to a wide range of complex real life applications. The problem is NPcomplete and a lot of research was conducted in order to find appropriate solutions.
Nowadays, bioinspired heuristics are successfully used to solve difficult problems. On this basis, the paper describes several soft computing approaches for solving the MBMP. The proposed heuristics are hybrid algorithms: genetic algorithms and ant colony algorithms. Some standard MBMP instances are tested using the hybrid bioinspired algorithms and compared with existing literature. The results are encouraging.
A new theoretical reinforcement learning model for solving the considered problem is also introduced. Computational experiments confirmed a good performance of the proposed algorithms, emphasizing the effectiveness of soft computing methods in order to solve the MBMP.
Further work will be made in order to detail the proposed reinforcement learning model. More exactly, we proposed to develop a RL algorithm for training the MBMP agent and to experimentally validate the RL model. We will also investigate new local search procedures in order to improve the performance of the ant system and of the genetic algorithm proposed for solving the Matrix Bandwidth Minimization Problem.
References
 Berry (1996) M. Berry, B. Hendrickson and P. Raghavan. Sparse matrix reordering schemes for browsing hypertext. Lectures in Appl. Math. 32: The Mathematics of Numberical Analysis, 99–123, 1996.
 Cuthill (1969) E. Cuthill, E., and McKee, J. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 24th National Conference ACM, 157–172, 1969.
 Cygan (2009) M. Cygan and M. Pilipczuk. Even faster exact bandwidth. ACM Trans. Algorithms (in press). Also Technical Report abs/0902.1661, arXiv, CoRR, 2009.
 Czibula (2010) Czibula G., Czibula IG., and Pintea, CM. A Reinforcement learning approach for solving the matrix bandwidth minimization problem. Studia Informatica, LV(4):9–17, 2010.
 Dorigo (2004) M. Dorigo and T. Stüzle. Ant Colony Optimization, 2004. MIT Press, Cambridge.

Esposito (1998)
A. Esposito, M.S. Catalano, F. Malucell and L. Tarricone.
Sparse Matrix Bandwidth Reduction: Algorithms, applications and real industrial cases in electromagnetics, High Performance Algorithms for Structured Matrix Problems.
Advances in the theory of Computation and Computational Mathematics
, 2:27–45, 1998.  Feige (1998) U. Feige Approximating the bandwidth via volume respecting embeddings. Journal of Computer and System Sciences. 60(3):510–539, 2000.
 Garey (1978) M. R. Garey, R. L. Graham, D. S. Johnson, and D. E. Knuth. Complexity results for the bandwidth minimization. SIAM Journal of Applied mathematics. 34(3):477–495, 1978.
 Goldberg (1989) D. E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning, 1989, 1st. AddisonWesley Longman Publishing Co., Inc.
 Gupta (2000) A. Gupta. Improved bandwidth approximation for trees. In Proceedings of the 11th ACMSIAM Symposium on Discrete Algorithms (SODA), 788–793, 2000.

Lim (2003)
A. Lim, B. Rodrigues, and F. Xiao.
Integrated genetic algorithm with hill climbing for bandwidth minimization problem.
Proceedings of the 2003 international Conference on Genetic and Evolutionary Computation, Lecture Notes In Computer Science.
SpringerVerlag, Berlin, Heidelberg, 1594–1595, 2003.  Lim (2004) A. Lim, B. Rodrigues and F. A. Xiao. Centroidbased approach to solve the Bandwidth Minimization Problem. In Proceedings of the 37th Hawaii International Conference on System Sciences, Hawaii, USA, 2004.
 Lim (2006) A. Lim, J. Lin, B. Rodrigues and F. Xiao. Ant Colony Optimization with hill climbing for the bandwidth minimization problem. Appl. Soft Comput. 6(2):180–188, 2006.
 Lin (2010) L. Lin and Y. Lin. Two models of twodimensional bandwidth problems. Information Processing Letters. 110(11):469–473, 2010.
 Marti (2001) R. Marti, M. Laguna, F. Glover and V. Campos. Reducing the Bandwidth of a Sparse Matrix with Tabu Search. European Journal of Operational Research. 135(2):211–220, 2001.
 Mitchell (1998) M. Mitchell. An Introduction to Genetic Algorithms. 1998, MIT Press.
 HarwellBoeing (2010) National Institute of Standards and Technology, Matrix Market, HarwellBoeing sparse matrix collection
 Papadimitriou (1976) C. H. Papadimitriou. The NPCompleteness of the bandwidth minimization problem. Computing. 16(3): 263–270, 1976.
 Pinana (2004) E. Pinana, I. Plana, V. Campos and R. Marti. GRASP and Path Relinking for the Matrix Bandwidth Minimization. European Journal of Operational Research, 153:200–210, 2004.
 Pintea (2010) CM. Pintea, GC. Crisan and C. Chira. A Hybrid ACO Approach to the Matrix Bandwidth Minimization Problem. HAIS 2010, LNCS 6076, 405–412, 2010.
 RodriguezTello (2008) E. RodriguezTello, H. JinKao, and J. TorresJimenez. An improved Simulated Annealing Algorithm for the matrix bandwidth minimization. European J. of Oper. Res. 185(3):1319–1335, 2008.
 Sutton (1998) R. Sutton and A. Barto. Reinforcement Learning. 1998, The MIT Press, Cambridge, London.
 Zavoianu (2009) C. Zavoianu. Tutorial: Bandwidth reduction  The CutHillMcKee Algorithm, 2009.
Comments
There are no comments yet.