DeepAI

# Cooperative Optimization for Energy Minimization: A Case Study of Stereo Matching

Often times, individuals working together as a team can solve hard problems beyond the capability of any individual in the team. Cooperative optimization is a newly proposed general method for attacking hard optimization problems inspired by cooperation principles in team playing. It has an established theoretical foundation and has demonstrated outstanding performances in solving real-world optimization problems. With some general settings, a cooperative optimization algorithm has a unique equilibrium and converges to it with an exponential rate regardless initial conditions and insensitive to perturbations. It also possesses a number of global optimality conditions for identifying global optima so that it can terminate its search process efficiently. This paper offers a general description of cooperative optimization, addresses a number of design issues, and presents a case study to demonstrate its power.

04/03/2022

### AutoOpt: A Methodological Framework of Automatically Designing Metaheuristics for Optimization Problems

Metaheuristics are gradient-free and problem-independent search algorith...
01/11/2019

### OFDMA-based DF Secure Cooperative Communication with Untrusted Users

In this letter we consider resource allocation for OFDMA-based secure co...
12/06/2019

### The surprising little effectiveness of cooperative algorithms in parallel problem solving

Biological and cultural inspired optimization algorithms are nowadays pa...
08/18/2022

### Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members

In many multi-agent settings, participants can form teams to achieve col...
03/29/2021

Adaptive networks have the capability to pursue solutions of global stoc...
09/18/2022

### An Interactive Knowledge-based Multi-objective Evolutionary Algorithm Framework for Practical Optimization Problems

Experienced users often have useful knowledge and intuition in solving r...
06/16/2020

### Evaluating and Rewarding Teamwork Using Cooperative Game Abstractions

Can we predict how well a team of individuals will perform together? How...

## I Introduction

Optimization is a core problem both in mathematics and computer science. It is a very active research area with many international conferences every year, a large amount of literature, and many researchers and users across many fields for a wide range of applications. Combinatorial optimization

[1, 2] is a branch of optimization where the set of feasible solutions of problems is discrete, countable, and of a finite size. The general methods for combinatorial optimization are 1) local search [3], 2) simulated annealing [4, 5][6, 7, 8], 5) ant colony optimization [9], 4) tabu search [10], 5) branch-and-bound [11, 12] 6) dynamic programming [12]. The successful applications of different combinatorial optimization methods have been reported in solving a large variety of optimization problems in practice.

Optimization is important in the areas of computer vision, pattern recognition, and image processing. For example, stereo matching is one of the most active research problems in computer vision

[13, 14, 15, 16]. The goal of stereo matching is to recover the depth image of a scene from a pair of 2-D images of the same scene taken from two different locations. Like many other problems from these areas, it can be formulated as a combinatorial optimization problem, which is NP-hard [17] in computational complexity in general.

The researchers in computer vision have developed a number of search techniques which have been proven effective in practice for finding good solutions for combinatorial optimization problems. Two well-known ones are the cooperative algorithm proposed by D. Marr and T. Poggio in [16] for stereo matching and the probabilistic relaxation proposed by A. Rosenfield et al [18] for scene labeling.

Recently, there are some remarkable progresses in discovering new optimization methods for solving computer vision problems. Graph cuts [14, 19, 13, 20] is a powerful specialized optimization technique popular in computer vision. It has the best known results in energy minimization in the two recent evaluations of stereo algorithms [13, 21], more powerful than the classic simulated annealing method. However, graph cuts has a limitation in its scope because it is only applicable when the energy minimization of a vision problem can be reduced into a problem of finding the minimum cut in a graph [20].

The second optimization method is so called the sum-product algorithm [22], a generalized belief propagation algorithm developed in AI [23]. The sum-product algorithm is the most powerful optimization method ever found so far for attacking hard optimization problems raised from channel decoding in communications. The min-sum algorithm and max-product algorithm [24, 25] are its variations. It has also been successful applied to solve several computer vision problems with promising experimental results [26].

The third method proposed recently is so called max-product tree-reweighted message passing [27]

. It is based on a lower bounding technique called linear programming relaxation. Its improvement has been proposed recently and its successful applications in computer vision have been reported

[28].

The cooperative optimization is a newly discovered general optimization method for attacking hard optimization problems [29, 30, 31, 32]. It has been found in the experiments [33, 34, 35, 36, 37] that cooperative optimization has achieved remarkable performances at solving a number of real-world NP-hard problems with the number of variables ranging from thousands to hundreds of thousands. The problems span several areas, proving its generality and power.

For example, cooperative optimization algorithms have been proposed for DNA image analysis [33], shape from shading [32], stereo matching [30, 34], and image segmentation [38]. In the second case, it significantly outperformed the classic simulated annealing in finding global optimal solutions. In the third case, its performance is comparable with graph cuts in terms of solution quality, and is twice as faster as graph cuts in software simulation using the common evaluation framework for stereo matching [13]. In the fourth case, it is ten times faster than graph cuts and has reduced the error rate by two to three factors. In all these cases, its memory usage is efficient and fixed, its operations are simple, regular, and fully scalable. All these features make it suitable for parallel hardware implementations.

This paper is organized in three major themes as 1) a formal presentation for cooperative optimization, 2) design issues, and 3) a case study. They are the generalization and the extension of the previous papers on cooperative optimization. In the case study, another cooperative optimization algorithm for stereo matching besides the one proposed before [30, 34] is offered to demonstrate the power and flexibility of cooperative optimization. Compared with the previous one for stereo matching, the new one lowers the energy levels of solutions further and is more than ten times faster. Just like the previous one, the new one is also simple in computation and fully parallel in operations, suitable for hardware implementations.

## Ii Cooperative Multi-Agent System for Distributed Optimization

Different forms of cooperative optimization can be derived from different cooperation schemes. The basic form defines an important collection of cooperative optimization algorithms. There are two different ways to derive it; namely, 1) as a cooperative multi-agent system for distributed optimization and 2) as a lower bounding technique for finding global optimums. Each way offers its own inspirations and insights to understand the algorithms. This section describes the first way. The following section offers the description for the second way. Readers who are not interested in them can directly jump to Section V for a general description of cooperative optimization. Those three sections are relatively independent to each other.

### Ii-a Inspiration and Basic Ideas

Team playing is a common social behavior among individuals of the same species (or different) where the team members working together can achieve goals or solve hard problems which are beyond the capability of any member in the team. Often times, team playing is achieved through competition and cooperation among the members in a team. Usually, competition or cooperation alone can hardly lead to good solutions either for a team or for the individuals in the team. Without competition, individuals in a team may lose motivation to pursue better solutions. Without cooperation, they might directly conflict with each other and poor solutions might be reached both for the team and themselves. Through properly balanced competition and cooperation, individuals in a team can find the best solutions for the team and possibly good solutions for themselves at the same time.

In the terms of computer science, we can view a team of this kind as a cooperative system with multiple agents. In the system, each agent has its own objective. The collection of all the agent’s objectives form the objective of the system. We can use a cooperative system to solve a hard optimization problem following the divide-and-conquer principle. We first break up the objective function of the optimization problem into a number of sub-objective functions of manageable sizes and complexities. Following that, we assign each sub-objective function to an agent in a system as the agent’s own objective function and ask those agents in the system to optimize their own objective functions through competition and cooperation. (Throughout this paper, we use the term “objective” and “objective function” interchangeably since the objective of an optimization problem is defined by an objective function and this paper focuses only on optimizing objective functions.)

Specifically, the competition is achieved by asking each agent to optimize its own objective function by applying problem-specific optimization methods or heuristics. However, the objectives of agents may not be always aligned with each other. In other words, the best solutions of the agents for optimizing their own objective functions may conflict with each other. To resolve the conflicts, each agent passes its solution to its neighbors through local message passing. After receiving its neighbor’s solutions, each agent compromises its solution with the solutions of its neighbors. The solution compromising is achieved by modifying the objective function of each agent to take into account its neighbors’ solutions. It is important to note that solution compromising among agents is a key concept for understanding the cooperation strategy introduced by cooperative optimization.

Let the objective of the individual be . Let the solution of the individual at time be . Let the collection of solutions of the neighbors of the individual at time be . The basic operations of a cooperative system are organized as a process shown in Figure 1.

The process of a cooperative system of this kind is iterative and self-organized and each agent in the system is autonomous. The system is also inherently distributed and parallel, making the entire system highly scalable and less vulnerable to perturbations and disruptions on individuals than a centralized system. Despite of its simplicity, it has many interesting emerging behaviors and can attack many challenging optimization problems.

### Ii-B Basic Form of Cooperative Optimization

In light of the cooperative multi-agent system for distributed optimization described in Fig. 1, we can derive the basic form of cooperative optimization now. It is based on a direct way for defining the solution of each agent and a simple way to modify the objective of each agent. The derivation can be generalized further in a straightforward way to any other definitions of solutions and modifications of objectives.

Given a multivariate objective function of variables, or simply denoted as , where each variable is of a finite domain of size . Assume that can be decomposed into sub-objective functions , denoted as , satisfying

1. ,

2. , for , contains at least variable ,

3. the minimization of , for , is computationally manageable in complexity.

Let us assign as the objective of agent ,

 Objective(i)=Ei(x),for i=1,2,…,n .

There are agents in the system, one agent for each sub-objective function.

Let the initial solution of agent be the minimization result of defined as follows,

 Solution(i,t=0)=minXi∖xiEi(x) ,

where is the set of variables contained in , and stands for minimizing with respect to all variables in excluding . The solution is an unary function on variable , denoted as .

Assume that the system takes discrete-time with iteration step . To simplify notations, let be the modified objective function of agent at iteration , i.e.,

 ~E(k)i(x)=Objective(i,t=k) .

It is also referred to as the -th modified sub-objective of the system. The agent’s solution at the iteration is defined as

 Solution(i,t=k)=minXi∖xi~E(k)i(x) . (1)

The solution is an unary function on variable , denoted as . It is the state of agent at iteration

. It can be represented as a vector of real values of size

, the domain size of variable . The -th equation in (1) defines the dynamics of agent . All the equations define the dynamics of the system.

As described in the previous subsection, the cooperation among the agents in the system is introduced by solution compromising via modifying the objective of each agent. Let agent define its modified objective function at iteration as a linear combination of its original objective and the solutions of its neighbors at the previous iteration as follows,

 ~E(k)i(x)=(1−λk)Ei(x)+λk∑j∈Neighbors(i)wijΨ(k−1)j(xj) , (2)

where and are coefficients of the linear combination.

Agent is the neighbor of agent if variable of the same index is contained in the agent ’s objective function . (Based on this definition, the agent is also a neighbor of itself. Such a generalization is necessary because there is no restriction to have agent modify its objective using its own solution.) The neighbors of agent is denoted as , i.e., . Specifically, it is defined as the set of indices as

 N(i)={j|{xj}∈Xi} .

Substituting Eq. (2) into Eq. (1) and letting if , the dynamics of the cooperative system can be written as the following difference equations,

 Ψ(k)i(xi)=minXi∖xi((1−λk)Ei(x)+λk∑jwijΨ(k−1)j(xj)),for i=1,2,…,n . (3)

Such a set of difference equations defines a basic cooperative optimization system (algorithm) for minimizing an objective function of the form .

At iteration , variable , for , has a value in the solution for minimizing the -th modified sub-objective function . It is denoted as , i.e.,

 ~x(k)i=argminximinXi∖xi~E(k)i(x) .

From (1), we have

 ~x(k)i=argminxiΨ(k)i(xi) . (4)

The agent is responsible for assigning that value to variable . The assignments of other variables are taken care of by other agents. All these values together form a solution of the system at iteration , denoted as .

Putting everything together, we have the pseudo code of the algorithm is given in Figure 2. The global optimality condition mentioned in the line will be discussed in detail later in this paper.

### Ii-C Cooperation Strength and Propagation Matrix

The coefficient in (3) controls the level of the cooperation among the agents at iteration . It is so called the cooperation strength, satisfying . From (3) we can see that, for each agent, a high value for will weigh the solutions of the other agents more than its own objective . In other words, the agents in the system tend to compromise more with their solutions. As a consequence, a strong level of cooperation is reached in this case. If the cooperation strength is of a small value, the cooperation among the agents is weak. Particularly, if it is equal to zero, there is no cooperation among the agents and each agent minimizes its own objective function independently (see (3)).

The coefficients control the propagation of solutions , for , as messages among the agents in the system. All s together form a matrix called the propagation matrix. To have as the objective function to be minimized, it is required [33] that the propagation matrix is non-negative and

 n∑i=1wij=1,for j=1,2,…,n .

To have solutions uniformly propagated among all the agents, it is required [33] that the propagation matrix is irreducible. A matrix is called reducible if there exists a permutation matrix such that has the block form

 (ABOC) .

The role of propagation matrices in basic cooperative optimization algorithms is exactly same as the one of transition matrices in Markov chains (or random walks over directed graphs). In a Markov chain, a transition matrix governs the distribution of states over time. In a basic cooperative optimization algorithm, a propagation matrix governs the distribution of solutions among agents. The mathematical foundation for analyzing Markov chains has been well established. They can be directly applied to analyze the message propagation of cooperative optimization.

### Ii-D Soft Decisions as Messages Passed Among Agents

As mentioned before, the solution of agent at iteration is an unary function on storing the solution of minimizing the -th modified sub-objective function (see (1)). Given a value of , is the minimal value of with the variable fixed to that value. To minimize , the values of which have smaller function values are preferred more than those of higher function values. The best value for assigning the variable is the one of the minimal function value (see (4)). Therefore, is inversely related to the preferences over different values of for minimizing . It is so called the assignment constraint on variable , an algorithm introduced constraint on the variable. It can also be viewed as a soft decision made by the agent for assigning the variable at iteration .

In particular, a soft decision of agent falls back to a hard decision for assigning the variable when the agent accept only one value and reject all the rest values. Such a hard decision can be represented by the assignment constraint as , for some , and for any .

With that insight, it can be understood now that the messages propagated around among the agents in a basic cooperative optimization system are the soft decisions for assigning variables. An agent can make a better decision using soft decisions propagated from its neighbors than using the hard ones instead. It is important to note that soft decision making is a critical feature of cooperative optimization, which makes it fundamentally different from many classic optimization methods where hard decisions are made for assigning variables.

### Ii-E A Simple Example

Given an objective function of the following form

 E(x1,x2,x3,x4,x5)= (5) f1(x1)+f2(x2)+f3(x3)+f4(x4)+f5(x5)+ f1,2(x1,x2)+f2,3(x2,x3)+f3,4(x3,x4)+ f4,5(x4,x5)+f1,5(x1,x5)+f2,5(x2,x5) ,

where each variable is of a finite domain. The goal is to seek values (labels) of the five variables such that the objective function is minimized.

Let us simply denote the function as

 E(x)=f1+f2+f3+f4+f5+ f1,2+f2,3+f3,4+f4,5+f1,5+f2,5 .

To design a basic cooperative optimization algorithm to minimize the objective function, we first decompose it into the following five sub-objective functions,

 \noindentE1(x1,x2,x5) = f1+f1,2/2+f1,5/2; E2(x1,x2,x3,x5) = f2+f1,2/2+f2,3/2+f2,5/2; E3(x2,x3,x4) = f3+f2,3/2+f3,4/2; E4(x3,x4,x5) = f4+f3,4/2+f4,5/2; E5(x1,x2,x4,x5) = f5+f1,5/2+f2,5/2+f4,5/2.

A propagation matrix of dimensions can be chosen as

 W=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝0130013120120130130120001201312130120⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ (6)

With the decomposition and the propagation matrix, substituting them into (3) we have a basic cooperative optimization algorithm with five difference equations for minimizing the five sub-objective functions in an iterative and cooperative way.

### Ii-F Basic Canonical Form as Generalization

Replacing by in the difference equations (3), we have the basic canonical form of cooperative optimization as

 Ψ(k)i(xi)=minXi∖xi(Ei(x)+λk∑jwijΨ(k−1)j(xj)),for i=1,2,…,n . (7)

The basic form of cooperative optimization (3) has its cooperation strength restricted to . It is because its difference equations (3) do not make sense when . However, such a restriction can be relaxed to for the basic canonical form (7). Often in times in practice, the basic canonical form is preferred over the basic one because the cooperation strength in the former has a broader range to choose from to maximize performance.

## Iii Cooperative Optimization as Lower Bounding Technique

### Iii-a Bound Function Tightening Technique for Optimization

In principle, a basic cooperative optimization algorithm can be understood as a lower bounding technique for finding global minima. It first initializes a function of some form as a lower bound function to an objective function. One may intentionally choose a form for the lower bound function such that the minimization of the function is simple in computation. Following that, the algorithm progressively tightens the lower bound function until its global minimum touches the global minimum of the original objective function. The latter is then found by searching the former instead (see the illustration in Fig. 3).

Specifically, let the objective function to be minimized be . Assume that the initial lower bound function be , . From , assume that the algorithm progressively tightens the function in an iterative way such that

 E(0)−(x)≤E(1)−(x)≤…≤E(k)−(x)≤E(x) ,

where is the iteration number.

Let the global minimum of the lower bound function at iteration be . Finding is simple in computation due to the simple form of the lower bound function . At iteration , if the algorithm found that the lower bound function at the solution has the same function value as the original objective function , i.e.,

 E(k)−(~x(k))=E(~x(k)) .

In other words, the two functions touch each other at the point where in the search space. Then must also be the global minimum of simply because

 E(~x(k))=E(k)−(~x(k))≤E(k)−(x)≤E(x),for any x . (8)

Such a condition implies that the lower bound function has been tightened enough such that its global minimum touches the global minimum of the original objective function . The latter is thus found by searching the former instead.

Such a lower bounding technique is so called the bound function tightening technique for optimization. There are other lower bounding techniques based on principles different from this one. Examples are Lagrangian relaxation techniques, cutting plane techniques, branch-and-bound algorithms, and branch-and-cut algorithms.

### Iii-B Basic Form as Lower Bounding Technique

In light of the bound function tightening technique described in the previous subsection, we can derive the basic form of cooperative optimization based on a simple form of lower bound functions. The derivation can be generalized further in a straightforward way to any other forms of lower bound functions.

Given an objective function of variables, , or simply denoted as . Assume that is a lower bound function of defined on the same set of variables. Obviously the linear combination of the two functions,

 (1−λ)E(x)+λE−(x) , (9)

defines a new lower bound function of if the parameter satisfying .

Let us choose a simple form for the lower bound function as

 E−(x)=Ψ1(x1)+Ψ2(x2)+…+Ψn(xn) , (10)

where is an unary component function defined on variable , for . Its global minimum, denoted as , can be easily found by minimizing the unary component functions independently as

 ~xi=argminxiΨi(xi),for i=1,2,…,n .

Assume that the objective function can be decomposed into sub-objective functions,

 E(x)=E1(x)+E2(x)+…+En(x) .

The lower bound function can also be easily decomposed into sub-functions as follows

 E−(x)=n∑i=1wijΨj(xj),where wij≥0 and ∑iwij=1,for 1≤i,j≤n .

Based on the two decompositions, the new lower bound function (9) can be rewritten as

 n∑i=1((1−λ)Ei(x)+λ∑jwijΨj(xj)) . (11)

To put the above function in a simple form, let

 ~Ei(x)=(1−λ)Ei(x)+λ∑jwijΨj(xj) .

Then it can be rewritten simply as

 n∑i=1~Ei(x) .

In the above sum, let be the set of variables contained in the -th component function . If we minimize the function with respect to all variables in except for , we obtain an unary function defined on , denoted as , i.e.,

 Ψ′i(xi)=min~Xi∖xi~Ei(x) . (12)

The sum of those unary functions defines another lower bound function of , denoted as , i.e.,

 E′−(x)=n∑i=1Ψ′i(xi)≤E(x) .

This new lower bound function has exactly the same form as the original one . Therefore, from a lower bound function of the form , we can compute another lower bound function of the same form. Such a process can be repeated and we can have an iterative algorithm to compute new lower bound functions.

Rewriting Eq. (12) in an iterative format, we have

 Ψ(k)i(xi)=min~Xi∖xi((1−λk)Ei(x)+λk∑jwijΨ(k−1)j(xj)), (13)

where is the iteration step, . The above difference equations define a basic cooperative optimization algorithm for minimizing an objective function of the form .

The solution at iteration , denoted as , is defined as the global minimal solution of the lower bound function at the iteration, i.e.,

 ~x(k)=argminxE(k)−(x),

which can be easily obtained as

 ~x(k)i=argminxiΨ(k)i(xi),for i=1,2,…,n . (14)

If at some iteration , then the solution must be the global minimum of the original objective function .

Without loss of generality, we assume in the following discussions that all sub-objective functions are nonnegative ones. One may choose the initial condition as , for any value of and . The parameter can be varied from one iteration to another iteration. If it is of a constant value and the above initial condition has been chosen, cooperative optimization theory [33] tells us that the lower bound function is monotonically non-decreasing as shown in (8).

## Iv Computational Properties

### Iv-a General Convergence Properties of Cooperative Optimization

It has been shown that a basic cooperative optimization algorithm (3) has some important computational properties [33]. Given a constant cooperation strength , i.e., for all s, the algorithm has one and only one equilibrium. It always converges to the unique equilibrium with an exponential rate regardless of initial conditions and perturbations. The two convergence theorems proved in [33] are very important and so they are listed here again. One formally describes the existence and the uniqueness of the equilibrium of the algorithm, and the another reveals the convergence property of the algorithm.

###### Theorem IV.1

A basic cooperative optimization algorithm with a constant cooperation strength () has one and only one equilibrium. That is, the difference equations (3) of the algorithm have one and only one solution (equilibrium), denoted as a vector , or simply .

###### Theorem IV.2

A basic cooperative optimization algorithm with a constant cooperation strength () converges exponentially to its unique equilibrium with the rate with any choice of the initial condition . That is,

 ∥Ψ(k)−Ψ(∞)∥∞≤λk∥Ψ(0)−Ψ(∞)∥∞ . (15)

where is the maximum norm of the vector defined as

 ∥x∥∞=maxi|xi| .

The two theorems indicate that every basic cooperative optimization algorithm (3) is stable and has a unique attractor, . Hence, the evolution of the algorithms is robust, insensitive to perturbations. The final solution of the algorithms is independent of their initial conditions. In contrast, the conventional algorithms based on iterative local improvement of solutions may have many local attractors due to the local minima problem. The evolution of those local optimization algorithms are sensitive to perturbations, and the final solution of those algorithms is dependent on their initial conditions.

Furthermore, the basic cooperative optimization algorithms (3) possess a number of global optimality conditions for identifying global optima. They know whether a solution they found is a global optimum so that they can terminate their search process efficiently. However, this statement does not imply that NP=P because a basic cooperative optimization algorithm can only verify within a polynomial time whether a solution it found is a global optimum or not. It cannot decide the global optimality for any given solution other than those it found.

It is important to note that a basic canonical cooperative optimization algorithm (7) may no longer possess the unique equilibrium property when its cooperation strengths at some iterations are greater than one, i.e., for some s. In this case, the algorithm may have multiple equilibriums. It can evolve into any one of them depending on its initial settings of the assignment constraints .

### Iv-B Consensus Solution and Solution Consensus in Distributed Optimization

As described before, a basic cooperative optimization algorithm is defined by the difference equations (3). The -th equation defines the minimization of the -th modified sub-objective function (defined in (2)). Given any variable, say , it may be contained in several modified sub-objective functions. At each iteration, has a value in the optimal solution for minimizing each of the modified sub-objective functions containing the variable. Those values may not be the same. If all of them are of the same value at some iteration, we say that the cooperative optimization algorithm reach a consensus assignment for that variable. Moreover, if a consensus assignment is reached for every variable of the problem at hand at some iteration, we call the minimization of the modified sub-objective functions reaches a solution consensus. That is, there is no conflict among the solutions in terms of variable assignments for minimizing those functions. In this case, those consensus assignments form a solution, called a consensus solution, and the algorithm is called reaching a consensus solution.

To be more specific, given modified sub-objective functions, , for (to simplify notation, let us drop the superscript temporarily). Let the optimal solution of the -th modified sub-objective function be , i.e.,

 ~x(~Ei)=argminx~Ei(x) .

Assume that variable is contained in both -th and -th modified sub-objective functions , . However, it is not necessary that

 ~xi(~Ej)=~xi(~Ek) .

Given a variable , if the above equality holds for any and where and contain , then a consensus assignment is reached for that variable with the assignment value denoted as . Moreover, if the above statement is true for any variable, we call the minimization for all s reaches a solution consensus. The solution with as the value of variable is called a consensus solution.

As defined before, stands for the set of variables contained in the function . is a subset of variables, i.e., . Let stand for the restriction of a solution on . Another way to recognize a consensus solution is to check if , for any , is the global minimum of , i.e.,

 ~x(~Xi)=argminx~Ei(x),for any i  .

Simply put, a solution is a consensus one if it is the global minimum of every modified sub-objective function.

### Iv-C Consensus Solution in Cooperative Optimization

Consensus solution is an important concept of cooperative optimization. If a consensus solution is found at some iteration or iterations, then we can find out the closeness between the consensus solution and the global optimal solution in cost. The following theorem from [33] makes these points clearer.

###### Theorem IV.3

Let

 E∗(k)−=n∑i=1Ψ(k)i(~x(k)i),% where ~x(k)i=argminxiΨ(k)i(xi) .

Given any propagation matrix , and the general initial condition , for each , or . If a consensus solution is found at iteration and remains the same from iteration to iteration , then the closeness between the cost of , , and the optimal cost, , satisfies the following two inequalities,

 0≤E(~x)−E∗≤⎛⎝k2∏k=k1λk⎞⎠(E(~x)−E∗(k1−1)−), (16)
 0≤E(~x)−E∗≤∏k2k=k1λk1−∏k2k=k1λk(E∗−E∗(k1−1)−) , (17)

where is the difference between the optimal cost and the lower bound on the optimal cost obtained at iteration .

In particular, if , for , when ,

 E(~x)→E∗ .

That is, the consensus solution must be global minimum of , i.e.,.

Consensus solution is also an important concept of cooperative optimization for defining global optimality conditions. The cooperative optimization theory tells us that a consensus solution can be the global minimal solution. As mentioned in the previous subsection that a basic cooperative optimization algorithm has one and only one equilibrium given a constant cooperation strength. If a cooperative optimization algorithm reaches an equilibrium after some number of iterations and a consensus solution is found at the same time, then the consensus solution must be the global minimal solution, guaranteed by theory. The following theorem (with its proof in the appendix) establishes the connection between a consensus solution and a global optimal solution.

###### Theorem IV.4

Assume that a basic cooperative optimization (3) reaches its equilibrium at some iteration, denoted as . That is, is a solution to the difference equations (3). If a consensus solution is found at the same iteration, then it must be the global minimum of , i.e.,.

Besides the basic global optimality condition given in the above theorem, a few more ones are offered in [33] for identifying global optimal solutions. The capability of recognizing global optimums is a critical property for any optimization algorithm. Without any global optimality condition, it will be hard for an optimization algorithm to know where to find global optimal solutions and whether a solution it found is a global optimum. Finding ways of identifying global optimums for any optimization algorithm is of both practical interests as well as theoretical importance.

### Iv-D Further Generalization of Convergence Properties

The convergence theorem IV.3 can be generalized further to any initial conditions for and , and to any cooperation strength series . Dropping the restriction on the initial conditions and in the theorem, from the difference equations (3), we have

 E∗−E∗(k2)−=⎛⎝k2∏k=k1λk⎞⎠(E∗−E∗(k1−1)−) . (18)

It is obvious from the above equation that still approaches exponentially with the rate when the cooperation strength is of a constant value ().

When the cooperation strength is not of a constant value , the convergence to the global optimum is still guaranteed as long as the cooperation strength series is divergent.

###### Lemma IV.1 (Infinite Products)

Let be a sequence of numbers of the interval .

1. If , then

 limn→∞n∏k=1λk>0 .
2. If , then

 limn→∞n∏k=1λk=0 .

The proof of the lemma is offered in Appendix.

From the above lemma and Eq. (18), the convergence theorem IV.3 can be generalized further as follows.

###### Theorem IV.5

Given any initial conditions, assume that a consensus solution is found by a basic cooperative optimization algorithm at some iteration and remains the same in the following iterations. If the series

 (1−λ1)+(1−λ2)+…+(1−λk)+… , (19)

is divergent, then

 E(~x)=E∗.

That is, the consensus solution must be the global minimal solution , .

If , for instance, the series (19) is the harmonic series,

 1+12+13+…+1k+…

The harmonic series is divergent. Hence, with the choice of , if a consensus solution is found at some iteration and it remains the same in the following iterations, it must be the global minimal solution .

If , as another example, is a convergent sequence of a positive limit, then is divergent. In this case, a consensus solution is also the global minimal solution. This statement can be generalized further to Cauchy sequences. Every convergent sequence is a Cauchy sequence, and every Cauchy sequence is bounded. Thus, if is a Cauchy sequence of a positive bound, a consensus solution is the global minimal solution.

To maximize the performance of a cooperative optimization algorithm, it is popular in the experiments to progressively increase the cooperation strength as the iteration of the algorithm proceeds. A weak cooperation level at the beginning leads to a fast convergence rate (see Theorem IV.2). A strong cooperation level at a later stage of the iterations increases the chance of finding a consensus solution. Theorem IV.5 offers us some general guidance and justification for choosing a variable cooperation strength. It tells us that the increment of the cooperative strength should not be too fast if we want the guarantee of a consensus solution being the global optimal one.

## V General Canonical Form of Cooperative Optimization

By combining different forms of lower bound functions and different ways of decomposing objective functions, we can design cooperative optimization algorithms of different complexities and powers for attacking different optimization problems. The basic canonical form of cooperative optimization (7) can be generalized further in a straightforward way to the general canonical one as follows.

Given a multivariate objective function of variables, or simply denoted as , where each variable is of a finite domain. Assume that can be decomposed into sub-objective functions which may satisfy the condition

 E(x)=m∑i=1Ei(x) .

One may define another function , on the same set of variables as , as the composition of component functions as follows,

 E−(x)=m∑i=1Ψi(xi) ,

where is a component function defined on a subset of variables , , for . is the restriction of on , denoted as .

A cooperative optimization algorithm of the general canonical form is defined as minimizing the sub-objective functions in the following iterative and cooperative way,

 Ψ(k)i(xi)=minXi∖X′i(Ei(x)+λkm∑j=1wijΨ(k−1)j(xj)) , (20)

for . In the equations, is the iteration step; is the set of variables contained in the functions at the right side of the -th equation; is a real value parameter at iteration satisfying ; and are also real value parameters satisfying .

The solution at iteration is defined as

 ~x(k)=argminxE(k)−(x) .

Moreover, is called a consensus solution if it is the conditional optimum of all the minimization problems defined in (20). That is,

 ~x(k)(Xi)=argminx(Xi)(Ei(x)+λkm∑j=1wijΨ(k−1)j(xj)) ,

when and .

One may choose the parameters and in such a way that they further satisfy the conditions of , for all s, and all s are less than one (). With the settings, if the algorithm reaches its equilibrium at some iteration and the solution of the iteration is also a consensus one, then it must be the global minimal solution (This global optimality condition can be proved in the exact same way as that of Theorem IV.4).

The general canonical form can be further generalized to variable propagation matrices, variable forms of lower bound functions, and variable ways of decomposing objective functions.

## Vi Design Issues

A basic cooperative optimization algorithm (3) (or a basic canonical one (7)) is uniquely defined by the objective function decomposition , the cooperation strength series , and the propagation matrix . Some general guideline for designing the cooperation strength series has discussed in the previous section. This section focuses on the rest two.

### Vi-a Objective Function Decomposition

#### Vi-A1 Constraint Optimization Problems

A large class of optimization problems have objective functions of the following form,

 E(x1,x2,…,xn)=n∑i=1fi(xi)+∑(i,j)∈Nfij(xi,xj) . (21)

The function is an unary function on variable , for