Towards time-varying proximal dynamics in Multi-Agent Network Games

11/11/2018 ∙ by Carlo Cenedese, et al. ∙ University of Groningen Delft University of Technology 0

Distributed decision making in multi-agent networks has recently attracted significant research attention thanks to its wide applicability, e.g. in the management and optimization of computer networks, power systems, robotic teams, sensor networks and consumer markets. Distributed decision-making problems can be modeled as inter-dependent optimization problems, i.e., multi-agent game-equilibrium seeking problems, where noncooperative agents seek an equilibrium by communicating over a network. To achieve a network equilibrium, the agents may decide to update their decision variables via proximal dynamics, driven by the decision variables of the neighboring agents. In this paper, we provide an operator-theoretic characterization of convergence with a time-invariant communication network. For the time-varying case, we consider adjacency matrices that may switch subject to a dwell time. We illustrate our investigations using a distributed robotic exploration example.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

I-a Motivation: Multi-agent decision making over networks

Multi-agent decision making over networks is currently a vibrant research area in the systems-and-control community, with application in several relevant domains, such as smart grids [1, 2], traffic and information networks [3, 4], social networks [5, 6], consensus and flocking groups [7, 8], robotic and sensor networks [9, 10].

In distributed computation and communication, the main advantage is that each decision maker, in short, agent, can keep its own data private and exchange information with selected agents only. Essentially, in networked multi-agent systems, the state (or decision) variables of each agent evolve as a result of local decision making, e.g. local constrained optimization, and distributed communication with some neighboring agents, via a communication graph. Typically, the aim of the agents is reaching a collective equilibrium state, where no agent can benefit from further updating its state variables.

I-B Literature overview: Multi-agent optimization and multi-agent network games

Multi-agent dynamics for solving a set of inter-dependent optimization problems arise naturally from distributed optimization and distributed equilibrium seeking in network games. Multi-agent convex constrained optimization has been widely studied in the last decade: in [11] with uniformly bounded subgradients, and either homogeneous constraint sets or time-invariant, complete communication graphs with uniform weights; in [12] with differentiable cost functions with Lipschitz continuous and uniformly bounded gradients; and, more generally, in [13], where convergence is proven via vanishing step sizes. Network games among agents with convex compact local constraints have been considered before: in [14] with strongly convex quadratic cost functions and time-invariant communication graph; in [15] [16], with differentiable cost functions with Lipschitz continuous gradient, strictly convex cost functions, and undirected, possibly time-varying, communication graphs; and in [2] with general local convex cost and quadratic proximal term, time-invariant and time-varying communication graphs, subject to technical restrictions. The common feature in multi-agent optimization and games over networks is the presence of a structured, possibly time-varying, communication graph. Therefore, it is interesting to design multi-agent dynamics that involve distributed computation and structured information exchange.

I-C Contribution of the paper

In this paper, we consider proximal dynamics in multi-agent network games with both time-invariant and time-varying communication graphs. In the time-invariant case, we show that global convergence of proximal dynamics holds if the adjacency matrix of the communication graph, assumed strongly connected, is row stochastic and with strictly-positive diagonal elements. Technically, we extend the convergence result in [2, Th. 1]

. The use of a row stochastic matrix is highly relevant in applications: it allows an agent to communicate with its neighbors without requiring an adjustment of the rest of the network. In the time-varying case, we consider switching adjacency matrices subject to a certain dwell time and we show global convergence of the proximal dynamics under switching with sufficiently large dwell time. For testing the derived sufficient conditions, we provide linear matrix inequalities.

I-D Organization of the paper

The paper is organized as follows: Section II presents an illustrative multi-robot exploration scenario; Section III formalizes the problem setup. We introduce the convergence result for time-invariant multi-agent proximal dynamics Section IV and for time-varying, dwell-time switched, proximal dynamics in Section V. A numerical simulations of the considered dynamics is presented in Section VI. Finally, we conclude the paper in Section VII, where we discuss future research directions.

I-E Basic notation

The set of real, positive, and non-negative are denoted by , , , respectively; . The set of natural numbers is denoted by , and for , , we define . For a square matrix , its transpose is denoted by . The identity matrix is denoted by . For

, a collective vector

is simply described as . For two matrices  and , denotes their Kronecker product. For vectors and a symmetric and positive definite matrix , the weighted inner product and norm are denoted by and , respectively; the induced matrix norm is denoted by . For , the standard inner product, Euclidean norm, and Frobenius norm are obtained. A real dimensional Hilbert space obtained by endowing with the product is denoted by .

I-F Operator-theoretic notation

For a function , define . The subdifferential is defined by . The proximal operator is defined by . The resolvent of an operator is .The indicator function of is defined as if ; otherwise . The identity operator is defined by . The Euclidean distance and the Euclidean distance weighted by of a point to are respectively , and .

Ii Motivating, illustrative scenario: Multi-robot exploration

We motivate the paper by the problem of distributed exploration performed by mobile robots. Let the two-dimensional position of each robot at time be and denote its neighbors indexed by . To each robot , we associate a local cost function that is composed by two separate terms: the local target function and the aggregation term , where denotes the weighted averaged positions of its neighboring . The first term penalizes the distance of the robot from its target position , and, by construction, has its minimum at the target position, see Fig. (a)a(c)c. Instead, the second term penalizes the distance of the position from , hence it plays the role to induce the robots to stay together during their motion, see Fig. (b)b(d)d. Each robot is assumed to be rational, namely, willing to determine its motion with the aim to minimize its cost function. Overall, the robots shall reach a collective equilibrium state, which we call network equilibrium.

(a)
(b)
(c)
(d)
Fig. 1: (a) Agent (blue disk) and its target position (red star), and other robots (light blue circles); (b) robots, the neighbors of robot and (red square), (c) Level sets of the cost function ; (d) Level sets of the function .

The resulting motion-planning problem can be intended as a game between all the robots involved in the exploration. In fact, the equilibrium points correspond to the trade-offs between the target positions and closeness among robots. For simplicity, in this illustrative example, the collision avoidance between robots is not taken into account. One possible simple structure for the (discrete-time) dynamics of each robot in the above setup is then

(1)

where the set represents the motion constraints of the robot. Whether or not the dynamics in (1) will converge to an equilibrium is unclear a-priori, espacially if the set of neighbors, , is time-varying. With this distributed robotic setup in mind, in the following, we address the convergence problem via an operator-theoretic perspective.

Iii Technical setup and problem formulation

We consider a network of agents, where the state of each agent is denoted by and the set coincides with the feasible states of agent . To compute its next state variable, each agent relies on the states of some neighboring agents. In turn, a network structure arises, described by a weighted digraph. Let us represent the communication between agents by the following weighted adjacency matrix:

(2)

where is the weight that agent assigns to the state of agent . If , then the state of agent is independent from that of agent . Furthermore, we assume that each agent aims at minimizing a cost function .

Throughout the paper, we assume compactness and convexity of the local constraint set  and convexity (not necessarily strict convexity) of the local cost function .

Assumption 1 (Local constraints)

For each agent , the set is non-empty, compact and convex.

Assumption 2 (Local cost functions)

For each agent , the local cost function is defined by

(3)

for some matrix , where is a lower semi-continuous and convex function.

In (3), the function is local to agent . For example, it can represent the distance from a desired state. The quadratic term penalizes the distance between the state of agent and the weighted average among the states of its neighbors. We emphasize that Assumption 2 requires neither the differentiability of the local cost function, nor the Lipschitz continuity or boundedness of its gradient.

From the above problem setup, since we assume the agents are rational, i.e., willing to minimize their individual cost functions, we consider the following notion of collective equilibrium state, called network equilibrium.

Definition 1 (Network equilibrium)

A collective vector is a network equilibrium (NWE) if ,

(4)

We recall that if there are no self-loops in the adjacency matrix, i.e., for all , then an NWE corresponds to a Nash equilibrium [2, Remark 1]. Under Assumptions 1 and 2, an NWE always exists.

The problem studied in this paper is then seeking an NWE (Definition 1), namely, convergence to an NWE from any initial condition. Clearly, in the time-varying case, the definition of NWE is more involved - let us postpone it to Section V.

Iv Time-invariant proximal dynamics

As mentioned in the previous section, we assume that each agent is rational and noncooperative. Therefore, it is natural to consider the following proximal dynamics for each agent :

(5)

In the collective vector form, namely, for the collective vector

the dynamics from (5) read as

(6)

where stands for the block-diagonal matrix

(7)

the matrix represents the interactions among agents, and the mapping is a block-diagonal proximal operator, i.e.,

(8)

With the introduced notations, a collective vector is an NWE if and only if , where stands for the set of fixed points of the operator in its argument. Under Assumptions 1 and 2, is non-empty [17, Th. 4.1.5(b)], and the convergence problem is well posed.

Therefore, from an operator-theoretic perspective, the proximal dynamics in (6) are the so-called Picard–Banach iteration for the mapping   [18, Equ. 1.69].

We assume that the adjacency matrix is row-stochastic with self-loops, and marginally stable, as formalized next.

Assumption 3 (Row-stochasticity and self-loops)

The communication graph is strongly connected. The matrix in (2) is row-stochastic, i.e., for all , and for any . Moreover, has strictly-positive diagonal elements, i.e. .

Now, we are ready to introduce the first result of this paper about the convergence of the proximal dynamics in (6).

Lemma 1 (Global convergence)

Suppose that Assumptions 13 hold. There always exists a matrix such that, for any , the sequence generated by (6) with converges to an NWE.

Remark 1

Lemma 1 extends [2, Th. 1], since is only assumed to be row-stochastic. Note that if the matrix can be chosen block-diagonal, then the operator defines fully distributed dynamics.

From the practical point of view, the matrix in Lemma 1 can be computed as , where the matrix solves the following LMI:

(9)

for some . We have in fact the following result.

Proposition 1

Let . If the LMI in (9) holds, then the sequence generated by (6) with , and solution to (9), converges to an NWE.

It follows from Remark 1 and Proposition 1 that, in order to obtain fully distributed dynamics in (6), one shall solve the LMI in (9) with diagonal matrix .

V Towards time-varying proximal dynamics

V-a Time-varying setup

In the previous section, we have assumed that the communication network of the agents is the same for all time instances . In practical situations, however, not all agents can update their strategies at the same time instances. More generally, the communication network can change from time to time. To address time-varying scenarios, in this subsection, we consider a time-varying communication matrix, i.e.,

(10)

hence the collective adjacency matrix .

For simplicity, in the remainder of the paper, we assume that the set of available communication networks is finite.

Assumption 4 (Finite number of adjacency matrices)

There exists such that for all , where each matrix satisfies Assumption 3.

To describe the corresponding dynamics, we introduce a switching signal that at each time step selects an adjacency matrix. Thus, in compact form, we have the following switching dynamics:

(11)

where , and

for all . Moreover, since we are interested in distributed dynamics, we assume that the matrices ’s are diagonal.

Assumption 5

For each , there exists a diagonal matrix that satisfies Lemma 1.

As anticipated in the previous section, for the time-varying case, we need to generalize the concept of an NWE, to what we call a persistent network equilibrium.

Definition 2 (Persistent network equilibrium)

A persistent network equilibrium (PNWE) is a collective vector in the set

(12)
Assumption 6 (Existence of PNWE)

The set of PNWE is assumed to be non-empty, i.e., .

We remind that, in general, even if all ’s are stable, i.e., correspond to averaged mappings, the switching system in (11) can be unstable for some switching sequences. Therefore, we shall use tools from switching systems to claim convergence to a PNWE. In the next subsection, we focus on the dwell-time approach to establish global convergence.

V-B Proximal dynamics with dwell time

With the aim to study global convergence of time-varying proximal dynamics, let us introduce the concept of dwell-time [19, 20, 21].

Definition 3

A natural number is called a dwell time if the switching times satisfy , for all .

Before we can establish convergence of the proximal dynamics with switching, we impose two technical assumptions, namely that all operators are linearly regular [22, Def. 2.1] and that the number of switchings is infinite.

Assumption 7

Assumptions 1, 2, 4 hold and for each , the operator is linearly regular on .

Assumption 8 (Infinite switching)

For all , the switching signal is such that infinitely many times as .

We are now ready to present a global convergence result for the switching proximal dynamics in (11) to a PNWE, provided that the dwell time is chosen large enough.

Theorem 1 (Global convergence under dwell-time)

Let Assumptions 68 hold. For any initial condition , the sequence generated by (11) converges to a PNWE if the dwell time is chosen large enough.

Remark 2

A lower bound for the dwell time can be obtained in the form

(13)

where and

are, respectively, the minimum and the maximum eigenvalues of

. The parameters are described in the Appendix.

Vi Numerical simulations

We resume the setting described in Section II, namely, the problem of a distributed exploration performed by a network of mobile robots. In the following, we verify the results of Section IV to solve this task in the time-invariant case.

Vi-1 Simulation setup

We have agents in the game, where each agent is a moving mobile robot, and the state is its position in the plane at time , hence . The robots are able to move in all directions from their current positions, but the maximum range of movement is limited inside a square , centred in and of edge . The weighted adjacency matrices in (2) and its collective counterpart are

The value of in Assumption 3 is . One can compute the matrix by solving the LMI in (9) with and imposing a diagonal structure. The solution is

(14)

The cost function of each agent is defined according to Assumption 3, as

(15)

where is the block present in the diagonal of , the target position of the agent, the “discover” parameter and the aggregative term. The parameter defines how much a robot prefers to achieve its goal position instead of staying close to the others.

From the collective dynamics in (6), the explicit update rule is obtained applying a forward-backward splitting to the operator, obtaining

(16)

where , ’s are the diagonal elements of , is the step size of the dynamics and is the projection operator over the set .

Vi-2 Simulation results

In the numerical simulation, we use the following values:

  • initial positions , , and ;

  • desired final positions , , and ;

  • edge of local constraint sets, , ;

  • the same discover parameter is used for all the agents, for all ;

  • small step size .

The trajectories of the four robots are shown in Fig. 2, where the desired position of each robot is represented by dashed circles with the same color of the robot .

Fig. 2: Trajectories of the robots generated by the dynamics in (16). The initial and desired positions are and , respectively. The latter represented by concentric dashed circles.

A closer look at the matrix clarifies the behavior of each robot. The first two robots weight mostly their relative positions and, since their targets are relatively close each other, they converge very close to the desired location. The remaining two robots must instead adapt their motion also with respect to that of robots and , in order to reduce the effect of the proximity term in their cost function. This leads the final positions of these robots distant from their targets.

Vi-3 Obstacle Avoidance

Next, we deal with an obstacle avoidance problem, which can be handled via the same setup, with the only difference that shall be modified. When the agent approaches the object, its local constraints set is formulated such that the future robot position cannot collide with the obstacle. More precisely, we define as the set of points covered by the obstacle, and , the local constraint set that the agent would have without the obstacle. Finally, we define the set as the largest convex subset of .

In Fig. 3, we show a resulting trajectory, run with the same parameters of the previous simulation. Each agent successfully avoids the obstacle, and the final positions almost coincide with the ones of the previous simulation.

Fig. 3: Trajectories of the robots generated by the dynamics in (16), performing obstacle avoidance. The obstacle is represented by the gray rectangle. The desired positions are represented by concentric dashed circles.

In our experience, expecially in the scenarios with obstacles, the tuning of the s is crucial to obtain satisfactory trajectories.

Vii Conclusion and Outlook

In this paper, we have studied the problem of a group of robots performing a distributed exploration task. We have shown that under weak condition on the communication, the global convergence to an NWE or to a PNWE can still be reached, respectively in the static and the time-varying case, even though in this latter case the imposition of a dwell time is necessary. Moreover we presented a practical implementation of the algorithm and studied its performances in different setups. This can be seen as a work towards asynchronous proximal dynamics, which aims at highlighting the potentials and the possible applications of this research topic. Future research will investigate milder conditions and different algorithms under which proximal dynamics are guaranteed to be fully distributed.

Proof of Lemma 1

From Assumption 3, is marginally stable, with no eigenvalues on the boundary of the unit disk but semi-simple eigenvalues at . From [23, Lem. 4], the linear operator is averaged in some Hilbert space with norm . From [18, Def. 4.33] and [23, Lem. 4], is averaged in the Hilbert space with norm . Then, the collective proximal operator is averaged in the same Hilbert space [18, Prop. 23.34(i)]. Thus, by [18, Prop. 4.44], the composition is also averaged in . The proof then follows from [18, Prop. 5.15].

Proof of Proposition 1

Follows directly from [18, Def. 4.33], [23, Lem. 4] and Lemma 1.

Proof of Theorem 1

For each , define . From [18, Prop. 4.44] we know that , is averaged as well, say averaged. Thus, without any switching, the sequence generated by the Banach–Picard iteration would converge to some vector in .

Due to Assumptions  7, 8 and [22, Lemma 3.8, Fact 5.3(i)], we have that, for all ,

(17)

where we defined the parameters

Then the lower bound of the dwell time in (13) can be computed via arguments similar to [24, Sec. 3.2.1]. Consider , since the distance functions are norms, we have that

(18)

Now, suppose that for all , and for all , where . By (17) and (18), we obtain

(19)

and consequently

(20)

By [24, Th. 3.1] and some manipulation, we conclude the converge to a PNWE if the dwell time satisfies

The proof for the general case when is analogous and leads to the lower bound on the dwell time in (13).

References

  • [1] F. Dörfler, J. Simpson-Porco, and F. Bullo, “Breaking the hierarchy: Distributed control and economic optimality in microgrids,” IEEE Trans. on Control of Network Systems, vol. 3, no. 3, pp. 241–253, 2016.
  • [2] S. Grammatico, “Proximal dynamics in multi-agent network games (in press),” IEEE Trans. on Control of Network Systems doi.org/10.1109/TCNS.2017.2754358, 2018.
  • [3] R. Jaina and J. Walrand, “An efficient Nash-implementation mechanism for network resource allocation,” Automatica, vol. 46, pp. 1276Ж1283, 2010.
  • [4] J. Barrera and A. Garcia, “Dynamic incentives for congestion control,” IEEE Trans. on Automatic Control, vol. 60, no. 2, pp. 299–310, 2015.
  • [5] J. Ghaderi and R. Srikant, “Opinion dynamics in social networks with stubborn agents: Equilibrium and convergence rate,” Automatica, vol. 50, pp. 3209–Ð3215, 2014.
  • [6] S. R. Etesami and T. Başar, “Game-theoretic analysis of the hegselmann-krause model for opinion dynamics in finite dimensions,” IEEE Trans. on Automatic Control, vol. 60, no. 7, pp. 1886–Ð1897, 2015.
  • [7] R. Olfati-Saber and R. Murray, “Consensus problems in networks of agents with switching topology and time-delays,” IEEE Trans. on Automatic Control, vol. 49, no. 9, pp. 1520–1533, 2004.
  • [8] R. Olfati-Saber, “Flocking for multi-agent dynamic systems: Algorithms and theory,” IEEE Trans. on Automatic Control, vol. 51, no. 3, pp. 401–420, 2006.
  • [9] S. Martínez, F. Bullo, J. Cortés, and E. Frazzoli, “On synchronous robotic networks – Part i: Models, tasks, and complexity,” IEEE Trans. on Automatic Control, vol. 52, pp. 2199–2213, 2007.
  • [10] M. Stanković, K. Johansson, and D. Stipanović, “Distributed seeking of Nash equilibria with applications to mobile sensor networks,” IEEE Trans. on Automatic Control, vol. 57, no. 4, pp. 904–919, 2012.
  • [11] A. Nedić, A. Ozdaglar, and P. Parrillo, “Constrained consensus and optimization in multi-agent networks,” IEEE Trans. on Automatic Control, vol. 55, no. 4, pp. 922–938, 2010.
  • [12] S. Lee and A. Nedić, “Distributed random projection algorithm for convex optimization,” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 2, pp. 221–229, 2013.
  • [13] A. Falsone, K. Margellos, S. Garatti, and M. Prandini, “Dual decomposition for multi-agent distributed optimization with coupling constraints,” Automatica, vol. 84, pp. 149–Ð158, 2017.
  • [14] F. Parise, B. Gentile, S. Grammatico, and J. Lygeros, “Network aggregative games: Distributed convergence to Nash equilibria,” in Proc. of the IEEE Conference on Decision and Control, Osaka, Japan, 2015, pp. 2295–2300.
  • [15] J. Koshal, A. Nedić, and U. Shanbhag, “Distributed algorithms for aggregative games on graphs,” Operations Research, vol. 64, no. 3, pp. 680–704, 2016.
  • [16] F. Salehisadaghiani and L. Pavel, “Distributed Nash equilibrium seeking: A gossip-based algorithm,” Automatica, vol. 72, pp. 209–216, 2016.
  • [17] D. Smart, Fixed Point Theorems, ser. Cambridge Tracts in Mathematics.   Cambridge University Press, 1980.
  • [18] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2010.
  • [19] D. Liberzon and A. S. Morse, “Basic problems in stability and design of switched systems,” IEEE Control Systems, vol. 19, no. 5, pp. 59–70, Oct 1999.
  • [20] Z. She, J. Lu, Q. Liang, and S. S. Ge, “Dwell time based stabilisability criteria for discrete-time switched systems,” International Journal of Systems Science, vol. 48, no. 14, pp. 3087–3097, 2017.
  • [21] M. Cao and A. S. Morse, “Dwell-time switching,” Systems & Control Letters, vol. 59, no. 1, pp. 57 – 65, 2010.
  • [22] H. H. Bauschke, D. Noll, and H. M. Phan, “Linear and strong convergence of algorithms involving averaged nonexpansive operators,” ArXiv e-prints, Feb. 2014.
  • [23] G. Belgioioso, F. Fabiani, F. Blanchini, and S. Grammatico, “On the convergence of discrete-time linear systems: A linear time-varying Mann iteration converges iff the operator is strictly pseudocontractive,” IEEE Control Systems Letters, vol. 2, no. 3, pp. 453–458, July 2018.
  • [24] D. Liberzon, Switching in systems and control.   Springer Science & Business Media, 2003.