# Distributed and Efficient Resource Balancing Among Many Suppliers and Consumers

Achieving a balance of supply and demand in a multi-agent system with many individual self-interested and rational agents that act as suppliers and consumers is a natural problem in a variety of real-life domains---smart power grids, data centers, and others. In this paper, we address the profit-maximization problem for a group of distributed supplier and consumer agents, with no inter-agent communication. We simulate a scenario of a market with S suppliers and C consumers such that at every instant, each supplier agent supplies a certain quantity and simultaneously, each consumer agent consumes a certain quantity. The information about the total amount supplied and consumed is only kept with the center. The proposed algorithm is a combination of the classical additive-increase multiplicative-decrease (AIMD) algorithm in conjunction with a probabilistic rule for the agents to respond to a capacity signal. This leads to a nonhomogeneous Markov chain and we show almost sure convergence of this chain to the social optimum, for our market of distributed supplier and consumer agents. Employing this AIMD-type algorithm, the center sends a feedback message to the agents in the supplier side if there is a scenario of excess supply, or to the consumer agents if there is excess consumption. Each agent has a concave utility function whose derivative tends to 0 when an optimum quantity is supplied/consumed. Hence when social convergence is reached, each agent supplies or consumes a quantity which leads to its individual maximum profit, without the need of any communication. So eventually, each agent supplies or consumes a quantity which leads to its individual maximum profit, without communicating with any other agents. Our simulations show the efficacy of this approach.

## Authors

• 1 publication
• 23 publications
• 12 publications
11/06/2017

### Distributed Multi-resource Allocation with Little Communication Overhead

We propose a distributed algorithm to solve a special distributed multi-...
10/31/2018

### Multi-Layers Supply chain modelling based on Multi-Agent Approach

This paper proposes a strategic multi layers model based on multi agents...
08/20/2017

### A Deep Q-Network for the Beer Game with Partial Information

The beer game is a decentralized, multi-agent, cooperative problem that ...
09/03/2012

### Optimizing Supply Chain Management using Gravitational Search Algorithm and Multi Agent System

Supply chain management is a very dynamic operation research problem whe...
04/26/2021

### Multi-resource allocation for federated settings: A non-homogeneous Markov chain model

In a federated setting, agents coordinate with a central agent or a serv...
12/27/2012

### Design of Intelligent Agents Based System for Commodity Market Simulation with JADE

A market of potato commodity for industry scale usage is engaging severa...
08/22/2013

### Matching Demand with Supply in the Smart Grid using Agent-Based Multiunit Auction

Recent work has suggested reducing electricity generation cost by cuttin...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Many modern problems involve one resource that is generated over time by a number of suppliers and simultaneously consumed by a number of consumers. For instance, consider electrical power generated or injected into the grid by utility companies, wind mills, solar panels, batteries, and electric vehicles. At the same time, other devices consume the same power. Other examples include cryptocurrency markets with large numbers of producers and consumers, etc.

These problems are complex problems that involve a huge number of decision-makers. Each producer has a different cost for producing a unit of the resource, and each consumer derives a different benefit from consuming the resource. One desirable outcome of the interaction between producers and consumers is to maximize the social welfare or minimize the social cost. If we can represent each agent by an utility function, then we aim to maximize the sum of utilities of producers and consumers subject to capacity constraints on the quantities. These constraints ensure that quantities are nonnegative and that the total consumption is at most the total production.

It is well-known that in the presence of money and price signals, an equilibrium can be reached through a tatonnement process where social welfare is maximized. However, we propose a mechanism where neither money, nor price signals, are required to maximize social welfare: we only require binary feedback signals on whether or not total consumption exceeds total production. Our mechanism is decentralized: each agent only requires knowledge of its own utility function. We show that if every agent follows a specified algorithm, then over time, the profile of decisions (production and consumption quantities) converges to a socially optimal outcome.

This problem can be related to the TCP congestion control problem, where data sources controlling their own rates can interact to achieve an optimal network-wide rate allocation, where supplier agents can act as source of allocation, while consumer agents act as the consumers of the allocated resources. The detailed analysis [1] of increase/decrease algorithms used predominantly in the past for TCP congestion control, which include Multiplicative Increase/Multiplicative Decrease (MIMD) [2], Additive Increase/Additive Decrease (AIAD) [3], Additive Increase/ Multiplicative Decrease (AIMD) [4], and Multiplicative Increase/Additive Decrease (MIAD), shows that AIMD leads to the most optimal resource allocation.

AIMD as a feedback control algorithm has been investigated a lot in the literature [5, 6, 7, 8, 9]. In this setting, each agent determines its own allocation as per the AIMD algorithm. Similar distributed optimization algorithms have been shown to iteratively converge to an optimal allocation of resources [10, 11, 12], but these algorithms rely on inter-agent communication to achieve optimality. While a combined version of the classical AIMD algorithm with a probabilistic rule [13]

defines the behavior of multiple agents in response to a capacity signal. This non-homogeneous Markov chain is showed to reach sure convergence, to the social optimum. Here, each agent responds to a capacity event according to its own probability function, known only to that agent, without the need of any communication with another agent. In this sense,

[13] goes beyond traditional AIMD and emulates RED-like congestion control [14]. Also, very limited actuation is assumed for an agent, and the agent only decides to respond to a capacity event or not, in an asynchronous manner. There is no need for a common clock and the setting is completely stochastic.

More generally, problems concerning multiple independent producers and consumers can occur in many different domains of application—for instance, electrical power [15], solar energy and microgrids [16], data centers [17], and air cargo [18]

. Such problems can be considered as specific instances of machine learning

[19], or as problems of collective control over distributed dynamical systems [20].

A simple strategy of rational learning where each producer and each consumer acts based on the previous instance, is unsatisfactory [21]. Many classes of equilibrium problems are computationally hard [22].

This paper is organized as follows. First, we describe the mathematical model in Section II. We describe our proposed solution approach in Section III. We show the optimality of our approach in Section IV and describe the iterative convergence to optimality through simulations in Section V. Finally, we sum up our contributions and conclude in Section VI.

## Ii Setting

We consider a market with a single good (arbitrarily divisible), with multiple suppliers and multiple consumers. Three types of convergence are routinely observed. First, the total demand and total supply converge through a dynamic process that requires the exchange of information on the price of a good, the total demand, or the total supply (e.g., through a process of tatonnement). In the presence of multiple suppliers, these suppliers may compete via their production quantities, which is known as Cournot competition. In Cournot competition, the production quantity of each supplier may converge as well through a tatonnement process. Similarly, in the presence of multiple consumers, these consumers may also compete for the limited quantity of good available through bidding and proportional allocation (e.g., competition among newsvendors). Here again, we observe convergence of the allocated quantities of each consumer. In these situations, the convergence relies on an exchange of information in the form of prices, bids, and quantities of goods. In this work, we consider a situation where communication, e.g., the exchange of information, is limited.

The notation that we use throughout the paper is as follows. There is a group of suppliers , and a group of consumers . Time is discrete: For each , the amount of good supplied at time is . For each , the amount of good consumed at time is . The total supply of good at time is , the total consumption is . For each , the utility function represents the profit from producing an amount . For each , the utility function represents the profit from consuming an amount . Our objective is twofold. On one hand, we want to see the individual consumption quantities and production quantities to converge using limited inter-agent communication. On the other hand, we want the limit to be efficient: the total supply and total demand are balanced, the total supply is allocated optimally among suppliers, and the total demand is allocated optimally among the consumers.

###### Remark 1 (Choice of utility function).

Whether the agent wants to respond to the capacity signal or not, highly depends on the choice of the utility function for that agent. The utility function has to be selected, considering the fact that it should lead to a maximum so that we can reach an optimal point, which maximizes profit for the agent. Therefore, we consider concave functions, which will be demonstrated to reach a global maximum at the optimum quantity supplied/consumed, in the later sections.

More precisely, the first objective requires that there exist constants and such that for every and , we have

 xi(t)→x∗i, (1) yj(t)→y∗j, (2)

as . The second objective requires that and satisfy the following:

 (x∗,y∗)∈maxx,y ∑i∈Sfi(xi)+∑j∈Cgj(yj) (3) such that ∑i∈Sxi=∑j∈Cyj. (4)

The limited communication property will be presented after presenting the algorithms for computing .

For the sake of discussion, and as a baseline, we also define the following vectors

and solving the following unconstrained optimization problems:

 u∗∈argmaxx∈RS ∑ifi(xi), (5)

and

 w∗∈argmaxy∈RC ∑jgj(yj). (6)

Next, we present the proposed distributed algorithm that specifies how agents update their and over time.

## Iii Algorithm

We consider a distributed environment, where we have a total of S suppliers and C consumers, such that each time instant , each supplier supplies an amount and after that, each consumer consumes an amount .

The procedures for updating and are presented in Algorithm 1 and Algorithm 2.

At the time instant , we keep a check whether the total amount supplied at was equivalent to the total amount consumed, or not. If the supply was more than the consumption, a capacity signal is sent to all the supplier agents to reduce the supply amount for . Otherwise, if the consumption was more than the supply, a capacity signal is sent to the consumer agents to reduce the consumption for . This makes sure that an equilibrium is maintained throughout the execution, whenever the supply or consumption fluctuates.

If the agent responds to the capacity signal, it reduces the amount in the next time instant by a factor of , or else, it keeps on increasing the amount after every time instant by adding the value of to it. When the agent receives a capacity signal, its probability to reduce by a factor of

depends on the Bernoulli random variable

, as , and

An agent also keeps track of its individual long term average, which is given by:

 ¯xi(t)=(1t+1)t∑T=0xi(T).

We want to consider limited inter-agent communication. The only communication available at time are the signals:

 s(t) ≜1[x(t−1)

The pseudocode for the procedure describing what happens at the supplier side is shown in Algorithm 1. Each agent maintains the last value sent, and the long term average of the values sent in the past, , and this along with the concave nature of its utility function helps the agent drive the quantity sent towards the optimal point, .

At the iteration of the algorithm, it is first checked whether the total quantity sent at was more than the total quantity consumed(condition for capacity signal being sent to the agent), or not (line 4) for each of the supplier agent. If the condition is satisfied, probability is calculated (line 5) as a function of utility derivative and long-term average, for . On the basis of this probability, it is determined whether the agent responds to the capacity signal or not. The agent responds to the signal by reducing the quantity sent at by a factor of , in comparison to what it sent at (line 9). This depends on the independent Bernoulli random variable with parameter (line 6).

Else, it checks whether the value supplied at was lesser than the optimum point of the agent’s utility function or not (line 10). If satisfied, the agent adds the value of to quantity supplied at and send it at (line 11). Otherwise, it sends the quantity by subtracting to the quantity sent at (line 13). This condition is included because we do not want the agent to over-supply, and also makes sure that if this agent has reached its optimum supply quantity, other agents follow suit and are driven towards their individual optimum supply quantities. We will prove this statement later.

Then, the long term average is calculated (line 15), and the quantity is supplied by agent at (line 16). The same algorithm runs at the consumer side (Algorithm 2), and both of these concurrent running algorithms make sure that our system reaches the state which leads to maximum combined profit.

## Iv Convergence

In this section, we show the convergence over time of the sequences of produced quantities for all producers, as well as the sequences of consumed quantities for all consumers.

###### Theorem 1 (Convergence Theorem).

Suppose that every function and is concave and achieves its maximum at a finite point. For every producer and every consumer , we have

 xi(t)→u∗i,yj(t)→w∗j.
###### Proof Sketch.

Recall and are defined in (5,6). The proof proceeds by three steps. First, we show that converges as . Secondly, we show that converges for all . Lastly, we show that converges for all .

Step 0. Observe that since each achieves its maximum at the point , hence, there exists a finite number such that . Observe that, if converges, the limit must be .

Step 1. Consider the sequence

 ^s(t)=1[∑ixi(t−1)

By continuity, the AIMD algorithm with input and the AIMD algorithm with input converge to the same limit point. Therefore, by [13, Theorem 1], we have for all .

Step 2. Consider the sequence

 ^c(t)=1[∑iu∗i<∑jyj(t−1)].

Since for all , by continuity, the AIMD algorithm with input and the AIMD algorithm with input converge to the same limit point. Therefore, by [13, Theorem 1], we have for all . ∎

## V Simulations

In this section, we simulate the interactions of suppliers and consumers in two settings; first, when their utility functions are non-monotonic concave, as in Figure 0(a), second, when the supplier utility functions are monotonic concave, as in Figure 4(a).

### V-a Non-monotonic Utility Functions

We simulate a total of 9 supplier agents, and 18 consumer agents. We generate random utility functions for the agents, while ensuring that the following sums on the supplier and consumer sides are deterministic and equal:

 ∑i∈Smaxzifi(zi)=∑j∈Cmaxzjgj(zj)=900. (8)

The values of and for supplier and consumer agents is 5 and 0.75 respectively. The network constant is kept at 2.0 to ensure that the probability remains in the interval .

The total supply and consumption lingers around the peak total utility, as shown in (8), i.e. . As seen in Figure 1(b), the total supply from all supplier agents is balanced by the total consumption by all consumer agents.

From Figures 2(a) and 3(a), we can see that the respective supplier and consumer agent long-term averages saturate around their respective optimum points. Figures 2(b) and 3(b) show the utility-function derivatives converging towards 0 for both supplier and consumer agents, depicting the maximum profit for each agent, while Figure 1(c) shows the 95% confidence on the supplier utility derivative. The same can be understood by examining Figure 1(a), that the sum of all the utilities converges towards optima, i.e. 900, which was assumed as a constant in the beginning of our simulation.

### V-B Monotonic Supplier Utility Functions

Now we consider a scenario where suppliers have a monotonic non-decreasing utility function, while the consumers have concave utility. So, the supplier utility (see Figure 4(a)) is of the form

 fi(¯xi(t))=ℓi√¯xi(t) (9)

while consumer utilities (Figure 0(b)) are of the form

 gj(¯yj(t))=−(¯yj(t)−yj∗)2ℏj+(1.5)ℏj (10)

On simulating for the same number of supplier and consumer agents, along with the same constants as in the previous simulation, we realize that sum of quantities supplied and consumed tend to saturate around the optimal sum (see Figure 4(c)), even when no particular maximum is present for the supplier utility. The utility sum for consumers does converge towards the optimal sum (see Figure 4(b)), unlike the supplier utility’s sum, as there is no particular maximum for the monotonic non-decreasing functions.

## Vi Conclusion

In this paper, we have utilized the convergence properties of the AIMD algorithm [13], to solve the problem of maintaining an equilibrium of supply and demand, for a group of distributed agents [21], by also maximizing their respective profits.

As showed in our simulations (see Section V-A), for a concave type utility functions for both supplier and consumer agents, i.e., with utilities which have clear maxima, profit-maximization and equal sum of supply and demand holds true. By other simulations (see Section V-B), even when either side—supply or demand—does not have a concave utility, equilibrium of supply and demand is satisfied, along with the profit-maximization of the agent with concave utility.

Considering the slew of applications requiring a global, dynamic balance between supply and demand, such as the management of data centers, energy producers and consumers connected to a smart grid, and the like, we surmise that the work presented here can be put to profitable use in several domains of application.

## Acknowledgment

Jia Yuan Yu was supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-05096).

## References

• [1] D.-M. Chiu and R. Jain, “Analysis of the increase and decrease algorithms for congestion avoidance in computer networks,” Computer Networks and ISDN systems, vol. 17, no. 1, pp. 1–14, 1989.
• [2] E. Altman, K. Avrachenkov, C. Barakat, A. A. Kherani, and B. Prabhu, “Analysis of mimd congestion control algorithm for high speed networks,” Computer Networks, vol. 48, no. 6, pp. 972–989, 2005.
• [3]

K. Xu and N. Ansari, “Stability and fairness of rate estimation-based AIAD congestion control in TCP,”

IEEE Commun. Lett., vol. 9, no. 4, pp. 378–380, 2005.
• [4] Y. R. Yang and S. S. Lam, “General AIMD congestion control,” in 8th International Conference on Network Protocols (ICNP 2000), pp. 187–198, IEEE, 2000.
• [5] S. Jacobs and A. Eleftheriadis, “Providing video services over networks without quality of service guarantees,” in World Wide Web Consortium Workshop on Real-Time Multimedia and the Web, 1996.
• [6] R. Rejaie, M. Handley, and D. Estrin, “Rap: An end-to-end rate-based congestion control mechanism for realtime streams in the internet,” in INFOCOM’99. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 3, pp. 1337–1345, IEEE, 1999.
• [7] D. Sisalem and H. Schulzrinne, “The loss-delay based adjustment algorithm: A TCP-friendly adaptation scheme,” in Proceedings of NOSSDAV, vol. 98, pp. 215–226, Citeseer, 1998.
• [8] S. Cen, J. Walpole, and C. Pu, “Flow and congestion control for internet media streaming applications,” in Multimedia Computing and Networking 1998, vol. 3310, pp. 250–265, International Society for Optics and Photonics, 1997.
• [9] M. Corless, C. King, R. Shorten, and F. Wirth, AIMD dynamics and distributed resource allocation, vol. 29. SIAM, 2016.
• [10] S. S. Ram, A. Nedić, and V. V. Veeravalli, “Distributed stochastic subgradient projection algorithms for convex optimization,” Journal of optimization theory and applications, vol. 147, no. 3, pp. 516–545, 2010.
• [11] A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Trans. Autom. Control, vol. 54, no. 1, pp. 48–61, 2009.
• [12] J. C. Duchi, A. Agarwal, and M. J. Wainwright, “Dual averaging for distributed optimization: Convergence analysis and network scaling,” IEEE Trans. Autom. Control, vol. 57, no. 3, pp. 592–606, 2012.
• [13] F. Wirth, S. Stuedli, J. Y. Yu, M. Corless, and R. Shorten, “Nonhomogeneous Place-Dependent Markov Chains, Unsynchronised AIMD, and Network Utility Maximization,” arXiv 1404.5064 [math.OC], Apr. 2014.
• [14] R. Srikant, “Internet congestion control, vol. 14 of control theory,” 2004.
• [15] A. Muralidharan, H. A. Maior, and S. Rao, A Self-Governing and Decentralized Network of Smart Objects to Share Electrical Power Autonomously, pp. 25–48. Cham: Springer International Publishing, 2018.
• [16] I. Maity and S. Rao, “Simulation and pricing mechanism analysis of a solar-powered electrical microgrid,” IEEE Syst. J., vol. 4, pp. 275–284, Sept. 2010.
• [17] N. Singh and S. Rao, “Ensemble learning for large-scale workload prediction,” IEEE Trans. Emerg. Topics Comput., vol. 2, pp. 149–165, June 2014.
• [18] R. Totamane, A. Dasgupta, and S. Rao, “Air cargo demand modeling and prediction,” IEEE Syst. J., vol. 8, pp. 52–62, Mar. 2014. doi:10.1109/JSYST.2012.2218511.
• [19] T. M. Mitchell, Machine Learning. McGraw Hill, 1997.
• [20] D. H. Wolpert, K. R. Wheeler, and K. Tumer, “Collective intelligence for control of distributed dynamical systems,” Europhysics Letters, vol. 49, no. 6, p. 708, 2000.
• [21] P. K. Enumula and S. Rao, “The Potluck Problem,” Economics Letters, vol. 107, pp. 10–12, Apr. 2010.
• [22] B. Codenotti, B. Mccune, S. Pemmaraju, R. Raman, and K. Varadarajan, “An experimental study of different approaches to solve the market equilibrium problem,” ACM Journal on Experimental Algorithmics, vol. 12, pp. 3.3:1–3.3:21, Aug. 2008.