Many modern problems involve one resource that is generated over time by a number of suppliers and simultaneously consumed by a number of consumers. For instance, consider electrical power generated or injected into the grid by utility companies, wind mills, solar panels, batteries, and electric vehicles. At the same time, other devices consume the same power. Other examples include cryptocurrency markets with large numbers of producers and consumers, etc.
These problems are complex problems that involve a huge number of decision-makers. Each producer has a different cost for producing a unit of the resource, and each consumer derives a different benefit from consuming the resource. One desirable outcome of the interaction between producers and consumers is to maximize the social welfare or minimize the social cost. If we can represent each agent by an utility function, then we aim to maximize the sum of utilities of producers and consumers subject to capacity constraints on the quantities. These constraints ensure that quantities are nonnegative and that the total consumption is at most the total production.
It is well-known that in the presence of money and price signals, an equilibrium can be reached through a tatonnement process where social welfare is maximized. However, we propose a mechanism where neither money, nor price signals, are required to maximize social welfare: we only require binary feedback signals on whether or not total consumption exceeds total production. Our mechanism is decentralized: each agent only requires knowledge of its own utility function. We show that if every agent follows a specified algorithm, then over time, the profile of decisions (production and consumption quantities) converges to a socially optimal outcome.
This problem can be related to the TCP congestion control problem, where data sources controlling their own rates can interact to achieve an optimal network-wide rate allocation, where supplier agents can act as source of allocation, while consumer agents act as the consumers of the allocated resources. The detailed analysis  of increase/decrease algorithms used predominantly in the past for TCP congestion control, which include Multiplicative Increase/Multiplicative Decrease (MIMD) , Additive Increase/Additive Decrease (AIAD) , Additive Increase/ Multiplicative Decrease (AIMD) , and Multiplicative Increase/Additive Decrease (MIAD), shows that AIMD leads to the most optimal resource allocation.
AIMD as a feedback control algorithm has been investigated a lot in the literature [5, 6, 7, 8, 9]. In this setting, each agent determines its own allocation as per the AIMD algorithm. Similar distributed optimization algorithms have been shown to iteratively converge to an optimal allocation of resources [10, 11, 12], but these algorithms rely on inter-agent communication to achieve optimality. While a combined version of the classical AIMD algorithm with a probabilistic rule 
defines the behavior of multiple agents in response to a capacity signal. This non-homogeneous Markov chain is showed to reach sure convergence, to the social optimum. Here, each agent responds to a capacity event according to its own probability function, known only to that agent, without the need of any communication with another agent. In this sense, goes beyond traditional AIMD and emulates RED-like congestion control . Also, very limited actuation is assumed for an agent, and the agent only decides to respond to a capacity event or not, in an asynchronous manner. There is no need for a common clock and the setting is completely stochastic.
More generally, problems concerning multiple independent producers and consumers can occur in many different domains of application—for instance, electrical power , solar energy and microgrids , data centers , and air cargo 
. Such problems can be considered as specific instances of machine learning, or as problems of collective control over distributed dynamical systems .
A simple strategy of rational learning where each producer and each consumer acts based on the previous instance, is unsatisfactory . Many classes of equilibrium problems are computationally hard .
This paper is organized as follows. First, we describe the mathematical model in Section II. We describe our proposed solution approach in Section III. We show the optimality of our approach in Section IV and describe the iterative convergence to optimality through simulations in Section V. Finally, we sum up our contributions and conclude in Section VI.
We consider a market with a single good (arbitrarily divisible), with multiple suppliers and multiple consumers. Three types of convergence are routinely observed. First, the total demand and total supply converge through a dynamic process that requires the exchange of information on the price of a good, the total demand, or the total supply (e.g., through a process of tatonnement). In the presence of multiple suppliers, these suppliers may compete via their production quantities, which is known as Cournot competition. In Cournot competition, the production quantity of each supplier may converge as well through a tatonnement process. Similarly, in the presence of multiple consumers, these consumers may also compete for the limited quantity of good available through bidding and proportional allocation (e.g., competition among newsvendors). Here again, we observe convergence of the allocated quantities of each consumer. In these situations, the convergence relies on an exchange of information in the form of prices, bids, and quantities of goods. In this work, we consider a situation where communication, e.g., the exchange of information, is limited.
The notation that we use throughout the paper is as follows. There is a group of suppliers , and a group of consumers . Time is discrete: For each , the amount of good supplied at time is . For each , the amount of good consumed at time is . The total supply of good at time is , the total consumption is . For each , the utility function represents the profit from producing an amount . For each , the utility function represents the profit from consuming an amount . Our objective is twofold. On one hand, we want to see the individual consumption quantities and production quantities to converge using limited inter-agent communication. On the other hand, we want the limit to be efficient: the total supply and total demand are balanced, the total supply is allocated optimally among suppliers, and the total demand is allocated optimally among the consumers.
Remark 1 (Choice of utility function).
Whether the agent wants to respond to the capacity signal or not, highly depends on the choice of the utility function for that agent. The utility function has to be selected, considering the fact that it should lead to a maximum so that we can reach an optimal point, which maximizes profit for the agent. Therefore, we consider concave functions, which will be demonstrated to reach a global maximum at the optimum quantity supplied/consumed, in the later sections.
More precisely, the first objective requires that there exist constants and such that for every and , we have
as . The second objective requires that and satisfy the following:
The limited communication property will be presented after presenting the algorithms for computing .
For the sake of discussion, and as a baseline, we also define the following vectorsand solving the following unconstrained optimization problems:
Next, we present the proposed distributed algorithm that specifies how agents update their and over time.
We consider a distributed environment, where we have a total of S suppliers and C consumers, such that each time instant , each supplier supplies an amount and after that, each consumer consumes an amount .
At the time instant , we keep a check whether the total amount supplied at was equivalent to the total amount consumed, or not. If the supply was more than the consumption, a capacity signal is sent to all the supplier agents to reduce the supply amount for . Otherwise, if the consumption was more than the supply, a capacity signal is sent to the consumer agents to reduce the consumption for . This makes sure that an equilibrium is maintained throughout the execution, whenever the supply or consumption fluctuates.
If the agent responds to the capacity signal, it reduces the amount in the next time instant by a factor of , or else, it keeps on increasing the amount after every time instant by adding the value of to it. When the agent receives a capacity signal, its probability to reduce by a factor of
depends on the Bernoulli random variable, as , and
An agent also keeps track of its individual long term average, which is given by:
We want to consider limited inter-agent communication. The only communication available at time are the signals:
The pseudocode for the procedure describing what happens at the supplier side is shown in Algorithm 1. Each agent maintains the last value sent, and the long term average of the values sent in the past, , and this along with the concave nature of its utility function helps the agent drive the quantity sent towards the optimal point, .
At the iteration of the algorithm, it is first checked whether the total quantity sent at was more than the total quantity consumed(condition for capacity signal being sent to the agent), or not (line 4) for each of the supplier agent. If the condition is satisfied, probability is calculated (line 5) as a function of utility derivative and long-term average, for . On the basis of this probability, it is determined whether the agent responds to the capacity signal or not. The agent responds to the signal by reducing the quantity sent at by a factor of , in comparison to what it sent at (line 9). This depends on the independent Bernoulli random variable with parameter (line 6).
Else, it checks whether the value supplied at was lesser than the optimum point of the agent’s utility function or not (line 10). If satisfied, the agent adds the value of to quantity supplied at and send it at (line 11). Otherwise, it sends the quantity by subtracting to the quantity sent at (line 13). This condition is included because we do not want the agent to over-supply, and also makes sure that if this agent has reached its optimum supply quantity, other agents follow suit and are driven towards their individual optimum supply quantities. We will prove this statement later.
In this section, we show the convergence over time of the sequences of produced quantities for all producers, as well as the sequences of consumed quantities for all consumers.
Theorem 1 (Convergence Theorem).
Suppose that every function and is concave and achieves its maximum at a finite point. For every producer and every consumer , we have
Step 0. Observe that since each achieves its maximum at the point , hence, there exists a finite number such that . Observe that, if converges, the limit must be .
Step 1. Consider the sequence
By continuity, the AIMD algorithm with input and the AIMD algorithm with input converge to the same limit point. Therefore, by [13, Theorem 1], we have for all .
Step 2. Consider the sequence
Since for all , by continuity, the AIMD algorithm with input and the AIMD algorithm with input converge to the same limit point. Therefore, by [13, Theorem 1], we have for all . ∎
In this section, we simulate the interactions of suppliers and consumers in two settings; first, when their utility functions are non-monotonic concave, as in Figure 0(a), second, when the supplier utility functions are monotonic concave, as in Figure 4(a).
V-a Non-monotonic Utility Functions
We simulate a total of 9 supplier agents, and 18 consumer agents. We generate random utility functions for the agents, while ensuring that the following sums on the supplier and consumer sides are deterministic and equal:
The values of and for supplier and consumer agents is 5 and 0.75 respectively. The network constant is kept at 2.0 to ensure that the probability remains in the interval .
The total supply and consumption lingers around the peak total utility, as shown in (8), i.e. . As seen in Figure 1(b), the total supply from all supplier agents is balanced by the total consumption by all consumer agents.
From Figures 2(a) and 3(a), we can see that the respective supplier and consumer agent long-term averages saturate around their respective optimum points. Figures 2(b) and 3(b) show the utility-function derivatives converging towards 0 for both supplier and consumer agents, depicting the maximum profit for each agent, while Figure 1(c) shows the 95% confidence on the supplier utility derivative. The same can be understood by examining Figure 1(a), that the sum of all the utilities converges towards optima, i.e. 900, which was assumed as a constant in the beginning of our simulation.
V-B Monotonic Supplier Utility Functions
Now we consider a scenario where suppliers have a monotonic non-decreasing utility function, while the consumers have concave utility. So, the supplier utility (see Figure 4(a)) is of the form
while consumer utilities (Figure 0(b)) are of the form
On simulating for the same number of supplier and consumer agents, along with the same constants as in the previous simulation, we realize that sum of quantities supplied and consumed tend to saturate around the optimal sum (see Figure 4(c)), even when no particular maximum is present for the supplier utility. The utility sum for consumers does converge towards the optimal sum (see Figure 4(b)), unlike the supplier utility’s sum, as there is no particular maximum for the monotonic non-decreasing functions.
In this paper, we have utilized the convergence properties of the AIMD algorithm , to solve the problem of maintaining an equilibrium of supply and demand, for a group of distributed agents , by also maximizing their respective profits.
As showed in our simulations (see Section V-A), for a concave type utility functions for both supplier and consumer agents, i.e., with utilities which have clear maxima, profit-maximization and equal sum of supply and demand holds true. By other simulations (see Section V-B), even when either side—supply or demand—does not have a concave utility, equilibrium of supply and demand is satisfied, along with the profit-maximization of the agent with concave utility.
Considering the slew of applications requiring a global, dynamic balance between supply and demand, such as the management of data centers, energy producers and consumers connected to a smart grid, and the like, we surmise that the work presented here can be put to profitable use in several domains of application.
Jia Yuan Yu was supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-05096).
-  D.-M. Chiu and R. Jain, “Analysis of the increase and decrease algorithms for congestion avoidance in computer networks,” Computer Networks and ISDN systems, vol. 17, no. 1, pp. 1–14, 1989.
-  E. Altman, K. Avrachenkov, C. Barakat, A. A. Kherani, and B. Prabhu, “Analysis of mimd congestion control algorithm for high speed networks,” Computer Networks, vol. 48, no. 6, pp. 972–989, 2005.
K. Xu and N. Ansari, “Stability and fairness of rate estimation-based AIAD congestion control in TCP,”IEEE Commun. Lett., vol. 9, no. 4, pp. 378–380, 2005.
-  Y. R. Yang and S. S. Lam, “General AIMD congestion control,” in 8th International Conference on Network Protocols (ICNP 2000), pp. 187–198, IEEE, 2000.
-  S. Jacobs and A. Eleftheriadis, “Providing video services over networks without quality of service guarantees,” in World Wide Web Consortium Workshop on Real-Time Multimedia and the Web, 1996.
-  R. Rejaie, M. Handley, and D. Estrin, “Rap: An end-to-end rate-based congestion control mechanism for realtime streams in the internet,” in INFOCOM’99. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 3, pp. 1337–1345, IEEE, 1999.
-  D. Sisalem and H. Schulzrinne, “The loss-delay based adjustment algorithm: A TCP-friendly adaptation scheme,” in Proceedings of NOSSDAV, vol. 98, pp. 215–226, Citeseer, 1998.
-  S. Cen, J. Walpole, and C. Pu, “Flow and congestion control for internet media streaming applications,” in Multimedia Computing and Networking 1998, vol. 3310, pp. 250–265, International Society for Optics and Photonics, 1997.
-  M. Corless, C. King, R. Shorten, and F. Wirth, AIMD dynamics and distributed resource allocation, vol. 29. SIAM, 2016.
-  S. S. Ram, A. Nedić, and V. V. Veeravalli, “Distributed stochastic subgradient projection algorithms for convex optimization,” Journal of optimization theory and applications, vol. 147, no. 3, pp. 516–545, 2010.
-  A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Trans. Autom. Control, vol. 54, no. 1, pp. 48–61, 2009.
-  J. C. Duchi, A. Agarwal, and M. J. Wainwright, “Dual averaging for distributed optimization: Convergence analysis and network scaling,” IEEE Trans. Autom. Control, vol. 57, no. 3, pp. 592–606, 2012.
-  F. Wirth, S. Stuedli, J. Y. Yu, M. Corless, and R. Shorten, “Nonhomogeneous Place-Dependent Markov Chains, Unsynchronised AIMD, and Network Utility Maximization,” arXiv 1404.5064 [math.OC], Apr. 2014.
-  R. Srikant, “Internet congestion control, vol. 14 of control theory,” 2004.
-  A. Muralidharan, H. A. Maior, and S. Rao, A Self-Governing and Decentralized Network of Smart Objects to Share Electrical Power Autonomously, pp. 25–48. Cham: Springer International Publishing, 2018.
-  I. Maity and S. Rao, “Simulation and pricing mechanism analysis of a solar-powered electrical microgrid,” IEEE Syst. J., vol. 4, pp. 275–284, Sept. 2010.
-  N. Singh and S. Rao, “Ensemble learning for large-scale workload prediction,” IEEE Trans. Emerg. Topics Comput., vol. 2, pp. 149–165, June 2014.
-  R. Totamane, A. Dasgupta, and S. Rao, “Air cargo demand modeling and prediction,” IEEE Syst. J., vol. 8, pp. 52–62, Mar. 2014. doi:10.1109/JSYST.2012.2218511.
-  T. M. Mitchell, Machine Learning. McGraw Hill, 1997.
-  D. H. Wolpert, K. R. Wheeler, and K. Tumer, “Collective intelligence for control of distributed dynamical systems,” Europhysics Letters, vol. 49, no. 6, p. 708, 2000.
-  P. K. Enumula and S. Rao, “The Potluck Problem,” Economics Letters, vol. 107, pp. 10–12, Apr. 2010.
-  B. Codenotti, B. Mccune, S. Pemmaraju, R. Raman, and K. Varadarajan, “An experimental study of different approaches to solve the market equilibrium problem,” ACM Journal on Experimental Algorithmics, vol. 12, pp. 3.3:1–3.3:21, Aug. 2008.