No-boarding buses: Agents allowed to cooperate or defect

06/28/2019
by   Vee-Liem Saw, et al.
Nanyang Technological University
0

We study a bus system with a no-boarding policy, where a "slow" bus may disallow passengers from boarding if it meets some criteria. When the no-boarding policy is activated, people waiting to board at the bus stop are given the choices of cooperating or defecting. The people's heterogeneous behaviours are modelled by inductive reasoning and bounded rationality, inspired by the El Farol problem and the minority game. In defecting the no-boarding policy, instead of the minority group being the winning group, we investigate several scenarios where defectors win if the number of defectors does not exceed the maximum number of allowed defectors but lose otherwise. Contrary to the classical minority game which has N agents repeatedly playing amongst themselves, many real-world situations like boarding a bus involves only a subset of agents who "play each round", with different subsets playing at different rounds. We find for such realistic situations, there is no phase transition with no herding behaviour when the usual control paramater 2^m/N is small. The absence of the herding behaviour assures feasible and sustainable implementation of the no-boarding policy with allowance for defections, without leading to bus bunching.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

07/23/2018

Towards a Programmable Framework for Agent Game Playing

The field of Game Theory provides a useful mechanism for modeling many d...
10/26/2021

Playing Repeated Coopetitive Polymatrix Games with Small Manipulation Cost

Repeated coopetitive games capture the situation when one must efficient...
11/02/2020

Incorporating Rivalry in Reinforcement Learning for a Competitive Game

Recent advances in reinforcement learning with social agents have allowe...
04/15/2022

Understanding Game-Playing Agents with Natural Language Annotations

We present a new dataset containing 10K human-annotated games of Go and ...
01/13/2020

State Representation and Polyomino Placement for the Game Patchwork

Modern board games are a rich source of entertainment for many people, b...
11/28/2019

Policies for constraining the behaviour of coalitions of agents in the context of algebraic information theory

This article takes an oblique sidestep from two previous papers, wherein...

I Introduction

Bus transit systems play a vital role in moving people efficiently. Left on its own, however, buses tend to bunch and form clusters of buses moving together. The formation of such clusters can be intuitively understood as follows: Suppose buses are initially distributed evenly. Due to stochasticity in the number of people waiting at a bus stop as well as traffic conditions, a bus may happen to be slightly delayed at a bus stop. Once it leaves the bus stop, the bus that is trailing it experiences a slightly shorter headway such that it needs to pick up slightly less people than the original bus. This speeds up the trailing bus further, allowing it to catch up even more with fewer and fewer people at the subsequent bus stops for it to pick up due to the diminishing headway. Therefore, these two buses end up bunching together. Bunched buses reduce the efficacy of the system, because if a person misses a group of bunched buses, he essentially misses not one but multiple buses and have to wait longer for the next bus(es) to arrive.

This problem has long been identified Newell and Potts (1964); Chapman and Michel (1978); Powell and Sheffi (1983); Gershenson and Pineda (2009); Bellei and Gkoumas (2010); Saw et al. (2019), with many studies conducted to propose possible rectifications. Some suggested methods include holding back buses so that they follow a prescribed schedule or to counteract the diverging headways between buses Abkowitz and Engelstein (1984); Rossetti and Turitto (1998); Eberlein et al. (2001); Hickman (2001); Fu and Yang (2002); Bin et al. (2006); Daganzo (2009); Cortés et al. (2010); Cats et al. (2011); Gershenson (2011); Bartholdi and Eisenstein (2012); Moreira-Matias et al. (2016); Wang et al. (2018), stop-skipping to correct for the headways Li et al. (1991); Eberlein (1995); Fu et al. (2003); Sun and Hickman (2005); Cortés et al. (2010); Liu et al. (2013), deadheading (i.e. having an empty bus move directly to a designated bus stop) Furth (1985); Furth and Day (1985); Eberlein (1995); Eberlein et al. (1998); Liu et al. (2013), carefully engineering the bus routes and locations of the bus stops Tirachini (2014), using buses with wide doors to speed up boarding/alighting Jara-Díaz and Tirachini (2013); Stewart and El-Geneidy (2014); El-Geneidy et al. (2017), as well as a no-boarding policy where a “slow” bus only allows people to alight but disallows boarding Saw and Chew (2019).

i.1 No-boarding buses

In the no-boarding policy, the “slow” bus gets to speed up by saving time from otherwise getting stuck at the bus stop to pick up people, and allows the “fast” bus behind it (which would soon bunch into it, if no interference is carried out) to pick up these people instead — effectively slowing it down. Ref. Saw and Chew (2019) has worked out analytically, backed up by extensive simulations based on a real bus system, that the bus system as a whole would experience significant improvement in terms of reducing the overall average waiting time of people at the bus stop for a bus to arrive. The global improvement comes with minor local cost, however, as those denied boarding would doubtlessly have their own waiting times slightly extended. Nevertheless, the overall global gain far outweighs those minor local costs.

The purpose of this paper is to investigate the scenario where unlike in Ref. Saw and Chew (2019) with no-boarding mandatorily enforced by the bus system when a bus is deemed as “too slow”, here the people waiting to board at the bus stop are given the options of whether to cooperate or defect the no-boarding policy. Such a social situation is of immense interest to policymakers and bus operators because sometimes some people necessarily require service urgently and are willing to pay a premium for it — of course, provided that such an option is available in the first place. On top of that, certain weather conditions like thunderstorm, snow (in countries with winter), blazing heat, amongst others, may make waiting in the open bus stop undesirable. Besides, it is arguably less of a pain point to be on board the bus, seated and enjoying the air-conditioner albeit slowly moving, compared to waiting at the bus stop with uncertainty on when a bus would actually arrive.

Whilst allowing defections out of goodwill and compassion for the needy is certainly worthwhile, without a mechanism for check and balance, this may be subjected to abuse since everybody would instinctively like to board immediately instead of wasting extra time waiting for the next bus. But if too many people defect, the system as a whole would fail to maintain its optimal configuration of buses with bus bunching being a repercussion — defeating the original intention of the no-boarding policy. Such a situation is an example of the tragedy of the commons Hardin (1968).

i.2 Inductive reasoning and bounded rationality

To simulate people evaluating choices and making what they would individually perceive as their respective best action Bower and Hilgard (1981); Holland et al. (1989), we adopt the description of humans with inductive reasoning and bounded rationality presented by Ref. Arthur (1994) in studying the El Farol Problem. That description has subsequently been formalised into the minority game Challet and Zhang (1997, 1998); Savit et al. (1999); Cavagna (1999); Manuca et al. (2000); Moro (2004). Our mathematical representation of the decision-making people (or agents) is based on Ref. Challet and Zhang (1997), which we now describe.

When a bus announces that the no-boarding policy is activated, here in this paper, each person who would otherwise normally be allowed to board but denied in Ref. Saw and Chew (2019) would be given the options to cooperate (i.e. not board, and obediently wait for the next bus) or defect (i.e. defy the no-boarding rule and board anyway). The way they make their choices is determined as follows. We represent them as independent agents each endowed with different strategies and memory . These strategies are ideally distinct for different agents, since different people would behave differently with their own beliefs and ideas. Nevertheless, coincidentally similar strategies are allowed. In fact, if the number of agents are way more than there are available strategies (which is determined by , i.e. strategies, see Refs. Challet and Zhang (1998); Savit et al. (1999) for discussions on this), then some agents must be sharing at least one strategy. The memory records the most recent past results of the winning choices, i.e. whether the cooperators are winners or the defectors are winners. Then, for each such combinations of binary historical outcomes, a strategy would specify the next action of whether to cooperate or defect. A strategy is thus a set of maps, one map for each of these possible combinations to an element of cooperate, defect to be made in the next round. A different strategy would map each of those combinations into a possibly different element of cooperate, defect. The performance of each of the strategies is tracked based on their predictions and the actual outcomes, and a strategy gains a “virtual point” for correctly predicting the next outcome. These virtual points track how well each strategy is predicting the next outcome regardless whether they are used or not in making the actual choice. The current best performing strategy (i.e. with the highest virtual point) would be used to make the actual next decision. If multiple strategies are tied on virtual points, the actual one to use is decided randomly. Thus, this adaptivity property allows agents to learn which strategy amongst their possible ones is so-called “current best” from their individual perspectives.

i.3 Winners are those in the minority group, collective cooperation, herding behaviour

Note that in Refs. Arthur (1994); Challet and Zhang (1997), the winning group (cooperators or defectors) are determined by the minority group, i.e. the side with fewer people. This rule presents itself with a natural feedback mechanism that never settles into any permanently desired group. If cooperators are the winners right now, then more defectors would like to be cooperators. But this would turn cooperators into the majority and defectors would become the winning group.

Such a minority feedback has led physicists to draw comparisons with physical systems possessing quenched disorder and phase transitions Savit et al. (1999); Challet and Marsili (1999); Manuca et al. (2000); Marsili et al. (2000); Challet et al. (2000); Sherrington and Kirkpatrick (1975); Moro (2004); Ghosh et al. (2012); Chakraborti et al. (2015). For instance, the minority game has different phases, where in one such phase the best strategies for a significant number of agents, respectively, are frozen, i.e. these are always their best respective strategies and these agents never switch into a different strategy. Ref. Challet and Marsili (1999) argued that this is akin to symmetry breaking of a spin system when the control parameter exceeds a critical value, analogous to spontaneous magnetisation. Such correspondences with physical systems have fruitfully opened up the use of statistical mechanical techniques being applied to the minority game Marsili et al. (2000); Challet et al. (2000); Sherrington and Kirkpatrick (1975), with applications even to financial markets Zhang (1998); Challet et al. (2001); Marsili (2001); Marsili and Piai (2002) and other problems on resource allocations Ghosh et al. (2012); Chakraborti et al. (2015).

A key revelation from research in the minority game is that there is a regime where the entire community exhibits incredible collectively cooperative behaviour even though each agent is only interested in self-gain. Apart from this surprising positive global behaviour, another astonishing result arises when the number of agents, , is much greater than the available independent strategies, : herding behaviour

emerges where many agents with similar strategies behave as a crowd who take the same action. This is highly undesirable as it would lead to a skewed outcome, with a small minority group containing few winners. In terms of resource utilisation, this means that there are many people who could have benefited but missed out because they joined the “majority bandwagon”. Quantitatively, the variance from optimal utilisation per number of agents varies as a power law with respect to

in the regime where [see Fig. 1(a) in Section 2].

In the bus problem that we are considering, this is also a resource allocation optimisation problem as the bus system strives to enhance the efficiency in serving commuters. The no-boarding policy essentially imposes a limit on the capacity, i.e. it is a bounded resource, which is being competed by the waiting commuters at the bus stop.

i.4 Who are winners in cooperating or defecting the no-boarding bus?

For the bus system, on the other hand, there is no clear, obvious, nor a unique generalised meaning for cooperators or defectors being in a “minority group”. Does “less defectors compared to cooperators” imply that defectors “win”? Why should a naïve “less defectors compared to cooperators” allow defectors to declare victory? After all, a so-called successful implementation of the no-boarding policy aims to minimise defections, i.e. “zero defections is ideal”, from the point of view of the bus system. Hence, a key part of this paper is to define and formalise what it means for the cooperators or defectors to win, instead of just counting the numbers in each group and seeing which has less people. One important aspect of the original minority game was to optimise usage of the resource, i.e. whilst the minority group wins, the system as a whole would be considered as “optimal” if the wastage or deviation from the ideal capacity is minimised. Once again offhand, it is not directly straightforward what this would be for the bus system, or if such a notion is even applicable here.

We note that Ref. Cavagna (1999) found that the actual historical outcomes of winners is not crucial in cultivating the emergence of collective cooperation. Instead, any

exogeneous piece of information is sufficient to generate that community-wide learning. Therefore, as long as the bus system systematically decides who the winners are and this information is made clear to all agents (for instance, defectors get away and thus “win” this time, or they are all punished with a fine for defecting and so cooperators are “winners” at another time), this would probably also be true for the no-boarding bus where people can choose whether to cooperate of defect. In other words, winners being decided by the minority rule can be replaced by other winning criteria, whilst maintaining the key feature of the community’s collective cooperation. We will verify this in this paper.

i.5 More total agents than those who are actually playing each round

The classical minority game sets a fixed number of agents who repeatedly play amongst all of them. But in the real world, this is definitely not the case. Why should everybody always play each round? Some people may take a break or play only occassionally. In the original El Farol Problem Arthur (1994), it is arguable that a realistic situation may be that there are say 200 people, but on average only about 100 of them would actually consider whether or not to go to the bar and compete for the 60 available places, with the remaining people taking a break from playing.

In fact in the bus system, the actual number of people boarding a bus each time is not even fixed! The total number of people using the bus system is overwhelmingly more than the number of people boarding each time, with even fewer who are actually boarding when the no-boarding policy is activated. In view of this, the classical minority game needs to be extended to a situation where there are necessarily more agents in the overall pool of people than those who are actually playing each round; as opposed to a fixed number of agents who always play against each other every round.

The herding behaviour of the agents in the classical minority game is a major cause of concern for the bus system. If in some successive rounds a massive crowd decides to defect, this may slow down the bus too much and nullify the no-boarding policy — leading to bus bunching. But since the bus system is not quite like the classical minority game, we need to study it explicitly and find out what happens.

In the next section, we recapitulate on the classical setup and features of the minority game, stating the key results as well as the behaviour of the agents in adapting to the optimal situation and how efficiency of resource usage depends on factors like number of agents and memory Challet and Zhang (1997, 1998); Savit et al. (1999); Manuca et al. (2000); Moro (2004). We also consider an “open” minority game, where there are total agents with only of them (randomly selected) who are actually playing each round. There are other models where not every agent in the pool plays every round, for instance some agents only play when they receive enough information in a financial market Challet et al. (2001) or financial agents trade at different time scales Marsili and Piai (2002). Then in Section 3, we investigate several situations for the bus system where a bus allows defectors to go through “victoriously” when there are few of them, but is capable of correcting the situation and “punishes” defectors when the defection level becomes high. Interestingly, the agents with such inductive reasoning and bounded rationality are indeed capable of adapting to the rules and optimising according to different situations at a global scale.

Ii Classical minority game

Fig. 1(a) shows a classical result of the minority game, viz. a graph of the variance from the ideal scenario [where the minority group size is ] per number of agents, , versus , where is the memory and is the number of agents Challet and Zhang (1997, 1998); Savit et al. (1999); Manuca et al. (2000); Moro (2004). We produce this graph by creating agents, each endowed with strategies which map the past most recent winning outcomes to a next action to be taken, as described in the Introduction. Each of these agents plays for rounds, and measurements are taken in the last of the latter rounds to ensure that the transient part of the simulation has subsided. This entire simulation is repeated times where the agents are randomly initialised with different strategies, and the average over these 100 runs is computed.

A game comprises

agents, typically an odd number, where each agent selects one of two options based on their current best performing strategy. The minority rule determines the winning group (which is well-defined, since

is odd). The so-called best configuration for the whole community is when there are agents selecting one choice, with agents selecting the other. The former is the winner, but the point here is that the number of winners is maximised — which is the notion of optimal use of the resource since the capacity of winners is in each round. These same agents play the game repeatedly, until the system settles into a state where the mean number of agents choosing one of the choices is about , with variance that depends on and , according to Fig. 1(a). In fact, Fig. 1(a) is a universal curve which holds true for any and .

Figure 1: (a) The universal curve in the classical minority game. (b) The proportion of agents who stick with one of their two strategies.

We summarise the known results and features of the classical minority game Challet and Zhang (1997, 1998); Savit et al. (1999); Manuca et al. (2000); Moro (2004):

  1. There are two phases separated by .

  2. For the phase , there are relatively fewer available strategies compared to the number of agents. Therefore, it is inevitable that some or many agents share similar strategies. Consequently, clusters of agents with similar strategies behave like herds or crowds, and the community as a whole performs badly. The deviation from optimal resource usage is high. The increase in as decreases is a power law.

  3. On the other hand, for , there are relatively more strategies available compared to the number of agents. Thus, herding behaviour is avoided and the community as a whole behaves much better than the random choice game (i.e. where each agent makes a choice with probability ).

  4. As the memory gets larger, each agent tracks more information and becomes more complicated. This increased complexity turns out to make them behave more randomly. Thus, the universal curve approaches the random choice game asymptotically as , but the community is generally still performing better than the random choice game.

  5. As shown in Fig. 1(b), the proportion of agents who stick to one of their two strategies shows a peak in the region where the entire community behaves most cooperatively. That proportion is substantial, with about half of the total number of agents experiencing a frozen strategy.

  6. Fig. 1 shows the situation where each agent is endowed with strategies. With larger , the added complexity tends to diminish the region where the curve is below the random choice game in Fig. 1(a), and the universal curve approaches the random choice game as increases.

ii.1 Open minority game

Figure 2: (a) The universal curve in the open minority game. (b) The proportion of agents who stick with one of their two strategies.

Let us now consider the situation where there is a pool of agents. Each round, agents randomly selected from that pool of agents would participate in the minority game whilst the remaining stay out for that round. The rest of the details are identical to those in the classical minority game presented above. Fig. 2 displays our simulation results. Note the important differences between this open minority game as compared to the classical minority game where in the latter all agents always participate in the game during each round:

  1. The plots in Fig. 1 for the classical minority game are in log-log and semi-log scales, respectively, with the -axis being with respect to . The plots in Fig. 2 for the open minority game on the other hand are in the usual linear scale, with the -axis being with respect to (independent of ).

  2. In Fig. 2(a), the variance per number of actual competing agents decreases with decreasing , but not as significantly as that in the classical minority game. Most strikingly, there is no herding behaviour for regions of low in the open minority game, i.e. no power law increase in as decreases beyond the critical value of and no phase transition.

  3. A universal curve in the open minority game appears to be between and , instead of and for the classical minority game.

  4. From Fig. 2(b), the proportion of frozen agents who stick to one of their two available strategies in the open minority game is significantly lower than that in the classical minority game, with no peak of the order of . Nevertheless, it appears to slightly increase with decreasing where the efficiency in terms of resource usage is better (i.e. where is smaller).

The absence of the herding behaviour when should not be a surprise in the open minority game in spite of the number of agents playing in each round, , being much more than the available strategies, . Each agent is not playing against the identical group of agents during each round but instead faces different agents whenever they play. Hence, there is no reason to expect the collective crowd or herding behaviour that exists in the classical minority game where they always face the same group of agents. Whilst agents “get used to each other” in the classical version, they persistently face random opponents in the open version which continually alters their best strategies.

The open minority game is a more realistic description of real-world systems like people boarding a bus, than the classical version. For example, different groups of people would board the bus each time, and each person faces a different group of people each time.

Iii No-boarding buses with inductive reasoning and bounded rationality

Let us now move on to deal with the problem of interest, viz. a bus system with a no-boarding policy. Our simulation environment for a bus loop system is based on that developed in Ref. Saw and Chew (2019), with parameters tuned from a real university campus bus loop service Saw et al. (2019); Saw and Chew (2019). A simplified setup with realistic parameters is to consider two buses serving one bus stop in a loop. We let both these buses move with a natural period of minutes (excluding time stopped at the bus stop), and people would arrive at the bus stop at a rate of person every seconds. Consequently, each bus would pick up an average of people each time. At the bus stop, a bus would first allow people to alight, before boarding new passengers. The people load/unload at a rate of person per second.

As it is, the two buses would quickly bunch into one single unit. In this case, the average waiting time of people at the bus stop for a bus to arrive is about minutes and seconds. To prevent this, a “slow” bus would implement the no-boarding policy, i.e. it only allows alighting and then leave, if its phase difference as measured from the bus behind it (or the bus immediately behind it if there are more than two buses) becomes less than some critical . Note that since the buses go in a loop, one can map this loop isometrically to the unit circle where the notion of a phase on the circle ( to ) is well-defined, and we can speak of the phase difference between two buses on this unit circle. For our simulations in this paper, we set so that if the phase difference between two buses gets smaller than that, the leading bus (which is “too slow”) would disallow boarding at the bus stop, leaving the people there to be picked up by the trailing bus (to slow it down, since it is “too fast”).

Ref. Saw and Chew (2019) has established significant improvements due to the no-boarding policy in preventing bus bunching and dramatically reducing the waiting time of people at the bus stop for a bus to arrive. With this setup where , the average waiting time is only about minutes and seconds, an improvement of almost from the situation without the no-boarding policy. Instead of mandatorily enforcing the no-boarding policy when the phase difference drops below , here we allow the people waiting to board to choose whether to cooperate or defect the no-boarding policy, when it is activated. Each person who would normally be allowed to board would decide for themselves, based on the inductive reasoning and bounded rationality described above. Then those who decide to defect would proceed to board as usual, whilst those who decide to cooperate would remain at the bus stop and wait for the next bus.

In this setup, the actual number of people who have to “play the no-boarding game” (i.e. who are faced with no-boarding, but given the choices to cooperate or defect) varies each time, from as low as person up to occasionally nearly people. The mean number is around to people (Table 1, below), depending on the actual learning and behaviour of the agents (e.g. their memory ). These people who face the no-boarding policy are the excess from the average of people to be picked up by each bus, who are accumulated at the bus stop due to the bus being “slow”. We create a pool of agents, and each round only a subset of these agents “play the no-boarding game”. This mimics the real situation where different subsets of the population board the bus at different times, with an even smaller subset who actually has to face the no-boarding policy. Nevertheless, we keep the overall pool to the relatively small number of agents, to avoid excessively lengthly training runs of the simulations required to weed out the transient part in order to allow the agents to learn their best respective strategies.

For a criterion on determining the winning group, suppose the bus system allows a fixed number of defectors to board when the no-boarding policy is implemented during that stop. This number is arbitrarily decided by the bus system: Perhaps it could set a larger limiting number during lull times when the pressure on bus bunching is weaker and a smaller limiting number during busy times when a bus would have to stop longer to serve more passengers. If the number of defectors is within this limit, then they get away without any punishment. In this sense, the defectors are deemed as winners, whilst the cooperators are losers since they apparently “wasted their time for nothing” when they obeyed and waited for the next bus. On the other hand, if the number of defectors exceeds the prescribed limit, then all those who defected that round are punished and charged a (possibly hefty) fee whilst the cooperators of that round are given a rebate for obediently following the rule. This information on how many defectors are allowed each time, however, is not announced to the agents as it is only meant for the bus system to decide on the winning group. Therefore, each agent has to individually weigh the pros and cons of defecting the no-boarding rule, if it is activated:


Cooperating is incentivised by possibly being rewarded with a rebate in exchange for the extra waiting time for the next bus, if there are too many defectors. But a cooperator may waste their time for nothing, if there are too few defectors.


Defecting is incentivised by saving time and possibly getting away with it, if there are not too many defectors. But a defector risks incurring a fine, if there are too many defectors.


We shall set various fixed limits: a maximum of , , , , or defectors, and test this in our simulations one at a time, with the bus system running for revolutions around the loop. Furthermore, we also run control simulations where instead of modelling agents with inductive reasoning and bounded rationality, all agents behave randomly — i.e. they flip a fair coin to decide whether to cooperate or defect the no-boarding rule. Each case is repeated times to obtain the average. Fig. 3 summarises our simulation results.

Figure 3: Various fixed limit at 2, 3, 5, 8, 10 allowed defectors, respectively: (a) Mean number of defectors versus memory. (b) The variance from maximum utilisation of the defection capacity versus memory.

Additionally, we also let the limiting number of defectors be variable: The maximum number of defectors in the next time the no-boarding policy is activated is taken as the actual number of defectors in the previous time the no-boarding policy is activated. This removes the arbitrary fixing of the limiting number of defectors, replaced by the actual number of defectors in the last round. Fig. 4 is the corresponding plots to Fig. 3 in this variable case.

Figure 4: The limiting number of defectors is set as the actual number of defectors in the previous round: (a) Mean number of defectors versus memory. (b) The variance from maximum utilisation of the defection capacity versus memory.

In each case where defection is allowed, the bus system reasonably maintains its performance where the two buses do not bunch. The allowance of defection only slightly increases the global average waiting time of people at the bus stop for a bus to arrive. Our simulation results show that agents behaving with inductive reasoning and bounded rationality are capable of adapting to the winning criteria, regardless of what those criteria are. In each of the maximum number of defectors set by the bus system, the agents are able to co-evolve their best strategies such that the mean number of defectors each round the no-boarding policy is activated approaches the maximum number being set by the bus system, in spite of the fact that the agents themselves do not actually know what the limit is! The only information that each agent receives and remembers is the last outcomes that he experienced, which he is able to track by himself since he knows whether he won or lost (viz. getting a rebate or not when cooperating; paying a fine or not when defecting). Yet the entire community optimises the variable resource usage, with lower memory being more effective (lower variance from maximum utilisation) since larger increases the agents’ complexity in their strategies and they approach the random choice game. The simulation results also clearly show the character of the open minority game, with no herding behaviour since each agent continually faces different groups of agents which alter their best strategies.

Allowed defectors No. of defectors No. of agents % defectors Increase in mean waiting time
or seconds
or seconds
or seconds
or seconds
or seconds
Previous number or seconds
Random choice game or seconds
Table 1: The case where the memory of the agents is , for the various numbers of allowed defectors set by the bus system: The mean number of defectors, mean number of agents given the choices to cooperate or defect the no-boarding policy, mean proportion of defectors, and the increase in mean waiting time as compared to the case where the no-boarding policy is mandatory. The last line is where each agent randomly decides to cooperate or defect.

Note also that by adapting to the limiting number of defectors set by the bus system, the entire community is capable of co-evolving their proportion of defection rates. For example with (Table 1), when only defectors are allowed, there are people, on average, given the choices to cooperate or defect the no-boarding policy, with about defectors on average, i.e. a defection rate of . On the other hand when the bus system allows defectors, there are people, on average, given the choices to cooperate or defect the no-boarding policy, with defectors on average, i.e. a defection rate of . Allowing more defectors would slightly increase the global average waiting time, which slightly slows down the “slow” bus and thus slightly increases the excess people who would face the no-boarding policy. This is why allowing more defectors would increase the average number of agents who “play each round”. Table 1 shows that the agents are able to dynamically adjust the community’s defection rate to the various winning criteria, not just the rate of defections that would arise from the minority rule for deciding the winning group. The adaptability weakens with increasing memory , since the agents’ complexity increases and would behave more randomly, just like the classical and open minority games where the variance from ideal utilisation grows towards that of the random choice game. The increases in mean waiting time are less than seconds in all cases, compared to minutes and seconds where no-boarding is mandatorily implemented with no option to defect. This is only a small cost, but gives people with urgency to board the option to do so.

Iv Concluding remarks

The application of the inductive reasoning and bounded rationality description of agents to the no-boarding buses suggests that with the no-boarding policy improving the efficacy of the bus system Saw and Chew (2019), allowing a fixed number of defectors each time the policy is activated is sustainable since the agents are able to adapt to making use of the available resource, without too much drawback on the bus system’s performance. It is also a fitting example of an “open” system where only a subset of the overall pool of agents actually play the “game” in each round, hence there is no emergence of herding behaviour even though many agents may possibly be sharing similar strategies. This is because in such an “open” system, each agent faces different groups of opponents each time and therefore continually co-evolve their best strategies. Furthermore, we see similar features to the open minority game presented in this paper, even though the winning criterion for the bus system is not determined by the group with a smaller number of people.

Crucially, the absence of the herding behaviour assures the feasibility of the no-boarding policy with allowance for defections. The variance from the prescribed limiting number of defectors for a group of agents with inductive reasoning and bounded rationality being smaller than that for a group of randomly behaving agents (Figs. 3 and 4), with no blowing up [unlike in the classical minority game when , Fig. 1(a)], implies that the number of defectors reasonably hovers near the prescribed limit and it is unlikely that there would arise formation of huge crowds of people who decide to defect. Hence, the bus system is protected from the situation where there are surges of people defecting which may slow down the “slow” bus too much to an extent that bus bunching occurs, nullifying the intention of the no-boarding policy. This however, may not be the case if it is the same group of agents who always play repeatedly with each other as in the classical minority game, which does possess the herding regime.

In terms of a physical system like spin glass, the classical minority game corresponds to all atoms in the spin glass contributing to the overall state at each time step. In the open minority game that we presented here, the correspondence would be that only a subset of these atoms are contributing to the overall state at any time step, with the rest somehow shielded off and momentarily not participating in the interaction. Although this is clearly not how spin glass behaves, in other kinds of system like a financial market, it is arguable that only a subset of the overall pool of market players are active, with the others going about with other businesses and only being active at other times Challet et al. (2001); Marsili and Piai (2002). Thus, such a distinction between the classical and open minority games is important as we have shown that the properties of these games are different.

Acknowledgements.
This work was supported by the Joint WASP/NTU Programme (Project No. M4082189).

References