1 Introduction
In the past 20 years most of the world’s major financial markets have seen a sharp rise in the level of automated trading on those markets, with many human traders being replaced by adaptive algorithmic “robot traders” at the point of execution. Although this has been a significant shift, affecting both patterns of employment and the dynamics of the markets concerned, it can plausibly be argued that at a macrolevel little has changed: these major markets are still populated by traders working on behalf of major financial institutions such as investment banks or fundmanagement companies; the difference is just that now those institutions are represented in the markets not by teams of human traders but by teams of robots. To be more precise, it is more often the case that within any one institution entire teams of human traders have been replaced by a single monolithic automated trading system that does the work previously performed by tens of hardworking human traders.
The success or failure of any one automated trading system is determined primarily by how much profit it can generate, but underlying that simple observation is a circularity. In any realistic market scenario, the profitability of a given robot trader will be determined at least in part by the extent to which its actions in the market are welltuned to the likely reactions of other traders in that market. Thus, in contemporary markets, is likely to be designed to adapt its trading behavior to the current market circumstances, and yet those circumstances are significantly determined by the behavior of other traders in the market, most of which are robots , , and so on, each of which are themselves adapting to the circumstances they experience in the market, which are to some extent influenced by the actions and reactions of .
In the natural world, in the Darwinian survivalofthefittest interactions among evolving species of organisms, exactly this kind of circular interaction and dependency is commonplace. Just as the profitdriven adaptive trading behavior of robot can be affected by the profitdriven adaptive trading behavior of , and vice versa, so the reproductive fitness of a predator animal (a cheetah, say) is determined to some extent by how well adapted it is to catching its prey, and the reproductive fitness of individuals in its prey species (antelopes, say) is starkly dependent on the extent to which they are adapted to evade being caught by their predators. If a mutation in the predator species gives rise to individuals that can run faster for longer when chasing prey, perhaps that will subsequently be countered by the prey species evolving to turn more sharply or to jump higher or farther than the predator can deal with. Similarly if the prey species happens to evolve sharper eyesight so they can better see the predator coming, perhaps the predator species will then evolve to exhibit stealthier ways of tracking their food. This circular armsrace dynamic, where evolutionary adaptations in one species are driven by the current distribution of genes in one or more other species , , and so on, and where in turn evolutionary adaptations in one or more of those other species , etc are driven by the current distribution of genes in species , is known technically as coevolution. Theoretical biologists have studied coevolution for many years, and have developed various gametheoretical analyses that give insights on the dynamics of the armsraces between competitively coevolving species: see e.g. [33, 28, 49, 25]).
In this paper we report on empirical simulation studies for which the startingpoint draws direct inspiration from those theoretical biology studies of coevolutionary dynamics. Our motivation here is to try to better understand, to gain insights into, the practical extent to which the various adaptive trading systems in a market are affecting each other, and specifically to investigate whether the population of adaptive traders is ever likely to converge on a situation where all of the traders are welladapted to each others’ behavior yet each trader is not as profitable as it could otherwise be. That is: could the competitive interactions and adaptations of traders in the market collectively converge on a stable set of trading behaviors that are suboptimal? And, if so, can we recognize when that has happened, or when it is about to happen? Similarly, might we be able to identify when the coevolutionary dynamics are about to lead to a flashcrash? We have commenced a sequence of empirical studies, starting with minimal but realistic simulations that are principled approximations of presentday highly automated financial markets.
Specifically, our ultimate aim has been to create agentbased models (ABMs) involving agents where each agent represents one financial legal entity (i.e. either an individual independent trader or an institution such as a bank or fundmanagement company) operating a single profitdriven automated trading system that trades in competition with the other agents, in an electronic market operating a continuous double auction (CDA) with a limit order book (LOB: see e.g. [24, 37, 1]
) – which is the presentday situation in many of the world’s major financial markets. Each entity can in principle be adapting its strategy/behavior in realtime (e.g. using a machine learning mechanism) but is not required to do so. That is, an entity’s trading strategy can be nonadaptive if that is the more profitable option. Furthermore at any time the entity can elect to totally change the strategy that it is operating, modelling the case where a financial institution switches a trading algorithm that has previously been in development and testing (commonly referred to as a
dev algo) into full use, the dev algo replacing the previouslyrunning production algorithm (commonly known as the prod algo). Thus each trading entity in our model internally maintains a minimum of two strategies, each of which could potentially be adaptive: a prod algo and a dev algo. When the agent’s dev algo replaces the prod, a new dev algo is created and is subsequently tested and refined until there is sufficient evidence that it is an improvement on the agent’s current prod algo, at which point the dev again replaces prod and then another new dev is created. The tradeoff between exploiting the prod algo and exploring the dev algo has manifest links to studies of multiarmed bandit problems (see e.g. [36, 31, 42]). We report here on the construction of two simulation models of this kind of system and on results from hundreds of thousands of simulated market sessions.Simulation modelling of financial markets very often involves populating a market mechanism with some number of traderagents: autonomous entities that have “agency” in the sense that they are empowered to buy and/or sell items within the particular market mechanism that is being simulated: this approach, known as agentbased computational economics (ACE: see e.g. [26]), has a history stretching back for more than 30 years, and much of the work in ACE studies of trading behaviours in models of financial markets owes a clear intellectual debt to work in experimental economics as pioneered by Vernon Smith (see e.g. [43, 44, 27, 45, 38]).
Over the multidecade history of ACE, a small number of specific traderagent algorithms, i.e. precise mathematical and algorithmic specifications of particular trading strategies, have been frequently used for modelling various aspects of financial markets, and the convention that has emerged is to refer to each such strategy via a short sequence of letters, reminiscent of a stockmarket tickersymbol. Notable trading strategies in this literature include (in chronological sequence): SNPR [41], ZIC [23], ZIP [4], GD [22], RE [18], MGD [48], GDX [47], HBL [21], and AA [52]; several of which are explained in more detail later in this paper. Of these, ZIC (invented by the economists Gode & Sunder [23]) is notable for being both highly stochastic and extremely simple, and yet it gives surprisingly humanlike market dynamics; GD and ZIP were the first two strategies to be demonstrated as superior to human traders, a fact established in a landmark paper by IBM researchers [12], (see also: [14, 15, 16]), which is now commonly pointed to as initiating the rise of algorithmic trading in real financial markets; and until very recently AA was widely considered to be the bestperforming strategy in the public domain. ZIC was the first instance of a zero intelligence trading strategy, which have proven to be surprisingly useful in ACE research: see, e.g., [19, 30]
. With the exception of SNPR and ZIC, all later strategies in this sequence are adaptive, using some kind of machine learning (ML) or artificial intelligence (AI) method to modify their responses over time, betterfitting their trading behavior to the specific market circumstances that they find themselves operating in, and details of these algorithms were often published in major AI/ML conferences and journals.
The supposed dominance of AA has recently been questioned in a series of publications [51, 7, 46, 40, 9] which demonstrated AA to have been less robust than was previously thought. Most notably, [40, 9] report on trials where AA is tested against two minimally simple algorithms that each involve no AI or ML at all: these two strategies are known as GVWY and SHVR [5, 6], and each share the paredback minimalism of Gode & Sunder’s ZIC mechanism. In the studies that have been published thus far, depending on the circumstances, it seems (surprisingly) that GVWY and SHVR can each outperform not only AA but also many of the other AI/MLbased traderagent strategies in the set listed above. Given this surprising recent result, there is an appetite for further ACEstyle marketsimulation studies involving GVWY and SHVR. One compelling issue to explore is the coevolutionary dynamics of markets populated by traders that can choose to play one of the three strategies from GVWY, SHVR, and ZIC, in a manner similar to that studied by [53] who employed replicator dynamics modelling techniques borrowed from theoretical evolutionary biology to explore the coevolutionary dynamics of markets populated by traders that could choose between SNPR, ZIP, and GD; each trader playing their chosen strategy for as long as it seems (to that trader) to be the most profitable strategy, and occasionally switching to (or “replicating”) use one of the other two strategies in the set if the current strategy appears (to that trader) to be weak. This replicator dynamics approach was also used in [52] to argue that AA was dominant over prior leading strategies, and in [51] to demonstrate that AA could in fact be dominated by other strategies.
Replicator dynamics studies are typically limited to visualising and analysing the coevolutionary dynamics of simple, restricted systems where the restrictions are introduced to constrain the systems in such a way that they can be easily visualised and analysed. For instance, replicator dynamics studies often involve studying a population of agents that can switch between two, three, or at most four distinct pure strategies, and this decision often seems driven by the fact that visualisation of the dynamics, characterising the entire system dynamics, is often best done by reference to the system’s phase space, i.e. to plot some factor of interest for every possible state of the system. Let be the set of distinct pure strategies that the agents in our system can choose between, let and refer to the of those strategies. Also let be the number of agents in the system, each of which makes a choice of some . Such a system can be characterised in full, all possible points in its finite phase space enumerated and plotted, by considering each possible combination of allowable strategy choices or assignments made by the population of agents: if all the agents have the same choice, and each can choose any of the strategies, then the number of possible system states, the number of points in its phase space, is , a number that may grow large but will forever be finite.
When , the system phase space can be characterised as points on a line, spanning from all agents playing , through to a 50:50 mix of :, to all agents playing . When , the phase space can be characterised and visualised as points on the 2D unit simplex, an equilateral triangle where a point within or on the perimeter of the triangle represents a particular ratio of ::, plotted in a barycentric coordinate frame. Technically, the onedimensional (1D) line used for the phasespace of a system is a 2D unit simplex; the 3D unit simplex is a 2D triangle; and then the 4D unit simplex is a 3D object, a tetrahedron, the volume bounded by four planar faces each being an equilateral triangle. Higherdimensional simplices are mathematically wellformed objects, but they are devils to visualise: try plotting the 40D unit simplex. Although the original authors do not explicitly state their reasons, it seems reasonable to conclude that each of [53, 52] and [51] chose to study replicator dynamics systems in which and not any higher number because of the rapidly escalating difficulty of visualising the phase space for any higher value. Yet realworld markets do not involve all entities each selecting from a choice of two or three pure trading strategies, so there is then a major concern over the extent to which these studies adequately capture the much richer degree of heterogeneity in realworld markets: this brings to mind the old adage about the latenight drunkard looking for his lost housekeys under a streetlamp not because that is where he mislaid them, but because the light is better there.
So, although one way of studying coevolutionary dynamics in markets where the traders can choose to either deploy GVWY, SHVR, or ZIC is to give each trader a discrete choice of one from that set of three strategies, so at any one time any individual trader is either operating according to GVWY or SHVR or ZIC, it is appealing to instead design experiments where the traders can continuously vary their trading strategy, exploring a potentially infinite range of differing trading strategies, where the space of possible strategies includes GVWY, SHVR, and ZIC. This is made possible by the recent introduction of a new minimalintelligence trading strategy called PRZI [8]. PRZI’s trading behavior is determined by a strategy parameter . When , the trader behaves identically to ZIC, and when it behaves the same as GVWY or SHVR. And, crucially, when a PRZI trader’s value is some other value, either partway between and or partway between and , its trading behavior is a kind of hybrid, partway between that of ZIC and SHVR, or partway between ZIC and GVWY. Because the PRZI strategyparameter is a real number, and its effect on the trading behavior is smooth and continuous, in principle any one PRZI trader can make microscopically small adjustments and hence the space of possible strategies available to a single PRZI trader is infinite, and the phasespace of a market of agents is a bounded volume within .
In Section 2 we discuss our experiences in working with populations of coevolving PRZI traders, where we immediately come up against the limits of applicable visualisation techniques for this type of dynamical system. While markets of PRZI traders allow for continuous and infinite heterogeneity in the population of agents, the bounded nature of the PRZI strategyspace is a limitation that reduces the realism of the model. To address this, we have commenced work on an unboundedly infinite system, where each coevolving trader’s strategy can in principle grow to be arbitrarily complex and sophisticated (that is, in principle they can be anything that is expressible as a program in a Turingcomplete listbased functional programming language), which we discuss in Section 3. For all our simulation studies reported here, we use the BSE simulator of a CDA market with a LOB (see [5, 6]), a mature opensource platform for ACE studies of electronic markets with automated trading.
2 Coevolution in a Bounded Infinite Space: PRZI
Full details of our initial work with coevolving populations of PRZI traders are given in [2], which this section is only a very brief summary of.
As a first illustration, we set up a minimal coevolutionary system, one in which only two of the traders could change their strategy by altering their PRZI value. Let’s refer to these two traders as and : the two are independent, so can set its strategy value regardless of the value chosen by , and vice versa. We set to to be a buyer, and we set to be a seller and hence, because any seller in the market needs to find a buyer as a counterparty and vice versa, the profitability of ’s choice of will be partially dependent on ’s choice of , and vice versa. We take the natural step of treating profitability as ‘fitness’ in the evolutionary sense, and hence this system is as simple as we can get while still being coevolutionary.
For the adaptation process, each adaptive trader operates a simple Adaptive Climber (AC) algorithm defined in [2], which echoes the dev/prod development cycle discussed in the previous section: the trader maintains two separate strategies, to different PRZI values, referred to as and . is initially set to some value, and is set to a ‘mutated’ version of
, by adding a small random value (e.g. a sample from a uniform distribution over the range
). The AC method executes some number trades using strategy and then executes trades using strategy . After that, if the profitability of is greater than that of then the trader generates a new ; but if the profitability of exceeds that of then is used to replace , and then a new is generated as a mutant value of the new . That is, AC is a minimally simple twopoint stochastic hillclimber algorithm.Figure 1 shows a quiver plot of the phase space for an instance of this system, in which initial values of are set at random from a uniform distribution for all traders. One of the two adaptive traders is designated as a buyer with strategy , and the other as a seller with strategy . Both the buyer and the seller can adjust their strategy value over time, using the AC method just described. The horizontal axis is the buyer’s value and the vertical is the sellers’s , and these two continuous values define the system’s phasespace, i.e.
. Uniformlength vectors have been plotted at regular intervals giving a discrete grid that indicates the system’s direction of travel in phasespace. The phasespace has a single pointattractor, the point of convergence marked by a red dot at
, and an obvious plateau area close to the origin: within the plateau area, the system will exhibit random drift, and will eventually step outside the plateau; once outside the plateau, the system evolves toward the attractor.The 2D quiver plot in Figure 1 is made possible because we constrained our system to only have two adaptive traders. As soon as we relax that constraint and have all agents in our system adapting and coevolving against all the others, we need to make an dimensional plot. Given that we routinely use values of 50 or more, and that 50dimensional quiverplots are not easy to plot or understand, this mode of visualization runs out of steam as soon as gets to plausibly interesting numbers.
An dimensional coevolutionary market system is an instance of an dimensional dynamical system, and a popular method of characterising the dynamics of highdimensional dynamical systems is the recurrence plot (RP: see e.g. [17, 32]). This purely graphical technique can be extended by various quantitative methods known collectively as recurrence quantification analysis (RQA: see [54]). As is discussed at length in [2], we have explored the use of RPs and RQA for visualising and analysing our coevolutionary PRZI markets.
In brief, for our purposes a RP visualization of an dimensional realvalued dynamical system is a rectangular grid of square binary pixels, i.e. pixels that are in one of two states: often either black or white. Let be the state of the system at time . A pixel is shaded black to represent that has recurred at time , i.e. has previously been seen at some earlier time , and is shaded white otherwise. Recurrence can be defined in various ways, but the simplest is to take the dimensional Euclidian distance and to declare recurrence to have occurred if is less than some threshold value. The coordinates for each pixel, each cell, in a RP are set by its values of and . Figure 2 shows a RP for one instance of our coevolving market of PRZI traders: there is nontrivial structure in the plot, which is subjected to further detailed RQA analysis in [2].
In our work with coevolving PRZI traders, merely by allowing each zerointelligence trader to have adaptive control of its single realvalued strategy parameter, for a market populated by such traders, we have an dimensional phase space, a bounded hypercubic volume , and monitoring the system’s temporal evolution within that hypercube becomes immediately problematic. Analysis methods based on RPs and RQAs, an approach currently popular and productive in many fields, get us only so far toward our ultimate aim of being able to understand what the system is doing and where it is going (as documented in [2]) – and, unfortunately, they do not get us far enough. While it is tempting to invest time and effort in developing better RP/RQA methods for analysis of the PRZI marketsystem’s phasespace trajectories in its subspace of , the results we present in the next section cast doubt on whether that would actually be a useful thing to do. There, we discuss the consequences of taking a second small step in the direction of greater realism: one in which the space of possible strategies is still infinite, but is also unbounded. Once we get there, RP/RQA analysis totally runs out of steam.
3 Coevolution in an Unbounded Infinite Space: STGP
While the work discussed in the previous section is illuminating, our PRZImarket model can be criticised for its lack of realism in the sense that each adaptive PRZI trader is constrained to play a zerointelligence strategy that is either ZIC, GVWY, SHVR, or some intermediate hybrid mix: traders in the coevolving PRZI market are never even going to play a more sophisticated minimalintelligence strategy like AA, GDX, or ZIP. But our work is motivated by the observation that in realworld coevolving markets, the trading entities are not constrained to select between a fixed number of existing pure strategies, and nor are they constrained to choose a point in some continuous subspace that includes specific pure strategies as special cases. In real markets, any entity at any time is free to invent its own strategy or to alter/extend an existing one. Our work with PRZI has revealed some of the issues of visualising and analysing such systems, but the bounded nature of its subspace means that it can never show the kind of coevolutionary dynamics of the class of system that we seek to ultimately address in our work. Thus, we need a model in which the space of strategies is not only infinite but also unbounded. In this section, we briefly describe early results from ongoing work in which each entity does have the freedom to adapt by innovating, by creating wholly new strategies, and in which the space of possible strategies is unbounded and hence infinite.
Genetic Programming (GP: see e.g. [29, 39]
) is a form of evolutionary computing in which a genetic algorithm operates on ‘genomes’ that are encodings of programs in a listbased functional language such as
Lisp [50] or Clojure [34]. Starting with an initial population of programs , each of the individuals in is evaluated via a fitness function which assigns a scalar fitness value to that individual. When all individuals in have been evaluated and assigned a value, a new population of individuals is created by a process of breeding where pairs of individuals inare selected with a probability proportionate to their fitness (so fitter individuals are more likely to be selected for breeding) and one or more
children are created that have genomes which inherit from the pair of parents in ways inspired by realworld sexual reproduction with mutation. In this way, the population of new children becomes the next generation of the system; the old population is typically discarded, each individual in then has its fitness evaluated, and the next generation is then bred from’s fitter members: if this process is repeated for sufficiently many generations, and if hyperparameters such as the mutation rate are set correctly, then useful novel programs can be created by the ‘Blind Watchmaker’
[13] of Darwinian evolution.To illustrate this, consider a simple functional language that allows for expressions computable by a fourfunction pocket calculator, where multiplication has the symbol , division has , subtraction , and addition . The expression (which evaluates to 11) could be written in a listbased style as , and can be visualised as a tree structure, as illustrated in Figure 3, which also illustrates the breeding process. Although we have shown only simple mathematical expressions here, when GP is used with Turingcomplete languages such as Lisp or Clojure, complete executable programs of arbitrary complexity and sophistication can in principle be evolved.
In our work we are using a variant of GP known as stronglytyped genetic programming (STGP), where datatype constraints are enforced between connecting nodes of a program trees [35]. For example, an and node that takes two boolean inputs can be guaranteed that it will only connect to two booleans. Now each entity in our model market, rather than using the Adaptive Climber algorithm to optimize a single numeric strategy value, instead uses a STGP process to create new programs that implement trading strategies: we start with a population seeded with minimally simple programs, and then we unleash them, allowing the coevolutionary process to proceed, during which each entity is at liberty to create programs of growing complexity and sophistication, if in doing so they generate greater profits.
Full details of our STGP work are given in [20], to which the reader is referred for further detail; here we present only the briefest of results, from a single successful experiment, to motivate discussion of the problems of visualization and analysis that arise when working in this unbounded infinite space of possible programmatic trading strategies.
As an initial exploration into the dynamics of the STGP traders coevolving in BSE, a simulation was run over 40 generations for 10000 units of time. 100 ZIC sellers were run against 50 ZIC and 50 STGP buyers; both buyers and sellers were regularly replenished with fresh “customer orders” (i.e., an instruction to buy or to sell, and an associated private limit price for that transaction) to execute. The STGP traders were each initialised with a priceimprovement expression of , where is the subtraction operator, is the best price on the same side of the LOB as the trader, and is the limit price for this customer order to be executed by trader . This expression represents the zerointelligence SHVR trader, expressed in STGP tree form.
Summary results, a plot of profitvalues in each generation, are shown in Figure 4. As can be seen, the profitability data are biphasic: there is an initial brief phase of rapid growth in profitability; followed by a prolonged phase where profitability steadily declines. The initial rise in profitability is as would be expected, and hoped for: the STGP coevolution is discovering ever more profitable trading strategies over successive early generations. The second phase, where profits are steadily eroded, is perhaps less expected and less desired, but can readily be explained by the competitive coevolutionary process progressively eating away at profits: if one SHVRlike trader is profitable by shaving off of the best price on each revision, then it can be beaten to the deal by another SHVRlike trader who instead shaves ; but that trader could in turn be beaten by a SHVRstyle trader who instead shaves off the best price, and so on: pricecompetition among the coevolving traders awards higher fitness to those individuals that get more deals by shaving greater amounts off of the current best price on the LOB, but in doing so the most successful cut their margins ever smaller, eventually hitting a zero margin at which point they are playing not SHVR but GVWY.
Table 1 shows the genome of the elite (most profitable) trader in a selection of generations from the experiment illustrated in Figure 4. There are two things to note in the genomes shown here. First, STGP (and vanilla GP too) frequently suffers from bloat, creating viable expressions or programs that get the job done, but which are expressed in very verbose form: for example, the elite individual at generation 30 has a genome that translates to , which any competent programmer would immediate rewrite as (i.e., as a shortened genome of . Second, because the functional languages used in (ST)GP are richly expressive (that is, the same algorithm or expression can be written in many different ways), the use of methods based on recurrence plots (RPs) becomes deeply problematic: the recurrence of any one particular strategy that had occurred earlier in the evolutionary process may be difficult to automatically detect. For instance, if the elite genome is at generation 30, and is at generation 60, and is at generation 60, then we humans can see by inspection that the same strategy is recurring every 30 generations, but an automated analysis technique would need to go beyond the lexical/syntactic dissimilarity in these expressions and instead reason about the underlying semantics of the functional programming language. For the simple mathematical expressions being discussed here, it is reasonable to operationally reduce them each to some agreed canonical form, but for only slightly more sophisticated (and stateful) algorithms such as AA, GDX, or ZIP, a manytoone mapping, a reduction of all possible implementations, all possible expressions, of that algorithm down to a single canonical form is unlikely to ever be achievable. And so, RPbased methods cease to have any applicability here too.
Once again, we take a small step in the direction of increased realism in our coevolutionary models, and the visualization/analytics toolbox is empty.
Gen  Expression Tree 

1  (S,(S,,1), ) 
2  (S,(S,,1),1) 
3  (S,(S,,1),1) 
4  (S,(S,(S,,1),1),1) 
⋮  ⋮ 
26  (S,(S,(S,(S,(S,(S,(S,(S,(S,,1),7),1),1),7),1),7),7),1) 
27  (S,(S,(S,(S,(S,(S,(S,(S,(S,,1),7),1),7),7),1),1),7),1) 
28  (S,(S,(S,(S,(S,(S,(S,(S,(S,,1),7),1),1),7),1),7),7),1) 
29  (S,(S,(S,(S,(S,(S,(S,(S,(S,(S,,1),7),7),1),1),7),1),1),7),1) 
30  (S,(S,(S,(S,(S,(S,(S,(S,(S,(S,,1),7),7),1),1),7),1),1),7),1) 
4 Discussion and Conclusion
The experiments and results that we have described here have demonstrated that, when we move our ACEstyle market models ever so slightly in the direction of being closer to realworld markets, we find that the toolbox for visualisation and analysis of the resultant system dynamics starts to look very empty. While it is relatively easy to make the changes necessary to extend existing models to make them more realistic, it is relatively hard to work out what the extended systems are actually doing, and hence we need new tools to help us do that. Our current work is concentrated on exploring the use of Ciao Plots [10, 11] in characterising the coevolutionary dynamics of our STGP system, although as [3] discuss, this is a visualisation technique that is not without its complexities.
While many research papers in science and engineering are written to describe the solution to some problem, this is not one of those papers. Instead, this is a paper that describes a problem in need of a solution. Or, more specifically, a problem that we expect to be tackled from multiple perspectives, one that eventually yields to multiple complementary solutions. In future work, we intend to develop novel visualisation and analysis techniques for coevolutionary market systems with unboundedly infinite continuous strategy spaces, which we will report on in due course; but in writing this paper we hope to encourage other researchers to work on this challenging problem too. To facilitate that, we have made our Python sourcecode freely available as opensource releases on GitHub, which is where in future we will also release our own visualisation and analysis methods as we develop them.^{2}^{2}2The Python code in the main BSE GitHub repository [5] has been extended by addition of a minimally simple adaptive PRZI trader, a point stochastic hill climber, referred to as PRZISHC (pronounced prezzyshuck), for which the case is a close relative of the AC algorithm described in Section 2 and which can readily be used for studies of coevolutionary dynamics. The sourcecode for our STGP work is available separately at https://github.com/charliefiguero/stgptrader/.
References
 [1] Abergel, F., Anane, M., Chakraboti, A., Jedidi, A., Toke, I.: Limit Order Books. Cambridge University Press (2016)
 [2] Alexandrov, N.: Competitive armsraces among autonomous trading agents: Exploring the coadaptive dynamics. Master’s thesis, University of Bristol (2021)
 [3] Cartlidge, J., Bullock, S.: Unpicking tartan CIAO plots: Understanding irregular coevolutionary cycling. Adaptive Behavior 12(2) (2004) 69–92
 [4] Cliff, D.: Minimalintelligence agents for bargaining behaviours in marketbased environments. Technical Report HPL9791, HP Labs Technical Report (1997)
 [5] Cliff, D.: Bristol Stock Exchange: opensource financial exchange simulator. https://github.com/davecliff/BristolStockExchange (2012)
 [6] Cliff, D.: BSE : A Minimal Simulation of a LimitOrderBook Stock Exchange. In Bruzzone, F., ed.: Proc. 30th Euro. Modeling and Simulation Symposium (EMSS2018). (2018) 194–203
 [7] Cliff, D.: Exhaustive testing of traderagents in realistically dynamic continuous double auction markets: AA does not dominate. In Rocha, A., Steels, L., van den Herik, J., eds.: Proceedings of the 11th International Conference on Agents and Artificial Intelligence (ICAART 2019). ScitePress (2019) 224–236
 [8] Cliff, D.: ParameterizedResponse ZeroIntelligence Traders. SSRN:3823317 (2021)
 [9] Cliff, D., Rollins, M.: Methods matter: A trading algorithm with no intelligence routinely outperforms AIbased traders. In: Proceedings of IEEE Symposium on Computational Intelligence in Financial Engineering (CIFEr2020). (2020)
 [10] Cliff, D., Miller, G.: Tracking the Red Queen: Measurements of adaptive progress in coevolutionary simulations. In Morán, F., Moreno, A., Guervós, J.J.M., Chacón, P., eds.: Advances in Artificial Life, Third European Conference on Artificial Life, Granada, Spain, June 46, 1995, Proceedings. Volume 929 of Lecture Notes in Computer Science., Springer (1995) 200–218
 [11] Cliff, D., Miller, G.: Visualizing coevolution with CIAO plots. Artificial Life 12 (02 2006) 199–202
 [12] Das, R., Hanson, J., Kephart, J., Tesauro, G.: Agenthuman interactions in the continuous double auction. In: Proc. IJCAI2001. (2001) 1169–1176
 [13] Dawkins, R.: The Blind Watchmaker. W. W. Norton (1986)
 [14] De Luca, M., Cliff, D.: Agenthuman interactions in the continuous double auction, redux: Using the OpEx labinabox to explore ZIP and GDX. In: Proceedings of the 2011 International Conference on Agents and Artificial Intelligence (ICAART2011). (2011)
 [15] De Luca, M., Cliff, D.: Humanagent auction interactions: AdaptiveAggressive agents dominate. In: Proceedings IJCAI2011. (2011) 178–185
 [16] De Luca, M., Szostek, C., Cartlidge, J., Cliff, D.: Studies of interaction between human traders and algorithmic trading systems. Technical report, UK Government Office for Science, London (September 2011)
 [17] Eckmann, J.P., Oliffson Kamphorst, S., Ruelle, D.: Recurrence plots of dynamical systems. Europhysics Letters 5 (1987) 973–977

[18]
Erev, I., Roth, A.:
Predicting how people play games: Reinforcement learning in experimental games with unique, mixedstrategy equilibria.
The American Economic Review 88(4) (September 1998) 848–881  [19] Farmer, J.D., Patelli, P., Zovko, I.: The Predictive Power of Zero Intelligence in Financial Markets. Proceedings of the National Academy of Sciences 102(6) (2005) 2254–2259
 [20] Figuero, C.: Evolving traderagents via strongly typed genetic programming. Master’s thesis, University of Bristol Department of Computer Science (2021)
 [21] Gjerstad, S.: The impact of pace in double auction bargaining. Technical report, Department of Economics, University of Arizona (2003)
 [22] Gjerstad, S., Dickhaut, J.: Price formation in double auctions. Games and Economic Behavior 22(1) (1998) 1–29
 [23] Gode, D., Sunder, S.: Allocative Efficiency of Markets with ZeroIntelligence Traders: Market as a Partial Substitute for Individual Rationality. Journal of Political Economy 101(1) (1993) 119–137
 [24] Gould, M., Porter., M., Williams, S., McDonald, M., Fenn, D., Howison, S.: Limit order books. Quantitative Finance 13(11) (2013) 1709–1742
 [25] Hebbron, T., Bullock, S., Cliff, D.: NKalpha: Nonuniform epistatic interactions in an extended NK model. In Bullock, S., Noble, J., Watson, R., Bedau, M., eds.: Artificial Life XI: Proceedings of the Eleventh International Conference on the Simulation and Synthesis of Living Systems. MIT Press (2008) 234–241
 [26] Hommes, C., LeBaron, B., eds.: Computational Economics: Heterogeneous Agent Modeling. NorthHolland (2018)
 [27] Kagel, J., Roth, A.: The Handbook of Experimental Economics. Princeton University Press (1997)
 [28] Kauffman, S.: The Origins of Order: SelfOrganization and Selection in Evolution. Oxford University Press (1993)
 [29] Koza, J.: Genetic Programming: On the Programming of Computers by means of Natural Selection. MIT Press (1993)

[30]
Ladley, D.:
Zero Intelligence in Economics and Finance.
The Knowledge Engineering Review
27(2) (2012) 273–286  [31] Lattimore, T., Szepesvari, C.: Bandit Algorithms. Cambridge University Press (2020)
 [32] Marwan, N.: How to avoid potential pitfalls in recurrence plot based data analysis. International Journal of Bifurcation and Chaos 21(4) (2011) 1003–1017
 [33] Maynard Smith, J.: Evolution and the Theory of Games. Cambridge University Press (1982)
 [34] Miller, A., Halloway, S., Bedra, A.: Programming Clojure. Third edn. Pragmatic Bookshelf (2018)
 [35] Montana, D.: Strongly typed genetic programming. Evolutionary Computation 3(2) (June 1995) 199–230
 [36] Myles White, J.: Bandit Algorithms for Website Optimization: Developing, Deploying, and Debugging. O’Reilly (2012)
 [37] Nolte, I., Salmon, M., Adcock, C., eds.: High Frequency Trading and Limit Order Book Dynamics. Routledge (2014)
 [38] Plott, C., Smith, V., eds.: Handbook of Experimental Economics Results, Volume 1. NorthHolland (2008)
 [39] Poli, R., Langdon, W., McPhee, N.: A Field Guide to Genetic Programming. Lulu (2008)
 [40] Rollins, M., Cliff, D.: Which trading agent is best? using a threaded parallel simulation of a financial market changes the peckingorder. In: Proceedings of the 32nd European Modeling and Simulation Symposium (EMSS2020). (2020)
 [41] Rust, J., Miller, J., Palmer, R.: Behavior of trading automata in a computerized double auction market. In Friedman, D., Rust, J., eds.: The Double Auction Market: Institutions, Theories, and Evidence. AddisonWesley (1992) 155–198
 [42] Slivkins, A.: Introduction to MultiArmed Bandits. Arxiv:1904.07272v6 (2021)
 [43] Smith, V.: An Experimental Study of Competitive Market Behaviour. Journal of Political Economy 70(2) (1962) 111–137
 [44] Smith, V.: Papers in Experimental Economics. Cambridge University Press (1991)
 [45] Smith, V., ed.: Bargaining and Market Behavior: Essays in Experimental Economics. Cambridge University Press (2000)
 [46] Snashall, D., Cliff, D.: AdaptiveAggressive traders don’t dominate. In van Herik, J., Rocha, A., Steels, L., eds.: Agents and Artificial Intelligence: Selected papers from ICAART2019. Springer (2019)
 [47] Tesauro, G., Bredin, J.: Sequential strategic bidding in auctions using dynamic programming. In: Proceedings AAMAS 2002. (2002)
 [48] Tesauro, G., Das, R.: Highperformance bidding agents for the continuous double auction. In: Proc. 3rd ACM Conference on Electronic Commerce. (2001) 206–209
 [49] Thompson, J.: The Coevolutionary Process. University of Chicago Press (1994)
 [50] Touretzky, D.: Common LISP: A Gentle Introduction to Symbolic Computation. Revised edn. Dover Publications Inc (2013)
 [51] Vach, D.: Comparison of double auction bidding strategies for automated trading agents. Master’s thesis, Charles University in Prague (2015)
 [52] Vytelingum, P., Cliff, D., Jennings, N.: Strategic bidding in continuous double auctions. Artificial Intelligence 172(14) (2008) 1700–1729
 [53] Walsh, W., Das, R., Tesauro, G., Kephart, J.: Analyzing complex strategic interactions in multiagent systems. In: Proc. of the AAAI Workshop on GameTheoretic and DecisionTheoretic Agents. (2002)
 [54] Webber, C., Marwan, N., eds.: Recurrence Quantification Analysis: Theory and Best Practice. Springer (2015)