1 Introduction
Stochastic Portfolio Theory (SPT) is a relatively new stream in financial mathematics, initiated and largely developed by Robert Fernholz (2002). For surveys of the field, see Fernholz and Karatzas (2009) and Vervuurt (2015). Among many other things, SPT offers an alternative approach to portfolio selection, taking as its selection criterion to outperform the market index (for instance, the S&P 500 index) with probability one. Investment strategies which achieve this are called relative arbitrages, and have been constructed in certain classes of market models. The almostsure comparison between the performance of certain portfolios and that of the market is facilitated by Fernholz’s ‘master equation’, a pathwise decomposition of this relative performance which is free from stochastic integrals. The foregoing master equation is the main strength of SPT portfolio selection, as it allows one to circumvent the challenges of explicit model postulation and calibration, as well as the (normative) noarbitrage assumption, that are encountered in the classical approaches to portfolio optimisation. However, there remain several problems in and limitations to the SPT framework as it stands.
First of all, the task of finding relative arbitrages under reasonable assumptions on the market model is difficult, since it is an inverse problem (this has also been noted by Wong (2015)). Namely, given an investment strategy and market assumptions, one can check whether this strategy is a relative arbitrage (although this quickly becomes very hard for more complicated strategies), but the theory itself does not suggest such strategies. As such, the number of relative arbitrages that have been constructed explicitly remains very small. In a practical setting it would be preferable to invert the problem, and learn investment strategies from data using a userspecified performance criterion. In effect, most established investment managers will likely have a strong view on: i) what performance metric to use to evaluate their strategies, and ii) what values for the chosen metric they regard as being exceptional. The chosen performance metric may depart from the excess return relative to the market index, for instance by adjusting for the risk taken. Similarly, outperforming the market index over a certain time horizon with probability might not be good enough for some practitioners, as investors might pull out following disappointing initial performances, leaving the investment manager unable to realise the longterm optimality. Whence, ideally one should aim at learning from market data what investment strategy is likely to perform exceptionally as per the user’s view.
Secondly, several market imperfections are ignored in SPT; most notably, the possibility of bankruptcy is excluded. Since the constructed investment strategies typically invest heavily in smallcapitalisation stocks, this poses a strong limitation on the realworld implementability of these portfolios. However, learning optimal investment strategies from the data copes well with bankruptcies as strategies investing in stocks that eventually fail will naturally be rejected as suboptimal. It also allows for the incorporation of transaction costs, which is theoretically challenging and has not yet been addressed in SPT.
Lastly, the SPT setup has thus far been developed almost exclusively for investment strategies that are driven only by market capitalisations — there have not yet been any constructions of relative arbitrages driven by other factors. Although this simplification eases theoretical analysis, it is a clear restriction as practitioners do consider many more market characteristics in order to exploit market inefficiencies.
We address all of these issues by adopting a Bayesian nonparametric approach. We consider a broad range of investment strategies driven by a function defined on an arbitrary space of trading characteristics (such as the market capitalisation), on which we place a Gaussian process (GP) prior. For a given strategy, the likelihood of it being ‘exceptional’ is derived from a userdefined performance metric (e.g. excess return to the market index, Sharpe ratio, etc) and values thereof that the practitioner considers ‘exceptional’. We then sample from the posterior of the GP driving the ‘exceptional’ strategy using Monte Carlo Markov Chain (MCMC).
The rest of the paper is structured as follows. In section 2 we provide a background on SPT. In section 3 we present our model, and we illustrate that our approach learns strategies that outperform SPT alternatives in section 4. Finally, we discuss our findings and make suggestions for future research in section 5.
2 Background
We give a brief introduction to SPT, defining the general class of market models within which its results hold, what the portfolio selection criterion is, and how strategies achieving this criterion are constructed.
2.1 The Model
In SPT, the stock capitalisations are modelled as Itô processes.^{1}^{1}1In the recent work Karatzas and Ruf (2016), it has been shown that this can be weakened to a semimartingale model that even allows for defaults. Namely, the dynamics of the positive stock capitalisation processes are described by the following system of SDEs:
(1) 
for and . Here, are independent standard Brownian motions with , and are the initial capitalisations. We assume all processes to be defined on a probability space , and adapted to a filtration that satisfies the usual conditions and contains the filtration generated by the “driving” Brownian motions. We refer the reader to Karatzas and Shreve (1988) for a reference on stochastic calculus.
The rates of return and volatilities are some unspecified progressively measurable processes and are assumed to satisfy the integrability condition
(2) 
for all and the nondegeneracy condition
(ND) 
for all and , almost surely.
2.2 Relative Arbitrage
In this context, one studies investments in the equity market described by (1) using portfolios. These are valued and progressively measurable processes , where stands for the proportion of wealth invested in stock at time .
We restrict ourselves to longonly portfolios. These invest solely in the stocks, namely, they take values in the closure of the set
(3) 
in particular, there is no money market. Assuming without loss of generality that the number of outstanding shares of each firm is 1, the corresponding wealth process of an investor implementing is seen to evolve as follows (we normalise the initial wealth to 1):
(4) 
In SPT one measures performance, for the most part, with respect to the market index. This is the wealth process
that results from a buyandhold portfolio, given by the vector process
of market weights(5) 
Definition 1.
Let . A strong relative arbitrage with respect to the market over the timehorizon is a portfolio such that
(6) 
An equivalent way to express this notion, is to say that the portfolio strongly outperforms over the timehorizon . ∎
Contrast the SPT approach to portfolio selection with other methods such as meanvariance optimisation (originally introduced by
Markowitz (1952)) and expected utility maximisation (see for instance Rogers (2013)), where the optimisation of a certain performance criterion determines the portfolio. In SPT, any portfolio that outperforms the market in the sense of (6) is a relative arbitrage, and the amount by which it outperforms the market is theoretically irrelevant.In practice, one clearly desires this relative outperformance to be as large as possible. Attempts at optimisation over the class of strategies that satisfy (6) have been made by Fernholz and Karatzas (2010), Fernholz and Karatzas (2011), Ruf (2011), Ruf (2013), and Wong (2015). However, these results are highly theoretical and very difficult to implement. Our datadriven approach circumvents these theoretical complications by optimising a userdefined criterion over the class of functionallygenerated portfolios, which we introduce below.
2.3 FunctionallyGenerated Portfolios
A particular class of portfolios, called functionallygenerated portfolios (or FGPs for short), was introduced and studied by Fernholz (1999).
Consider a function , where is an open neighbourhood of and such that is bounded on for . Then is said to be the generating function of the functionallygenerated portfolio , given, for , by
(7) 
Here, we write for the partial derivative with respect to the variable, and we will write for the second partial derivative with respect to the and variables. Theorem 3.1 of Fernholz (1999) asserts that the performance of the wealth process corresponding to , when measured relative to the market, satisfies the almost sure decomposition (often referred to as “Fernholz’s master equation”)
(8) 
where the quantity
(9) 
is called the drift process of the portfolio . Here, we have written for the relative covariances; denoting by the ^{th} unit vector in , these are defined for as
(10) 
Under suitable conditions on the market model (1), the left hand side of master equation (8) can be bounded away from zero for sufficiently large , thus proving that is an arbitrage relative to the market over . Several FGPs have been shown to outperform the market this way — see Fernholz (2002), Fernholz et al. (2005), Fernholz and Karatzas (2005), Banner and Fernholz (2008), Fernholz and Karatzas (2009), Picková (2014), and Vervuurt and Karatzas (2015). In fact, Pal and Wong (2014) prove that any relative arbitrage with respect to the market is necessarily of the form (7), if one restricts to be a functional of the current market weights only.
Strong (2014) proves a generalisation of (8) for portfolios which are deterministic functions not only of the market capitalisations, but also of other observable quantities. Namely, let , with a continuous, valued, progressively measurable process of finite variation, and let . By an application of Theorem 3.1 of Strong (2014), for any portfolio
(11) 
for , the following master equation holds:
(12)  
Here (compare with (8) and (9))
(13) 
Although explicit in its decomposition, the modified master equation (12) has so far not been applied in the literature. It is very difficult and unclear how to postulate in what way such ‘extended generating functions’ should depend on market information, and what additional covariates to use. It is thus of interest to develop a methodology that makes suggestions for what functions to use, and extracts from market data which signals are significant.
2.4 DiversityWeighted Portfolios
One of the moststudied FGPs is the diversityweighted portfolio (DWP) with parameter , defined in (4.4) of Fernholz et al. (2005) as
(14) 
In Eq. (4.5) of Fernholz et al. (2005) it was shown that this portfolio is a relative arbitrage with respect to over for any and , under the condition (ND), and that of diversity (D), introduced below. The latter says that no single company’s capitalisation can take up more than a certain proportion of the entire market, which can be observed to hold in real markets;
(D) 
In Vervuurt and Karatzas (2015), this result was extended to the DWP with negative parameter , and several variations of this portfolio were shown to outperform the market over sufficiently long time horizons and under suitable market assumptions. A simulation using real market data supported the claim that these portfolios have the potential to outperform the market index, as well as their positiveparameter counterparts. Our results strongly confirm this finding, as well as computing the optimal parameter — see section 4.
3 Solving the Inverse Problem
We consider solving the inverse problem of SPT: given some investment objective, how to go about learning a suitable trading strategy from the data? In doing so, we aim for a method that:

Learns from a large class of candidate investment strategies to uncover possibly intricate strategies from the data, typically by making use of nonparametric generative models for the generating functions;

Leverages additional sources of information beyond market capitalisations to uncover better investment strategies;

Works irrespective of the practitioner’s investment objective (e.g. achieving a high Sharpe Ratio, outperforming alternative benchmark indices, etc).
3.1 Model Specification
Let be a set of trading characteristics, for some . We consider longonly portfolios of the form
(15) 
for some continuous function .
The idea behind this choice of investment portfolios is grounded in the fact that in practice, an investment manager will often have a predefined set of characteristics that he uses to compare stocks, for instance company size, balance sheet variables, credit ratings, sector, momentum, market vs. book value, return on assets, management team, online sentiment, technical indicators, ‘beta’, etc. The investment manager will typically choose trading characteristics so that they are informative enough to unveil market inefficiencies. Moreover, two stocks that have ‘similar’ characteristics will receive ‘similar’ weights.
This approach includes as special cases all functionallygenerated portfolios in the SPT framework, and in particular the diversity, entropy and equallyweighted, as well as market, portfolios. Our more general setting allows for any set of trading characteristics.
The trading opportunities in our framework are revealed through the time evolving trading characteristics , and the investment map fully determines how to go about seizing these opportunities. Whence, learning an investment strategy in our framework is equivalent to learning an investment map . To do so, we consider two families of functions. Firstly, galvanised by the theoretical results of SPT, we consider the case where is the set of market weights, and we take to be of the parametric form
(16) 
for , which corresponds to the diversityweighted portfolio (DWP, see section 2.4). Secondly, in order to capture more intricate trading patterns, and to allow for a more general set of trading characteristics , we also consider an alternative nonparametric approach in which we take to be a path of a meanzero Gaussian process with covariance function
(17) 
To learn ‘good’ investment maps, we need to introduce an optimality criterion that encodes the user’s investment objective. To do so, we consider a performance functional that maps the logarithm of a candidate investment map to the historical performance of the portfolio as in Eq. (15) over some finite time horizon, given historical data . An example performance functional is the annualised Sharpe Ratio, defined as
(18) 
where is the return of our portfolio between time and time , is the return of the th asset between time and time , represents the number of units of time in a business year (e.g. if the returns are daily), and (resp.
) denote the sample mean (resp. sample standard deviation). Another example of a performance functional is the excess return relative to a benchmark portfolio
(19) 
The nature of (Sharpe ratio, excess return, etc) depends on the portfolio manager; we impose no theoretical restriction.
In the parametric case (Eq. (16)), is effectively a function of one single variable , and we can easily learn the optimal using standard optimisation techniques.
In many cases, however, it might be preferable to reason under uncertainty and be Bayesian. To do so, we express the investment manager’s view as to what is a good performance through a likelihood model
, which we may choose to be a probability distribution on
(20) 
This is perhaps the most important step of the learning process. Indeed, the Bayesian methods we will develop in the next section aim at learning investment maps that provide an appropriate tradeoff between how likely the map is in light of training data, and how consistent it is with prior beliefs. This will only lead to a profitable investment map if ‘likely’ maps satisfy the manager’s investment objective insample and viceversa. If one chooses the likelihood model such that likely maps are strategies that lose money, then our learning machines will learn strategies that lose money!
Fortunately, it is very straightforward to express that likely investment maps are the ones that match a desired investment objective. For instance, we may use as likelihood model that, given a candidate investment map
, the extent to which it is good, or equivalently the extent to which it is ‘likely’ to be the function driving the strategy we are interested in learning, is the same as the extent to which the Sharpe Ratio it generates insample comes from a Gamma distribution with mean
and standard deviation . The positive support of the Gamma distribution renders functions leading to negative insample Sharpe ratios of no interest, while the concentration of the distribution over the Sharpe Ratio around reflects both our target performance and some tolerance around it. The choice of mean () and standard deviation () of the Gamma reflects the risk appetite of the investment manager, while the vanishing tails properly reflect the fact that too high a performance would likely raise suspicions and too low a performance would not be good enough.To complete our Bayesian model specification, in the parametric case we place on a uniform prior on .
3.2 Inference
Throughout the rest of this paper we will use as performance functional the total excess — transaction cost adjusted — return (as defined in (3.1)) relative to the equally weighted portfolio (EWP), which has constant weights
(21) 
over the whole training period
(22) 
We assume a bps transaction cost upon rebalancing (i.e. we incur a cost of of the notional for each transaction). It is well known to algorithmic (execution) trading practitioners that a good rule of thumb is to expect to pay bps when executing an order whose size is of the average daily traded volume (ADV) on liquid stocks. Whence, this assumption is reasonable so long as the wealth invested in each stock does not exceed ADV. When needed, we use as likelihood model
(23) 
where we denote
the probability density function of the Gamma distribution with mean
and standard deviation . As previously discussed, and need not be learned as they reflect the investment manager’s risk appetite. In the experiments of the following section, we use and . In other words, we postulate that the ideal investment strategy should be such that, starting with a unit of wealth, the terminal wealth over the training period should be on average units of wealth higher than the terminal wealth achieved by the equally weighted portfolio over the same trading horizon — this is purposely greedy.Frequentist parametric: The first method of inference we consider consists of directly learning the optimal parameter of the DWP by maximising for . As a comparison, the typical range of considered in the SPT literature is . To avoid any issue with local maxima, we proceed with brute force maximisation on the uniform grid with mesh size .^{2}^{2}2This took no longer than a couple of seconds in every experiment that we ran.
Bayesian parametric: The second method of inference we consider consists of using the MetropolisHastings algorithm (Hastings (1970)) to sample from the posterior distribution over the exponent in the DWP case,
(24) 
where we have rewritten as to make the dependency in explicit. We sample a proposal update from a Gaussian centred at the current exponent and with standard deviation . The acceptance probability is easily found to be
(25) 
We note in particular that so long as is initialised within , the indicator function in Eq. (24) will not cause problems to the Markov chain. We typically run MH iterations and discard the first as ‘burnin’. We use the posterior mean exponent learned on training data to trade in our testing horizon following the corresponding DWP
(26) 
Bayesian nonparametric: The third method of inference we consider is Bayesian and nonparametric. We place a Gaussian process prior on
(27) 
Given the sizes of datasets we consider in our experiments (more than million training inputs — assets over a year period), we approximate the latent function over a Cartesian grid. This approximation fits nicely with the quantised nature of financial data. We use as covariance function a separable product of Rational Quadratic (RQ) kernels
(28) 
where the hyperparameters , on which we place independent lognormal priors are all to be inferred. We found the RQ kernel to be a better choice than the Gaussian kernel as it allows for ‘varying length scales’. Denoting by the values of the investment map over the input grid, we prefer to work with the equivalent whitened representation
(29) 
where
is the identity matrix,
is the Gram matrix over all input points,is the Singular Value Decomposition (SVD) of
and . We use a Blocked Gibbs sampler (Geman and Geman (1984)) to sample from the posterior(30) 
where we have rewritten as to emphasise that the likelihood is fully defined by . The whitened representation has two primary advantages. First, it is robust to ill conditioning as we may always compute , even when is singular. Second, it creates a hard link between function values and hyperparameters, so that updating the latter affects the likelihood , and therefore directly contributes towards improving the training performance : we found this to improve mixing of the Markov chain. Our Blocked Gibbs sampler alternates between updating conditional on hyperparameters, and updating the hyperparameters (and consequently ) conditional on . For both steps we use the elliptical slice sampling algorithm (Murray et al. (2010)). The computational bottleneck of our sampler is the computation of the SVD of , which we may do very efficiently by exploiting the separability of our kernel and the grid structure of the input space using standard Kronecker techniques (see for instance Saatchi (2011)).
4 Experiments
The universe of stocks we consider in our experiments are the constituents of the S&P 500 index, accounting for changes in index constituents. We rebalance our portfolios on a daily basis. At the end of each trading day, we determine our target portfolio for the next day, which is acquired at the open of the next trading day. When the constituents of the index are due to change on day , our target portfolio at the end of day relates to the constituents of the index on day (which would indeed be known to the market on day ). As previously discussed, we assume that each transaction incurs a charge of of its notional value. The returns we use account for corporate events such as dividends, defaults, M&A’s, etc. Our data sources are the CRSP and Compustat databases, and we use data from 1 January 1992 to 31 December 2014.
In our first experiment, we aim to illustrate that the approaches we propose in this paper consistently and considerably outperform SPT alternatives over a wide range of market conditions. We consider learning optimal investment strategies as described in the previous section using 10 consecutive years worth of data and testing on the following 5 years. We begin on 1st January 1992 for the first training dataset, and roll both training and testing datasets by one year, which leads to a total of 9 pairs of training and testing subsets. We compare the equallyweighted portfolio (EWP), the market portfolio, the diversityweighted portfolio where the exponent is learned by maximising the evaluation functional (DWP*), the diversityweighted portfolio where the exponent is learned with MCMC (DWP), the Gaussian process approach using as trading characteristic the logarithm of the market weights (CAP), and the Gaussian process approach using as trading characteristics both the logarithm of the market weights and the returnonassets (CAP+ROA). The returnonassets (ROA) on day is defined as the ratio between the last net income reported by the company and last total assets reported by the company known on day — we note that this quantity may not change on a daily basis but this does not affect our analysis. The rationale behind using the ROA as additional characteristics is to capture not only how big a company is, but also how well it performs relative to its size.
Table 1
summarises the average over the 9 scenarios of the yearly insample and outofsample returns plusminus two standard errors. It can be seen that all learned strategies do indeed outperform the benchmark (EWP) insample and outofsample. Moreover, the performance is greatly improved by considering nonparametric models, even when the only characteristic considered is the market weight. Analysing such families of strategies within the SPT framework would simply be mathematically intractable. Finally, it can be seen that adding more trading characteristics does indeed add value. Crucially, the CAP+ROA portfolio
considerably and consistently outperforms the benchmark (EWP), both insample and outofsample.Portfolio  IS ret (%)  OOS ret (%) 

Market  8.561.62  6.232.07 
EWP  10.561.67  8.991.85 
DWP*  11.942.01  12.511.12 
DWP  11.911.99  12.501.11 
CAP  26.542.38  22.052.89 
CAP+ROA  56.187.35  25.142.58 
In our second experiment, we aim to illustrate that our approaches are robust to financial crises. To do so, we train our model using data from 1 January 1992 to 31 December 2005, and test the learned strategy between 1 January 2006 and 31 December 2014, which includes the 2008 financial crisis. We compare the same investment strategies as before. The posterior distribution over the exponent in the Bayesian parametric method is illustrated in Figure 1. The learned posterior mean investment maps are illustrated in Figure 3. In Table 2 we provide insample and outofsample average yearly returns as well as outofsample Sharpe ratios. Once again, it can be seen that: i) all learned portfolios do indeed outperform the benchmark (EWP) insample and outofsample, ii) nonparametric methods outperform parametric methods, and iii) adding the ROA as an additional characteristic does indeed add value. These conclusions hold true not only in absolute terms (returns) but also after adjusting for risk (Sharpe Ratio). A more granular illustration of how our method performs during the 2008 financial crisis can be seen in the time series of total wealth provided in Figure 2. It turns out that the ROA does not only improve the return outofsample, but it also has a ‘stabilising effect’ in that the volatility of the wealth process is considerably reduced.
Finally, it is also worth stressing that the shape of the learned investment map in the twodimensional case (Figure 3) suggests that the investment strategy uncovered by our Bayesian nonparametric approach can hardly be replicated with a parametric model. Once again, it would be near impossible to derive analytical results pertaining to such a portfolio within the SPT framework.
Portfolio  IS RET (%)  OOS RET (%)  OOS SR 

Market  9.60  7.90  0.47 
EWP  13.46  9.60  0.51 
DWP*  14.62  11.74  0.56 
DWP  14.62  11.38  0.55 
CAP  16.49  18.01  0.60 
CAP+ROA  37.54  18.33  0.72 
5 Conclusion & Discussion
The inverse problem of stochastic portfolio theory (SPT) is the following problem: given a userdefined portfolio selection criterion, how does one go about constructing suitable investment strategies that meet the desired investment objective? This problem is extremely challenging to solve within the SPT framework. We propose the first solution to the inverse SPT problem and we demonstrate empirically that the proposed methods consistently and considerably outperform standard benchmarks, and are robust to financial crises.
Unlike the SPT framework, our methods are based solely on historical data rather than stochastic calculus. This allows us to consider a very broad class of candidate investment strategies that includes all SPT strategies as special cases, but crucially contains many investment strategies that cannot be analysed in the SPT framework. Unlike the SPT framework, which almost exclusively considers outperforming the market portfolio using investment strategies that are solely based on market weights, our proposed approach can cope with virtually any userdefined investment objective and can exploit any arbitrary set of trading characteristics. We empirically demonstrate that this added flexibility allows us to uncover more subtle patterns in financial markets, which results in greatly improved performance.
Although the Gaussian process in our model was approximated to be piecewise constant on a grid, there is no theoretical or practical obstacle in using an alternative approximation such as sparse Gaussian processes (Quiñonero Candela and Rasmussen (2005)) or string Gaussian processes (Kom Samo and Roberts (2015b)). Our method may be extended to learn even subtler patterns using the nonstationary general purpose kernels of Kom Samo and Roberts (2015a). Our work may also be extended to allow for longshort investment strategies (i.e. strategies that allow shortselling). Finally, it would be interesting to develop an online extension of our work that would capture temporal changes in market dynamics.
Acknowledgements
YvesLaurent would like to acknowledge support from the OxfordMan Institute of Quantitative Finance. Alexander gratefully acknowledges PhD studentships from the Engineering and Physical Sciences Research Council, Nomura, and the OxfordMan Institute of Quantitative Finance. Wharton Research Data Services (WRDS) was used in preparing the data for this paper. This service and the data available thereon constitute valuable intellectual property and trade secrets of WRDS and/or its thirdparty suppliers.
References
 Banner and Fernholz [2008] Adrian D. Banner and Daniel Fernholz. Shortterm relative arbitrage in volatilitystabilized markets. Ann. Finance, 4:445–454, 2008.
 Fernholz and Karatzas [2010] Daniel Fernholz and Ioannis Karatzas. On optimal arbitrage. Ann. Appl. Probab., 20(4):1179–1204, 2010.
 Fernholz and Karatzas [2011] Daniel Fernholz and Ioannis Karatzas. Optimal arbitrage under model uncertainty. Ann. Appl. Probab., 21(6):2191–2225, 2011.
 Fernholz [1999] Robert Fernholz. Portfolio generating functions. Quantitative Analysis in Financial Markets, River Edge, NJ. World Scientific, 1999.
 Fernholz [2002] Robert Fernholz. Stochastic Portfolio Theory. Springer, 2002.
 Fernholz and Karatzas [2005] Robert Fernholz and Ioannis Karatzas. Relative arbitrage in volatilitystabilized markets. Ann. Finance, 1:149–177, 2005.
 Fernholz and Karatzas [2009] Robert Fernholz and Ioannis Karatzas. Stochastic portfolio theory: A survey. In Alain Bensoussan and Qiang Zhang, editors, Handbook of Numerical Analysis. Vol. XV. Special volume: mathematical modeling and numerical methods in finance, volume 15 of Handbook of Numerical Analysis. Elsevier/NorthHolland, Amsterdam, 2009.
 Fernholz et al. [2005] Robert Fernholz, Ioannis Karatzas, and Constantinos Kardaras. Diversity and relative arbitrage in equity markets. Finance Stoch., 9(1):1–27, 2005.
 Geman and Geman [1984] Stuart Geman and Donald Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741, 1984.
 Hastings [1970] W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 24:97–109, 1970.
 Karatzas and Ruf [2016] Ioannis Karatzas and Johannes Ruf. Trading strategies generated by Lyapunov functions. arXiv preprint arXiv:1603.08245, 2016.
 Karatzas and Shreve [1988] Ioannis Karatzas and Steven Shreve. Brownian Motion and Stochastic Calculus. Volume 113 in the series Probability and its Applications (New York). SpringerVerlag, New York, 1988.
 Kom Samo and Roberts [2015a] YvesLaurent Kom Samo and Stephen Roberts. Generalized spectral kernels. arXiv preprint arXiv:1506.02236, 2015a.
 Kom Samo and Roberts [2015b] YvesLaurent Kom Samo and Stephen Roberts. String Gaussian processes. arXiv preprint arXiv:1507.06977, 2015b.
 Markowitz [1952] Harry Markowitz. Portfolio selection. The journal of finance, 7(1):77–91, 1952.
 Murray et al. [2010] Iain Murray, Ryan Prescott Adams, and David J. C. MacKay. Elliptical slice sampling. JMLR: W&CP, 9:541–548, 2010.
 Pal and Wong [2014] Soumik Pal and TingKam Leonard Wong. The geometry of relative arbitrage. Mathematics and Financial Economics, pages 1–31, 2014.
 Picková [2014] Radka Picková. Generalized volatilitystabilized processes. Ann. Finance, 10(1):101–125, 2014.
 Quiñonero Candela and Rasmussen [2005] Joaquin Quiñonero Candela and Carl Edward Rasmussen. A unifying view of sparse approximate gaussian process regression. The Journal of Machine Learning Research, 6:1939–1959, 2005.
 Rogers [2013] L. C. G. Rogers. Optimal Investment. SpringerVerlag, New York, 2013.
 Ruf [2011] Johannes Ruf. Optimal Trading Strategies Under Arbitrage. PhD thesis, Columbia University, 2011.
 Ruf [2013] Johannes Ruf. Hedging under arbitrage. Math. Finance, 23(2):297–317, 2013.
 Saatchi [2011] Yunus Saatchi. Scalable Inference for Structured Gaussian Process Models. PhD thesis, University of Cambridge, 2011.
 Strong [2014] Winslow Strong. Generalizations of functionally generated portfolios with applications to statistical arbitrage. SIAM Journal on Financial Mathematics, 5(1):472–492, 2014.
 Vervuurt [2015] Alexander Vervuurt. Topics in Stochastic Portfolio Theory. arXiv preprint arXiv:1504.02988, 2015.
 Vervuurt and Karatzas [2015] Alexander Vervuurt and Ioannis Karatzas. Diversityweighted portfolios with negative parameter. Annals of Finance, 11(34):411–432, 2015.
 Wong [2015] TingKam Leonard Wong. Optimization of relative arbitrage. Annals of Finance, 11(34):345–382, 2015.