# Modeling Risk and Return using Dirichlet Process Prior

In this paper, we showed that the no-arbitrage condition holds if the market follows the mixture of the geometric Brownian motion (GBM). The mixture of GBM can incorporate heavy-tail behavior of the market. It automatically leads us to model the risk and return of multiple asset portfolios via the nonparametric Bayesian method. We present a Dirichlet Process (DP) prior via an urn-scheme for univariate modeling of the single asset return. This DP prior is presented in the spirit of dependent DP. We extend this approach to introduce a multivariate distribution to model the return on multiple assets via an elliptical copula; which models the marginal distribution using the DP prior. We compare different risk measures such as Value at Risk (VaR) and Conditional VaR (CVaR), also known as expected shortfall (ES) for the stock return data of two datasets. The first dataset contains the return of IBM, Intel and NASDAQ and the second dataset contains the return data of 51 stocks as part of the index "Nifty 50" for Indian equity markets.

There are no comments yet.

## Authors

• 10 publications
• 3 publications
• 1 publication
• 13 publications
• ### Location Dependent Dirichlet Processes

Dirichlet processes (DP) are widely applied in Bayesian nonparametric mo...
07/02/2017 ∙ by Shiliang Sun, et al. ∙ 0

• ### Dirichlet Process Mixtures of Generalized Linear Models

We propose Dirichlet Process mixtures of Generalized Linear Models (DP-G...
09/28/2009 ∙ by Lauren A. Hannah, et al. ∙ 0

• ### A nonparametric copula approach to conditional Value-at-Risk

Value-at-Risk and its conditional allegory, which takes into account the...
12/15/2017 ∙ by Gery Geenens, et al. ∙ 0

• ### Simulating an infinite mean waiting time

We consider a hybrid method to simulate the return time to the initial s...
07/23/2019 ∙ by Krzysztof Bartoszek, et al. ∙ 0

• ### Learning-augmented count-min sketches via Bayesian nonparametrics

The count-min sketch (CMS) is a time and memory efficient randomized dat...
02/08/2021 ∙ by Emanuele Dolera, et al. ∙ 0

• ### The Potential of the Return Distribution for Exploration in RL

This paper studies the potential of the return distribution for explorat...
06/11/2018 ∙ by Thomas M. Moerland, et al. ∙ 0

This paper gives an overview of the project Return to Bali that seeks to...
09/01/2020 ∙ by Marc Böhlen, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The nature of the log-returns of a financial asset is characterized by heavy-tails, significant skewness and kurtosis. Parametric modelling of the data as explained in

[13, 19, 20]

concentrate mainly on fitting a normal distribution, generalized hyperbolic distributions, or heavy tail distrubution from extreme value family. This requires us to obtain maximum likelihood estimates (MLEs) for the paramaters of the distribution, which inspite of suffering from overfitting are also inconsistent estimates for the expected return. The usual binomial asset pricing model is a discrete time analogue to the continuous time geometric Brownian motion (GBM)

[19, 20]. However, empirical studies indicate [16] that financial log-returns are characterized by heavy-tailed distributions. Using the geometric Brownian motion to model stock/asset prices adversely affects the estimation of quantities like VaR, Conditional Value at Risk (CVaR) and other “coherent” risk measures [15, 16]. A single normal distribution fails to account for the heavy-tails, where a mixture normal ditribution often performs better. In this paper, we consider mixture of normal distributions as starting point for modeling log-return. Naturally, it motivates us for the mixture of GBM on the stock prices.

In this context, a nonparametric Bayesian (NPB) [9] approach to modelling has three advantages. The first advantage being data adaptivity. In this paper we present methodology accompanied by an algorithm that takes the data as its only input. Secondly, as we use a DP approach to model the data. The DP essentially fits a finite mixture model with regard to the choice of the base measure. The number of components that are fitted to the data, is learned by augmenting a stochastic process to the algorithm. Finally the tails are accounted for by the finite mixture model that is fitted; since the components of the mixture model vary, the tail behaviour explanation is better explained by changes in the precision parameter of the base-measure.

We develop a multivariate distribution to model the multiple asset’s return using a multivariate copula; which models the marginal distribution using the DP prior. In marginal distribution, the DP prior takes care of heavy-tail behavior of the single asset return and similarly, the multivariate

copula incorporate the heavy-tail behavior of joint distribution of the multiple asset return.

In section 2, we present the motivation for mixture models. In this section, we showed that no-arbitrage condition holds, if the marke follows mixture of GBM. The mixture of GBM can incorporate heavy-tail behavior. In section 3, we aim to establish the framework for the DP prior. We have different subsections, (3.1) through (3.2); the first deals with the modelling of a single asset using the DP priors. Leter the approach extend to the higher dimensions using an elliptical copula for multiple asset return. In section 4

, a gradual development to an optimal risk measure is provided indicating the advantages and disadvantages for suggested measures. The performance of these risk measures is assessed based on the suggested probability model for log-returns of assets. In section

5 we elaborate on the computational details that are needed to implement the modelling in both univariate and multivariate cases. Finally, section 6 deals with the application aspects of the suggested modelling approach in univariate as well as multivariate datasets. Section 7 concludes the paper.

## 2 Motivation for Mixture Models

Suppose is the price of a stock at time point , which follows geometric Brownian motion (GBM). The stochastic differential equation (SDE) corresponding to the GBM is:

 dSt=St(μdt+σdBt), S0=1,

where is Brownian motion, is drift parameter, is volatility. Then solution of SDE, namely geometric Brownian motion or stock price model, is as follows:

 St=S0eμt+σBt−12σ2t.

The log return is , with expectation

and and variance

. So, we have, . With this background in mind, we consider the log-return as mixture of Brownian motions, which is as follows:

 rt∼n∑i=1πiN((μ−12σ2i)t,σ2it)=n∑i=1πiXi,

where and

 Xi∼N((μ−12σ2i)t,σ2it)

with expectation and variance is . Note that, we can express as

 Xi=μt+σiBit−12σ2it.

So the log-return can be expressed as,

 rt=n∑i=1πiXi=μt+n∑i=1πiσiBit−12n∑i=1πiσ2it,

where ’s are independent Brownian motions for on same complete probability space and filtration by is denoted by .

If then

###### Remark 2.1.

The is martingale. If , where is the risk free interest rate, then can be interpreted as discounted log-return.

###### Remark 2.2.

This theorem implies if the market follows finite mixture of GBM, then there is no arbitrage opportunity in the market. However, the market is incomplete.

## 3 Methodology for Modelling Asset Return

The seminal paper by [18] provided a constructive definition of the DP-prior and their mathematical properties. Recent advancements in Monte Carlo techniques makes it possible to implement the DP-prior for constructing various kinds of generalized mixture models. The DP is obtained using an infinite-generalization of the finite-Dirichlet distribution. Mixture models consider a kernel, on the space , where is a measurable function satisfying the condition that for all , is a density with respect to some -finite measure on . Let denote the class of all priors defined on . Then for some , a prior is induced on the density of , via the map . The DP priors are useful in providing infinite dimensional extensions to finite dimensional mixture models and consequently assign priors on unknown distributions. The predictive distribution for the problem is then given by,

 θ1,…,θk Exch.∼ P∈Π, X1,…,Xn|θ1,…,θk iid∼ k≤n∑i=1h−1K(x−θih), (3.1)

which essentially approximates , given . Furthermore, if we can equip , with natural topologies, like the weak or norm topology, issues related to posterior consistency of estimates obtained using (3) can be found in [12]. [26] introduced an approach to model Value at Risk (VaR), by assigning DPs priors directly on the log-return of a financial security/asset. According to [11], the DPs are not used directly to model data. In this paper, we use the mathematical formulation, similar to [12], with as the univariate normal distribution and as the required DP, with the base measure as . This is as an alternative attempt to model financial log-return via a non-parametric Bayesian approach. The assumptions of normality on the base measure does not affect the composition of data, in terms of modelling around the locations.

let us consider , then we have the corresponding measurable space . Let denote the corresponding measurable parametric space, then a prior , is a probability measure on . If , be a DP-prior on and then,

 P(B)|X1,…,Xn ∼ Beta(αP0(B)+n∑i=1δXi(B),α{1−P0(B)}+n∑i=1δXi(Bc)),

which holds for all [9]. It follows that,

 E[P(B)|X1,…,Xn] = αα+nP0(B)+nα+nn∑i=11nδXi(B), (3.2) → n∑i=11nδXi(B),  as n→∞. (3.3)

### 3.1 Modelling Single Asset:

Throughout this paper we shall refer to as the price-path, as its corresponding log-return. Associated with

the volatility measure used is the standard deviation of the increment process

. Then, , where is the price of the security at time . The model we consider as,

 Rti ∼ K(θ), θ ∼ P, P ∼ DP(αP0). (3.4)

According to [9], we see that if, is a sample of size 1, then for , , therefore,

 fSt(st,θ,P∈B) = S0⋅eK(Rt|θ)⋅P(θ)⋅αP0(θ∈B)αP0(Θ).

We refrain from making any assumption on the kernel , in this section to keep the discussion of the ensuing consequences of (3.1) as generic as possible.

###### Note 3.1.

We can consider our base measure to be a Weiner measure, in particular Brownian motion ([20]) when modelling log-returns . Then, the Bayes’ risk for our approach is the same as that of the volatiliy parameter, of the geometric Brownian motion. Given the sample

, DP-prior probability structure allows us to adaptively update the estimate of our base measure. The hierarchy then induces a probability structure which models the underlying stock/asset price.

Therefore, to explicitly obtain the path of the asset price, we integrate the above over all possible -partions of , and . The stick-breaking construction by [18] is then used to provide us with an induced map on

 P(Rti) = ∞∑h=1πhK(Rti|θ∗h), (3.5) θ∗h ∼ P0, (3.6)

where . [18] showed that (3.5) is also a DP-prior.

A desirable property of the DP prior on is conjugacy, as mentioned in [9] and [18]. Let be a Dirichlet process defined on with parameter . Then the conjugacy property states, . Here is the measure assigning probability 1, to . In other words, if , be sampled parameters corresponding to log-returns of the asset, the posterior map given the parameters is also a DP with suitably altered parameters. We now aim to derive the form of the distribution function of the induced probability map on the log-returns .

###### Theorem 3.1.

For a stochastic process , on the measurable space , with parameters, , if , is a probability measure assigned on , in particular if , then the distribution function induced on is,

 P(Rt∈A|θ∈B) = αα+m^FRt0(A|θ∈B)+mα+m⋅^FRtm(A|θ∈B),

where, is a realization of size from , , and

 ^FRt0(A|θ∈B) = ∫AK(Rt|θ∈B)dP0(B), ^FRtm(A|θ∈B) = ^FRtm(A|θ∈B)=∫AK(Rt|θ∈B)d(m∑11mδθi(B)), = m∑i=11m∫AK(Rt|θi)dPθi(B).

The last equality is obtained by noting that the integral commutes with the finite -sum.

###### Note 3.2.

Typically, we know that is location invariant. In that context clustering mainly affects the volatility parameter of . Therefore, the clusters may be interpreted as volatility regimes located in the . On the other hand, existence of bull and bear-market trends, affect the location of the process . Altogether a location-scale kernel mixture would therefore do justice to both the mentioned facts in conjunction for modelling the increment process.

The development of the DP-priors is fundamentally based on the Polya urn Processes and Chinese restaurant Processes. In this paper we consider , being the number of urns, and the data points as balls that are given at the start of the experiment. The -urns in the context of can be thought of as a partition. Theoretically the partition size can be infinite. We have a prior and a base-measure , which corresponds to the prior knowledge regarding the urn-occupancy and distribution of occupied urns. We then perform the random experiment of throwing the balls. Here serves as the tuning parameter controlling the concentration of balls in urns. serves as probability assigned to the urns/partition. With respect to we should have an idea regarding the furthest expected urn occupancy in our throw. Thus one throw produces of urns that are occupied. Note that is the maximum number of urn that can be occupied by the balls.

###### Lemma 3.1.

For a base-measure , a tuning parameter , and the resulting set , for given , the urn-modelling occurs almost surely in urns across iterations .

### 3.2 Modelling Multiple Assets through Copula

Given a collection of assets, , a portfolio ([7]) is a

-vector consisting of the appropriate weights. In this section we assume that we have a given portfolio, that is a set of appropriate weights. We consider the portfolio in terms of the associated log-return for the concerned

-assets, . We also assume that an investor allocates a fixed sum according to the portfolio and the observable prices are at time . The associated log-return over this chosen time horizon, is denoted by , for and . We aim to model the marginal distribution of in this section, using the DP-priors.

The correlations between the -assets can be modelled using the

-dimensional multivariate probability distribution. In the previous section (

3.1), we presented the methodology to model the return for a single asset using a DP-prior. In this section we use the copula technique to model the marginal distribution using the DP-prior. The correlation structure is modelled using the elliptical copula such as multivariate -copula.

Elliptical copulas correspond to the class of elliptical distributions through the Sklar’s theorem. If denote the multivariate CDF of an elliptical distribution, , the marginal of the -component and its corresponding inverse for . Then using the Sklar’s Theorem the elliptical copula is determined via

 C(u1,…,up) = F[F−11(u1),…,F−1p(up)].

The uniqueness of the copula obtained using Sklar’s theorem ([21]) relies on the assumptions that the marginal(s) all have continuous CDFs. This facilitates the application of the probability integral transform on the marginal(s) to formulate the copula. In section (3.1) we have showed that the CDF induced on is continuous. Consequently, modelling the -assets using a DP prior and using the induced CDFs as marginal(s) ensures the existence of a unique copula.

The copula is defined given a covariance matrix . For instance, the Gaussian Copula [25] has the following structure,

 c(u1,…,up|Σ) = |Σ|−12exp[12qT(Ip−Σ−1)q],

where . The measure of association between the -assets being denoted by . In this paper we consider an appropriate measure of concordance [17] to obtain the entries , where . An interesting property of the family of concordance measures [5], is consistency. If

is a sequence of continuous random variables with a copula

, then as the copula converges (pointwise) then the measure of concordance also converges. In this paper we use the Kendall’s as to model the association between -assets under consideration. Considering this with respect to the results in Section (3.1) we have the following lemma.

###### Lemma 3.2.

Let and be the associated copula. Then by uniqueness of the fitted copula we have,

 limn→+∞Cn(u1,…,up)=C(u1,…,up),

then,

 MR1,n,…,Rp,na.s.→MR1,…,Rp,

where denotes, almost surely.

## 4 Coherent Risk Measures

In this section we present the coherent risk measures and consider their performance in reference to the log-returns of marginal components of a -asset portfolio being modelled using DP priors. First, we present a discussion regarding the development of coherent risk mesures, followed by a discussion about how the induced probability structure can be incorporated to evaluate portfolios.

Let be the sample space. Let the map denote the map corresponding to the loss or gain for a -asset protfolio over the time horizon . Then this map is termed as the risk associated with the portfolio. For instance,

 RPt = p∑i=1ωiRit=ωTRt,  p∑i=1ωi=1. (4.1)

Then, can be interpretted as the loss or gain (risk), in terms of return from the -asset portfolio. In general if, be a probability space, and be the set of all such risk-maps (), a risk measure is defined as a fucntion .

###### Remark 4.1.

According to [4], implies that a positive value is assigned by the measure to the risk . Therefore, is the minimum amount of capital that is to be added to , by investing in the risk-free rate to surpass any level of risk. Conversely, implies that can be cashed from the current position without any risk.

Commonly used measures of risk, such as Value at Risk associated with , suffers from a variety of deficiencies. These have been identified ([23], [3], [2]) to formulate a much robust class of measures of risk given by the “coherent” risk measures. [2] states that coherent measures of risk should satisfy 4 properties: (i) sub-additivity (ii) positive-homegeneity (iii) translation invariance and (iv) monotonicity. [23] changed how portfolio risk was quantified by establishing a generalized theory of coherent measures risk, using distortion functions. Distortion functions are always defined using an associated probability measure with the risk . [23] also established the Choquet integral expressions [8], for the commonly used measures of risk. A distortion function is defined with respect to a probability measure over a measure/probability space . If be an increasing concave function with and then,

 μ = g∘P.

The dual distortion function being,

 ~g(u) = 1−g(1−u).

The risk-measure is defined through the Choquet integral,

 ρg(X) = ∫Xdμ(X∈A), (4.2) = ∫Ag[P(X)]dx−∫A1−g[1−P(X)]dx, = = Eμ(A)(X).

It is evident that the nature of the distortion function affects the “coherence” of the obtained risk-measure. Also, if we assume that is -integrable, then the distorted risk measure is the expectation under the re-weighted probabilities. By using this construction we obtain risk measures that are coherent.

###### Remark 4.2.

The is not a coherent risk measure. The class of coherent distortion risk measures can be further extended to formulating exhaustive distortion risk measures that are both coherent and complete. Completeness ([4]), of a distortion risk measure relates primarily to the property of the distortion function to utilise information from the original loss distribution associated with the risk triplet , where is the associated risk.

Formally, if be the associated risk variable over , then is a complete distortion risk measure generated by if,

 P(X>x1)=P(X>x2) ⇔ μ((x1,∞))=μ((x2,∞)),

where . In conjunction to this definition, it is important to state two theorems from ([4]).

###### Theorem 4.1.

For a distorted probability defined by a distortion function , is complete is implied, and implied by is stricly increasing.

The proof immediately follows from the definiton of completeness for distorted risk measures.

###### Theorem 4.2.

If is a distorted risk measure, then it is an exhaustive distortion risk measure if and only if is concave and strictly increasing. Also it is exhaustive if and only if is concave and , for all .

The proof can be found in [4]. These theorems establish mainly that for a risk-measure to be complete the distortion fucntion should effectively incorporate all the information in the loss distribution .

###### Remark 4.3.

: It is clear, that formulation of a risk-measure calls for exercising caution on two fronts viz., the selection of an appropriate loss distribution to model the risk function and the choice of an appropriate distortion function , to incorporate all of the information in the associated probability to measure the risk of a position.

In reference to the methodology developed above for modelling the log-return of a

-asset portfolio using DP priors, we now proceed to look at the performance of the aforesaid risk-measures. The DP is a hyperprior with respect to

, as it assigns a DP-prior to . In the beginning of this section we have shown how the log-return is a valid risk (measure of gain or loss in net worth) associated with a portfolio. In the section (3.1), we have seen that the DP prior induces a probability measure on the log-return . Let us consider a collection of such log-returns corresponding to respective -assets that are modelled using DP priors. Their covariance being accounted for by an appropriate copula as shown in section (3.2). Then we have an induced probability for the probability space, , through a DP-prior . The associated risk is given by equation (4.1) on which the DP induces a probability structure. It is more appropriate to consider equation (4.1) given . Moreover, . Explicitly,

 RPt ∼ DP(αP0,…,αP0p % times,C), P0 ∼ N−Inv−χ2(μ0,σ20κ0;ν0,σ20).

Here the DP is a multivariate dependent Dirichlet process with marginals as DP-priors, and is the associated copula with an appropriate concordance measure .

Now we consider the commonly used measures of risk. It is important to consider that given a filtration , upto iteration , the associated probability with the loss for a single asset is given by the follwing equation, ( being the number of iterations until the mixing RPM estimate is obtatined.)

 Rt∣∣FM∗ = (4.3)

where are Bayes’ estimates obtained after succesful convergence for the parameters of the DP prior.

### 4.1 Risk Measures

Value at Risk: VaR

For an appropriate risk , is defined as,

 VaRγ(X) =

It is obtained as a Choquet integral ([8], [4]) by setting,

 g(u) = {0 if 0≤u<1−γ1 if 1−α≤u≤1..

Using (4.2) we have . Then, is simply the

% quantile of the loss distribution associated with

. By assigning the DP prior on the parameter space , the induced probability distribution on the log-return variable is given by the theorem (3.1). The Value at Risk for a single asset portfolio becomes an % quantile of the log-return distribution. The problem for estimating is then equivalent to estimating the % quantile for . Assuming that the market assumptions for the model hold, we have the GBM model for the log-return for an investment horizon ,

 Rt ∼ N((μ−σ22)T,σ2T). (4.4)

When comparing the models (4.3) and (4.4) in terms of estimating quantiles we have clear picture regarding the importance of the loss distribution. The % quantile for (4.4) is obtained by solving,

 Φ(qγ−(μ−σ2/2)Tσ√T) = γ,

whereas, for (4.3) we solve the following equation for ,

 ∫qγ0H(M)∗∑h=1π∗hN(Rt∣∣μ∗h,ϕ∗h−1)dRt = γ.

Here is finite. Since the sum is finite we have,

 H(M)∗∑h=1π∗h∫qγ0N(Rt∣∣μ∗h,ϕ∗h−1)dRt = γ, H(M)∗∑h=1π∗hΦ(qγ−μ∗h√ϕ∗h−1) = γ.

, denotes the distribution function for the standard normal. However, there does not exist any closed form expression for the above equation. It is evident that the estimtes for will be different in the two cases. The difference being a direct consequence of (4.2), which results in a significant change in the estimate of . The has been known to suffer from numerous deficinecies, the foremost of them being lack of sub-additivity. It is not a convex measure of risk, therefore diversification in terms of assets does not provide room for optimization [10]. Despite of these discrepancies it is widely used as a risk measure due to its simplicity in interpretation.

The reason behind considering quantile estimation for a single asset is to elucidate the significance of the loss distribution . For modelling the risk for a -asset portfolio, we consider univariate modelling of -assets using the equation in (4.3) and consider the covariance structure specified by fitting an appropriate copula. The distortion function for remains constant over the interval , which results in not being a complete risk measure. This follows from the equivalent condition stated in Theorem (4.1). Furthermore, the distortion function over the interval does not make it suitable for being an exhaustive distortion risk measure as well.

ESF/CVAR: Expected Shortfall/Conditional Value at Risk

This measure of risk was first introduced by [3]. For an appropriate risk , or is defined as,

 ESFγ(X) = 11−γ∫1−γ0VaRu(X)du, = 11−γ∫1−γ0sup{x∈R∣∣P(X≥x)>1−u}du,

The associated distortion function being given by ,

 g(u) = {u1−α if 0≤u<1−γ1 if 1−γ≤u≤1.

depends on the and therefore significant changes are expected, when considering variations in the distribution from (4.4) to (4.3). The discussion with respect to the improvements is estimation of risk using equation (4.3) as the associated loss distribution holds true for as well. In this case as a distortion function is better, in terms of information content from the loss distribution . Furthermore, it is easy to see that is non-decreasing and concave in nature. Consequently, is a coherent risk-measure.

WT: Wang’s Transform

Despite of being coherent, suffers from sensitivity towards severity of loss in final net worth, that is higher risk below % points of the loss-distribution. This serves as the major downside for ; moreover, being non-decreasing ( in the interval ) does not qualify the risk-measure to be a complete one. The Wang’s Transform is a valid measure belonging to the class of complete risk-measures. [24] draws heavily from the general principles establshed in [23] to suggest a distortion function that concentrates on symmettric parametric family viz., Normal class of probability measures.

The advantage of considering a symettric family being reflected in the dual distortion function . The suggested being given by,

 gr(u) = Φ[Φ−1(u)+r],

where is the corresponding market price of risk. It follows from the definition of , that for is a concave, and if , is convex. Therefore, with reference to (4.2) one can easily derive that the distorted risk measure corresponding to , with is complete and exhaustive. Coupled with the stated properties, using equation (4.3) as an alternative to (4.4) provides better estimates for the risk-measure associated to .

## 5 Computational Issues

In this section we consider computational aspects for applying the suggested approach to model the data. We use the blocked Gibbs sampler as an MCMC algorithm that is used to update the cluster specific parameters. The advantages of using the blocked Gibbs sampling can be summarized into two factors. Firstly, instead of using just a scale-prior we can now use location-scale families of DPs to model the data. The blocked Gibbs sampler is suited specifically for the purpose of simultaneously updating multiple parameters. Secondly, we have a conjugate prior for

for the blocked Gibbs in general. This makes the application more data-adaptive and generalized in nature. This section is divided into two parts. The first discussion is about the alterations proposed in case of irregular clusters. This will be preceded by a short digression explaining what are regular clusters with respect to the current theoretical setup. The second discussion mainly features an MCMC algorithm to implement the procedure.

We make the following assumptions,

 K(θ(m)h) = N(μ(m)h,ϕ(m)h−1), (5.1) P0 = Normal−Inv−χ2(μ0,σ20κ0;ν0,σ20). (5.2)

In light of the (5.2), a subtle yet serious issue is the formation of improper clusters. Broadly we are faced with the following cases: (i) , (ii) , and (iii)

. The second case shows the presence of an improper cluster, for which second order moments loose interpretability.

###### Remark 5.1.

Let , and denote the number of points allocated in cluster for a particular iteration . If , we say that for the -iteration the cluster- is irregularly occupied. The cluster- looses its usual interpretability in terms of moments.

Let us assume that, for a particular iteration, , are the unique values of . This characterizes the data where and . Then for , such that , we have the distribution function from (3.1),

 P(θ∣∣θ(m)) = αα+nP0(θ)+nα+nH≤n∑j=11nδθ(m)j(θ), = αα+nP0(θ)+nα+nH≤n∑j=11nδθ(m)j(θ).

If ,

 P(Rt|θ,θ(m)) = αα+nK(Rt|θ,θ(m))P0(θ|θ(m))+(H≤n∑j=1njα+nK(Rt|θ=θ(m)j)), (5.3)

which clearly shows that the data will tend to cluster in clusters characterized by . Thus, according to [11], fitting such a prior to the parameter space, should favor clustering on the financial log-return data.

### 5.1 The Algorithm

Here we present the MCMC algorithm that is used to implement the approach presented in the previous sections. The algorithm is presented using the assumptions made in (5.1) and (5.2); the steps are as follows:

(i) Setting Hyper-parameters

1. Select an appropriate ; consequently, and and initialize .

2. Set hyperprior values for .

3. Set as the Normal-Inverse- conjugate prior for the normal kernel (5.1).

(ii) MCMC Posterior Updates

1. For and update the parameters for the multinomial sampling by,

 ph = πhN(Rti|μh,ϕ−1h)∑Hh=1πhN(Rti|μh,ϕ−1h).

Then draw a sample of size from . Assign to to formulate , with and .

2. Update the precision parameter using the conjugacy of the blocked Gibbs,

 a(m) = aα+Hmax0m−1, b(m) = bα−Hmax0m−1∑h=1log(1−Vh), α ∼ Gamma(a(m),b(m)).
3. Calculate the cluster occupancy using,

 n(m)h = n∑i=1δθi=h.
4. For update the stick weights

from their beta distributions,

 Vh ∼ Beta(1+n(m)h,α+H∑k=h+1n(m)k).
5. For ,

• , resample from .

• , then the posterior update of prior .

• If , then the posterior update of prior

• , then the posterior update for prior is,

 N−Inv−χ2(μ(m−1)0,σ20(m−1)κ(m−1)0;ν(m−1)0,σ20(m−1)).

These are the posterior Gibbs updates using the previous iterations posterior as the prior for the next. Note that is just the starting value for the parameters.

6. Repeat this until stabilizes to obtain the mixing RPM.

### 5.2 Cluster Regularization

In the previous section we have seen that the estimate(s) of the RPM and related quantities are simply bootstrap estimates. We can estimate the augmented information for when considered for a fixed . For instance, we have

 E(n(M∗)h∣∣FM∗) = 1M∗M∗∑m=1n(m)h,

as the estimate for the expected cluster occupancy given the filtration . Thus, for a fixed if an irregular cluster in located we do not halt the MCMC procedure, since immediate inference from the posterior in terms of interpretability is not required. Therefore, in a collective manner over all -iterations the process remains regular. The interpretation makes sense when considering the possibility of a sample of size 1 from a cluster

. In particular, this is the case for extreme observations or outliers. This approach allows us to make room for heavy-tails in the RPM. This being indicated by sparsely occupied extreme clusters.

## 6 Application

In this section, we present the application of the proposed methodology to two different datasets. The section is broken into two different parts. Firstly, we consider univariate modeling of an asset using the DP prior. We present the estimates along with their respective confidence intervals. This is done for three different stocks, namely IBM, Intel, and NASDAQ. Secondly, we consider multivariate modeling of a dataset consists of a portfolio consisting of IBM, Intel, and NASDAQ. Following which a multivariate application is carried out on a much larger dataset consisting of an optimized portfolio over

assets from the National Stock Exchange of India (NSEI). Note that these 51 stocks make up the index, “Nifty 50” for Indian stock markets. For optimizing the portfolio, we use a mean-variance optimization [14] to select a suitable portfolio. We use an appropriate fitted elliptical -copula to account for the correlation structure amongst the 51 stocks in the portfolio.

### 6.1 Risk and Return Analysis for Single Asset:

The log-return of the IBM and Intel and the NASDAQ index are modeled using the DP prior and the figure (1) showing the DP fit against Black-Scholes to Intel, IBM and NASDAQ daily log-returns. The data is collected for these three stocks for a year starting from 1 July, 2015. Overall there are 246 days of log-return for the three assets. Simple visual inspection is enough to conclude that DP fit models the log-return much better in comparision to Black-Scholes. Table (2) presents comparative capabilities of different methods for density estimation and return path modeling of the three assets. A comparison with the default DP-density and Polya Tail Free priors from DPpackage in R [1] is shown in Figures (5). We also calculate the Highest Posterior Density (HPD) intervals in Table (4) with . It can be seen that the DP prior results in posterior intervals with shortest length with a probability of 1 of contating the mean return and volatility estimate for the underlying asset. In the comparative fits shown in Figures (5), we see that the tendency to detect modes and changes in tail behaviour is increased in the m-DP prior. The default DPdensity fails to identify modes completely, while the PTdensity shows modes in the 2- interval remaining neutral to changes in the tail behaviour.

###### Remark 6.1.

Considering the kernel density estimate as a benchmark, we compare the performances of the Black-Scholes and DP simulated returns based on mean square deviations. Table (

Figures and Tables) presents a comparative study for the same. Table (3) compares the estimates of the various risk-measures discussed in section 4, obtained under the empirical method of modelling log-returns using the kernel density estimate and the multivariate methodology presented.

### 6.2 Risk and Return Analysis for Multiple Asset:

Here we present the application of the multivariate modeling of log-return using the method as discussed in section (3.2). Here we conducted two exercises viz., first we apply the methodology over three assets as considered in (6.1), then we apply the same over the dataset with 51 stocks from Indian stock market. We estimate the covariance matrix for copula using the [6, 22]. After modelling the three assets using a -copula, we simulate the individual returns with respect to modelled correlation structure. Figures in (2) show the scatter plots for the observed and fitted daily log-returns for the three assets. The performance of the t-copula, in modelling the observed correlation structure is compared in the figure (3). Visual inspection reveals that the observed correlation structure is preserved in simulated log-returns.

Next, while modelling 51 assets we compare performance of the methodology over the first two principal components for the observed and simulated returns. Figure (4

) shows the scatter plots for the first two principal components in the observed and simulated data respectively. The returns were simulated from a t-copula with 10 degrees of freedom, where the marginals were modeled using the DP prior approach discussed above. Visual inspection indicates that the proposed methodology is able to model the variation in the 51 stocks; along the first two major directions of the correlation structure in the data.

## 7 Conclusion

In this paper, we presented the Dirichlet Process (DP) prior for modeling the log-return of the single asset(s). In comparison to the approach of [26] we have proposed following alterations. First, we assign a DP prior over parameter space which helps us to avoid inducing a discrete RPM over the log-return. Note that [26] induces a discrete RPM over the log-return almost surely. Consequently, we face the problem of quantile estimation for the log-returns, which was avoided by [26]. To deal with this problem, we develop the necessary results that assure the fitting of an RPM, which is almost-surely a continuous finite mixture over the log-return; all the while preserving the hierarchy. The proposed results rely heavily on the urn-scheme approach for interpreting DP. For a given set of observations over a fixed time horizon, theoretically assigning a DP involves an infinite mixture modeling. We used the conjugate structure provided by the blocked-Gibbs sampler to augment stochastic processes unique to each question. This augmentation technique helps us to avoid the reversible jump MCMC.

We extend this approach to introduce a multivariate distribution to model the return on multiple assets via a -copula; which models the marginal using the DP prior. This helps us to keep the already existing nonparametric univariate approach the same even in multivariate applications. The application of this methodology comprises of fitting RPMs over univariate and collection of univariates with a -copula in two different datasets. We compare different risk measures such as Value at Risk (VaR) and Conditional VaR (CVaR) in both the datasets.

## References

• [1] Jara. Alejandro, Hanson. Timothy, Quintana. Fernando, Peter. Mueller, and Gary. Rosner. Properties of distortion risk measure. Journal of Statistical Software, 40(5):1–30, 2011.
• [2] P. Artzner, F. Delbaen, J. M. Eber, and D. Health. Thinking coherently. Risk, 10:68–71, 1997.
• [3] P. Artzner, F. Delbaen, J. M. Eber, and D. Health. Posterior consistency of dirichlet mixtures in density estimation. The Annals of Statistics, 27:143–158, 1999.
• [4] A. Balbas, J. Garrido, and S. Mayoral. Properties of distortion risk measure. Methodology and Computing in Applied Probability, 11(3):385–403, 2009.
• [5] U. Cherubini, E. Luciano, and W. Vecchiato. Copula Methods in Finance. Wiley Finance Series., 1 edition, 2006.
• [6] Sourish. Das and Dipak. K Dey.

On bayesian inference for generalized multivariate gamma distribution.

Statistics and Probability Letters, 80:1492–1499, 2010.
• [7] F. Delbaen and W. Schachermayer. The Mathematics of Arbitrage. Springer Finance., 3 edition, 2006.
• [8] D. Denneberg. Non-Additive Measure and Integral. Springer–Netherlands, Theory and Decision Library 27, 1 edition, 1994.
• [9] T. S. Ferguson. A bayesian analysis of some nonparametric problems. The Annals of Statistics, 1:209–230, 1973.
• [10] H. Follmer and A. Shield. Convex measures of risk and trading constraints. Finance Stoch., 6(4):429–447, 2002.
• [11] A. Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, A. Vehtari, and Donald B. Rubin. Handbook of Heavy Tailed Distributions in Finance., volume 3. Chapman and Hall/CRC, Edition, 2004.
• [12] S. Ghoshal, Jayanta. K. Ghosh, and R. V. Ramamoorthi. Posterior consistency of dirichlet mixtures in density estimation. The Annals of Statistics, 27:143–158, 1999.
• [13] A. Habib. Calculus of Finance. University Press, 2011.
• [14] Harry M. Markowitz. Properties of distortion risk measure. Journal of Finance, 7(1):77–91, 1952.
• [15] Alexander J. McNeil and F. R??diger.

Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach.

Journal of Empirical Finance, 7:271–300, 2000.
• [16] S. T. Rachev. Handbook of Heavy Tailed Distributions in Finance., volume 1. Elsevier–North Holland.
• [17] M. Scarsini. On measures of concordance. Stochastica, 8, 1984.
• [18] J. Sethuraman. A constructive definition of dirichlet priors. Statistica Sinicia, 4:639–650, 1994.
• [19] Steven E. Shreve. Stochastic Calculus for Finance I. Springer, 1 edition, 2004.
• [20] Steven E. Shreve. Stochastic Calculus for Finance II. Springer, 1 edition, 2004.
• [21] A. Sklar. Fonctions de répartition à n dimensions et leurs marges. de l’Institut de Statistique de L’Université de Paris., 8:229–231, 1959.
• [22] Das. Sourish, Halder. Aritra, and K. Dey Dipak. Regularizing portfolio risk analysis: A bayesian approach. Methodology and Computing in Applied Probability, 19:865–889, 2017.
• [23] S. Wang. Premium calculation by transforming the layer premium density. ASTIN Bulletin, 26(1).
• [24] Shaun S. Wang. A class of distortion operators for financial and insurance risks. Journal of Risk Insuranc, 67:15–36, 2000.
• [25] J. Yan. Enjoy the joy of copulas: With a package copula. Journal of Statistical Software, 21(4), 2007.
• [26] Bedard T. Zarepour, M., Feldheim, and Darbowski A. R. Return and value at risk using dirichlet process. Applied Mathematical Finance, 15:205–218, 2008.

## Appendix: Proof

Proof of Theorem 2.1

###### Proof.
 E[rt|Ft−1] = E[μt+n∑i=1πiσiBit−12n∑i=1πiσ2it|Ft−1] = E[μt+n∑i=1πiσi(Bit−Bit−1)+n∑i=1πiσiBit−1−12