## 1 Introduction

In several practical decision problems, the presence of uncertainty complicates the decision making process as decisions typically are required to be taken before the uncertainty is resolved. Traditionally, this difficulty is overcome by averaging the costs (or rewards) over all possible realizations of the uncertainty, and then optimizing the averaged cost thus obtained. However, it has been argued that considering averaged outcomes is not appropriate in situations where low-probability events such as financial crashes and category 4 hurricanes can cause huge costs. The possibility of occurrence of such tail events has led to the introduction of risk measures such as Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) for quantification of risk. In financial risk management, the VaR of a risky portfolio at a confidence level

is a loss threshold such that the probability of the loss exceeding the threshold is no greater than . The CVaR of a portfolio at a confidence level is the expected loss on the portfolio conditioned on the event that the loss exceeds the VaR. Loosely speaking, VaR quantifies the maximum loss that can occur in the absence of a catastrophic tail event, while the CVaR gives the expected loss given the occurrence of such a tail event. CVaR has several desirable properties as a risk measure. In particular, it is a convex, coherent risk measure (see the survey paper [4] and references therein). As a result, CVaR continues to receive increasing attention in operations research, mathematical finance, and decision science for problems involving risk quantification or risk minimization.In most applications involving uncertainty, the distributions characterizing the underlying uncertain factors are not known, and risk measures such as CVaR have to be estimated from sampled values of the random variable of interest. This is true, for instance, in a multi-armed bandit problem [5, 3] in which pulling an arm leads to a random loss, and one seeks to identify the arm whose loss random variable has the lowest CVaR by observing a sample of outcomes that result from multiple arm pulls. An obvious estimator for the CVaR of a distribution is the sample CVaR of an i.i.d. sample drawn from the distribution. Naturally, one seeks error bounds for the estimator that help to understand the trade-off between accuracy and sample size. Previous results on CVaR estimation either provide asymptotic error bounds for a general r.v. [7], or provide non-asymptotic error bounds that hold with high probability, but under the stringent assumption that the underlying r.v. is bounded [2, 8].

In this paper, we consider the problem of estimating the CVaR of an unbounded, albeit sub-Gaussian or sub-exponential random variable. Sub-Gaussian r.v.s include bounded, Gaussian and any other r.v. whose tail decays as fast as a Gaussian. On the other hand, sub-exponential r.v.s include exponential, Poisson and squared-Gaussian r.v. and are characterized by a tail heavier than Gaussian, resembling that of an exponential distribution. To the best of our knowledge, there are no concentration bounds for CVaR estimator for these two popular classes of unbounded distributions. We believe, imposing a tail decay assumption (sub-Gaussian or sub-exponential) is not restrictive, as such an imposition is common in concentration results for sample mean. Also, the task of CVaR estimation is more challenging in comparison, as it relates to a tail event. We derive a one-sided concentration bound for the empirical CVaR of an i.i.d. sample. Our bound relies on one of the two concentration results (of possibly independent interest) that we provide for a quantile-based estimator for VaR.

## 2 Background

Given a r.v.

with cumulative distribution function (CDF)

, the VaR and CVaR^{1}

^{1}1For notational brevity, we omit from and whenever the r.v. can be understood from the context. at level are defined as follows:

(1) | ||||

(2) |

where we have used the notation that for a real number Typical values of chosen in practice are and . Note that, if has a continuous and strictly increasing CDF, then is a solution to the following *i.e.,* CVaR also admits another form under the following assumption:

(A1) The r.v. is continuous and has strictly increasing CDF.

If (A1) holds and has a positive density at then admits the following equivalent form (cf. [7]):

Let denote i.i.d. samples from the distribution of . Then, the estimates of VaR and CVaR at level , denoted by and , are formed as follows [6]:

(3) | ||||

(4) |

where is the empirical distribution function of . Note that, from the order statistics , the empirical VaR can be computed as follows:

## 3 Concentration bounds

In this section, we present four concentration bounds. The first two bounds are for the VaR estimator given in (3) and these bounds do not impose any restrictions on the underlying distribution. The next two concentration results are for the CVaR estimator given in (4), and for these results, we assume that the underlying distribution is either sub-Gaussian or sub-exponential (see Definitions 3–3 below).

In each of the result presented below, the estimates are calculated using i.i.d. samples drawn from the r.v. with CDF and for a given (VaR concentration bound) Let and Define and . Further, let and , where is defined by (3). Then,

Note that, the above concentration bound is free of any distribution dependent parameters.

###### Proof.

See Section 4.1. ∎

The following result presents another concentration bound for the VaR estimator which will have distribution parameters in the bound. However, unlike Proposition 3, the result presented below is symmetric, and more importantly, bounds the estimation error directly. [VaR concentration bound] Suppose that (A1) holds. For any we have

where is a constant that depends on the value of the density of the r.v. in a neighbourhood of VaR.

###### Proof.

See Section 4.2. ∎

The bound above implies that to estimate the VaR to an accuracy of , one would require an order number of samples. Notice that no restrictive assumptions on the tail of the underlying distribution are made in arriving at the concentration bounds for VaR in Propositions 3 and 3. However, for establishing concentration bounds for the CVaR, which involves conditioning on a tail event, it is necessary to assume that the distribution is not heavy-tailed. In fact, even for the case of estimating the expected value of a r.v., exponential concentration bounds are available under an assumption that restricts the tail to be light (cf. Chapter 2 of [1]).

In this paper, we present concentration bounds under two popular assumptions on the tail of a r.v. The first restricts the r.v. to be sub-Gaussian, while the second requires the same to be sub-exponential. These two classes of r.v.s include bounded r.v.s and more importantly, several unbounded r.v.s as well. Sub-Gaussian r.v.s include the Gaussian r.v.s as well as several other r.v.s whose moment generating functions do not exceed that of a Gaussian, while sub-exponential r.v.s include heavier tailed r.v.s. These two notions are made precise in the following definitions.

A r.v. with is said to be *-sub-Gaussian* if

A r.v. with mean is said to be *-sub-exponential* if

It is worth noting that all sub-Gaussian r.v.s are sub-exponential, but the converse is not true. The following result presents a one-sided concentration bound for the CVaR estimator in (4), for the case when the underlying r.v. is sub-Gaussian. [CVaR concentration bound: sub-Gaussian case] Suppose that (A1) holds. Let and be a -sub-Gaussian r.v. with mean . Suppose that is large enough to ensure and the sub-Gaussian parameter satisfies Then, for any we have

(5) |

where and are constants that depend on the value of the density of the r.v. in a neighbourhood of VaR.

###### Proof.

See Section 4.3. ∎

Suppose that the accuracy is greater than . Then, it is apparent that the dominant terms on the RHS of (5) are those involving an exponential with . Further, with probability (w.p.) at least , when the number of samples is of the order . On other hand, an order number of samples are enough to ensure that w.p. at least . Hence, CVaR estimation requires more samples in comparison to VaR, when . In the complementary case, i.e., when , both VaR and CVaR estimate can be -accurate w.p. , if the number of samples is of the order .

Next, we analyse the concentration of the CVaR estimator in (4) for the case when the underlying r.v. is sub-exponential. [CVaR concentration bound: sub-exponential case] Suppose that (A1) holds. Let be a -sub-exponential r.v. with mean . Suppose that is large enough to ensure and the parameter satisfies where where Then for any we have

(6) |

where and are as in Proposition 3. From the result above, it is apparent that the rate of CVaR concentration for sub-exponential r.v.s matches that of sub-Gaussian ones.

###### Proof.

See Section 4.4. ∎

## 4 Proofs

In this section, we present the proofs of the results presented in Section 3.

### 4.1 Proof of Proposition 3

###### Proof.

Recall the Dvoretzky-Kiefer-Wolfowitz (DKW) inequality, which provides a finite-sample bound on the distance between the empirical distribution and the true distribution: For any ,

Consider the following event

with and as defined in the theorem statement. By the DKW inequality, we have

(7) |

On the event , we have

where (a) follows from continuity of , (b) follows from the definition of the event , and (c) follows from the definition of . Thus, and the main claim follows from the lower bound on in (7). ∎

### 4.2 Proof of Proposition 3

###### Proof.

where is due to the DKW inequality, and

Given that the density exists, we have

for some . Using the identity above for the two expressions inside , we obtain

for some and The claim follows. ∎

### 4.3 Proof of Proposition 3

We first prove a more general result without restricting the sub-Gaussian parameter and the main claim in Proposition 3 follows in a straightforward fashion.

[General CVaR concentration bound: sub-Gaussian case] Assume (A1). Let be a -sub-Gaussian with mean . Suppose that . Then, for any , we have

(8) |

where and

###### Proof.

First, we bound the estimate . Notice that

(9) |

The last term on the RHS of (9) can be re-written as follows:

(10) |

and

(11) |

The last equality above uses the fact that takes values zero w.p. , since is continuous, for each .

Combining (9), (10) and (11), we obtain

(12) |

where

From (12), we have

(13) |

For notational convenience, let

It is easy to see that ’s are i.i.d., non-negative, and , . We now proceed to bound , using (13), as follows:

(14) |

For handling , we bound the moment generating function of r.v. as follows:

(15) |

where is due to the sub-Gaussianity of . Thus,

(16) |

where uses Markov’s inequality and follows from (15). Notice that (16) holds for any . However, for the bound on the RHS above to be meaningful, we require that . Now, maximizing over , we obtain . Substituting the value of in (16), we obtain

(17) |

For handling the term in (14), we bound as follows. Using the inequality above, we bound as follows:

(18) |

The first inequality above uses the following fact:

Using (18), we have

It is easy to see that Hence,

Let . Then, we have

where is due to the DKW inequality. Therefore,

(19) |

Using (14), (17) and (19), we obtain

where are as defined in the statement of the proposition, and is due to Proposition 3 and the fact that . ∎

#### Proof of Proposition 3

###### Proof.

For we note that Given that the density exists, we have

for some . Using the identity above for the two expressions inside , we obtain

for some and . Along similar lines, it is easy to infer that

for some and . The claim follows. ∎