 # Total variation distance for discretely observed Lévy processes: a Gaussian approximation of the small jumps

It is common practice to treat small jumps of Lévy processes as Wiener noise and thus to approximate its marginals by a Gaussian distribution. However, results that allow to quantify the goodness of this approximation according to a given metric are rare. In this paper, we clarify what happens when the chosen metric is the total variation distance. Such a choice is motivated by its statistical interpretation. If the total variation distance between two statistical models converges to zero, then no tests can be constructed to distinguish the two models which are therefore equivalent, statistically speaking. We elaborate a fine analysis of a Gaussian approximation for the small jumps of Lévy processes with infinite Lévy measure in total variation distance. Non asymptotic bounds for the total variation distance between n discrete observations of small jumps of a Lévy process and the corresponding Gaussian distribution is presented and extensively discussed. As a byproduct, new upper bounds for the total variation distance between discrete observations of Lévy processes are provided. The theory is illustrated by concrete examples.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Although Lévy processes, or equivalently infinite divisible distributions, are mathematical objects introduced almost a century ago and even though a good knowledge of their basic properties has since long been achieved, they have recently enjoyed renewed interest. This is mainly due to the numerous applications in various fields. To name some examples, Lévy processes or Lévy-type processes (time changed Lévy processes, Lévy driven SDE, etc…) play a central role in mathematical finance, insurance, telecommunications, biology, neurology, telecommunications, seismology, meteorology and extreme value theory. Examples of applications may be found in the textbooks [N] and [CT] whereas the manuscripts [Bertoin] and [sato] provide a comprehensive presentation of the properties of these processes.

The transition from the purely theoretical study of Lévy processes to the need to understand Lévy driven models used in real life applications has led to new challenges. For instance, the questions of how to simulate the trajectories of Lévy processes and how to make inference (prediction, testing, estimation, etc…) for this class of stochastic processes have become a key issue. The literature concerning these two aspects is already quite large; without any claim of completeness, we quote

[AR], Chapter VI in [N], [BCGM], [CR] and Part II in [CT].

We specifically focus on statistics and simulation for Lévy processes, because our paper aims to give an exhaustive answer to a recurring question in these areas: When are the small jumps of Lévy processes morally Gaussian?

Before entering into the details, let us take a step back and see where this question comes from. Thanks to the Lévy-Itô decomposition, the structure of the paths of any Lévy process is well understood and it is well known that any Lévy process can be decomposed into the sum of three independent Lévy processes: a Brownian motion with drift, a centered martingale associated with the small jumps of and a compound Poisson process describing the big jumps of (see the decomposition (2) in Section 1.1 below). If the properties of continuously or discretely observed compound Poisson processes and of Gaussian processes are well understood, the same cannot be said for the small jumps . As usual in mathematics, when one faces a complex object a natural reflection is whether the problem can be simplified by replacing the difficult part with an easier but, in a sense, equivalent one. There are of course various notions of equivalence ranging from the weakest, convergence in law, to the stronger convergence in total variation.

For some time now, many authors have noticed that marginal laws of small jumps of Lévy processes with infinite Lévy measures resemble to Gaussian random variables, see Figure

2 and 2. Figure 1: Discretized trajectory of a Lévy process (0,0,1(0,ε](x)x1+β) for (n=103,Δ=1,ε=10−2,β=0.9).

This remark has led to propose algorithms of simulation of trajectories of Lévy processes based on a Gaussian approximation of the small jumps, see e.g. [CR] or [CT], Chapter 6. Regarding estimation procedures, a Gaussian approximation of the small jumps has, to the best of our knowledge, not been exploited yet. A fine control of the total variation distance between these two quantities could open the way of new statistical procedures.

In this paper we would like to shed some light on when a Gaussian approximation of the small jumps is adequate. To do so we carefully measure the distance in total variation between the small jumps and their corresponding Gaussian distribution. The choice of this distance is justified by its statistical interpretation: if the total variation distance between the law of the small jumps and the corresponding Gaussian component converges to zero then no statistical test can be built to distinguish between the two models. Rephrased in terms of information theory, this means that the two models are asymptotically equally informative.

Clearly, investigating the goodness of a Gaussian approximation of the small jumps of a Lévy process in total variation distance makes sense only if one deals with the discrete observations of the process. Indeed, if one disposes of continuous observations of two Lévy processes, the problem of separating the continuous part from the jumping part does not arise: the jumps are observed. It has long been known that the measure corresponding to the continuous observation of a continuous Lévy process is orthogonal with respect to the measure corresponding to the continuous observation of a Lévy process with non trivial jump part, see e.g. [JS]. However, the situation changes when dealing with discrete observations. In this case, the matter of disentangling continuous and discontinuous part of the processes is much more complex: intuitively, fine techniques are needed to understand whether, between two observations and , there has been a chaotic continuous behavior, many small jumps, one single bigger jump, or a mixture of these.

A criterion for the weak convergence for marginals of Lévy processes is given by Gnedenko and Kolmogorov, [GK]:

###### Theorem 1 (Gnedenko, Kolmogorov).

Marginals of Lévy processes with Lévy triplets converge weakly to marginals of a Lévy process with Lévy triplet if and only if

 bn→b and σ2nδ0+(x2∧1)νn(dx)w→σ2δ0+(x2∧1)ν(dx),

where is the Dirac measure in and denotes weak convergence of finite measures.

A remarkable fact in the previous statement is the non-separation between the continuous and discontinuous parts of the processes: the law at time of a pure jumps Lévy process can weakly converge to that of a continuous Lévy process. In particular, if is a Lévy process with Lévy measure then, for any and , the law of the centered jumps of with magnitude less than

converges weakly to a centered Gaussian distribution with variance

as We aim to understand this phenomenon in depth, using a notion of closeness stronger than the weak convergence, providing a quantitative translation of the result of Gnedenko and Kolmogorov in total variation distance.

There exist already several results for distances between Lévy processes. Most of them (see for example [EM], [JS] and [K]) are distances on the Skorohod space, that is distances between the continuous observation of the processes, and thus out of the scope of this paper. Concerning discretely observed Lévy processes we mention the results in [liese] and [MR]. Liese [liese] proved the following upper bound in total variation for marginals of Lévy processes , : for any

 ∥L(X1t)−L(X2t)∥TV ≤2√1−(1−H2(N(t˜b1,tΣ21),N(t˜b2,tΣ22))2)2exp(−tH2(ν1,ν2))

with , and denotes the Hellinger distance.

This result is the analogous in discrete time of the result of Mémin and Shyryaev for continuously observed Lévy processes, see [MS]. This is reflected in the fact that there is a clear separation between the continuous and discontinuous parts of the processes, which is unavoidable on the Skorohod space but that can be relaxed when dealing with the marginals. Clearly, from this kind of upper bounds it is not possible to deduce a Gaussian approximation for the small jumps in total variation: the bound is actually trivial whenever . However, such an approximation may hold even in total variation as proved in [MR].

More precisely, in [MR] the convolutional structure of Lévy processes with non-zero diffusion coefficients is exploited to transfer results from Wasserstein distances to total variation distance. However, it turns out that the Gaussian approximation of the small jumps thus obtained is suboptimal for large classes of Lévy processes, for instance for Lévy processes with infinite Lévy measures that are absolutely continuous with respect to the Lebesgue measure, see Section 2.1.3.

In the present paper we complete the work started in [MR], providing a comprehensive answer to the following question: Under which asymptotics does a Gaussian approximation capture the behavior of the small jumps of a discretely observed Lévy process adequately so that the two corresponding statistical models cannot be distinguished, they are equally informative?

Our analysis is developed for Lévy processes with Lévy measures which are supposed to be infinite and absolutely continuous with respect to the Lebesgue measure. If is finite and non-zero, is the convolution between a compound Poisson process and a Brownian motion plus drift. Several tools are already present in the literature to deal with this kind of processes, hence we do not focus on them. The absolutely continuity of with respect to the Lebesgue measure is one way to avoid the orthogonality between the law of the small jumps and a Gaussian distribution. Indeed, if is concentrated on a finite set, then the total variation distance between and any Gaussian distribution is always .

Differently from what was done in [MR], we do not only consider the total variation distance between marginals of Lévy processes but we also establish sharp bounds for the distance between given increments of the small jumps. This is an essential novelty. If it is true that from a bound in total variation between marginals one can always deduce a bound for the sample by , this kind of control is in general sub-optimal. In many situations indeed, is of order of , as goes to infinity. This faster rate can indeed be obtained using the methods developed in the present work.

More concretely, our main result about the Gaussian approximation of the small jumps in total variation may be stated as follows:

###### Main Result 1.

Let be a Lévy process with Lévy triplet where is an infinite Lévy measure absolutely continuous with respect to the Lebesgue measure with support in . Introducing , and , for any , , it holds

 ∥∥L((MiΔ(ε)−M(i−1)Δ(ε), i=1…,n))−N(0,Δσ2(ε))⊗n∥∥TV ≤3 ⎷(1+μ24(ε)Δ2(σ2(ε))4+76μ23(ε)Δ(σ2(ε))3)n−1.

The non-asymptotic nature of the result above allows to quantify just how “small” must the small jumps be, in terms of the number of observations and their frequency , in order for it to be close in total variation to the corresponding Gaussian distribution. More precisely, provided and as , the rate of convergence in Main Result 1 is of order

 r:=√n(μ4(ε)Δ(σ2(ε))2+76μ3(ε)√Δ(σ2(ε))3/2). (1)

A sufficient condition to ensure and is that as . In that case it is straightforward to see that if then the rate of convergence of is of order ; when , this can be further improved to .

To exemplify, consider the small jumps of symmetric -stable processes, that is Lévy processes with Lévy measure given by , . In this case, the rate of convergence of the distance in total variation between the law of given increments of the small jumps collected at a sampling rate and the corresponding Gaussian distribution is of order , as

The idea behind the proof of Main Result 1 is the following. For any and any , we decompose into the sum of , the marginal of a Lévy process with jumps of size smaller than , and , the marginal of a centered compound Poisson process with jumps of size in . For small enough is negligible (see Proposition 1) and can be approximated by a Gaussian distribution as follows. Firstly we write

 Rt(η,ε)=Nt(η,ε)∑i=1Yi−E[Nt(η,ε)∑i=1Yi]≈Nt(η,ε)∑i=1(Yi−E[Yi]),

where is a Poisson process with intensity and the ’s are i.i.d. bounded random variables. The approximation is justified by Proposition 2. To deal with this random sum we separate the cases where

is large and where it is not, which occur with very small probability, given that

as by the assumption . Conditional on being close enough to its mean (and thus large enough), the sum can be proven to be approximately Gaussian by means of an Edgeworth expansion. The main technical step in our proof of Main Result 1, however, is to derive an unconditional bound with explicit constants (see Proposition 3), to obtain a Gaussian approximation of the all of

. This involves very precise computations in order to get cancellation of the moments and thus the optimal rate of convergence in

.

Main Result 1 can be sharpened for Lévy processes with non-zero Gaussian components obtaining a faster rate of convergence, as follows.

###### Main Result 2.

With the same notation as in Main Result 1 the following holds. For any , , , and , we have

 ∥∥(N(bΔ,ΔΣ2)∗MΔ(ε))⊗n−N(bΔ, Δ(Σ2+σ2(ε))⊗n∥∥TV≤ 3 ⎷(1+μ24(ε)Δ2(Σ2+σ2(ε))4+76μ23(ε)Δ(Σ2+σ2(ε))3)n−1.

It can additionally be sharpened by considering separately large and rare jumps, see Corollary 2 - this has an impact in the case where the jumps of size of order are very rare.

Main Result 1 is optimal, in the sense that whenever the jumps of size of order are not “too rare” and whenever the above quantity in (1) is larger than a constant, then the upper bound in Main Result 1 is trivial, but the total variation distance can be bounded away from 0 as shown in Main Result 3 below, which is in fact more general as it considers also processes with a drift and a Brownian part.

###### Main Result 3.

Let be an infinite Lévy measure absolutely continuous with respect to the Lebesgue measure, let and . For any and such that , there exists an absolute sequence (independent of ) and an absolute constant (independent of ) such that the following holds:

 ∥∥(N(bΔ,ΔΣ2)∗MΔ(ε))⊗n −N(bΔ,Δ(Σ2+σ2(ε))⊗n∥∥TV≥ {1−C[Δ(Σ2+σ2(ε))3nμ3(ε)2∧Δ2(Σ2+σ2(ε))4nμ4(ε)2]−αn}.

A more technical and general lower bound, that holds without condition on the rarity of the jumps of size of order , is also available, see Theorem 5. This lower bound matches in order the general upper bound of Corollary 2 - implying optimality without conditions of our results. The proof of the lower bound for the total variation is based on the construction of a sharp Gaussian test for Lévy processes. This test combines three ideas, (i) the detection of extreme values that are ‘too large’ for being produced by a Brownian motion, (ii) the detection of asymmetries around the drift in the third moment, and (iii) the detection of too heavy tails in the fourth moment for a Brownian motion. It can be of independent interest as it does not take knowledge on the tested process as parameter and detects optimally the presence of jumps. It uses classical ideas from testing through moments and extreme values [ingster2012nonparametric], and it adapts them to the specific structure of Lévy processes. The closest related work is [reiss2013testing]. We improve on the test proposed there as we go beyond testing based on the fourth moment only, and we tighten the results regarding rare and large jumps.

To sum up, this paper provides a comprehensive answer to the question of when a Gaussian approximation of the small jumps of discretely observed Lévy processes is appropriate, in the following sense. Given the observation of the increments of a Lévy process , if the number of observations , the sampling rate and the size of the small jumps of are such that the upper bounds for the total variation distance we give in Theorems 2 and 3 go to as goes to infinity, then a Gaussian approximation of the small jumps is statistically justified. Indeed, under these circumstances no statistical test can be constructed to distinguish between realizations of increments of the small jumps of (potentially convoluted with the Gaussian part of

) and realizations of a Gaussian vector with matching mean and variance. Conversely, if

, and are such that the upper bounds we provide in Theorems 2 and 3 do not vanish, then the Gaussian approximation of the small jumps is not statistically justifiable. In this case in fact the total variation distance is bounded from below and a statistical test can be constructed to distinguish the small jumps from Gaussian distributions, see Theorems 4 and 5.

An interesting byproduct of Main Results 1 and 2 is Theorem 6 in Section 3 which provides a new upper bound for the total variation distance between given increments of each pairs of Lévy processes with infinite Lévy measures. A peculiarity of the result is the non-separation between the continuous and discontinuous part of the processes. In this sense Theorem 6 is close in spirit to Theorem 1 although it is worth noting that it gives a concrete measure of the closeness between discretely observed Lévy processes in total variation, that is using a much stronger notion of convergence than the weak convergence.

The paper is organized as follows. In Section 1.1 we fix the notation. Section 2 is devoted to the analysis of the Gaussian approximation of the small jumps of Lévy processes. More precisely, in Section 2.1 we present upper bounds in total variation distance whereas in Section 2.2 we provide lower bounds proving the optimality of our findings. In Section 3 new upper bounds for the total variation distance between given increments of two general Lévy processes are derived. Most of the proofs are postponed to Section 4. The paper also contains two Appendices: in Appendix A technical results related to the proof of the lower bounds can be found whereas in Appendix B we recall some results about total variation distance and present some general, and probably not new, results about the total variation distance between Gaussian distributions and discrete observations of compound Poisson processes.

### 1.1 Statistical setting and notation

For a Lévy process with Lévy triplet (we write ), where , and satisfies the Lévy-Itô decomposition gives a decomposition of into the sum of three independent Lévy processes: a Brownian motion with drift , a centered martingale associated with the small jumps of and a compound Poisson process associated with the large jumps of . More precisely, for any , can be decomposed as

 Xt =b(ε)t+ΣWt+limη→0(∑s≤tΔXs1η<|ΔXs|≤ε−t∫η<|x|≤εxν(dx))+∑s≤tΔXs1|ΔXs|>ε, :=b(ε)t+ΣWt+Mt(ε)+Zt(ε),∀t≥0 (2)

where denotes the jump at time of and

• is a standard Brownian motion;

• is a centered Lévy process (and a martingale) with a Lévy measure with support in i.e. it is the Lévy process associated to the jumps of smaller than ;

• is a Lévy process with a Lévy measure concentrated on i.e. it is a compound Poisson process of the form with intensity and jump measure ;

• , and are independent.

Contrary to the process , the processes and are well understood and have been extensively studied in the literature. In the present paper we focus on the process and our objective is twofold. On the one hand, we would like to understand when a Gaussian approximation of the increments of the process is valid in total variation. On the other hand, we would like to deduce new results to control the total variation distance between given increments of two Lévy processes. More precisely, in the first case, suppose one is given increments of the process , , collected at the sampling rate over the time interval , i.e. one observes . We propose a sharp control of

 ∥∥∥L((MiΔ(ε)−M(i−1)Δ(ε), i=1…,n))−N(0,Δ∫|x|<εx2ν(dx))⊗n∥∥∥TV, (3)

where stands for “the law of ” and stands for “the law of a Gaussian random variable with mean and variance ”. As is a centered Lévy process, its increments are i.i.d. centered random variables with variance given by , that is the reason why we compare with the product measure . In the second case, suppose one observes increments of two distinct Lévy processes , , at the sampling rate over the time interval . We provide a control of

 ∥L((X1iΔ−X1(i−1)Δ, i=1…,n))−L((X2iΔ−X2(i−1)Δ, i=1…,n))∥TV.

In the sequel, we exploit several approximations for different distances and objects. We introduce the following notations. For any define , the centered compound Poisson process that approximates as , by

 Mt(η,ε)=∑s≤tΔXs1η<|ΔXs|≤ε−t∫η<|x|≤εxν(dx)=Nt∑i=1Yi−t∫η<|x|≤εxν(dx)

where is a Poisson process with intensity and the are i.i.d. with jump measure

 P(Y1∈B)=1λη,ε∫B∩{η<|x|≤ε}ν(dx)∀B∈B(R).

Moreover, we denote the first moments of by

 σ2(η,ε)=∫η<|x|≤εx2ν(dx),μj(η,ε)=∫η<|x|≤εxjν(dx),andμj(ε)=μj(0,ε)j=3,4.

Finally we recall that the total variation distance between two probability measures and defined on the same -field is defined as

 ∥P1−P2∥TV:=supB∈B|P1(B)−P2(B)|=12∫∣∣∣dP1dμ(x)−dP2dμ(x)∣∣∣μ(dx),

where is a common dominating measure for and . The -distance is instead defined as follows:

 χ2(P1,P2)=⎧⎨⎩∫(dP1dP2−1)2dP2if P1≪P2,+∞otherwise,

where the symbol “” stands for “absolutely continuous with respect to”.

In order to ease the reading, if and are random variables with densities and with respect to a same dominating measure, we sometimes write or instead of . The same type of abbreviation is used for any other distance.

## 2 Gaussian approximation for MΔ(ε) in total variation distance

Our main goal is to understand under which circumstances a Gaussian approximation for the small jumps of a Lévy process is adequate. Of course, what is meant by “adequate” must be clarified. As statisticians, such approximation is justified if we are not able to distinguish from discrete observations of the increments of the small jumps and independent Gaussian random variables. More concretely, we will be able to establish that a Gaussian approximation for is valid if we can provide a sharp control in total variation distance. This is the subject of this section. First, we prove an upper bound for (3) which is, using the independence and stationarity of the increments of , equivalent to prove an upper bound for

 ∥∥MΔ(ε)⊗n−N(0,Δσ2(ε))⊗n∥∥TV. (4)

Several intermediate approximations given in Propositions 1, 2 and 3 are necessary to control (4). The final result is given in Theorem 2. At the beginning of Section 2.1, we outline the main steps needed to establish Theorem 2. Proofs are postponed to Section 4 whereas the optimality of Theorem 2 is confirmed in Section 2.2 where a lower bound result is presented.

### 2.1 Upper bound

#### 2.1.1 Preliminary approximations

We establish an upper bound for (4) in Theorem 2 below, many auxiliary approximations are used to prove it, which are gathered in this Section. The starting point is that the process can be decomposed as follows. For any and for any ,

 Mt(ε)=Mt(η)+Nt∑i=1Yi−t∫η<|x|≤εxν(dx)=:Mt(η)+Mt(η,ε), (5)

where we recall that is a Poisson process with intensity and the ’s are i.i.d. with density . The interest in writing as in (5) lies in the fact that the process is more manageable than , being a Lévy process with finite Lévy measure. The spirit of the approximation of by is to use the closeness, for small enough, between and on the one hand and between and on the other hand.

First, we try to understand if, for small enough, working with discrete observations of is somehow equivalent to working with those of . What is well known is that the law of converges almost surely and in to the law of as goes to , but this is not strong enough for our purpose. In Proposition 1, we establish a control in total variation distance between i.i.d. increments of theses quantities, under the assumption that is an infinite Lévy measure.

###### Proposition 1.

Let be an infinite Lévy measure absolutely continuous with respect to the Lebesgue measure. For any fixed , and , it holds,

 limsupη→0∥MΔ(η,ε)⊗n−MΔ(ε)⊗n∥TV≤2liminfε′→0∥MΔ(ε′,ε)⊗n−N(0,Δσ2(ε′,ε))⊗n∥TV.

Roughly speaking, Proposition 1 tells us that the distance between i.i.d. observations of and i.i.d. observations of is bounded by the approximation error one would make approximating the discrete observation of by the corresponding Gaussian observations.

Second, always under the assumption that is infinite, we control the distance between and , for that we use an Edgeworth expansion for the distribution of . However, this expansion will be sharp only when , appearing in the definition of , is large. Therefore, we need to localize the values taken by the Poisson process ; when goes to , its intensity increases and gets closer to its mean value This is why we introduce the following events, for a fixed ,

 Ai:={ω:λη,εΔ(1−ψ)

This events are equiprobable, set . For technical reasons, it is easier to work with the compound Poisson process with centered jumps than with the centered compound Poisson process . These two processes are equal if is symmetric and they are close to each other for small enough. Denote by (resp. ) the density of

 1P(A)(∑k∈N|k−λΔ|≤ψλΔP(NΔ=n)n∑i=1Yi−λη,εΔE[Y1])(resp. ∑k∈N|k−λΔ|≤ψλΔP(NΔ=n)P(A)n∑i=1(Yi−E[Y1])) (7)

and by (resp. ) the density of a centered Gaussian distribution with variance (resp. ). The densities (resp. ) can be interpreted as the densities of (resp. ) conditioned by the event Given that is a Poisson random variable with mean and that as , the event has exponentially small probability, that is why we focus on .

Proposition 2 below justifies the substitution of with . We prove the convergence in , as , between the laws of and restricted to the event , i.e. conditioned on the fact that , for some

###### Proposition 2.

Fix and let be an infinite Lévy measure absolutely continuous with respect to the Lebesgue measure. Then,

 limη→0∫|fη,ε,A(x)−˜fη,ε,A(x)|dx=0.

Next, we approximate , the density of the random variable conditional on the event when is small with respect to . Under the assumption is infinite, is large and roughly speaking we can think about as the density of the sum of

i.i.d bounded and centered random variables. In view of the central limit theorem,

is therefore close to a Gaussian density. However, quantifying this proximity requires to choose a suitable distance between and

Since our final goal is to control product measures in total variation, we select a metric that behaves well with respect to product measures and that is connected with total variation, such as, the Hellinger distance, the Kullback-Leibler divergence or the

distance. The latter was easier to work with in this context and lead to the following Proposition 3.

###### Proposition 3.

Fix , and let be an infinite Lévy measure absolutely continuous with respect to the Lebesgue measure. Then, for any and there exists independent of and , such that for any

 χ2(˜fη,ε,A,˜φη,ε)≤Cμ23(η,ε)Δ(σ2(η,ε))3+C′μ24(η,ε)Δ2(σ2(η,ε))4. (8)

#### 2.1.2 Upper bound result : Gaussian approximation

Collecting all the pieces together, it is finally possible to state under which conditions on , and , a Gaussian approximation of the small jumps is valid, provided that is sufficiently active.

###### Theorem 2.

Let be an infinite Lévy measure absolutely continuous with respect to the Lebesgue measure. Then, for any , and we have

 ∥∥MΔ(ε)⊗n−N(0,Δσ2(ε))⊗n∥∥TV≤3 ⎷(1+μ24(ε)Δ2(σ2(ε))4+76μ23(ε)Δ(σ2(ε))3)n−1. (9)
###### Proof of Theorem 2.

Let , by the triangle inequality it holds

 ∥M⊗nΔ(ε)−N(0,Δσ2(ε))⊗n∥TV ≤∥M⊗nΔ(ε)−MΔ(η,ε)⊗n∥TV+∥MΔ(η,ε)⊗n−N(0,Δσ2(η,ε))⊗n∥TV +∥N(0,Δσ2(η,ε))⊗n−N(0,Δσ2(ε))⊗n∥TV =T1+T2+T3.

The first term , is controlled by Proposition 1. For the term , recall that is of the form ; together with the definition of the sets given in (6), it follows that

 T2 ≤P((NΔ,N2Δ−NΔ,…,NnΔ−N(n−1)Δ)∈(n⨂i=1Ai)c)+∥∥f⊗nη,ε,A−φ⊗nη,ε∥∥TV. (10)

The first term in (10) is bounded using the Chernov inequality and ,

 nP(|NΔ−λη,εΔ|>ψλη,εΔ)≤2ne−ψ2λη,εΔ2. (11)

For the second term in (10) observe that by triangular inequality joined with Lemma 13, we have

 ∥∥f⊗nη,ε,A−φ⊗nη,ε∥∥TV ≤∥∥f⊗nη,ε,A−˜f⊗nη,ε,A∥TV+∥˜f⊗nη,ε,A−φ⊗nη,ε∥∥TV ≤√2n∥fη,ε,A−˜fη,ε,A∥TV+∥˜f⊗nη,ε,A−˜φ⊗nη,ε∥∥TV+∥φ⊗nη,ε−˜φ⊗nη,ε∥∥TV. (12)

The first term in (12) is controlled by means of Proposition 2, whereas for the second term we use the following well known relation on total variation and distances (see e.g. Tsybakov [tsybakov2009introduction])

 χ2(P⊗n1,P⊗n2) =(1+χ2(P1,P2))n−1and∥P1−P2∥TV≤√χ2(P1,P2).

In particular we get

Applying Proposition 3 we therefore deduce that for any and there exists such that

 ∥∥˜f⊗nη,ε,A−˜φ⊗nη,ε∥∥TV≤ ⎷(1+Cμ23(η,ε)Δ(σ2(η,ε))3+C′μ24(η,ε)Δ2(σ2(η,ε))4)n−1. (13)

The fact that and vanish as is a consequence of Lemma 10 joined with the fact that if then

 limη→01λη,ε(∫η<|x|<εxν(dx))2=0, (14)

as proved in Lemma 2. Finally, for any , having (11) vanishes and we have:

 ∥M⊗nΔ(ε)−N(0,Δσ2(ε))⊗n∥TV ≤liminfη→0∥M⊗nΔ(ε)−MΔ(η,ε)⊗n∥TV