 # On the entropy numbers of the mixed smoothness function classes

Behavior of the entropy numbers of classes of multivariate functions with mixed smoothness is studied here. This problem has a long history and some fundamental problems in the area are still open. The main goal of this paper is to develop a new method of proving the upper bounds for the entropy numbers. This method is based on recent developments of nonlinear approximation, in particular, on greedy approximation. This method consists of the following two steps strategy. At the first step we obtain bounds of the best m-term approximations with respect to a dictionary. At the second step we use general inequalities relating the entropy numbers to the best m-term approximations. For the lower bounds we use the volume estimates method, which is a well known powerful method for proving the lower bounds for the entropy numbers. It was used in a number of previous papers.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Behavior of the entropy numbers of classes of multivariate functions with mixed smoothness is studied here. This problem has a long history and some fundamental problems in the area are still open. The main goal of this paper is to develop a new method of proving the upper bounds for the entropy numbers. This method is based on recent developments of nonlinear approximation, in particular, on greedy approximation. This method consists of the following two steps strategy. At the first step we obtain bounds of the best -term approximations with respect to a dictionary. At the second step we use general inequalities relating the entropy numbers to the best -term approximations. For the lower bounds we use the volume estimates method, which is a well known powerful method for proving the lower bounds for the entropy numbers. It was used in a number of previous papers. Taking into account the fact that there are fundamental open problems in the area, we give a detailed discussion of known results and of open problems. We also provide some comments on the techniques, which were used to obtain known results. Then we formulate our new results and compare them to the known results.

Let be a Banach space and let denote the unit ball of with the center at . Denote by a ball with center and radius : . For a compact set and a positive number we define the covering number as follows

 Nε(A):=Nε(A,X):=min{n:∃y1,…,yn:A⊆∪nj=1BX(yj,ε)}.

It is convenient to consider along with the entropy the entropy numbers :

 εk(A,X):=inf{ε:∃y1,…,y2k∈X:A⊆∪2kj=1BX(yj,ε)}.

Let be the univariate Bernoulli kernels

 Fr(x,α):=1+2∞∑k=1k−rcos(kx−απ/2).

For and we define

 Fr(x,α):=d∏i=1Fr(xi,αi)

and

 Wrq,α:={f:f=Fr(⋅,α)∗φ,∥φ∥q≤1}

where means convolution. In the univariate case we use the notation .

It is well known that in the univariate case

 εk(Wrq,α,Lp)≍k−r (1.1)

holds for all and . We note that condition is a necessary and sufficient condition for compact embedding of into . Thus (1.1) provides a complete description of the rate of in the univariate case. We point out that (1.1) shows that the rate of decay of depends only on and does not depend on and . In this sense the strongest upper bound (for ) is and the strongest lower bound is .

There are different generalizations of classes to the case of multivariate functions. In this section we only discuss known results for classes of functions with bounded mixed derivative. For further discussions see , Chapter 3 and .

The following two theorems are from  and .

###### Theorem 1.1.

For and one has

 εk(Wrq,α,Lp)≪k−r(logk)r(d−1).
###### Theorem 1.2.

For and one has

 εk(Wrq,α,L1)≫k−r(logk)r(d−1).

The problem of estimating has a long history. The first result on the right order of was obtained by Smolyak . Later (see ,  and theorems above) it was established that

 εk(Wrq,α,Lp)≍k−r(logk)r(d−1) (1.2)

holds for all , . The case , was established by Dinh Dung . Belinskii  extended (1.2) to the case when .

It is known in approximation theory (see ) that investigation of asymptotic characteristics of classes in becomes more difficult when or takes value or than when . It turns out to be the case for too. It was discovered that in some of these extreme cases ( or equals or ) relation (1.2) holds and in other cases it does not hold. We describe the picture in detail. It was proved in  that (1.2) holds for , , . It was also proved that (1.2) holds for , (see  for and  for ). Summarizing, we state that (1.2) holds for and , for all (with appropriate restrictions on ). This easily implies that (1.2) also holds for , . For all other pairs , namely, for , and , the rate of is not known in the case . It is an outstanding open problem.

In the case this problem is essentially solved. We now cite the corresponding results. The first result on the right order of in the case was obtained by Kuelbs and Li  for , . It was proved in  that

 εk(Wrq,α,L∞)≍k−r(logk)r+1/2 (1.3)

holds for , . We note that the upper bound in (1.3) was proved under condition and the lower bound in (1.3) was proved under condition . Belinskii  proved the upper bound in (1.3) for under condition . Relation (1.3) for under assumption was proved in .

The case , was settled by Kashin and Temlyakov . The authors proved that

 εk(Wr1,α,Lp)≍k−r(logk)r+1/2 (1.4)

holds for , and

 εk(Wr1,0,L∞)≍k−r(logk)r+1,r>1. (1.5)

Let us make an observation on the base of the above discussion. In the univariate case the entropy numbers have the same order of decay with respect to for all pairs , . In the case we have three different orders of decay of which depend on the pair . For instance, in the case it is , in the case , , it is and in the case , it is .

We discussed above results on the right order of decay of the entropy numbers. Clearly, each order relation is a combination of the upper bound and the matching lower bound . We now briefly discuss methods that were used for proving upper and lower bounds. The upper bounds in Theorem 1.1 were proved by the standard method of reduction by discretization to estimates of the entropy numbers of finite-dimensional sets. Here results of ,  or  are applied. It is clear from the above discussion that it was sufficient to prove the lower bound in (1.2) in the case . The proof of this lower bound (see Theorem 1.2) is more difficult and is based on nontrivial estimates of the volumes of the sets of Fourier coefficients of bounded trigonometric polynomials. Theorem 2.4 (see below) plays a key role in this method.

An analogue of the upper bound in (1.3) for any was obtained by Belinskii : for and we have

 εk(Wrq,α,L∞)≪k−r(logk)(d−1)r+1/2. (1.6)

That proof is based on Theorem 2.2 (see below).

Kuelbs and Li  discovered the fact that there is a tight relationship between small ball problem and the behavior of the entropy . Based on results obtained by Livshits and Tsirelson , by Bass , and by Talagrand  for the small ball problem, they proved

 εk(W12,α,L∞)≍k−1(lnk)3/2. (1.7)

Proof of the most difficult part of (1.7) – the lower bound – is based on a special inequality, known now as the Small Ball Inequality, for the Haar polynomials proved by Talagrand  (see  for a simple proof).

We discussed above known results on the rate of decay of . In the case the picture is almost complete. In the case the situation is fundamentally different. The problem of the right order of decay of is still open for , and , . In particular, it is open in the case , , that is related to the small ball problem. We discuss in more detail the case , . We pointed out above that in the case the proof of lower bounds (the most difficult part) was based on the Small Ball Inequalities for the Haar system for and for the trigonometric system for all . The existing conjecture is that

 εk(Wrq,α,L∞)≍k−r(lnk)(d−1)r+1/2,1

for large enough . The upper bound in (1.8) follows from (1.6). It is known that the corresponding lower bound in (1.8) would follow from the -dimensional version of the Small Ball Inequality for the trigonometric system.

The main goal of this paper is to develop new techniques for proving upper bounds for the entropy numbers. We consider here slightly more general classes than classes . Let

be a vector with nonnegative integer coordinates (

) and

 ρ(s):={k=(k1,…,kd)∈Zd+:[2sj−1]≤|kj|<2sj,j=1,…,d}

where denotes the integer part of a number . Define for

 δs(f):=∑k∈ρ(s)^f(k)ei(k,x),

and

 fl:=∑∥s∥1=lδs(f),l∈N0,N0:=N∪{0}.

Consider the class (see )

 Wa,bq:={f:∥fl∥q≤2−al(¯l)(d−1)b},¯l:=max(l,1).

Define

 ∥f∥Wa,bq:=supl∥fl∥q2al(¯l)−(d−1)b.

It is well known that the class is embedded in the class for . Classes provide control of smoothness at two scales: controls the power type smoothness and controls the logarithmic scale smoothness. Similar classes with the power and logarithmic scales of smoothness are studied in the recent book of Triebel . Here is one more class, which is equivalent to in the case (see ). Consider a class , which consists of functions with a representation (see Subsection 2.2 below for the definition of )

 f=∞∑n=1tn,tn∈T(Qn),∥tn∥q≤2−annb(d−1).

In the case classes are wider than .

The main results of the paper are the following theorems in the case for the extreme values of and . First, we formulate two theorems for the case .

###### Theorem 1.3.

Let and . Then for

 εk(Wa,b1,Lp)≍εk(¯Wa,b1,Lp)≍k−a(logk)a+b. (1.9)
###### Theorem 1.4.

Let and . Then

 εk(Wa,b1,L∞)≍εk(¯Wa,b1,L∞)≍k−a(logk)a+b+1/2. (1.10)

Second, we formulate three theorems for the case .

###### Theorem 1.5.

We have for all

 εk(Wa,b∞,L1)≫k−a(logk)(d−1)(a+b)−1/2. (1.11)
###### Theorem 1.6.

We have for , ,

 εk(Wa,b∞,Lp)≍εk(¯Wa,b∞,Lp)≍k−a(logk)a+b−1/2. (1.12)
###### Theorem 1.7.

We have for all , ,

 εk(Wa,bq,Lq)≍εk(¯Wa,bq,Lq)≍k−a(logk)(d−1)(a+b). (1.13)

Let us make some comments on Theorem 1.3. As we already mentioned above classes are close to classes but they are different. We show that they are different even in the sense of asymptotic behavior of their entropy numbers. We point out that the right order of is not known for . We confine ourselves to the case . It is proved in  that for

 εk(Wr1,α,Lp)≍k−r(logk)r+1/2,1≤p<∞. (1.14)

Theorem 1.3 gives for

 εk(Wr,01,Lp)≍k−r(logk)r,1≤p<∞. (1.15)

This shows that in the sense of the entropy numbers class is smaller than . It is interesting to compare (1.14) and (1.15) with the known estimates in the case

 εk(Wrq,α,Lp)≍εk(Wr,0q,Lp)≍k−r(logk)r,1≤p<∞. (1.16)

Relation (1.16) is for the case . The general case of is also known in this case (see (1.2) and its discussion above and also see Section 3.6 of  for the corresponding results and historical comments). Relations (1.15) and (1.16) show that in the sense of entropy numbers the class behaves as a limiting case of classes when .

The proof of upper bounds in Theorems 1.3 and 1.4 is based on greedy approximation technique. It is a new and powerful technique. In particular, Theorem 1.4 gives the same upper bound as in (1.6) for the class , which is wider than any of the classes , , from (1.6). In Section 7 we develop mentioned above new technique, which is based on nonlinear -term approximations, to prove the following result.

###### Theorem 1.8.

Let and . Then

 εk(Wa,bq,L∞)≪k−a(logk)(d−1)(a+b)+1/2. (1.17)

In particular, Theorem 1.8 implies (1.6).

Theorem 1.6 discovers an interesting new phenomenon. Comparing (1.12) with (1.13), we see that the entropy numbers of the class in the space have different rate of decay in cases and . We note that in the proof of the upper bounds in this new phenomenon we use the Riesz products for the hyperbolic crosses. This technique works well in the case but we do not know how to extend it to the general case . This difficulty is of the same nature as the corresponding difficulty in generalizing the Small Ball Inequality from to (see , Ch. 3, for further discussion). We already mentioned above that in studying the entropy numbers of function classes the discretization technique is useful. Classically, the Marcinkiewicz theorem serves as a powerful tool for discretizing the -norm of a trigonometric polynomial. It works well in the multivariate case for trigonometric polynomials with frequencies from a parallelepiped. However, there is no analog of Marcinkiewicz’ theorem for hyperbolic cross polynomials (see  and , Section 2.5, for a discussion). Thus, in Sections 5–7 we develop a new technique for estimating the entropy numbers of the unit balls of the hyperbolic cross polynomials. The most interesting results are obtained in the dimension . It would be very interesting to extend these results to the case . It is a challenging open problem.

Finally, we emphasize that in the case , when the classes are embedded in the classes , the new technique, developed in this paper, provides all known upper bounds. In the case Theorem 5.1 gives the upper bounds in (1.2), and in the case Theorem 1.8 gives the upper bounds in (1.6).

## 2 Known results

### 2.1 General inequalities

For the reader’s convenience we collect in this section known results, which will be used in this paper. The reader can find results of this subsection, except Theorem 2.3, and their proofs in , Chapter 3.

###### Proposition 2.1.

Let , and let be a subspace of . Then

 Nε(A,X)≥N2ε(A,Y).

Let us consider the space equipped with different norms, say, norms and . For a Lebesgue measurable set we denote its Lebesgue measure by .

###### Theorem 2.1.

For any two norms and and any we have

 1εDvol(BY)vol(BX)≤Nε(BY,X)≤vol(BY(0,2/ε)⊕BX)vol(BX). (2.1)

Let us formulate one immediate corollary of Theorem 2.1.

###### Corollary 2.1.

For any -dimensional real Banach space we have

 ε−D≤Nε(BX,X)≤(1+2/ε)D,

and, therefore,

 εk(BX,X)≤3(2−k/D).

Let denote the norm and let be a unit ball in . Denote the boundary of . We define by the normalized -dimensional measure on . Consider another norm on and denote by the equipped with .

###### Theorem 2.2.

Let be equipped with and

 MX:=∫SD−1∥x∥dσ(x).

Then we have

 εk(BD2,X)≪MX{(D/k)1/2,k≤D2−k/D,k≥D.

The following Nikol’skii-type inequalities are known (see , Chapter 1, Section 2).

###### Theorem 2.3.

Let . For any (see Subsection 2.2 below for the definition of ) we have

 ∥t∥p≤C(q,p,d)2βn∥t∥q,β:=1/q−1/p.

### 2.2 Volume estimates

Denote for a natural number

 Qn:=∪∥s∥1≤nρ(s);ΔQn:=Qn∖Qn−1=∪∥s∥1=nρ(s)

with for . We call a set hyperbolic layer. For a set denote

 T(Λ):={f∈L1:^f(k)=0,k∈Zd∖Λ},T(Λ)p:={f∈T(Λ):∥f∥p≤1}.

For a finite set we assign to each a vector

 A(f):={(Re^f(k),Im^f(k)),k∈Λ}∈R2|Λ|

where denotes the cardinality of and define

 BΛ(Lp):={A(f):f∈T(Λ)p}.

The volume estimates of the sets and related questions have been studied in a number of papers: the case , in ; the case , in , . In the case , , the following estimates are known.

###### Theorem 2.4.

For any we have

 (vol(BΠ(N,d)(Lp)))(2|Π(N,d)|)−1≍|Π(N,d)|−1/2,

with constants in that may depend only on .

We note that the most difficult part of Theorem 2.4 is the lower estimate for . The corresponding estimate was proved in the case in  and in the general case in  and  by a method different from the one in . The upper estimate for in Theorem 2.4 can be easily reduced to the volume estimate for an octahedron (see, for instance ). In the case Theorem 2.4 is a direct corollary of the well known estimates of the volume of the Euclidean unit ball.

The case of arbitrary and was studied in . The results of  imply the following estimate.

###### Theorem 2.5.

For any finite set and any we have

 vol(BΛ(Lp))(2|Λ|)−1≍|Λ|−1/2.

The following result was obtained in .

###### Theorem 2.6.

Let have the form , is a finite set. Then for any we have

 vol(BΛ(Lp))(2|Λ|)−1≍|Λ|−1/2.

In particular, Theorem 2.6 implies for and that

 (vol(BΔQn(Lp)))(2|ΔQn|)−1≍|ΔQn|−1/2≍(2nn)−1/2. (2.2)

The following result was obtained in . Denote .

###### Theorem 2.7.

In the case we have

 (vol(BΔQn(L∞)))1/D≍(2nn2)−1/2. (2.3)

The following lemma from  is an important ingredient of analysis in this paper. For the reader’s convenience we give a proof of this lemma here.

###### Lemma 2.1.

Let and . Then

 (vol(BΛ(L∞)))1/N≥C(d)(Nn)−1/2.
###### Proof.

We use the following result of E. Gluskin .

###### Theorem 2.8.

Let , , and

 W(Y):={x∈RN:|(x,yi)|≤1,i=1,…,M}.

Then

 (vol(W(Y)))1/N≥C(1+log(M/N))−1/2.

Consider the following lattice on the :

 Gn:={x(l)=(l1,…,ld)π2−n−1,1≤lj≤2n+2,lj∈N,j=1,…,d}.

It is clear that . It is well known (see , Ch.2, Theorem 2.4) that for any one has

 ∥f∥∞≤C1(d)maxx∈Gn|f(x)|.

Thus, for any we have

 {A(f):f∈T(Λ),|f(x)|≤C1(d)−1,x∈Gn}⊆BΛ(L∞). (2.4)

Further

 |f(x)|2=|∑k∈Λ^f(k)ei(k,x)|2=
 (∑k∈ΛRe^f(k)cos(k,x)−Im^f(k)sin(k,x))2
 +(∑k∈ΛRe^f(k)sin(k,x)+Im^f(k)cos(k,x))2.

We associate with each point two vectors and from :

 y1(x):={(cos(k,x),−sin(k,x)),k∈Λ},
 y2(x):={(sin(k,x),cos(k,x)),k∈Λ}.

Then

 ∥y1(x)∥2=∥y2(x)∥2=|Λ|

and

 |f(x)|2=(A(f),y1(x))2+(A(f),y2(x))2.

It is clear that the condition is satisfied if

 |(A(f),yi(x))|≤2−1/2C1(d)−1,i=1,2.

Let now

 Y:={yi(x)/∥yi(x)∥,x∈Gn,i=1,2}.

Then and by Theorem 2.8

 (vol(W(Y)))1/N≫(1+log(M/N))−1/2≫n−1/2. (2.5)

Using that the condition

 |(A(f),yi(x))|≤1

is equivalent to the condition

 |(A(f),yi(x)/∥yi(x)∥)|≤(N/2)−1/2

we get from (2.4) and (2.5)

 (vol(BΛ(L∞)))1/N≫(Nn)−1/2.

This completes the proof of Lemma 2.1

## 3 New lower bounds. The volumes technique

In this section we prove lower bounds in Theorems 1.31.7 from the Introduction.

Proof of lower bounds in Theorems 1.3 and 1.7. The lower bound in Theorem 1.3 follows from the lower bound in Theorem 1.7 with . We prove the lower bounds for the with and any . This lower bound is derived from the well known simple inequality (see Corollary 2.1 above)

 Nε(BX,X)≥ε−D (3.1)

for any -dimensional real Banach space . Consider as a Banach space the with norm. Clearly, it can be seen as a -dimensional real Banach space with . It follows from the definition of that

 2−annb(d−1)T(ΔQn)q⊂Wa,bq. (3.2)

Take . Then (3.1) implies that

 εk(T(ΔQn)q,Lq∩T(ΔQn))≫1. (3.3)

We now use one more well known fact from the entropy theory – Proposition 2.1. This and inequality (3.3) imply

 εk(T(ΔQn)q,Lq)≫1. (3.4)

Taking into account (3.2) and the fact we derive from (3.4) the required lower bound for the .

The lower bounds in Theorems 1.3 and 1.7 are proved.

Proof of lower bounds in Theorem 1.4. We prove the lower bound for . This proof is somewhat similar to the proof of lower bounds in Theorem 1.3. Instead of (3.1) we now use the inequality (see Theorem 2.1 above)

 Nε(BY,X)≥ε−Dvol(BY)vol(BX) (3.5)

with and . It follows from the definition of that

 2−annb(d−1)T(ΔQn)1⊂Wa,b1. (3.6)

Take . Then (3.5), Theorem 2.7, and (2.2) imply that

 εk(T(ΔQn)1,L∞∩T(ΔQn))≫n1/2. (3.7)

Proposition 2.1 and inequality (3.7) imply

 εk(T(ΔQn)1,L∞)≫n1/2. (3.8)

Taking into account (3.6) and the fact we derive from (3.8) the required lower bound for the .

The lower bounds in Theorem 1.4 are proved.

Proof of Theorem 1.5. We prove the lower bound for . This proof goes along the lines of the above proof of lower bounds in Theorem 1.4. We use (3.5) with and . It follows from the definition of that

 2−annb(d−1)T(ΔQn)∞⊂Wa,b∞. (3.9)

Take . Then (3.5), Lemma 2.1 with , and Theorem 2.5 with imply that

 εk(T(ΔQn)∞,L1∩T(ΔQn))≫n−1/2. (3.10)

Proposition 2.1 and inequality (3.10) imply

 εk(T(ΔQn)∞,L1)≫n−1/2. (3.11)

Taking into account (3.9) and the fact we derive from (3.11) the required lower bound for the .

The lower bounds in Theorem 1.5 are proved.

Proof of lower bounds in Theorem 1.6. The required lower bounds follow from Theorem 1.5.

## 4 Upper bounds. A general scheme

From finite dimensional to infinite dimensional. Let and be two Banach spaces. We discuss a problem of estimating the entropy numbers of an approximation class, defined in the space , in the norm of the space . Suppose a sequence of finite dimensional subspaces , , is given. Define the following class

 ¯Wa,bX:=¯Wa,bX{Xn}:={f∈X:f=∞∑n=1fn,fn∈Xn,
 ∥fn∥X≤2−annb,n=1,2,…}.

In particular,

 ¯Wa,bq=¯Wa,b(d−1)Lq{T(Qn)}.

Denote and assume that for the unit balls we have the following upper bounds for the entropy numbers: there exist real and nonnegative and such that

 εk(B(Xn),Y)≪nα{(Dn/(k+1))β(log(4Dn/(k+1)))