Let F∈CD×D denote the D×D
discrete Fourier transform matrix: its
(i,j)th entry is
where ω:=exp(−2πi/D) is a primitive root of unity.
Let μ:=Fβ for some β∈CD.
Consider the following observation model:

S and T are independent and uniformly random subsets of [D] of cardinalities n and p, respectively.

We observe the n×p design matrix FS,T and ndimensional vector of responses μS.
Here, FS,T is the submatrix of F with rows from S and columns from T, and μS is the subvector of μ of entries from S.
The learner fits regression coefficients ^β=(^β1,…,^βD) with
This can be regarded as a onedimensional version of the random Fourier features model studied by rahimi2008random for functions defined on the unit circle.
One important property of the discrete Fourier transform matrix that we use is that the matrix FA,B has rank min{A,B} for any A,B⊆[D].
This is a consequence of the fact that F is Vandermonde.
Thus, for p≥n, we have

F†S,T=F∗S,T(FS,TF∗S,T)−1. 

In the remainder of this section, we analyze the risk of ^β under a random model for β, where
(which implies E[∥β∥2]=1).
The random choice of β is independent of S and T.
Considering the risk under this random model for β is a form of averagecase analysis.
For simplicity, we only consider the regime where p≥n, as it suffices to reveal some key aspects of the risk of ^β.
Following the arguments from Section 2.1, we have

∥β−^β∥2 
=∥βTc∥2+∥(I−F†S,TFS,T)βT∥2+∥F†S,TFS,TcβTc∥2 



=∥β∥2−∥F†S,TFS,TβT∥2+∥F†S,TFS,TcβTc∥2. 

Now we take (conditional) expectations with respect to β, given S and T:

E[∥β−^β∥2∣S,T]=1−1D⋅tr((F†S,TFS,T)∗(F†S,TFS,T))+1D⋅tr((F†S,TFS,Tc)∗(F†S,TFS,Tc)). 

(1) 
Since FS,T has rank n, the first trace expression is equal to

tr((F†S,TFS,T)∗(F†S,TFS,T))=n. 

For the second trace expression, we use the explicit formula for F†S,T and the fact that FS,TF∗S,T+FS,TcF∗S,Tc=I to obtain

tr((F†S,TFS,Tc)∗(F†S,TFS,Tc)) 
=tr(F∗S,Tc(FS,TF∗S,T)−1FS,Tc) 



=tr(F∗S,Tc(I−FS,TcF∗S,Tc)−1FS,Tc) 



=tr((I−FS,TcF∗S,Tc)−1FS,TcF∗S,Tc) 



=n∑i=1λi1−λi 



=−n+n∑i=111−λi, 

where λ1,…,λn∈[0,1]
are the eigenvalues of
FS,TcF∗S,Tc.
Therefore, from
Equation 1, we have

E[∥β−^β∥2]=1−2nD+nD⋅E[1nn∑i=111−λi](∗). 

A precise characterization of (∗) is difficult to obtain.
Under a slightly different model, in which membership in S (respectively, T) is determined by independent Bernoulli variables with mean n/D (respectively, p/D), we can use asymptotic arguments to characterize the empirical eigenvalue distribution for FS,TF∗S,T.
Assuming the asymptotic equivalence of these random models for S and T, we find that the quantity (∗) approaches
as D,n,p→∞, where ρn:=n/D and ρp:=p/D are held fixed and ρp>ρn farrell2011limiting.
So, in this limit, we have

E[∥β−^β∥2]→1−ρn⋅(2−ρp⋅(1−ρn)ρp−ρn). 

This quantity diverges to +∞ as ρp→ρn, and decreases as ρp→1.
This is the same behavior as in the Gaussian model from Section 2 with random selection; we depict it empirically in Figure 3.