# Channel Estimation with Reconfigurable Intelligent Surfaces – A General Framework

Optimally extracting the advantages available from reconfigurable intelligent surfaces (RISs) in wireless communications systems requires estimation of the channels to and from the RIS. The process of determining these channels is complicated by the fact that the RIS is typically composed of passive elements without any data processing capabilities, and thus the channels must be estimated indirectly by a non-colocated device, typically a controlling base station. In this article, we examine channel estimation for RIS-based systems from a fundamental viewpoint. We study various possible channel models and the identifiability of the models as a function of the available pilot data and behavior of the RIS during training. In particular, we consider situations with and without line-of-sight propagation, single- and multiple-antenna configurations for the users and base station, correlated and sparse channel models, single-carrier and wideband OFDM scenarios, availability of direct links between the users and base station, exploitation of prior information, as well as a number of other special cases. We further conduct numerical comparisons of achievable performance for various channel models using the relevant Cramer-Rao bounds.

## Authors

• 24 publications
• 9 publications
• 2 publications
• 51 publications
• 136 publications
08/04/2020

### Channel Estimation for RIS-Empowered Multi-User MISO Wireless Communications

Reconfigurable Intelligent Surfaces (RISs) have been recently considered...
01/26/2020

### Parallel Factor Decomposition Channel Estimation in RIS-Assisted Multi-User MISO Communication

Reconfigurable Intelligent Surfaces (RISs) have been recently considered...
07/31/2021

02/24/2020

### A Hardware Architecture for Reconfigurable Intelligent Surfaces with Minimal Active Elements for Explicit Channel Estimation

Intelligent surfaces comprising of cost effective, nearly passive, and r...
11/26/2020

### Joint Channel Estimation and Signal Recovery in RIS-Assisted Multi-User MISO Communications

Reconfigurable Intelligent Surfaces (RISs) have been recently considered...
09/10/2019

### A Machine Learning Method for Prediction of Multipath Channels

In this paper, a machine learning method for predicting the evolution of...
03/07/2022

### A Random Access Protocol for RIS-Aided Wireless Communications

Reconfigurable intelligent surfaces (RISs) are arrays of passive element...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

There has been an explosion of interest in the use of reconfigurable metasurfaces for wireless communication systems in the last few years. Such reconfigurable intelligent surfaces (RIS) provide tunable degrees-of-freedom for adjusting the propagation characteristics of problematic channels (e.g., sparse channels with frequent blockages) that make them a valuable resource for maintaining and enhancing the quality of service (QoS) for users (UEs) in the network. However, most techniques that exploit this ability require channel state information (CSI) to and from the elements of the RIS, which is a challenge since the number of RIS elements may be very large, and more importantly, they are usually constructed only as passive devices without active transceivers or computational resources. Consequently, channel estimation for RIS-based systems has been a subject of intense study.

Because the RIS is passive, the CSI must be estimated by devices – most often a basestation (BS) or access point – that are not co-located with the RIS. For example, training signals transmitted by the UEs are received by the BS after reflection from the RIS, and possibly also over a direct path to the BS, and these known signals are exploited for CSI estimation. In order to estimate the RIS-based channel components, the reflection coefficients of the RIS must be varied as well, at least during a portion of the training period. However, even with variable training from the UEs and RIS, the fact that the impact of the RIS is only indirectly viewed in the data means that the complete structure of the channel is not identifiable. In particular, while the cascaded or composite channel from the UEs to the BS can be determined, the individual components of the channel involving the RIS cannot. Fortunately, this is typically not a problem for designing beamforming algorithms at the BS or optimizing the RIS reflection properties, since ultimately the QoS only depends on the composite channel.

A large amount of published work on CSI estimation for RIS-based systems has appeared recently. Initially, this work focused on estimating unstructured models, where the channels are simply described using complex gains [25, 60, 27, 54, 34, 19, 48, 57, 49, 51, 59]. Such models are simple and lead to straightforward algorithms, but the required training overhead is very large and may render such approaches impractical. Methods for reducing the training overhead, for example based on grouping the RIS elements or exploiting the common BS-RIS channel among the users, have been proposed, but larger reductions are possible when the channels are sparse if parametric or geometric channel models are used instead [6, 9, 45, 46, 35, 32, 61, 16, 50, 2, 31, 8, 30, 28]. In these models, the channels are parameterized by the angles of arrival (AoAs), angles of departure (AoDs) and complex gains of each propagation path. As long as the number of multipaths is not large, then the total number of parameters to be estimated can be 1-2 orders of magnitude smaller than in the unstructured case, and the amount of training can be correspondingly reduced. On the other hand, geometric models require knowledge of the array calibration and RIS element responses, as well as the model order; errors in the modeling assumptions will degrade some of this advantage. In addition, we will see later that the algorithms for estimating the geometric channel parameters can in general be quite complex.

Many CSI estimation techniques have been proposed under a wide array of assumptions, from Rayleigh fading to line-of-sight (LoS) propagation, single- to multi-antenna configurations, single- and multi-carrier modulation, scenarios with and without a direct link between BS and UEs, and a variety of other special cases. In this paper, we take a systematic approach to the problem and organize the various approaches that have been proposed – as well as some that have not – under a common framework. In this way, the advantages and disadvantages of different assumptions and solution approaches become clearer, and avenues for future work are elucidated.

After stating our general assumptions and notational conventions in Section II, we begin with a discussion of CSI estimation for unstructured channels in Section III. We will first consider the narrowband single user MIMO case and the corresponding least-squares (LS) and minimum mean squared error (MMSE) solutions, and then we will examine extensions to the wideband and multi-user cases, as well as special cases involving a single antenna BS and UEs and methods for reducing the training overhead. Then in Section IV we focus on estimation of geometric channel models, and we follow the same format of beginning with the narrowband single user MIMO case and then considering the same generalizations and special cases as in the previous section. Numerical examples involving the Cramér-Rao bound (CRB) will be given in Section V to illustrate the main conclusions. Several additional topics will be briefly considered in Section VI

, including the use of some active transceivers at the RIS, scenarios with more than one RIS, machine learning approaches, etc. Finally some conclusions and suggestions for future research are offered in Section

VII.

## Ii General Assumptions and Notation

In this paper, we primarily consider scenarios with a single basestation (BS), a single RIS, and potentially multiple co-channel UEs. Various assumptions are made about the number of antennas at the BS and UEs, and the number of UEs that are active. We assume the BS and UEs employ fully digital rather than hybrid digital/analog architectures. We also assume a standard time-division duplex protocol in which pilot symbols transmitted by the UEs in the uplink are exploited by the BS to obtain a channel estimate, which is then used for downlink beamforming or multiplexing. This assumes reciprocal uplink and downlink channels between the BS, RIS and users, which in turn typically requires some type of RF transceiver calibration and RIS elements whose behavior is independent of the angle of incidence. Pilots could also be embedded in the downlink for channel estimation at the UEs, but this is similar to the uplink problem and thus is not explicitly considered.

Matrices and vectors are denoted by boldface capital and lowercase letters, respectively. In some cases, the

-th column or row of a matrix will be denoted by or , respectively. The transpose, conjugate transpose, and conjugate are denoted by , and , respectively. The Kronecker, Khatri-Rao, and Hadamard products of two matrices are indicated by , and , respectively. An identity matrix is represented as , and vectors composed of all ones or zeros are denoted by and

, respectively. A circular complex multivariate Gaussian distribution with mean

and covariance is denoted by . The function creates a vector from matrix by stacking its columns. The function creates an integer from real number by truncating its decimal part, and is the modulo operator that returns the integer remainder of . A diagonal matrix with elements of vector on the diagonal is indicated by , and a block diagonal matrix with block entries is written .

The reflective properties of an RIS with elements is described by the diagonal matrix , where . There are a number of practical issues associated with that are important for RIS performance optimization, such as the dependence of the gains on the phases , the fact that the phases are typically discrete and frequency dependent, etc. For the most part, these issues are not directly relevant to the generic channel estimation problem, which only requires that be known and sufficiently controllable. However, certain simplifying assumptions about are made below for performance analyses or purposes of illustration.

## Iii Estimation of Unstructured Channel Models

We begin with models where the channel between individual network elements is described by a complex coefficient in the case of a narrowband single carrier signal, or a complex-valued impulse response for wideband transmission. Such unstructured or nonparametric channel models are appropriate for situations with rich multipath scattering (e.g., sub-6GHz systems), where it is difficult to describe the aggregate characteristics of the propagation environment. We initially focus on the narrowband single-user scenario, and then examine cases involving wideband signals or multiple users. As will become clear, the limiting factor with unstructured CSI estimation is the large training overhead that is required. Approaches for reducing the training overhead are discussed at the end of the section.

### Iii-a Narrowband Single User MIMO

The scenario assumed here is as depicted in Fig. 1, with an -antenna BS, an -element RIS, and a single UE with antennas. The geometries of the RIS and the arrays at the BS and UE are arbitrary. If the UE transmits the vector at time , the signal received at the BS is given by

 yt=√P(Hd+HΦtGH)xt+nt, (1)

where are respectively the channels between the BS and UE, the BS and RIS, and the RIS and UEs, and denotes additive noise or interference. Assuming and ,

represents the transmit power, and the signal-to-noise ratio (SNR) is defined as

. The channels are all assumed to be block flat fading and constant over a coherence interval sufficiently long to permit channel estimation and subsequent data transmission. On the other hand, the reflection coefficients of the RIS, , can vary synchronously with the UE uplink transmission. Some prior work ignores the direct channel component , assuming that it is either not present (e.g., due to a blockage), or that it was estimated in a previous step and its contribution has been removed from the received data, i.e., .

It is important to note that not all of the components of the channel-related term are individually identifiable. In particular, for any invertible diagonal matrix , we have

 HΦtGH=HΛΦtΛ−1GH=~HΦt~GH, (2)

where and . Thus there is a scaling ambiguity between each pair of the columns of and that cannot be resolved using data obtained as in (1). Most methods for beamforming, precoding or RIS reflection optimization do not require this ambiguity to be resolved, although as briefly discussed later, with certain additional information the individual channel components can be identified. For this reason, channel estimation in the context of RIS-aided communication systems focuses primarily on determination of the composite or cascaded channel , defined using properties of the Khatri-Rao product:

 vec(Hd+HΦtGH)=[hdG∗⋄H][1ϕt]≡Hc~ϕt, (3)

with . Eq. (3) together with further use of the Kronecker product allows us to rewrite (1) in a compact form:

 yt =√P(xTt⊗IM)Hc~ϕt+nt (4a) =√P[~ϕTt⊗xTt⊗IM]hc+nt (4b) ≡√PZthc+nt, (4c)

where and the matrix is implicitly defined.

The composite channel is clearly underdetermined in Eq. (4), and thus multiple pilot symbols must be transmitted in order for it to be uniquely estimated. Combining the data from such pilots together, we have

 y=⎡⎢ ⎢⎣y1⋮yT⎤⎥ ⎥⎦=√P⎡⎢ ⎢⎣Z1⋮ZT⎤⎥ ⎥⎦hc+n≡√PZhc+n. (5)

Provided that and is full rank, there are two common ways to estimate , as discussed below.

#### Iii-A1 Least Squares

The simplest approach for estimating is to use the standard deterministic least-squares (LS) method,

 ^hc,LS=argminhc∥y−√PZhc∥2, (6)

whose solution is given by

 ^hc,LS=1√PZ†y, (7)

where . Assuming again that and that the noise is temporally uncorrelated, the LS channel estimate is unbiased and equivalent to the maximum likelihood (ML) estimate, and its covariance matrix corresponds to the CRB:

 R^hc,LS =E{(^hc,LS−hc)(^hc,LS−hc)H} (8a) =1PE{Z†nnH(Z†)H}=σ2P(ZHZ)−1. (8b)

Ideally, and should be designed to optimize the CSI estimation performance. While such an optimization is generally intractable, a good choice can be found [25] by noting that for any positive definite matrix , we have

 [B−1]ii≥1Bii, (9)

with equality for all only if is diagonal. Thus, a good choice for would make (8) diagonal. Such a choice may not be optimal in general, but a diagonal covariance matrix also greatly simplifies the computation of in (7).

The most common training approach that meets the above design goals breaks the training interval into subblocks of length , where is assumed to be an integer. For each subblocks, , is held constant, while the pilots are chosen as an orthonormal sequence that repeats itself for each subblock. For example, the subblock sequence for the UE is , where , which is then repeated times:

 xt pilots =[XX⋯Xrepeated % T/K times] (10a) ϕt pilots =[¯ϕ1⋯¯ϕ1repeated K times⋯¯ϕTK⋯¯ϕTKrepeated K times] (10b)

Using this approach, we have

 ZHZ =T∑t=1[~ϕ∗t~ϕTt⊗x∗txTt⊗IM] (11a) (11b) =⎛⎝T/K∑b=1~¯ϕ∗b~¯ϕTb⎞⎠⊗(XXH)∗⊗IM (11c) =K(ΨHΨ)∗⊗IMK, (11d)

where and

 Ψ=⎡⎢ ⎢ ⎢ ⎢⎣1¯ϕH1⋮⋮1¯ϕHT/K⎤⎥ ⎥ ⎥ ⎥⎦. (12)

To achieve a diagonal , the columns of the matrix must be orthogonal, with . If can be made proportional to an identity matrix, then is also a scaled identity matrix.

For the above training protocol, the general solution in (7) is implemented by taking data from the -th pilot subblock,

 Yb=√P(Hd+H¯ΦbGH)X+Nb, (13)

and multiplying on the right by to obtain

 yb≡1K√Pvec(YbXH)=Hc~¯ϕb+¯nb. (14)

where and . The result from each of the subblocks then forms a column of the following combined equation:

 Yc=Hc[~¯ϕ1⋯~¯ϕTK]+¯N=HcΨH+¯N, (15)

where , from which an estimate of the composite channel is obtained by multiplying by on the right, assuming .

Several methods have been proposed to choose the RIS training sequence to satisfy :

• When the direct path is absent (the first column of is removed), a simple approach is to set and “turn on” one RIS element at a time for each -sample pilot subblock, with all other elements “turned off”111“Turning off” an RIS element assumes it becomes a perfect absorber of RF energy, which in practice is not possible. Thus, such elements will still reflect a small amount of energy and thus degrade the orthogonality assumption. [34, 22]. This results in . If each (identical) RIS element when active is tuned to the same phase, it is reasonable to assume that , which results in

and an estimate variance of

for each element of .

• Better performance is achieved by activating all RIS elements over the entire training interval, in order to benefit from the RIS array gain. One approach for doing so assigns the RIS phase shifts such that the columns of equal the columns of the matrix that defines the

-point Discrete Fourier Transform (DFT)

[25, 34]:

 [Ψ]mn=ej2π(m−1)(n−1)/(T/K) (16)

for and . If the RIS gains are assumed to be phase-independent and satisfy , then this leads to and the variance of the channel coefficient estimates is , a factor of smaller than in the first approach. In addition to the need for phase-independent RIS element gains, which is difficult to achieve in practice, the RIS phase shifts should be tunable with at least bits of resolution, which may be problematic for large .

• An alternative that achieves the same performance is to choose the columns of from among the columns of a -dimensional Hadamard matrix, whose entries are constrained to be [54, 4]. This achieves orthogonality for , and has the advantage of requiring only two phase states for each RIS element (one bit of resolution). In addition, a diagonal only requires that the RIS gains be equal at these two phase values. In this approach, must be a multiple of 4 for the Hadamard matrix to exist, but this is not a significant issue for large .

#### Iii-A2 Linear Minimum Mean Squared Error

The LS approach assumes a deterministic channel with no prior information. On the other hand, the minimum mean-squared error (MMSE) estimator assumes a stochastic model for , usually in terms of correlated Rayleigh fading with prior information of the second-order statistics. However, the composite channel is composed of products of the Gaussian elements in and , which makes the MMSE estimate difficult to compute, although message-passing algorithms have been proposed for this problem [17, 18, 33]. Instead, the linear MMSE, or LMMSE, estimate given by can be found by solving [23, 41]

 W=argmin~WE{∥~Wy−hc∥2}. (17)

Assuming spatially and temporally white Gaussian noise uncorrelated with , the LMMSE estimate is given by

 ^hc,LM=√PRhcZH(PZRhcZH+σ2IMT)−1y, (18)

where and we have assumed .

Using orthogonal pilot and RIS reflection sequences like those discussed above also simplifies computation of the LMMSE estimate. For example, let and assume the Hadamard reflection pattern so that . Then the LMMSE estimate simplifies to

 ^hc,LM=1√PTRhc(Rhc+σ2PTI)−1ZHy. (19)

The matrices in (19) involving are data independent, and can be computed and stored offline since changes relatively slowly. The resulting error covariance is given by

 Re,LM=Rhc−Rhc(Rhc+σ2PTI)−1Rhc. (20)

For the above training protocol, converges to for high SNR (i.e., ) or long training intervals ().

A bigger issue than the computational complexity of (19) is how to determine the composite channel covariance . In theory, the covariance could be estimated using simulations involving detailed propagation models of the environment, or by taking sample statistics of channel estimates obtained over a long period of time. However, the size of means that such procedures would require a large amount of data. Instead, a more reasonable approach is to determine based on covariance information about its constituent parts. For MIMO channels, it is commonly assumed that the multipath scattering at the source is uncorrelated with the scattering at the destination, which leads to the following descriptions:

 H =R12HB~HRH2HR (21a) G =R12GU~GRH2GR (21b) Hd =R12HdB~HdRH2HdU, (21c)

where the subscripts respectively correspond to BS, RIS, and UE, and indicate which side of the link the correlation matrix is associated with (e.g., is the correlation matrix for the BS-side of the channel ). The matrices are of the same dimensions as respectively, and are composed of uncorrelated elements. Under this model, it can be shown that the composite channel covariance matrix has the following form:

 (22)

where we define .

Estimating the correlation matrices and is relatively straightforward since the BS and UEs have active transceivers that can collect and process data. However, determining the RIS-side correlation matrices and is problematic since the RIS is typically passive. Various assumptions can be made to further simplify . With uncorrelated scattering at the RIS, and can be taken as identity matrices, and is block diagonal with identical block entries except for the block associated with . This greatly simplifies computation of (19). If we go a step further and assume all channels exhibit uncorrelated Rayleigh fading, then the LMMSE estimate and error simplify to

 ^hc,LM=1√PT [νHdIMK00νGHIMNK]ZHy (23a) Re,LM=σ2PT [νHdIMK00νGHIMNK], (23b)

where

 νHd =PTσ2HdPTσ2Hd+σ2<1 (24a) νGH =PTσ2Gσ2HPTσ2Gσ2H+σ2<1, (24b)

and represent the variances of the channels , respectively. Since and are less than one, the LMMSE estimates have a smaller error than for LS, which is due to the exploitation of the prior statistical information. However, assumptions of uncorrelated fading are hard to justify in RIS-aided wireless systems, which are typically motivated by propagation environments with sparse propagation paths and frequent blockages. In these environments, the BS and RIS installations are envisioned to be in elevated positions away from nearby RF scatterers. This leads to low-rank channel correlation matrices and consideration of geometric models, as discussed in Section IV.

### Iii-B Wideband Single User MIMO

In wideband scenarios where the channel is frequency selective, we assume the UE transmits an OFDM signal composed of subcarriers from each of its antennas. The symbols are given by the rows of the matrix

in the frequency domain, where here

is the OFDM symbol index. Prior to transmission, the data is first converted to the time domain using the matrix that denotes the -point inverse DFT: , and then is appended with a cyclic prefix of length that is longer than the maximum delay spread of the channel, . At the BS, the cyclic prefix is removed, and the data are converted back to the frequency domain through multiplication by the DFT matrix . This generates a model essentially identical to (1) for each subcarrier :

 yFt,n=√P(HFd,n+HFnΦt,nGFHn)xFt,n+nFt,n, (25)

where represent the DFT at subcarrier

for the UE-BS, RIS-BS, and UE-RIS channel impulse responses, respectively. Thus, one can employ the same estimation methods discussed above on a per-subcarrier basis, although to exploit the channel correlation in frequency and reduce the training overhead, pilot data is normally transmitted only on a subset of the subcarriers, and interpolation used to construct channel estimates for others

[60]. An alternative approach proposed in [59] is to use shorter OFDM symbols during the training period.

Note that most prior work on RIS channel estimation with OFDM signals has assumed that the RIS reflection properties are frequency independent, i.e., , but this is generally true only for relatively narrow bandwidths [5, 52]. If one sets to have desirable properties (e.g., with orthogonal columns) at a particular subcarrier , then in general those properties will not be inherited at other subcarriers. This issue motivates the design of RIS circuit architectures that have invariant properties across wider frequency bands.

An alternative to estimating the channels in the frequency domain and using interpolation is to directly estimate the channel impulse response. In the time domain, we represent the data received for sample of OFDM symbol as

 yt,s=√PL−1∑k=0(Hd(k)+H(k)Φt,s−kGH(k))xt,s−k+nt,s, (26)

where represent the channel impulse responses and is the maximum number of taps. Defining and , after removal of the cyclic prefix we can write

 yt,s =√PL−1∑k=0[~ϕTt,s−k⊗xTt,s−k⊗IM]hc(k)+nt,s (27a) =√PL−1∑k=0Zt,s−khc(k)+nt,s (27b) =√P[Zt,sZt,s−1⋯Zt,s−L+1]hc+nt,s (27c) yt =⎡⎢ ⎢ ⎢⎣yt,1⋮yt,Nc⎤⎥ ⎥ ⎥⎦=√PZthc+nt, (27d)

where is the vector containing all unknown channel coefficients, and is an block-circulant matrix with first block row . Finally, assuming the channel is stationary over total OFDM symbols, we have

 y=√P⎡⎢ ⎢⎣Z1⋮ZTo⎤⎥ ⎥⎦hc+n=√PZhc+n. (28)

The time-domain approach assumes only pilot data is transmitted first, followed by payload data. The total number of pilot symbols required is . While more OFDM symbols are likely required for the frequency domain method to obtain the same channel estimation accuracy, this is offset by the fact that data and pilots can be transmitted together.

### Iii-C Single Antenna Scenarios

#### Iii-C1 Single Antenna UE

The single-antenna UE case is often considered in the literature, since it simplifies the notation and reduces the algorithm complexity, but there is fundamentally little difference with the general multi-antenna UE case described above. The channel becomes a row vector that we denote by , while the direct channel is simply an vector . The pilot data received at the BS is given by

 yt=√P(hd+Hdiag(g∗)ϕt)xt+nt, (29)

where the composite channel is now . The training overhead in this case is reduced to samples.

#### Iii-C2 Single Antenna BS and UE

When both the BS and UE have only a single antenna, we denote the RIS-BS channel as the row vector , and write the BS output and composite channel as

 yt =√PhTc~ϕtxt+nt (30a) hTc =[hdgH⊙hT]=[hd¯hTc], (30b)

where only is identifiable.

### Iii-D Multiple User Scenarios

The models and approaches discussed above are easily generalized to the multiple UE case. Assuming UE has antennas for , then the model in (1) holds if we simply set and all UE antennas transmit orthogonal pilot sequences. Some prior work has proposed that the users take turns transmitting pilots, in which case there is no change to the algorithms described above, but this only makes sense if one exploits the fact that each user’s composite channel shares a common RIS-BS component . This idea will be explored further in the next subsection. For multicarrier signals, a scheme is required to allocate the pilot subcarriers to the UEs, but otherwise the channel estimation is the same. One implication for the LMMSE approach is that, assuming the channels for different UEs are uncorrelated, the matrices and will be block-diagonal.

### Iii-E Reducing the Complexity and Training Overhead

As noted already above, one of the key hurdles to overcome in CSI estimation for RIS-aided systems is the large required training overhead. Consequently, recent work has focused on a variety of methods to reduce this overhead, some of which is described below. The use of geometric channel models to reduce pilot overhead is reserved for Section IV.

#### Iii-E1 RIS Element Grouping

A simple approach to reduce the number of pilots and estimation complexity is to assign identical phases to RIS elements with highly correlated channels [60, 54]. High channel correlation occurs when adjacent RIS elements are closely spaced; retaining the flexibility of arbitrary phase shifts for such elements provides minimal additional beamforming gains. Suppose groups of size are identifed, and assume for simplicity that is an integer and no direct channel is present. Then we define , where is , and write

 Hcϕt=Hc(ϕ′t⊗1J)=Hc(IN′⊗1J)ϕ′t=H′cϕ′t, (31)

where the effective composite channel is now . Each column of is thus a unit-coefficient linear combination of the columns of corresponding to a given group of RIS elements. The revised model is identical in form to the general case, and thus the methods described above can be implemented to estimate with a reduction in the required training overhead by a factor of . A generalization of this idea presented in [54] successively reduces the size of the groups over multiple blocks of pilot and payload data in order to eventually resolve the channels for all of the RIS elements.

#### Iii-E2 Low-Rank Channel Covariance

We see from the noise-free part of (5), , that in the general case, the data matrix should be full rank , since otherwise components of in the nullspace of could not be identified. Like the LS approach, this requires training samples. However, if is rank deficient, then it would be enough for the column span of to lie within the column span of . In particular, suppose is rank , and thus can be factored as , where has columns. Then in principle it would be sufficient to choose

 ZT=UV (32)

for some full rank matrix , and thus theoretically it would be sufficient that . Unfortunately, due to constraints on the possible values for , finding a that exactly satisfies (32) is generally not possible if . It may however be possible to approximately solve (32) for larger values of that are still much smaller than , provided that is not too large. In addition to reducing the training overhead, the low rank channel covariance can be exploited to significantly reduce the cost of computing the LMMSE solution in (18), since only an inverse rather than an inverse is required:

 ^hc=√Pσ2U[Ir−W(W+σ2PIr)−1]UHZHy, (33)

where .

#### Iii-E3 Exploiting Common Channels

The LS method in Section III-A1 ignores the Kronecker product structure of the composite channel, which can be exploited to reduce the training overhead. The key observation is that, in the uplink, the composite channel for each user shares the same RIS-BS channel [57]. To explain how this information can be exploited, assume without loss of generality a scenario with single-antenna users. The approach is divided into two steps [48, 51]. In the first, one of the users is selected and the composite channel for this user is estimated in the normal way, while the other users do not transmit. Then, in the second step, the other users transmit and the estimate of the RIS-BS channel obtained in the first step is exploited to reduce the training required for the remaining channels.

Assume the users are ordered such that the user corresponding to the first row of , denoted by , is the one selected for the first step. The LS method is used to estimate the composite channel and the direct channel , which requires at least training samples. Recall that only the product is estimated and not the individual terms and . In fact, we can treat as in (2), so step 1 provides us with an estimate of , and we can set the first row of to . With the estimate , during step 2 the training data model is approximately given by

 yt ≃√P(~Hd+^~HΦt~GH)xt+nt (34a) ≃√PxTt⊗[IM^~HΦt]M×(K−1)(M+N)[~hd~g∗]+nt (34b) ≃√P~Zt~hc+nt, (34c)

where we drop the first column of to create , and we drop the first row of ones in , since UE 1 does not transmit. We also have defined and . Stacking of these training vectors together, we get an equation analogous to (5), where in this case is . Assuming linearly independent pilots and RIS reflection vectors are chosen, we can solve for the remaining channel parameters using provided that , or equivalently, . Given the samples needed for step 1, the minimum required training time is thus

 Tmin=(N+1)+(NM+1)(K−1), (35)

which for large is significantly less than the value required by the standard LS method.

## Iv Estimation of Structured Channels

The large training overhead required for unstructured channel estimation motivates the consideration of channel models that are described by fewer parameters. Such models are often used in millimeter wave or higher frequency bands, where multipath scattering is sparse and propagation is often dominated by strong specular components. In such cases, the channels can be described by a small number of propagation paths defined by path gains, angles of arrival (AoAs), and angles of departure (AoDs)222For very large RIS, where the BS or UEs are in the Fresnel region of the RIS, the channel parameterization must also include range or the 3-D coordinates of the various devices, and the large scale fading becomes antenna-dependent. However, here we focus on the more common far-field scenario.. The resulting number of parameters is often more 1-2 orders of magnitude less than that required in the unstructured case, and the training overhead is correspondingly reduced.

Parametric channels are described by the array response or “steering” vectors associated with the angle of an incoming (AoA) or outgoing (AoD) signal. For example, the response of an -element uniform linear array (ULA) to a signal arriving with azimuth angle is described by the Vandermonde vector

 ax(ωx)=[1ejωxej2ωx⋯ej(Mx−1)ωx]T, (36)

where the spatial frequency is defined by , and is the distance in wavelengths between the antennas333Note that we assume a narrowband propagation model here where time delays can be represented by phase shifts. For large arrays, ignoring the frequency dependence of the model leads to the beam-squint effect [31].. For an uniform rectangular array (URA) with antenna separations of and in the and directions, the array response vector can be written as

 a(ω)=ax(ωx)⊗ay(ωy), (37)

where the vertical array response component is similar to (36),

 ay(ωy)=[1ejω2ej2ω2⋯ej(M′y−1)ω2]T, (38)

but defined by with elevation AoA . The vector corresponds to a 2D spatial frequency. For either a ULA or URA, there is a one-to-one correspondence between the angles and spatial frequencies as long as are no more than one-half wavelength. This is important for applications involving localization, since the angles provide useful information for locating a signal source. However, from the viewpoint of channel estimation, it is enough to know , and any ambiguities in determining the angles need not be resolved.

In this section we focus on estimation of structured or geometric channel models. To simplify the discussion, we assume that the direct UE-BS channel is absent. This assumption is typical for scenarios with low-rank near-specular propagation at high frequencies, where blockages are common. The extra steps and pilot data required to estimate when it is present generate minimal additional overhead. We will further assume that the BS and the UEs (when they have multiple antennas) employ ULAs, so that their array response depends on a single angle/spatial frequency, and we assume that the RIS elements are arranged as a URA, so its spatial response depends on two spatial frequencies. Generalizations to arbitrary array geometries are straightforward.

### Iv-a Parametric Estimation

The structured channel estimators that we will consider assume parametric channel models of the following form, which we describe first for the RIS-BS channel:

 H =dH∑k=1γH,kaB(ωBH,k)aHU(ωRH,k) (39a) =AB(ωBH)ΓHAHR(ωRH), (39b) where the columns of AB(ωBH) =[aB(ωBH,1)⋯aB(ωBH,dH)] (39c) AR(ωRH) =[aR(ωRH,1)⋯aR(ωRH,dH)] (39d)

respectively represent the steering vectors for the propagation paths with AoAs at the BS and AoDs from the RIS. The diagonal matrix contains the complex path gains . The RIS AoDs for path , denoted by , are written as vectors since the RIS spatial frequencies are two-dimensional:

 ωRH,k=[ωRH,k,xωRH,k,y]. (40)

Parametric models like (39) are usually employed when the number of paths is smaller than the array dimensions and , and thus the channel is low-rank.

Parametric CSI estimation involves finding the spatial frequencies of signals collected by an array. For example, suppose observations are available from an arbitrary -element array receiving signals from directions:

 Y′=A(ω′)S′+N′, (41)

where is , is , is noise, is the array response matrix, and . The matrix is not typically assumed to be known. This is the classical model assumed for AoA estimation, and many methods have been developed to estimate from . The simplest method is based on (matched filter) beamforming, which involves searching for peaks in the spectrum

 pB(ω)=aH(ω)RY′a(ω), (42)

where is the sample covariance matrix

 RY′=1nY′Y′H. (43)

Alternatively, one can employ higher resolution algorithms such as the MUSIC [38] or ESPRIT [37], which require computation of the eigendecompositon of . The beamforming and MUSIC spectra are either one- or two-dimensional functions, depending on whether the steering vectors depend on one- or two-dimensional spatial frequencies. If is spatially and temporally white, the (deterministic) maximimum likelihood (ML) method [40] finds the AoA estimates from the -dimensional (or -dimensional for azimuth/elevation angles) problem

 ^ω′ML=argminω% trace(P⊥