I System Model
We consider a mmWave MIMO link to send data streams using a transmitter with antennas and a receiver with antennas. Both the transmitter and the receiver use a fully-connected hybrid MIMO architecture as shown in Fig. 1, with and RF chains. A hybrid precoder is used, with , where is the analog precoder and the digital one. The RF precoder and combiner are implemented using a fully connected network of phase shifters, as described in .
The MIMO channel between the transmitter and the receiver is modeled as a matrix denoted as , which is assumed to be a sum of the contributions of spatial clusters, each contributing with rays. This matrix is given by 
where denotes the path loss, is the number of scattering clusters, is the number of rays for -th cluster, is the complex gain of the -th ray within -th cluster, and are the angles of arrival and departure (AoA/AoD), respectively of the -th ray within -th cluster, and and are the array steering vectors for the receive and transmit antennas respectively. This matrix can be written in a more compact way as
where is diagonal with non-zero complex entries, and and contain the receive and transmit array steering vectors and , respectively. can be approximated using the extended virtual channel representation [mmWavetutorial] as
where is a sparse matrix which contains the path gains of the quantized spatial frequencies in the non zero elements. The dictionary matrices and contain the transmitter and receiver array response vectors evaluated on grids of sizes and . Assuming that the receiver applies a hybrid combiner , with the analog combiner, and the baseband combiner, the received signal at discrete time instant can be written as
for .The signal , is a training sequence known to the receiver, is the unknown carrier frequency offset (CFO), and
is the circularly symmetric complex Gaussian distributed additive noise vector.
We consider that the training sequence can be expressed as , with a spatial filter consisting of normalized QPSK symbols. Let us define , and , where is the Cholesky factor of , i.e., . Then, if we stack the samples of the received signal in (4) we obtain the signal model
such that . Then, the vector of parameters to be estimated is .
We define the Signal to Noise Ratio (SNR) for the -th signal , as
Furthermore, the average Signal to Noise Ratio at digital level can be defined as well. Let us define as . Therefore, if the SNR at baseband level is denoted as , it can be written as
which is just the average of the post-combining at each RF chain.
Ii Regularity Condition
For the CRLB to exist, the regularity condition must be fulfilled by the probability density function (pdf) of the data. This condition states
The pdf of the vector is written as
and the log-likelihood function (LLF) as
Before computing the gradient of the LLF, it is important to note that
Therefore, for ,
Now, for ,
Finally, for ,
since the term inside brackets is also purely imaginary.
Thus, since the regularity condition holds, the CRLB is to be found in the next section.
Iii Cramér-Rao Lower Bound
Since the model for the received signal is Gaussian, the Slepian-Bangs formula can be used to find the elements in the Fisher Information Matrix. This formula is given by 
Thus, the elements in the FIM can be found as
The partial derivative of is given by
being a vector of zeros with a single one in the -th position. Therefore,
For the phase offset parameter,
Then, the Fisher Information for yields
For the carrier frequency offset parameter,
For the noise variance,
For the non-diagonal elements in the FIM, it can be checked that all of them are zero-valued except for
Then, the FIM is found to be
Finally, upon inverting the block diagonal matrix , the CRLB for the parameters is given by
Finally, the in the -th RF chain can be distinguished from the average of the individual s of the different signals involved in the problem. If the of the -th signal is denoted by , then
for which, according to the Transformed Parameters Property, the CRLB can be found to be
The last formulas provide a clear insight on how the estimation of actually works. For lower values of , the CRLB is dominated by the term growing linearly with the parameter, whereas for higher values the CRLB is dominated by the second term, making the estimation of the parameter much harder. The same happens for , although it actually depends on the different for the different streams.
Henceforth, our effort is focused on searching for suitable estimators of these parameters. Owing to the non-linear dependence between the data and the parameters, an efficient estimator cannot be found, in general. The practical approach to follow is to seek the ML estimators for these parameters, for which the next subsection is devoted.
Iv Maximum Likelihood Estimation
The problem of finding the ML Estimator for the parameters in can be formalized as
which involves a joint maximization over scalar variables. This problem can be solved in four different steps by splitting the original problem into four interconnected optimization problems.
Iv-a MLE for the Phase Offset
Recall that the LLF is given by
such that the term that depends on is the second variable inside brackets. Therefore, the value that maximizes the function above is found from
The first derivative of the function to maximize yields
Now the following result can be applied.
Lemma. Given two complex numbers, , , the imaginary part of their product is defined as
Therefore, setting the previous derivative to zero allows us to obtain
which can be interpreted as a matched-filtering operation with the training sequence after the carrier frequency offset has been corrected. Notice that the ML estimator of requires the knowledge of the true value . Since it is impossible to know the exact value of , the ML estimator of is to be substituted in (47) instead, such that the final estimator can be applied in a practical scenario.
Iv-B MLE for the Amplitude
In this subsection we provide a closed-form expression for the amplitude parameters, which have already been denoted as . By inspection of the LLF
it can be seen that the terms that depend on this parameter are the two last within brackets. Therefore, the problem of finding is formalized as
Taking the first derivative of the objective function yields
such that the value of that maximizes the function above is
Iv-C MLE for the Carrier Frequency Offset
In this subsection we will find the ML estimator of the carrier frequency offset. In (48), the only term that depends on is the second one. Therefore, the problem of finding the ML estimator for this parameter can be stated as
We can express the objective function in (52) as
in which the statistics and are to be substituted. These are given by
Accordingly, the ML estimator for the carrier frequency offset can be understood as the bin corresponding to the maximum of the sum of the squared-periodograms of the received signals, after a temporal matched filter. Since the periodogram is defined as the Fourier Transform of the autocorrelation, it is necessary to find its maximum numerically, by using the Fast - Fourier Transform (FFT).
The FFT yields a sampled version of the DTFT, such that there is no guarantee that the carrier frequency offset falls within a specfic integer bin of the FFT. Thus, the three largest points in the FFT can be found and parabolic interpolation can be performed thereafter. The motivation behind this approach is trying to find the real maximum of the function. The actual maximum is not generally found, but an accurate estimate can be found, instead.
Iv-D Quadratic Interpolation of Spectral Peaks
This subsection is meant to explaining how parabolic interpolation can be applied to obtain a better estimate of the carrier frequency offset. In general, assume that the shape of the Energy Spectral Density (ESD) of the received stochastic process is the one pictured in Figure 2. Then, the equation of a parabola is given by
where is the interpolated peak location (in FFT bins). At the three nearest samples, the images of the parabola at these points are
Then, expressing the three samples in terms of the interpolating parabola, it is found that
Therefore, the interpolated peak location in bins is given by
such that the amplitude at that bin is
Finally, by using (60), an estimate of the interpolated peak location can be obtained. If denotes the FFT bin that yields the maximum value, then yields the absolute position of the interpolated bin. Then, the analog frequency estimate is expressed as
where is the sampling frequency in Hertz and is the number of points in the FFT.
Iv-E MLE for the Noise Variance
This subsection is devoted to find the MLE for the noise variance parameter, . Recalling the Log-Likelihood Function
the MLE for the noise variance can be easily found from
The condition for the derivative to vanish is that
where is the estimate of the matrix that models how the signal energy is transferred from the transmitter to the receiver. The estimate is computed using the ML estimators of the previous parameters.
Iv-F MLE for the Signal to Noise Ratio
This last subsection is devoted to provide the ML estimator for the metric. To find this statistic, the following property of ML estimation can be applied .
Invariance Property of the MLE: The MLE of the parameter , where the pdf is parameterized by , is given by
where is the MLE of .
This nice property allows us to find the MLE for , which is simply
where and are the MLEs for and provided in the previous subsections. As a summary, the ML estimators for the parameters are
To assess the performance of the different estimators, some results are presented hereafter. In the simulations, a range from dB to dB has been considered for , which is a typical range in millimeter wave communications. The amplitude parameters are generated from a distribution. The phase-offsets are generated from a and the carrier frequency offset is generated as . The number of antennas is set to and , and the number of RF chains employed at both transmitter and receiver are set to and . The training sequence , contains normalized QPSK symbols, with time-domain samples. The results have been averaged over Monte Carlo realizations.
The normalized sample bias of the the different parameters is shown in Figure 4. As predicted by the Estimation Theory, at low regime the sample bias of the parameters is non zero whereas it decreases to zero asymptotically with . The estimators are, thereby, asymptotically unbiased.
Now, the efficiency of the estimators is evaluated through the normalized sample variance. The normalized sample variances for the different estimators is shown in Figure 5, Figure 6, Figure 7 and Figure 8, along with their corresponding normalized CRLBs. It can be noticed that the normalized sample variance of the estimators does not lie within the NCRLB for low values of , as it is expected. Nevertheless, for high values of , the estimators present a normalized sample variance that lies within the NCRLB. Therefore, the estimators are asymptotically efficient.
-  R. Méndez-Rial, C. Rusu, N. González-Prelcic, A. Alkhateeb, and R. W. Heath Jr., ”Hybrid MIMO architectures for millimeter wave communications”, in Proc. IEEE Globecom, 2014.
-  S. M. Kay, Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Prentice Hall PTR, 1993.