One goal of future wireless communications (the emerging 5G or beyond 5G) is to support reliable communications in high-mobility scenarios, such as on high-speed railways with a speed of up to 500 km/h [1, 2] or on vehicles with a speed of up to 300 km/h [3, 4]. Currently, the dominant modulation technique for 4G and the emerging 5G is orthogonal frequency division multiplexing (OFDM). For the high-mobility scenarios, OFDM may experience significant inter-carrier interference (ICI) due to the Doppler spread of time-variant channels (which are also referred to as doubly selective or doubly dispersive channels). ICI will severely degrade the performance of OFDM systems when the traditional transceivers are used .
To cope with ICI, some modifications of the traditional OFDM were proposed at the cost of more complicated transceiver design. Linear equalization [6, 7, 8, 9] and non-linear equalization [10, 11, 12] were proposed to eliminate the ICI at the receiver. Some transmitter processing methods to mitigate ICI were proposed including polynomial cancellation coding [13, 14] and pulse shaping [15, 16]. Using both transmitter and receiver processing, a channel-independent block spreading based multiple access scheme was proposed for the multi-user scenarios, where both ICI and multiuser interference can be eliminated .
Instead of trying to eliminate ICI, there are some modulation schemes proposed for time-variant channels to enhance the system performance by using the transmit diversity. A frequency-oversampling technique for zero-padded OFDM system was proposed in
, where frequency diversity can be achieved through transmit signal design. Vector OFDM transmits multiple groups of linearly precoded symbols over the channel subcarriers to provide frequency diversity. A Doppler-resilient orthogonal signal division multiplexing technique was proposed. That multiplexes several data vectors and a pilot vector into a data stream to fully exploit the frequency-time diversity in the time-variant channels [20, 21].
Orthogonal time frequency space (OTFS) is an alternative to OFDM to tackle the time-variant channels [22, 23, 24]. Leveraging the basis expansion model (BEM) for the channel [25, 26], OTFS converts the time-variant channels into the time-independent channels in the delay-Doppler domain. Accordingly, the information bearing data is multiplexed into the roughly constant channels in the delay-Doppler domain. OTFS is different from previous work in that it multiplexes data in the delay-Doppler domain.
Like OFDM multi-antenna systems, OTFS with massive multiple-input multiple-output (MIMO) can further increase the spectrum efficiency. Such benefits require that the channel state information (CSI) is known at the transmitter to design the transmit beamforming vectors [27, 28, 29]. When OTFS massive MIMO systems are operated with frequency division duplex (FDD) mode, downlink channel estimation is necessary due to the lack of channel reciprocity. With a large number of antennas equipped at the base station (BS) in OTFS massive MIMO systems, downlink channel estimation is challenging.
The time-variant channel estimation schemes for massive MIMO systems have been proposed in [30, 31, 32]. In , the time-variant MIMO channels are modeled by several jointly sparse time-independent coefficients based on the BEM. These jointly sparse coefficients can be estimated through a distributed compressive sensing algorithm with high accuracy. In , a spatial-domain BEM was developed to further reduce the effective dimensions of massive MIMO time-variant channels, such that the downlink training overhead can be reduced. Moreover, a general framework of compressed channel sensing was provided in . Based on the sparse multipath structure of massive MIMO time-variant channels, compressed channel sensing can achieve a target estimation error using much less overhead. The aforementioned channel estimation techniques [30, 31, 32] were proposed for OFDM massive MIMO systems. They are not directly applicable for OTFS massive MIMO systems. This is because that the information bearing data is multiplexed in the delay-Doppler domain in OTFS systems, not the frequency-time domain as in OFDM systems.
For OTFS systems, an impulse based channel estimation technique was proposed for the OTFS single-input single-output (SISO) architecture in . The BS transmits an impulse in the delay-Doppler domain as the training pilots. The received signals in the delay-Doppler domain can be regarded as a two-dimensional periodic convolution of the transmit impulse with the delay-Doppler channel . The delay-Doppler channel can then be estimated from the received signal. An alternative method using PN sequences as the training pilots in the delay-Doppler domain was proposed for OTFS SISO systems . In that method, channel estimation is done in the discrete domain, where three quantities of interest, namely, delay shift, Doppler shift, and fade coefficient are estimated. Then the delay-Doppler channel can be calculated accordingly. The impulse-based scheme is extended to OTFS MIMO systems by transmitting several impulses with proper guard between two adjacent impulses to distinguish different BS antennas . The existing channel estimation techniques can not be directly extended to OTFS massive MIMO since a large number of antennas are required to be distinguished by transmitting orthogonal pilots, which will lead to high pilot overhead.
To solve this problem, we propose a 3D structured orthogonal matching pursuit (3D-SOMP) algorithm based downlink channel estimation technique for OTFS massive MIMO systems, which can achieve accurate CSI with low pilot overhead. The specific contributions are summarized as follows.
We present the discrete-time formulation of OTFS systems and demonstrate that the OTFS massive MIMO channel exhibits a delay-Doppler-angle 3D structured sparsity. Since the number of dominant propagation paths is limited, the 3D channel is sparse along the delay dimension. As the Doppler frequency of a path is usually much smaller than the system bandwidth, the 3D channel is block-sparse along the Doppler dimension. The only one non-zero block is concentrated around zero, but the length of the non-zero block is unknown. Since the angle-of-departure (AoD) spread of a path at the BS is usually small, the 3D channel is burst-sparse along the angle dimension . The lengths of non-zero bursts can be regarded as constant, but the start position of each non-zero burst is unknown.
Based on the 3D structured sparse channel, we formulate the downlink channel estimation problem in OTFS massive MIMO systems as a sparse signal recovery problem. The estimator makes use of the training pilots that are transmitted in the delay-Doppler domain. We propose that pilots of different antennas are independent complex Gaussian random sequences, which overlap to reduce the overall pilot overhead. By inserting guard intervals between pilots and data, the received pilots can be expressed as a phase compensated two-dimensional periodic convolution of the transmit pilots with the delay-Doppler channel. Decomposing the channel based on its structure, we formulate the downlink channel estimation problem as a sparse signal recovery problem.
We propose a 3D-SOMP algorithm to solve the formulated channel estimation problem. The main idea is summarized as follows. The 3D support of each path is estimated in an one-by-one fashion. For each path, the user first estimates the delay-dimension support. Then, by using the block-sparse property of channels along the Doppler dimension, the user estimates the size of the unique non-zero block to obtain the Doppler-dimension support. Finally, the user transforms the burst-sparsity of channels along the angle dimension into the traditional block-sparsity through a lifting transformation following , so that the angle-dimension support can be estimated accordingly. In this way, the whole 3D channel can be estimated after several iterations by removing the contribution of previous paths in each iteration.
The rest of the paper is organized as follows. In Section II, we present the system model. In Section III, we review the channel estimation in OTFS SISO systems. Then, we propose a 3D-SOMP based channel estimation technique for OTFS massive MIMO systems in Section IV. Simulation results are given in Section V. Our conclusions are finally drawn in Section VI.
Notation: Boldface capital letters stand for matrices and lower-case letters stand for column vectors. The transpose, conjugate, conjugate transpose, and inverse of a matrix are denoted by , , and , respectively. is the Hadamard product operator. is the -norm of the vector . is the Moore-Penrose pseudo-inverse of . Finally,
denotes the identity matrix of size.
Ii System Model
In this section, we review OTFS for SISO systems including a discrete-time formulation of OTFS modulation and OTFS demodulation. Then, we describe an extension of OTFS into massive MIMO systems.
Fig. 1 shows the OTFS SISO architecture as commonly assumed in [22, 23, 24]. OTFS is a modulation/demodulation technique. It can be realized by adding a pre-processing block before a traditional modulator in the frequency-time domain such as OFDM modulator at the transmitter, and a corresponding post-processing block after a traditional demodulator in the frequency-time domain such as OFDM demodulator at the receiver. Through the pre-processing and post-processing blocks, time-variant channels are converted into the time-independent channels in the the delay-Doppler domain. Therefore, the information bearing data can be multiplexed in the roughly constant delay-Doppler channel. At the same time, the transmit data in OTFS systems can take advantage of full diversity in the frequency-time channels. In this way, OTFS improves system performance over OFDM in high-mobility scenarios [22, 23, 24].
Ii-a OTFS SISO Modulation
In this section, we describe the modulation at the transmitter. A quadrature amplitude modulated (QAM) data sequence of length is first rearranged into a 2D data block. This is called a 2D OTFS frame in the delay-Doppler domain , where and are the numbers of resource units along the delay dimension and Doppler dimension. OTFS modulation at the transmitter is composed of a pre-processing block and a traditional frequency-time modulator such as OFDM or filter bank multicarrier (FBMC). The pre-processing block maps the 2D data block in the delay-Doppler domain to a 2D block
in the frequency-time domain. It is realized by using an inverse symplectic finite Fourier transform (ISFFT) and a transmit windowing function. The ISFFT ofis 
where and are discrete Fourier transform (DFT) matrices. A transmit windowing matrix multiplies element-wise to produce the 2D block in the frequency-time domain as
There are several uses of the windowing matrix. For example, the windowing matrix can be designed to randomize the phases of the transmitted symbols to eliminate the inter-cell interference . In this paper, we assume a trivial window at the transmitter for simple expression, i.e., is a matrix of all ones.
Then, the 2D block in the frequency-time domain is transformed to the 1D transmit signal through a traditional frequency-time modulator such as OFDM or FBMC. Assuming an OFDM modulator, the -point inverse DFT (IDFT) is applied on each column of to obtain the 2D transmit signal block , i.e.,
where . Each column vector of can be regarded as an OFDM symbol. Note that OFDM symbols occupy the bandwidth and have the duration , where and are the subcarrier spacing and symbol duration. By combing (1)-(3),
To avoid inter-symbol interference between blocks, the OFDM modulator usually adds cyclic prefix (CP) for each OFDM symbol via a CP addition matrix  with being the length of CP. By reading the 2D transmit signal block column-wise, the 1D transmit signal is
Ii-B OTFS SISO Demodulation
In this section, we describe demodulation at the receiver. The -th element of the received signal after the time-variant channel with length is expressed as
where is the additive noise at the receiver. The OTFS demodulation at the receiver consists of a traditional frequency-time demodulator such as the OFDM or FBMC demodulator and a post-processing block as shown in Fig. 1. The frequency-time demodulator transforms the received signal to a 2D block in the frequency-time domain . Specifically, assuming an OFDM demodulator, the received signal is first rearranged as a matrix of size , i.e.,
where each column vector of can be regarded as a received OFDM symbol including CP. Then, the OFDM demodulator removes the CP by multiplying with a CP removal matrix  to obtain the OFDM symbols without CPs. Applying the -point DFT on each OFDM symbol without CP (i.e., each column vector of ), we obtain the received 2D block in the frequency-time domain as
In the post-processing block, is transformed to the 2D data block in the delay-Doppler domain. It is realized by a receive windowing matrix and the SFFT. The receive windowing matrix multiplies element-wise, i.e.,
Then, the SFFT is applied for to obtain the 2D data block in the delay-Doppler domain as
The received 2D data block in the delay-Doppler domain is given by the phase compensated two-dimensional periodic convolution of the transmit 2D data block in the delay-Doppler domain with the delay-Doppler channel impulse response (CIR) as shown in the following Lemma 1.
Lemma 1: We denote the -th element of and as and , where and . Then can be expressed as
where is the additive noise in the delay-Doppler domain. is the -th element of the delay-Doppler CIR and
where is the remainder after division of by . Note that , thus (12) can be regarded as periodic convolution.
See Appendix I.
We observe from (12) that the transmit data in the delay-Doppler domain experiences roughly constant channel in the delay-Doppler domain, since the delay-Doppler CIR is time-independent ( does not vary with the variable of ). Moreover, since each transmit data in the delay-Doppler domain is expanded onto the whole frequency-time domain as shown in (1) and (2), it can exploit the full diversity of the frequency-time channel. As a result, OTFS has improved performance over the traditional OFDM especially in high-mobility scenarios[22, 23, 24].
Equalization is required to eliminate the inter-symbol interference, since each transmit data in (12) experiences not only the delay-Doppler channel but also the inter-symbol interference . To eliminate such inter-symbol interference through equalization, the delay-Doppler CIR is required, which is obtained through downlink channel estimation.
Ii-C OTFS Massive MIMO
We explain how OTFS work in massive MIMO systems to further increase the spectrum efficiency by using multi-user MIMO in this section. Fig. 2 shows the OTFS massive MIMO architecture. The BS is equipped with antennas to simultaneously serve single-antenna users. Downlink precoding is performed to eliminate the inter-user interference. For example, the zero-forcing Tomlinson-Harashima precoding is adopted in . To perform downlink precoding, downlink CSI is required, which is obtained from uplink channel feedback in FDD systems. After precoding, the transmit data block in the delay-Doppler domain will be modulated through the OTFS modulation and transmitted at antennas. At the user side, the received signal is first demodulated through the OTFS demodulation to obtain the received data block in the delay-Doppler domain. To cancel the inter-symbol interference, equalization is performed based on the downlink CSI. Next, we will focus on the downlink channel estimation in OTFS SISO/massive MIMO systems.
Iii Channel Estimation in OTFS SISO Systems
The goal of channel estimation is to obtain the delay-Doppler CIR from the received delay-Doppler data block in (12). One intuitive method to estimate is to transmit an impulse in the delay-Doppler domain as the training pilots. The transmit impulse is expressed as
Based on (12), the received signal in the delay-Doppler domain can be expressed as
The delay-Doppler CIR can be estimated from in (15) through the least square (LS) estimator or minimum mean square error (MMSE) estimator . Note that only the non-zero part of need to be estimated due to its finite support, which will be explained later.
This impulse based channel estimation technique, however, is not applicable to massive MIMO systems due to the huge required pilot overhead. In OTFS massive MIMO systems, to distinguish the delay-Doppler channels associated with BS antennas at the user side, impulses are required to be transmitted. We assume that the delay-Doppler CIRs of antennas have finite support along the delay dimension and along the Doppler dimension [22, 33, 28]. To avoid the interference among multiple antennas, guard intervals between two adjacent impulses should not be smaller than along the Doppler dimension and no smaller than along the delay dimension . As a result, the length of pilots to transmit impulses in OTFS massive MIMO systems should be . With a large number of BS antennas, the pilot overhead will be overwhelming. To solve this problem, we propose a 3D-SOMP algorithm based channel estimation technique, which can obtain the accurate CSI with considerably reduced pilot overhead.
Iv Proposed 3D-SOMP Based Channel Estimation in OTFS Massive MIMO Systems
In this section, we first demonstrate the 3D structured sparsity of channels in OTFS massive MIMO systems. Then, we formulate the downlink channel estimation problem as a sparse signal recovery problem. To solve this problem, we propose a 3D-SOMP algorithm. Finally, we analyze the required pilot overhead for the proposed 3D-SOMP based channel estimation technique.
Iv-a 3D Structured Sparsity of Delay-Doppler-angle Channel
We consider an OTFS massive MIMO system with antennas at the BS and single-antenna users. Downlink channel estimation is the same for users. Therefore, we focus on a certain user and omit the subscript for the user without loss of generality. We consider the downlink time-variant channel consisting of dominant propagation paths. Each dominant path is composed of subpaths. The -th subpath in the -th dominant path has a complex path gain and Doppler frequency . The delays of all subpaths in the -th dominant path can be regarded as the same . We denote the physical AoD of the -th subpath as . When a typical uniform linear array (ULA) of antennas is considered, the spatial angle associated with is defined as , where is the antenna spacing and is the wavelength of the carrier frequency. Typically, and , thus . The time-variant channel associated with the -th antenna () can be expressed as
where is the band-limited pulse shaping filter response evaluated at and is the system sampling interval. Based on (13), we express the delay-Doppler CIR of the -th antenna (which is referred to as delay-Doppler-space CIR in OTFS massive MIMO systems, where , and correspond to the delay, Doppler and spatial index) as follows
where , and .
To investigate the 3D structured sparsity of channels in OTFS massive MIMO systems, we define the delay-Doppler-angle channel by applying inverse DFT for along the space-dimension as
into a 3D tensor, where is the -th element of (, , and ).
The function has the following characteristic: when . Therefore, has dominant elements only if , , and . As shown in Fig. 3, since the number of dominant paths is small, e.g.,  (the path delays of subpaths of a dominant path are regarded as the same), the delay-Doppler-angle channel is sparse along the delay dimension . Assuming that the largest path delay is , then has finite support along the delay dimension , where .
Additionally, the Doppler frequency of the -th subpath in the -th dominant path can be expressed as , where is the moving velocity of the user and is the angle between the user’s moving direction and the arriving direction of the -th subpath. Therefore, the maximum Doppler of a subpath is . Since is distributed in , is distributed in . Therefore, has finite support along the Doppler dimension , where . For example, for the typical subcarrier spacing kHz and carrier frequency 2.15 GHz, the maximum Doppler of a user with a speed of 180 km/h equals to Hz. Thus, . There are only about 5% dominant elements along the Doppler dimension. That is to say, the delay-Doppler-angle channel is block-sparse along the Doppler dimension , where the unique non-zero block is centered around but the length of the non-zero block is unknown.
Finally, for the angle dimension , since the angle spread of a dominant path is small, is distributed in pieces in . Therefore, the delay-Doppler-angle channel is burst-parse  along the angle dimension . There are non-zero blocks but the start position of each block is unknown, since the path may arrive from any directions. Note that the difference between the burst-sparsity and the traditional block-sparsity is that, the start position of the non-zero burst is not necessarily to be where is the length of non-zero blocks.
To sum up, we decompose the multipaths of time-variant channels to show its structured sparsity along the delay dimension, Doppler dimension, and angle dimension as shown in Fig. 3. The 3D channel tensor is sparse along the delay dimension, block-sparse along the Doppler dimension, and burst-sparse along the angle dimension. This 3D structured sparsity can be used to estimate the CSI with low pilot overhead.
Iv-B Formulation of Downlink Channel Estimation
Fig. 4 shows an OTFS frame of size in the delay-Doppler domain. The length of pilots along the Doppler dimension and the delay dimension are and , satisfying that and . We propose to use complex Gaussian random sequences as the training pilots. To avoid interference between pilots and data caused by the two-dimensional periodic convolution in the delay-Doppler domain, guard intervals are required. Note that the delay-Doppler-angle channel in OTFS massive MIMO systems has finite supports along the Doppler dimension and along the delay dimension. The length of guard intervals should be along the Doppler dimension and along the delay dimension as shown in Fig. 4. To reduce the overall pilot overhead in OTFS massive MIMO systems, we propose the non-orthogonal pilot pattern, i.e., the transmit pilots at different antennas are completely overlapped in the delay-Doppler domain, but the complex Gaussian random sequences (pilots) at different antennas are independent.
The training pilots in the delay-Doppler domain at the -th antenna are denoted as with , , and . The OTFS frames at antennas will be modulated and transmitted simultaneously. After passing the channel, the received signal is demodulated, and then the guard intervals are discarded. According to (12), the received pilots in the delay-Doppler domain at the user side can be expressed as
where is the compensate phase, , . The delay-Doppler-space channel can be expressed as the DFT of the delay-Doppler-angle channel based on (18), i.e.,
To simplify the expression, we rewrite (22) into the vector-matrix form. We arrange (, ) into column vectors , where the -th elements of equal to . We also arrange () into column vector , where the -th elements of equal to . As a result, (22) can be rewritten in the vector-matrix form as
where is the two-dimensional periodic convolution matrix with the -th element of being equal to , where , , , and . is a matrix with the -th element being . By denoting and , (23) can be expressed as
Note that can be inversely vectorized to obtain a truncated delay-Doppler-angle channel , i.e., , which is composed of the non-zero part of with , , and . In this way, we formulate the OTFS channel estimation problem as a sparse signal recovery problem with the sensing matrix
This problem can be solved by traditional CS algorithms such as the OMP algorithm. In the next subsection, we propose a 3D-SOMP algorithm to recover the channel vector (or the truncated 3D channel ) in (25) with improved performance compared with the traditional OMP algorithm.
Iv-C 3D-SOMP Algorithm
The proposed 3D-SOMP algorithm is presented in Algorithm 1. We borrow the main idea of OMP to obtain the correlation vector between the columns of sensing matrix and the residual measurements with being the initial channel estimate
For the traditional OMP algorithm, the support of the sparse channel vector can be identified by finding the columns of that is most correlated to the residual measurement . Different from OMP, to use the 3D structured sparsity of (or ), we rearrange the correlation vector as a tensor in step 6,
For the sake of presentation, we first introduce some notations of a -dimensional () tensor . The mode- fiber is obtained by fixing all indexes but the -th index of , i.e., . The slice is obtained by fixing all but two indexes of , i.e., . Finally, the unfolding operation transforms a -dimensional tensor to a 2D matrix. The mode- unfolding matrix can be obtained by arranging all the mode- fibers as the columns of .
Our proposed 3D-SOMP algorithm identifies the 3D support of each dominant path in an one-by-one fashion. For each dominant path, the algorithm starts by obtaining the mode-1 unfolding matrix . By calculating the -norm of row vectors of , the correlation vector along the delay dimension is obtained with the -th element
Thus, the delay-dimension index of the -th dominant path can be obtain by finding the largest element of , i.e., .
Then, the user fixes the delay-dimension index and focuses on the slice to identify the Doppler- and angle-dimension support. By calculating the -norm of row vectors of the slice , the correlation vector along the Doppler dimension is obtained with the -th element
Since the truncated 3D channel is block-sparse along the Doppler dimension and there is only one non-zero block centered around , only the length of the non-zero block is unknown. It can be estimated by finding a smallest block in the Doppler-dimension correlation vector , where the ratio between the block’s norm and should be larger than a threshold , i.e.,
Thus, the Doppler-dimension support of the -th dominant path is obtained as in step 11.
Finally, we focus on to obtain the angle-dimension support of the -th dominant path. Similarly, by calculating the -norm of column vector of , the angle-dimension correlation vector is obtained with the -th element
As we have discussed in the previous subsection, the truncated 3D channel is burst-sparse along the angle dimension. The length of the non-zero burst is assumed as . The user needs to estimate the start position of the non-zero burst which is correlated with the AoD of the -th dominant path. The user first transforms the burst sparsity into the traditional block sparsity through a lifting transformation method following . In this method, a burst-sparse vector of size is connected to a block-sparse vector with a higher diemnsion via a lifting matrix . The start position of the non-zero burst in the burst-sparse vector is correlated with the support of the non-zero block in the higher-dimensional block-sparse vector. The -th column of ( and ) only has one non-zero element 1 at location where
To transform the burst sparsity of the truncated 3D channel along the angle dimension into the traditional block sparsity, the angle-dimension correlation vector is modified by the lifting matrix as
Then is rearranged as a matrix . By calculating the -norm of the row vectors of , we obtain in step 14. Thus, the start position of the non-zero burst is obtained by finding the largest element of . Therefore, the angle-dimension support correlated to the -th dominant path can be obtained as in step 16.
Up to this point, the delay-Doppler-angle 3D support in the -th iteration can be obtained as