I Introduction
One goal of future wireless communications (the emerging 5G or beyond 5G) is to support reliable communications in highmobility scenarios, such as on highspeed railways with a speed of up to 500 km/h [1, 2] or on vehicles with a speed of up to 300 km/h [3, 4]. Currently, the dominant modulation technique for 4G and the emerging 5G is orthogonal frequency division multiplexing (OFDM). For the highmobility scenarios, OFDM may experience significant intercarrier interference (ICI) due to the Doppler spread of timevariant channels (which are also referred to as doubly selective or doubly dispersive channels). ICI will severely degrade the performance of OFDM systems when the traditional transceivers are used [5].
To cope with ICI, some modifications of the traditional OFDM were proposed at the cost of more complicated transceiver design. Linear equalization [6, 7, 8, 9] and nonlinear equalization [10, 11, 12] were proposed to eliminate the ICI at the receiver. Some transmitter processing methods to mitigate ICI were proposed including polynomial cancellation coding [13, 14] and pulse shaping [15, 16]. Using both transmitter and receiver processing, a channelindependent block spreading based multiple access scheme was proposed for the multiuser scenarios, where both ICI and multiuser interference can be eliminated [17].
Instead of trying to eliminate ICI, there are some modulation schemes proposed for timevariant channels to enhance the system performance by using the transmit diversity. A frequencyoversampling technique for zeropadded OFDM system was proposed in
[18], where frequency diversity can be achieved through transmit signal design. Vector OFDM
[19] transmits multiple groups of linearly precoded symbols over the channel subcarriers to provide frequency diversity. A Dopplerresilient orthogonal signal division multiplexing technique was proposed. That multiplexes several data vectors and a pilot vector into a data stream to fully exploit the frequencytime diversity in the timevariant channels [20, 21].Orthogonal time frequency space (OTFS) is an alternative to OFDM to tackle the timevariant channels [22, 23, 24]. Leveraging the basis expansion model (BEM) for the channel [25, 26], OTFS converts the timevariant channels into the timeindependent channels in the delayDoppler domain. Accordingly, the information bearing data is multiplexed into the roughly constant channels in the delayDoppler domain. OTFS is different from previous work in that it multiplexes data in the delayDoppler domain.
Like OFDM multiantenna systems, OTFS with massive multipleinput multipleoutput (MIMO) can further increase the spectrum efficiency. Such benefits require that the channel state information (CSI) is known at the transmitter to design the transmit beamforming vectors [27, 28, 29]. When OTFS massive MIMO systems are operated with frequency division duplex (FDD) mode, downlink channel estimation is necessary due to the lack of channel reciprocity. With a large number of antennas equipped at the base station (BS) in OTFS massive MIMO systems, downlink channel estimation is challenging.
The timevariant channel estimation schemes for massive MIMO systems have been proposed in [30, 31, 32]. In [30], the timevariant MIMO channels are modeled by several jointly sparse timeindependent coefficients based on the BEM. These jointly sparse coefficients can be estimated through a distributed compressive sensing algorithm with high accuracy. In [31], a spatialdomain BEM was developed to further reduce the effective dimensions of massive MIMO timevariant channels, such that the downlink training overhead can be reduced. Moreover, a general framework of compressed channel sensing was provided in [32]. Based on the sparse multipath structure of massive MIMO timevariant channels, compressed channel sensing can achieve a target estimation error using much less overhead. The aforementioned channel estimation techniques [30, 31, 32] were proposed for OFDM massive MIMO systems. They are not directly applicable for OTFS massive MIMO systems. This is because that the information bearing data is multiplexed in the delayDoppler domain in OTFS systems, not the frequencytime domain as in OFDM systems.
For OTFS systems, an impulse based channel estimation technique was proposed for the OTFS singleinput singleoutput (SISO) architecture in [33]. The BS transmits an impulse in the delayDoppler domain as the training pilots. The received signals in the delayDoppler domain can be regarded as a twodimensional periodic convolution of the transmit impulse with the delayDoppler channel [33]. The delayDoppler channel can then be estimated from the received signal. An alternative method using PN sequences as the training pilots in the delayDoppler domain was proposed for OTFS SISO systems [34]. In that method, channel estimation is done in the discrete domain, where three quantities of interest, namely, delay shift, Doppler shift, and fade coefficient are estimated. Then the delayDoppler channel can be calculated accordingly. The impulsebased scheme is extended to OTFS MIMO systems by transmitting several impulses with proper guard between two adjacent impulses to distinguish different BS antennas [35]. The existing channel estimation techniques can not be directly extended to OTFS massive MIMO since a large number of antennas are required to be distinguished by transmitting orthogonal pilots, which will lead to high pilot overhead.
To solve this problem, we propose a 3D structured orthogonal matching pursuit (3DSOMP) algorithm based downlink channel estimation technique for OTFS massive MIMO systems, which can achieve accurate CSI with low pilot overhead. The specific contributions are summarized as follows.

We present the discretetime formulation of OTFS systems and demonstrate that the OTFS massive MIMO channel exhibits a delayDopplerangle 3D structured sparsity. Since the number of dominant propagation paths is limited, the 3D channel is sparse along the delay dimension. As the Doppler frequency of a path is usually much smaller than the system bandwidth, the 3D channel is blocksparse along the Doppler dimension. The only one nonzero block is concentrated around zero, but the length of the nonzero block is unknown. Since the angleofdeparture (AoD) spread of a path at the BS is usually small, the 3D channel is burstsparse along the angle dimension [36]. The lengths of nonzero bursts can be regarded as constant, but the start position of each nonzero burst is unknown.

Based on the 3D structured sparse channel, we formulate the downlink channel estimation problem in OTFS massive MIMO systems as a sparse signal recovery problem. The estimator makes use of the training pilots that are transmitted in the delayDoppler domain. We propose that pilots of different antennas are independent complex Gaussian random sequences, which overlap to reduce the overall pilot overhead. By inserting guard intervals between pilots and data, the received pilots can be expressed as a phase compensated twodimensional periodic convolution of the transmit pilots with the delayDoppler channel. Decomposing the channel based on its structure, we formulate the downlink channel estimation problem as a sparse signal recovery problem.

We propose a 3DSOMP algorithm to solve the formulated channel estimation problem. The main idea is summarized as follows. The 3D support of each path is estimated in an onebyone fashion. For each path, the user first estimates the delaydimension support. Then, by using the blocksparse property of channels along the Doppler dimension, the user estimates the size of the unique nonzero block to obtain the Dopplerdimension support. Finally, the user transforms the burstsparsity of channels along the angle dimension into the traditional blocksparsity through a lifting transformation following [36], so that the angledimension support can be estimated accordingly. In this way, the whole 3D channel can be estimated after several iterations by removing the contribution of previous paths in each iteration.
The rest of the paper is organized as follows. In Section II, we present the system model. In Section III, we review the channel estimation in OTFS SISO systems. Then, we propose a 3DSOMP based channel estimation technique for OTFS massive MIMO systems in Section IV. Simulation results are given in Section V. Our conclusions are finally drawn in Section VI.
Notation: Boldface capital letters stand for matrices and lowercase letters stand for column vectors. The transpose, conjugate, conjugate transpose, and inverse of a matrix are denoted by , , and , respectively. is the Hadamard product operator. is the norm of the vector . is the MoorePenrose pseudoinverse of . Finally,
denotes the identity matrix of size
.Ii System Model
In this section, we review OTFS for SISO systems including a discretetime formulation of OTFS modulation and OTFS demodulation. Then, we describe an extension of OTFS into massive MIMO systems.
Fig. 1 shows the OTFS SISO architecture as commonly assumed in [22, 23, 24]. OTFS is a modulation/demodulation technique. It can be realized by adding a preprocessing block before a traditional modulator in the frequencytime domain such as OFDM modulator at the transmitter, and a corresponding postprocessing block after a traditional demodulator in the frequencytime domain such as OFDM demodulator at the receiver. Through the preprocessing and postprocessing blocks, timevariant channels are converted into the timeindependent channels in the the delayDoppler domain. Therefore, the information bearing data can be multiplexed in the roughly constant delayDoppler channel. At the same time, the transmit data in OTFS systems can take advantage of full diversity in the frequencytime channels. In this way, OTFS improves system performance over OFDM in highmobility scenarios [22, 23, 24].
Iia OTFS SISO Modulation
In this section, we describe the modulation at the transmitter. A quadrature amplitude modulated (QAM) data sequence of length is first rearranged into a 2D data block. This is called a 2D OTFS frame in the delayDoppler domain , where and are the numbers of resource units along the delay dimension and Doppler dimension. OTFS modulation at the transmitter is composed of a preprocessing block and a traditional frequencytime modulator such as OFDM or filter bank multicarrier (FBMC). The preprocessing block maps the 2D data block in the delayDoppler domain to a 2D block
in the frequencytime domain. It is realized by using an inverse symplectic finite Fourier transform (ISFFT) and a transmit windowing function. The ISFFT of
is [22](1) 
where and are discrete Fourier transform (DFT) matrices. A transmit windowing matrix multiplies elementwise to produce the 2D block in the frequencytime domain as
(2) 
There are several uses of the windowing matrix. For example, the windowing matrix can be designed to randomize the phases of the transmitted symbols to eliminate the intercell interference [33]. In this paper, we assume a trivial window at the transmitter for simple expression, i.e., is a matrix of all ones.
Then, the 2D block in the frequencytime domain is transformed to the 1D transmit signal through a traditional frequencytime modulator such as OFDM or FBMC. Assuming an OFDM modulator, the point inverse DFT (IDFT) is applied on each column of to obtain the 2D transmit signal block , i.e.,
(3) 
where . Each column vector of can be regarded as an OFDM symbol. Note that OFDM symbols occupy the bandwidth and have the duration , where and are the subcarrier spacing and symbol duration. By combing (1)(3),
(4) 
To avoid intersymbol interference between blocks, the OFDM modulator usually adds cyclic prefix (CP) for each OFDM symbol via a CP addition matrix [24] with being the length of CP. By reading the 2D transmit signal block columnwise, the 1D transmit signal is
(5) 
IiB OTFS SISO Demodulation
In this section, we describe demodulation at the receiver. The th element of the received signal after the timevariant channel with length is expressed as
(6) 
where is the additive noise at the receiver. The OTFS demodulation at the receiver consists of a traditional frequencytime demodulator such as the OFDM or FBMC demodulator and a postprocessing block as shown in Fig. 1. The frequencytime demodulator transforms the received signal to a 2D block in the frequencytime domain . Specifically, assuming an OFDM demodulator, the received signal is first rearranged as a matrix of size , i.e.,
(7) 
where each column vector of can be regarded as a received OFDM symbol including CP. Then, the OFDM demodulator removes the CP by multiplying with a CP removal matrix [24] to obtain the OFDM symbols without CPs. Applying the point DFT on each OFDM symbol without CP (i.e., each column vector of ), we obtain the received 2D block in the frequencytime domain as
(8) 
In the postprocessing block, is transformed to the 2D data block in the delayDoppler domain. It is realized by a receive windowing matrix and the SFFT. The receive windowing matrix multiplies elementwise, i.e.,
(9) 
Then, the SFFT is applied for to obtain the 2D data block in the delayDoppler domain as
(10) 
Like the transmitter, we consider a trivial window at the receiver for simple expression, i.e., is a matrix of all ones [33]. By combing (8)(10), we can obtain
(11) 
The received 2D data block in the delayDoppler domain is given by the phase compensated twodimensional periodic convolution of the transmit 2D data block in the delayDoppler domain with the delayDoppler channel impulse response (CIR) as shown in the following Lemma 1.
Lemma 1: We denote the th element of and as and , where and . Then can be expressed as
(12)  
where is the additive noise in the delayDoppler domain. is the th element of the delayDoppler CIR and
(13) 
where is the remainder after division of by . Note that , thus (12) can be regarded as periodic convolution.
Proof:
See Appendix I.
We observe from (12) that the transmit data in the delayDoppler domain experiences roughly constant channel in the delayDoppler domain, since the delayDoppler CIR is timeindependent ( does not vary with the variable of ). Moreover, since each transmit data in the delayDoppler domain is expanded onto the whole frequencytime domain as shown in (1) and (2), it can exploit the full diversity of the frequencytime channel. As a result, OTFS has improved performance over the traditional OFDM especially in highmobility scenarios[22, 23, 24].
Equalization is required to eliminate the intersymbol interference, since each transmit data in (12) experiences not only the delayDoppler channel but also the intersymbol interference . To eliminate such intersymbol interference through equalization, the delayDoppler CIR is required, which is obtained through downlink channel estimation.
IiC OTFS Massive MIMO
We explain how OTFS work in massive MIMO systems to further increase the spectrum efficiency by using multiuser MIMO in this section. Fig. 2 shows the OTFS massive MIMO architecture. The BS is equipped with antennas to simultaneously serve singleantenna users. Downlink precoding is performed to eliminate the interuser interference. For example, the zeroforcing TomlinsonHarashima precoding is adopted in [28]. To perform downlink precoding, downlink CSI is required, which is obtained from uplink channel feedback in FDD systems. After precoding, the transmit data block in the delayDoppler domain will be modulated through the OTFS modulation and transmitted at antennas. At the user side, the received signal is first demodulated through the OTFS demodulation to obtain the received data block in the delayDoppler domain. To cancel the intersymbol interference, equalization is performed based on the downlink CSI. Next, we will focus on the downlink channel estimation in OTFS SISO/massive MIMO systems.
Iii Channel Estimation in OTFS SISO Systems
The goal of channel estimation is to obtain the delayDoppler CIR from the received delayDoppler data block in (12). One intuitive method to estimate is to transmit an impulse in the delayDoppler domain as the training pilots[33]. The transmit impulse is expressed as
(14) 
Based on (12), the received signal in the delayDoppler domain can be expressed as
(15) 
The delayDoppler CIR can be estimated from in (15) through the least square (LS) estimator or minimum mean square error (MMSE) estimator [35]. Note that only the nonzero part of need to be estimated due to its finite support, which will be explained later.
This impulse based channel estimation technique, however, is not applicable to massive MIMO systems due to the huge required pilot overhead. In OTFS massive MIMO systems, to distinguish the delayDoppler channels associated with BS antennas at the user side, impulses are required to be transmitted. We assume that the delayDoppler CIRs of antennas have finite support along the delay dimension and along the Doppler dimension [22, 33, 28]. To avoid the interference among multiple antennas, guard intervals between two adjacent impulses should not be smaller than along the Doppler dimension and no smaller than along the delay dimension [33]. As a result, the length of pilots to transmit impulses in OTFS massive MIMO systems should be . With a large number of BS antennas, the pilot overhead will be overwhelming. To solve this problem, we propose a 3DSOMP algorithm based channel estimation technique, which can obtain the accurate CSI with considerably reduced pilot overhead.
Iv Proposed 3DSOMP Based Channel Estimation in OTFS Massive MIMO Systems
In this section, we first demonstrate the 3D structured sparsity of channels in OTFS massive MIMO systems. Then, we formulate the downlink channel estimation problem as a sparse signal recovery problem. To solve this problem, we propose a 3DSOMP algorithm. Finally, we analyze the required pilot overhead for the proposed 3DSOMP based channel estimation technique.
Iva 3D Structured Sparsity of DelayDopplerangle Channel
We consider an OTFS massive MIMO system with antennas at the BS and singleantenna users. Downlink channel estimation is the same for users. Therefore, we focus on a certain user and omit the subscript for the user without loss of generality. We consider the downlink timevariant channel consisting of dominant propagation paths. Each dominant path is composed of subpaths. The th subpath in the th dominant path has a complex path gain and Doppler frequency . The delays of all subpaths in the th dominant path can be regarded as the same [37]. We denote the physical AoD of the th subpath as . When a typical uniform linear array (ULA) of antennas is considered, the spatial angle associated with is defined as [38], where is the antenna spacing and is the wavelength of the carrier frequency. Typically, and , thus . The timevariant channel associated with the th antenna () can be expressed as[39]
(16) 
where is the bandlimited pulse shaping filter response evaluated at and is the system sampling interval. Based on (13), we express the delayDoppler CIR of the th antenna (which is referred to as delayDopplerspace CIR in OTFS massive MIMO systems, where , and correspond to the delay, Doppler and spatial index) as follows
(17)  
where , and .
To investigate the 3D structured sparsity of channels in OTFS massive MIMO systems, we define the delayDopplerangle channel by applying inverse DFT for along the spacedimension as
(18) 
where is the angle index. Then, by substituting (17) into (18), we can express the delayDopplerangle channel as
(19)  
We arrange
into a 3D tensor
, where is the th element of (, , and ).The function has the following characteristic: when [40]. Therefore, has dominant elements only if , , and . As shown in Fig. 3, since the number of dominant paths is small, e.g., [37] (the path delays of subpaths of a dominant path are regarded as the same[39]), the delayDopplerangle channel is sparse along the delay dimension . Assuming that the largest path delay is , then has finite support along the delay dimension , where .
Additionally, the Doppler frequency of the th subpath in the th dominant path can be expressed as [39], where is the moving velocity of the user and is the angle between the user’s moving direction and the arriving direction of the th subpath. Therefore, the maximum Doppler of a subpath is . Since is distributed in , is distributed in . Therefore, has finite support along the Doppler dimension , where . For example, for the typical subcarrier spacing kHz and carrier frequency 2.15 GHz, the maximum Doppler of a user with a speed of 180 km/h equals to Hz. Thus, . There are only about 5% dominant elements along the Doppler dimension. That is to say, the delayDopplerangle channel is blocksparse along the Doppler dimension , where the unique nonzero block is centered around but the length of the nonzero block is unknown.
Finally, for the angle dimension , since the angle spread of a dominant path is small, is distributed in pieces in . Therefore, the delayDopplerangle channel is burstparse [36] along the angle dimension . There are nonzero blocks but the start position of each block is unknown, since the path may arrive from any directions. Note that the difference between the burstsparsity and the traditional blocksparsity is that, the start position of the nonzero burst is not necessarily to be where is the length of nonzero blocks.
To sum up, we decompose the multipaths of timevariant channels to show its structured sparsity along the delay dimension, Doppler dimension, and angle dimension as shown in Fig. 3. The 3D channel tensor is sparse along the delay dimension, blocksparse along the Doppler dimension, and burstsparse along the angle dimension. This 3D structured sparsity can be used to estimate the CSI with low pilot overhead.
IvB Formulation of Downlink Channel Estimation
Fig. 4 shows an OTFS frame of size in the delayDoppler domain. The length of pilots along the Doppler dimension and the delay dimension are and , satisfying that and . We propose to use complex Gaussian random sequences as the training pilots. To avoid interference between pilots and data caused by the twodimensional periodic convolution in the delayDoppler domain, guard intervals are required. Note that the delayDopplerangle channel in OTFS massive MIMO systems has finite supports along the Doppler dimension and along the delay dimension. The length of guard intervals should be along the Doppler dimension and along the delay dimension as shown in Fig. 4. To reduce the overall pilot overhead in OTFS massive MIMO systems, we propose the nonorthogonal pilot pattern, i.e., the transmit pilots at different antennas are completely overlapped in the delayDoppler domain, but the complex Gaussian random sequences (pilots) at different antennas are independent.
The training pilots in the delayDoppler domain at the th antenna are denoted as with , , and . The OTFS frames at antennas will be modulated and transmitted simultaneously. After passing the channel, the received signal is demodulated, and then the guard intervals are discarded. According to (12), the received pilots in the delayDoppler domain at the user side can be expressed as
(20) 
where is the compensate phase, , . The delayDopplerspace channel can be expressed as the DFT of the delayDopplerangle channel based on (18), i.e.,
(21) 
By substituting (21) into (20) and expressing , we have
(22) 
To simplify the expression, we rewrite (22) into the vectormatrix form. We arrange (, ) into column vectors , where the th elements of equal to . We also arrange () into column vector , where the th elements of equal to . As a result, (22) can be rewritten in the vectormatrix form as
(23) 
where is the twodimensional periodic convolution matrix with the th element of being equal to , where , , , and . is a matrix with the th element being . By denoting and , (23) can be expressed as
(24) 
Note that can be inversely vectorized to obtain a truncated delayDopplerangle channel , i.e., , which is composed of the nonzero part of with , , and . In this way, we formulate the OTFS channel estimation problem as a sparse signal recovery problem with the sensing matrix
(25) 
This problem can be solved by traditional CS algorithms such as the OMP algorithm[41]. In the next subsection, we propose a 3DSOMP algorithm to recover the channel vector (or the truncated 3D channel ) in (25) with improved performance compared with the traditional OMP algorithm.
IvC 3DSOMP Algorithm
The proposed 3DSOMP algorithm is presented in Algorithm 1. We borrow the main idea of OMP to obtain the correlation vector between the columns of sensing matrix and the residual measurements with being the initial channel estimate
(26) 
For the traditional OMP algorithm, the support of the sparse channel vector can be identified by finding the columns of that is most correlated to the residual measurement . Different from OMP, to use the 3D structured sparsity of (or ), we rearrange the correlation vector as a tensor in step 6,
(27) 
For the sake of presentation, we first introduce some notations of a dimensional () tensor . The mode fiber is obtained by fixing all indexes but the th index of , i.e., . The slice is obtained by fixing all but two indexes of , i.e., . Finally, the unfolding operation transforms a dimensional tensor to a 2D matrix. The mode unfolding matrix can be obtained by arranging all the mode fibers as the columns of .
Our proposed 3DSOMP algorithm identifies the 3D support of each dominant path in an onebyone fashion. For each dominant path, the algorithm starts by obtaining the mode1 unfolding matrix . By calculating the norm of row vectors of , the correlation vector along the delay dimension is obtained with the th element
(28) 
Thus, the delaydimension index of the th dominant path can be obtain by finding the largest element of , i.e., .
Then, the user fixes the delaydimension index and focuses on the slice to identify the Doppler and angledimension support. By calculating the norm of row vectors of the slice , the correlation vector along the Doppler dimension is obtained with the th element
(29) 
Since the truncated 3D channel is blocksparse along the Doppler dimension and there is only one nonzero block centered around , only the length of the nonzero block is unknown. It can be estimated by finding a smallest block in the Dopplerdimension correlation vector , where the ratio between the block’s norm and should be larger than a threshold , i.e.,
(30)  
Thus, the Dopplerdimension support of the th dominant path is obtained as in step 11.
Finally, we focus on to obtain the angledimension support of the th dominant path. Similarly, by calculating the norm of column vector of , the angledimension correlation vector is obtained with the th element
(31) 
As we have discussed in the previous subsection, the truncated 3D channel is burstsparse along the angle dimension. The length of the nonzero burst is assumed as . The user needs to estimate the start position of the nonzero burst which is correlated with the AoD of the th dominant path. The user first transforms the burst sparsity into the traditional block sparsity through a lifting transformation method following [36]. In this method, a burstsparse vector of size is connected to a blocksparse vector with a higher diemnsion via a lifting matrix . The start position of the nonzero burst in the burstsparse vector is correlated with the support of the nonzero block in the higherdimensional blocksparse vector. The th column of ( and ) only has one nonzero element 1 at location where
(32) 
To transform the burst sparsity of the truncated 3D channel along the angle dimension into the traditional block sparsity, the angledimension correlation vector is modified by the lifting matrix as
(33) 
Then is rearranged as a matrix . By calculating the norm of the row vectors of , we obtain in step 14. Thus, the start position of the nonzero burst is obtained by finding the largest element of . Therefore, the angledimension support correlated to the th dominant path can be obtained as in step 16.
Up to this point, the delayDopplerangle 3D support in the th iteration can be obtained as
Comments
There are no comments yet.