We introduced in [WJH18a, WJH18b] a novel blind (noncoherent) communication scheme for the physical layer, called modulation on conjugate-reciprocal zeros (MOCZ), to reliably transmit sporadic short-packets of fixed size over unknown wireless multipath channels with bandwidth at an incredible low-latency. Here the information of the packet is modulated on the zeros of the transmitted discrete-time baseband signal’s transform. We will call the discrete-time baseband signal a MOCZ symbol, similar to an orthogonal frequency division multiplexing (OFDM) symbol, which is a finite length sequence of complex-valued coefficients. These coefficients will then modulate a continuous-time pulse shape at a sample period of to generate the continuous-time baseband waveform. Since the MOCZ symbols (sequences) are neither orthogonal in time nor frequency domain, the MOCZ design can be seen as a non-orthogonal multiplexing scheme. After up-converting to the desired carrier frequency, the transmitted passband signal will propagate in space such that, due to reflections, diffractions, and scattering, different delays of the attenuated signal will interfere at the receiver. Hence, multipath propagation causes a time-dispersion which results in a frequency-selective fading channel [TV05]. Due to ubiquitous impairments between transmitter and receiver clocks a carrier frequency offset (CFO) will be present after a down conversion to the baseband. Doppler shifts due to relative velocity causes additional frequency dispersion which can be also approximated in first order by a CFO. This is a known weakness in many multi-carrier modulation schemes, such as OFDM [TV05, Moo94, ZGX10, LLTC04], and various approaches have been developed to estimate or eliminate the CFO effect. A common approach for OFDM systems is to learn the CFO in a training phase or from blind estimation algorithms, such as MUSIC [LT98] or ESPRIT [TLZ00]. Furthermore, due to the unknown distance and asynchronous transmission, a timing offset (TO) of the received symbol has to be determined as well, which will otherwise destroy the orthogonality of the OFDM symbols [CKYK10, 5.1],[PKPKKH06]. By “sandwiching” the data symbol between two training symbols a timing and frequency offset can be estimated [SC97],[SC96]. By using antenna arrays at the receiver, antenna diversity of a single-input-multiple output (SIMO) system can be exploited to improve the performance [ZGX10].
Whereas OFDM is typically used in long frames, consisting of many successive OFDM symbols and hence much longer signal lengths, we consider here only one single symbol transmission with a very short signal length. This places high demands on such a bursty signaling scheme, since timing and carrier frequency offsets have to be addressed from only one received symbol. Here our MOCZ scheme will be a promising solution. Since any communication will be scheduled and timed on the MAC layer by a certain bus, running with a known bus clock-rate, timing-offsets of the symbols can be assumed as fractions of the bus clock-rate. We will introduce here an improved receiver design for a coded binary MOCZ (BMOCZ) scheme and demonstrate by bit-error-rate (BER) simulations the robustness against these impairments.
In the MOCZ design, a CFO will result in an unknown common rotation of all received zeros. Since the angular zero spacing in a BMOCZ symbol of length is given by a base angle of , a fractional rotation can be easily obtained at the receiver by an oversampling during the post-processing to identify the most likely transmitted zeros (zero-pattern). Rotations, which are integer multiples of the base angle, correspond to cyclic shifts of the binary message word. By using a cyclically permutable code (CPC) for the binary message, the BMOCZ symbol becomes invariant against any cyclic shift and hence against any CFO. This prevents any further symbol transmissions for estimating the CFO, which will reduce overhead, latency, and complexity. As a byproduct, this has the appealing feature of providing a CFO estimation from the decoding process of a single BMOCZ symbol. Furthermore, due to the embedding into a cyclic code, such as BCH codes, we can use their error correction capabilities to improve the BER and moreover the block error-rate (BLER) performance tremendously. By measuring the energy of the expected symbol length with a sliding window in the received signal, we can identify arbitrary TOs at the receiver. We will show the robustness of the TO estimation analytically, which reveals another strong property of the MOCZ design.
At last, we will combine CFO and TO with error correction over multiple receive antennas and demonstrate antenna diversity of the SIMO system. By simulating BER over the received SNR for various average power delay profiles, with constant and exponential decay as well as random sparsity constraints, we will demonstrate the performance in various indoor and outdoor scenarios by using the simulation framework Quadriga [JRBT14].
We will use small letters for complex numbers in . Capital Latin letters denote natural numbers
and refer to fixed dimensions, where small letters are used as indices. Boldface small letters denote row vectors and capitalized letters refer to matrices. Upright capital letters denote complex-valued polynomials in. We will denote the first natural numbers in as . For we denote by the shift of the set . The Kronecker-delta symbol is given by and is if and else. For a complex number , given by its real part and imaginary part with imaginary unit , its complex-conjugation is given by and its absolute value by . For a vector we denote by its complex-conjugated time-reversal or conjugate-reciprocal, given as for . We use for the complex-conjugated transpose of the matrix . For the identity matrix we write and for a matrix with all elements zero we write . By we refer to the diagonal matrix generated by . The unitary Fourier matrix is given entry-wise by for . By denote the elementary Toeplitz matrix given element-wise as . The all one and all zero vectors in dimension will be denoted by and , respectively. The -norm of a vector is given by for . If we write and for we set
. The expectation of a random variableis denoted by .
Ii System Model and Requirements
We are interested in a blind and asynchronous transmission of a short single MOCZ symbol at a designated bandwidth . In this “one-shot” communication we assume no synchronization and no packet scheduling between transmitter and receiver. Such extreme sporadic, asynchronous, and ultra short-packet transmissions are required, for example, in critical control applications, exchange of channel state information (CSI), signaling protocols, secret keys, authentication, commands in wireless industry applications, or initiation, synchronization and channel probing packets to prepare for longer or future transmission phases. By choosing the carrier frequency, transmit sequence length, and bandwidth accordingly, a receive duration in the order of the channel delay spread can be obtained, which pushes the latency at the receiver to the lowest possible. Since the next generation of mobile wireless networks aims for large bandwidths with carrier frequencies beyond Ghz, in the so called mmWave band, the transmitted signal duration will be in the order of nano seconds. Hence, even at moderate mobility, the wireless channel in an indoor or outdoor scenario can be considered as approximately time-invariant over such a short time duration. On the other hand, wideband channels are highly frequency selective, which is due to the superposition of different delayed versions (echos) of the transmitted signal at the receiver. This makes equalizing in time-domain very challenging and is commonly simplified by using OFDM instead. But conventional OFDM requires an additional cyclic prefix to convert the frequency-selective channel to parallel scalar channels and in coherent mode it requires additional pilots (training) to learn the channel coefficients. This will increase the latency for short messages dramatically.
For a communication in mmWave band massive antenna arrays are exploited to overcome the large attenuation, which increases the complexity and energy consumption in estimating the huge amount of channel parameters and becomes the bottleneck in mmWave MIMO systems, especially for mobile scenarios. However, in a sporadic communication only one symbol will be transmitted and a next symbol may follow at an unknown time later. In a random access channel (RACH), a different user may transmit the next symbol from a different location, which will therefore experience an independent channel realization. Hence, the receiver can barely use any channel information learned from past communications. OFDM systems approach this by transmitting many successive OFDM symbols as a long frame, to estimate the channel impairments, which will cause a considerable overhead and latency if only a few data-bits need to be communicated. Furthermore, to achieve orthogonal subcarriers in OFDM, the cyclic prefix has to be at least as long as the channel impulse response (CIR) length, resulting in signal lengths at least twice as the CIR length during which the channel also needs to be static. Using OFDM signal lengths much longer than the coherence time might be not feasible for fast time-varying block-fading channels. Furthermore, the maximal CIR length needs to be known at the transmitter and if underestimated will lead to a serious performance loss. This is in high contrast to our MOCZ design, where the signal length can be chosen for a single MOCZ symbol independently from the CIR length. The goal in this work is to address the ubiquitous impairments of the MOCZ design under such ad-hoc communication assumptions and signal lengths in the order of the CIR length.
After up-converting the MOCZ symbol, which is a discrete-time complex-valued baseband signal of two-sided bandwidth , to the desired carrier frequency , the transmitted passband signal will propagate in space. Regardless of directional or omnidirectional antennas, the signal will be reflected and diffracted at point-scatters, resulting in different delays of the attenuated signal which interfere at the receiver if the maximal delay spread of the channel is larger than the sample period . Hence, the multipath propagation causes time dispersion resulting in a frequency-selective fading channel. Due to ubiquitous impairments between transmitter and receiver clocks an unknown frequency offset will be present after the down-conversion to the received continuous-time baseband signal
By sampling at the sample period , the received discrete-time baseband signal can be represented by a tapped delay line (TDL) model. Here the channel action is given as the convolution of the MOCZ symbol with a finite impulse response , where the th complex-valued channel tap describes the th averaged path over the bin , which we model by a circularly symmetric Gaussian random variable in for and zero elsewhere. The average power delay profile (PDP) of the channel can be sparse and exponentially decaying, where defines the sparsity pattern of non-zero coefficients and the exponential decay rate. To obtain equal average transmit and average receive power we will eliminate in our analysis the overall channel gain by normalizing the CIR realization by its average energy (for a given sparsity pattern), such that . The convolution output is then additively distorted by Gaussian noise
of zero mean and variance (average power density)for as
Here denotes the carrier frequency offset (CFO) and the timing offset (TO), which marks the delay of the first symbol coefficient via the first channel path , measured in integer multiples of the sample time . The modulated MOCZ symbol will have rotated coefficients as well as the channel , which will be also effected by a global phase . Since the channel taps have a uniform independent phase the distribution does not change. By the same argument, the Gaussian noise distribution is not alternated by the phase, hence we have for any and .
In [WJH18b, WJH18a] a good signal-codebook is given for Binary MOCZ (BMOCZ) for the set of normalized Huffman sequences , i.e., by all with positive first coefficientand “impulsive-like“ autocorrelation [HUf62], given by
for some . The absolute value of (3) forms a trident with one main peak at the center, given by the energy , and two equal side-peaks of , see Figure (1). From an analytical and empirical investigation [WJH18a], the BMOCZ symbols are most robust against noise if
Hence, the BMOCZ codebook (constellation set) is only determined by the number . Each BMOCZ symbol (constellation, Huffman sequence) defines the coefficients of a polynomial of degree , where the zeros are uniformly placed on a circle of radius or , selected by the message bits as
see also Figure (3). Hence, the BMOCZ encoder is defined iteratively for by its zero codeword as
where we normalize after the last iteration step . From the received noisy signal samples (no CFO and TO)
the decoder is given as a Direct Zero Testing (DiZeT) of the received polynomial at the possible zero positions as
see [WJH18a, WJH18b]. A global phase in will have no affect to the DiZeT decoder and to the received zeros. But the CFO modulates the BMOCZ symbol in (2) and causes a rotation111The CFO would rotate the zeros in any scheme of modulation on zeros, but we will consider here for simplicity only the BMOCZ scheme. of its zeros by in (5), which will destroy the hypothesis test of the DiZeT decoder. Hence, one needs to either estimate or use an outer code for BMOCZ to be invariant against an arbitrary rotation of the entire zero codebook , which we will introduce in Section (IV). However, before we can apply the DiZeT decoder, we have to identify the timing offset of the symbol which yields to the convolution output in (7).
Iii Timing Offset and Effective Delay Spread for BMOCZ
In an asynchronous communication, the receiver does not know when a packet from a transmitter (user) will arrive. Hence, at first the receiver has to detect a transmitted packet, which is already one bit of information. We will assume that the receiver decide correctly, that in an observation window of received samples, one single MOCZ packet of length with maximal channel length of is captured. By assuming a maximal length and a known or a maximal at the receiver, the observation window can be chosen, for example, as . From the noise floor knowledge at the receiver, a simple energy detector with a hard threshold over the observation window can be used for a packet detection. Then, an unknown TO and CFO will be present in the observation window
The challenge here is to identify and the efficient channel length which contains most of the energy of the instantaneous CIR realization . The estimation of these Timing-of-Arrival (TOA) parameters are usually done by observing the same channel under many symbol transmission, to obtain a sufficient statistic of the channels PDP [GGKST03], [CWM02]. Since we only have one observation available, a good estimation is very challenging.
The efficient (instantaneous) channel length , defined by an energy concentration window, will be much less than the maximal channel length , due to blockage and attenuation by the environment, which might also cause a sparse, clustered, and exponential decaying power delay profile. For the MOCZ scheme, it is essential to correctly identify in the window (9) the first received sample from the transmitted symbol , or at least do not miss it, since it will carry most of the energy if is the line of sight (LOS) path. It was shown in [WJH18b] that for the optimal radius in BMOCZ, carries in average to of the BMOCZ symbol energy, see also Figure (1). On the other hand, an overestimated channel length will reduce the overall bit-error performance because the receiver collects unnecessary noise samples.
Since we assume no CSI at the receiver, the channel characteristic, i.e., the instantaneous power delay profile, has to be determined entirely from the received MOCZ symbol. We will introduce here an efficient approach for the BMOCZ design, by exploiting the radar properties of the Huffman sequences, to obtain excellent estimation of the timing offset and the effective channel delay in moderate and high SNR.
Huffman sequences have an impulsive autocorrelation (3), originally designed for radar applications, and are therefore very suitable to measure the channel impulse response [GG05]. Since the transmitted Huffman sequence is still unknown at the receiver, we can not correlate the received signal with the correct Huffman sequence to retrieve the CIR. Instead, we will use an approximative universal Huffman sequence, which is just the first and last peak of a typical Huffman sequence, expressed by the impulses and for as
which we call the Huffman bracket of phase . Since the first and last coefficients are
see [WJH18b], typical Huffman sequences, i.e., having same amount of ones and zeros, will have
By correlating the modulated Huffman sequence with the Huffman bracket we keep the locational properties of the Huffman autocorrelation (trident)
where denotes the exterior signature and the interior signature of the Huffman sequence , see Figure (1). Here, the interior signature can be seen as the data noise floor distorting the trident in (3). Taking the absolute-squares in (13), we get for the three peaks of the approximated trident
where the side-peaks have energy
Since we get by (3) and that , where the lower bound is achieved for typical sequences with (having the same amount of ones and zeros) and the upper bound for (all ones or all zeros). If then the two coefficients (the exterior signature ) will carry all the energy of the Huffman sequence. But then also and the only Huffman sequences (real valued first and last coefficient) are given by and for else, which are the coefficients of polynomials with uniform zeros on the unit circle, see [WJH17a]. For given by (4) the autocorrelation side-lobe is exponentially decaying in but is bounded to for . Hence, , such that almost half of the Huffman sequence energy is always carried in the two peaks. If the CFO would be known, we can set and get for the center peak in (14)
i.e., the energy of the center peak is roughly twice as large as the energy of the side-peaks, and reveals the trident in the approximated Huffman autocorrelation . But , since we do not know the CFO and for some then we get for typical Huffman sequences, such that the power of the center peak will vanish. Hence, in the presence of an unknown CFO the center peak does not always identify the trident. We will therefore correlate the positive Huffman bracket with the absolute-square value of or in presence of noise and channel with the absolute-square of the received signal , which will result approximately in
where and are colored noise and
denotes the noisy trident which collects three times the instant power delay profile of the shifted CIR. These three echos of the CIR will be separated if we have . The approximation in (17) can be justified by the isometry property of the Huffman convolution. Briefly, , the generated (banded) Toeplitz matrix , for any Huffman sequence , is a stable linear time-invariant (LTI) system, since the energy of the output satisfy for any CIR realization
Here, is the autocorrelation matrix of , which is the identity scaled by if . Hence, each normalized Huffman sequence, generates an isometric operator having the best stability among all discrete-time LTI systems, as studied in [WJP15].
Iii-a Timing Offset Estimation
The delay of the strongest path can be identified from the maximum in (17)
where the last equality follows from the fact that both peaks in are contributing between and . If the CIR has a LOS path, then and we immediately have found an estimate for the timing-offset by . In case of NLOS or if the first paths are equally strong, we have to go further back and identify the first significant peak above the noise floor, since the convolution sum of the CIR with the interior signature might produce a significant peak. Let us note here, that this might result in a misidentification of the tridents center peak by (20), for example if . Therefore we will use as a peak threshold
which is the average power of the Huffman sequence distorted by the channel and noise. By comparing to the noise power we found empirically to set the noise-dependent threshold to
to ensure with high probability to be above the instantaneous noise energy. By using an iterative back stepping in Algorithm (1), we will stop if the sample power falls below the threshold , which finally yields an estimate of the timing-offset. In line of Algorithm (1) we update the timing-offset estimate, if the sample power is larger than the threshold and the average power of the preceding samples is larger than the threshold divided by , which will be weighted by the amount of back-steps.
Iii-B Efficient Channel Length Estimation
Since the BMOCZ design does not need any channel knowledge at the receiver, it is also well suited for estimating the channel itself at the receiver. Here, a good channel length estimation is essential for the performance of the decoder, if the power delay profile (PDP) is decaying. At some extent, the channel delays will fade out exponentially and the receiver can cut-off the received signal by using a certain energy ratio threshold. Let us recall the average received SNR
where is the energy of the BMOCZ symbols, which is constant for the codebook. If the power delay profile is flat, then the collected energy will be uniform and the SNR will not change if we cut the channel length at the receiver. However, the additional channel zeros will increase the confusion for the DiZeT decoder and reduce the BER performance. Therefore, the performance will decrease for increasing at a fixed symbol length , see simulations in [WJH18b]. For the most interesting scenario of the BER performance loss is only dB over , but will increase dramatically if . The reason for this behaviour is the collection of many noise taps, which will lead to more distortion of the transmitted zeros. Since in most realistic scenarios the PDP will be decaying, most of the channel energy will be concentrated in the first channel taps. Hence, if we cut the received signal length to , we will reduce the channel length to and improve the rSNR for non-flat PDPs with , since it holds
Since we obtain a significant gain in SNR if and . Hence, by cutting the received signal to the effective channel length, given by a certain energy concentration, we can improve the SNR and reduce at the same time the amount of channel zeros, which we will demonstrate by simulations in Section (V-C).
Assuming the knowledge of the noise floor at the receiver, a cut-off time can be defined as the window time which, for example, contains of the received energy. The estimation of the efficient channel length can be done after the detection of the timing-offset with Algorithm (1). We assume here that the maximal channel delay is . Since the BMOCZ symbol length is , we know that the samples of the received time-discrete signal in (2), which is the CIR correlated by shifts of and distorted by additive noise (we ignore here the CFO distortion since it will be not relevant for the PDP estimation), see Figure (1(a))-fig:receivedpower. We therefore need to determine by an energy concentration threshold, which depends on the instantaneous SNR of . We know, that the last channel tap will be multiplied by , which is as strong as in average. There are many signal processing methods to detect the efficient energy window in the received samples, like total variation smoothing [BV04], or regularized least-square methods [BV04, FR13] which promotes short window sizes (sparsity). We propose in Algorithm (2) an iterative increasing of starting at and increase until enough channel energy is collected. Here we set the estimate channel/signal energy to
where we start with the maximal CIR length . By assuming a path exponent of we can calculate a threshold for the effective energy with . The algorithm then collects as many samples of until the energy is achieved and sets . The extracted modulated signal is then given by
which will processed further for a CFO detection and final decoding.
Iv Carrier Frequency Offset
We assume now, that the down-converted baseband signal in (9) has no further timing-offset and captured all path delays up to . The signal will experience an unknown CFO of
This is a common problem in many multi-carrier systems, such as OFDM, which therefore require CFO estimation algorithms [Moo94, LLTC04, ZGX10]. For a bandwidth of , the relative frequency offset is
Let us consider, for example, a carrier frequency of GHz with a drastic frequency offset of MHz and bandwidth Mhz. This would result in a relative frequency offset , which is able to rotate all zeros by any in the -plane. Hence, the received polynomial (noiseless) will experience a rotation of all its zeros by the angle
As illustrated in Figure (3), we have to ensure that each rotated zero (red) does not leave the zero-codebook set (blue) . To apply the DiZeT decoder, we have to find such that , i.e., we need to ensure that all the data zeros will lie on the uniform grid. Hence, for the CFO can be split in
for some and , where is called the integer and the fractional CFO, which are also present in OFDM systems [CKYK10, Cha.5.2]. Only if (or correctly compensated), the DiZeT decoder, will sample at correct zero positions and decode, due to the unknown integer shift , a cyclic permuted bit sequence , which we will correct in Section (IV-C) by an cyclically permutable code.
Iv-a Decoding BMOCZ via FFT
The DiZeT decoder for BMOCZ allows also a simple hardware implementation at the receiver. Let us scale the received samples with the radius powers respectively
By applying the point unitary IDFT matrix on the
zero-padded scaled signal, wherewith , we get the samples of the transform222An even more efficient FFT calculation with could be achieved if for some . by
where . Hence, the DiZeT decoder simplifies to
Here, can be seen as an oversampling factor of the IDFT, where we pick each th sample point to obtain the zero sample values. Hence, the decoder can be fully implemented by a simple IDFT from the delayed amplified received signal, by using for example FPGA or even analog front-ends. We can also rewrite the diagonal scaling matrix (31) in the symmetric form
such that corresponds to a time-reversal of the diagonal, which brings us to
since the absolute values cancel the phases from a circular shift and the conjugate-time-reversal , where is the circular time-reversal, can be rewritten by using .
Iv-B Fractional CFO estimation via Oversampled FFTs
To estimate the factional frequency offset, we will oversample by choosing to add further zero blocks to . This leads to an oversampling factor of and allows to quantize in uniform bins with separation for a base angle . Hence, the absolute values of the sampled transform in (27) of the rotated codebook-zeros are given by
for each and , where is addition modulo . To estimate the fractional frequency offset of the base angle, we will sum the smaller sample values and select the fraction corresponding to the smallest sum
Iv-C Using Cyclically Permutable Codes
To be robust against rotations which are integer multiple of the base angle, we will need an outer block code in for the binary message , which is invariant against cyclic shifts, i.e., a bijective mapping on the Galois Field
such that for any . We will use the common notation for the code length . Such a block code is called a cycling register code (CRC) [Gol67], which can be constructed from the linear block code , by separating it in all its cyclic equivalence classes
where has cyclic order if for the smallest possible . To make coding one-to-one, each equivalence class can be represented by the codeword with smallest decimal value [RW75]333The authors call the cycling register codes as cyclically permutable codes, which are nowadays defined differently. Furthermore, they claim that CRCs are also comma-free codes, which is not true by definition.. Then is given by the union of all its equivalence class representatives and its cyclic shifts, i.e.
This will generate in a systematically way a look-up table for the cycling register code. Unfortunately, the construction is non-linear and combinatorial difficult. However, the cardinality of such a code is proven explicitly for any positive integer in [Gol67, Thm.VI.3] to be (number of cycles in a cycling register)
where is the Euler function, which counts the number of elements coprime to . For prime, we obtain
which would allow to encode at least bits. For this would result in a loss of only