I Relevance
The origin of the decomposition is a technical report by Julian J. Bussgang from 1952 [1]. Interestingly, the decomposition is not explicitly stated in his report, but rather a consequence of his results. In fact, it is mainly nontrivial extensions of his results that are utilized in current research; for example, applications to complexvalued multipleinput multipleoutput (MIMO) systems are popular in the communication community. There is no standard reference that presents and proves those extended results, and it can be hard to differentiate between which results are exact and which are mere approximations. This lecture note fills these gaps.
Ii Prerequisites
This lecture note requires basic knowledge of random variables, linear algebra, signals and systems, and estimation theory.
Iii Original Bussgang Decomposition for Real Gaussian Random Variables
In the original paper [1], Bussgang considers two jointly Gaussian stationary random processes and . The process undergoes a nonlinear memoryless distortion represented by the function . The resulting nonGaussian random process is
(1) 
Bussgang computed the crosscorrelation of the two random variables obtained by sampling and at specific time instances. Let and denote the zeromean Gaussian random variables obtained by sampling at time and , respectively. Moreover, let be the sampled output of the nonlinear distortion function. We then have the following main result from [1, Sec. III].
Theorem 1 (The Bussgang theorem).
The crosscorrelation of and is
(2) 
where is called the Bussgang gain and is the crosscorrelation of and .
The Bussgang theorem shows that the crosscorrelation between two Gaussian signals is the same before and after one of them has passed through a nonlinear function, except for a scaling factor . The value of depends on the choice of but the theorem holds for any function.
A consequence of Theorem 1 for is that the output signal can be decomposed as
(3) 
where is a zeromean random variable that is uncorrelated to both and . This is the Bussgang decomposition in its elementary form and shows that the output contains the useful part and the distortion part . In other words, the output of a nonlinear function is equal to a scaled version of the input plus the uncorrelated distortion . Note that and are not independent. Since is a deterministic function of
, the distortion term is nonGaussian distributed and statistically dependent on
. Even if the Bussgang decomposition is named after Bussgang, the result is not explicitly stated in [1].Iv Bussgang Decomposition for Complex Random Variables
The Bussgang theorem was extended to the complex case in [2]. We will present this result and then provide a direct proof that is inspired by [3]. For notational convenience, in the remainder of this lecture note, we use to denote the power of a signal and we use to denote the crosscorrelation between and .
Theorem 2 (The complex Bussgang theorem).
Consider the jointly circularly symmetric complex Gaussian random variables and . Let be the output of a deterministic function. The crosscorrelations and are then related as
(4) 
Proof:
We begin by decomposing into two parts:
(5) 
Interestingly, this is equivalent to computing a minimummean squared error (MMSE) estimate of given , with representing the estimation error. Hence, it follows that the second part, , in (5) is uncorrelated with :
(6) 
Since and are jointly Gaussian, the fact that and are uncorrelated implies that they are also independent complex Gaussian variables. By using the decomposition in (5), it follows that
(7) 
by using that the independence between and implies . ∎
The complex Bussgang theorem is the natural complexvalued extension of Theorem 1. The corresponding complex Bussgang decomposition is given by (3) with the only exception that the Bussgang gain is now computed as instead.
A first use case of the Bussgang decomposition is to quantify the signaltodistortion ratio (SDR) at the output of the distortion function. The SDR is simply the power ratio of the desired signal to the additive distortion :
(8) 
where we have used that the additive distortion is uncorrelated with the desired signal .
A second use case is to analyze the performance of a communication system where is the transmitted information signal. Suppose the received signal is noisy distorted signal , where models the hardware distortion and is thermal noise with power . The hardware distortion might, for example, be caused of a sequence of nonideal blocks in the receiver hardware [4], as illustrated in Fig. 1. The first block is the lownoise amplifier (LNA) that can distort both the amplitude and phase of the input signal. In the yellow figure, the amplitude distortion is exemplified and clipping occurs for input signals with large amplitudes. The second block is the inphase/quadrature (I/Q) demodulator that might have mismatches between its branches leading to I/Q imbalance. In the green curve, the effect of I/Q imbalance is shown on a QPSK constellation where the actual transmitted points are affected by leakage from the mirror subcarriers. Finally, in the analogtodigital converter (ADC) block, the real and imaginary parts of the received signal are quantized to be represented by a finite number of bits. Quantization distortion is inevitable even if a large number of ADC bits are used [5, 6]. We can use the Bussgang decomposition in (3) to rewrite the received signal as
(9) 
This signal contains a desired part and an uncorrelated additive “noise” term . Since the latter term is uncorrelated with , we can utilize the Worst case uncorrelated additive noise theorem from [7] to compute an achievable data rate. That theorem says that the worst distribution of from a rate perspective is independent complex Gaussian, in which case the rate is
(10) 
One can possibly achieve a larger rate than (10), by somehow making use of the information content in . But we achieve (10) if we treat as independent Gaussian noise in the decoder.
Iva Alternative Computation of the Bussgang Gain and Two Examples
If the distortion function is differentiable, there is an alternative way of computing the Bussgang gain that might be easier. We will exemplify how to compute it in the realvalued case where
has the probability density function (PDF)
. Note that the derivative of this PDF is . We can then rewrite the Bussgang gain as(11) 
where we identify in and integrate by parts to get . The last expression in (11) reveals that the Bussgang gain can be also computed as the expected value of the first derivative of the distortion function. This result is a special case of Price’s Theorem [8, Example 917].
Example 1 (Onebit quantization).
Consider a realvalued signal that enters the nonlinear distortion function , which represents onebit quantization. The Bussgang gain can then be found as , where is the Dirac function. The same Bussgang gain can be computed as .
IvB Additive Quantization Noise Model is Nothing But Bussgang Decomposition
The Bussgang decomposition is unique in the sense that it is the only decomposition of a distorted signal having the property that the additive distortion noise is uncorrelated with the desired signal . No other value of can be used to achieve that.
One seemingly different decomposition is the Additive Quantization Noise Model (AQNM) originally proposed in [5] to model quantization errors. This model is sometimes described as an alternative decomposition, however, AQNM is nothing but the Bussgang decomposition for quantization. In [5, Lemma 1], a scalar quantizer function is considered, which has the property , which means that each quantization interval is represented by its mean value. When the input is , it is shown that the output can be expressed as a summation of a scaled version of plus an uncorrelated distortion term as follows:
(14) 
where and .
We will show that (14) equals the Bussgang decomposition , where the Bussgang gain equals . Using the assumption from [5], we have
(15) 
By utilizing this result, the scaling in (14) can be rewritten as
(16) 
Hence, the AQNM is a special case of the Bussgang decomposition for distortion functions that satisfy a particular condition. The bottomline is that the Bussgang decomposition is unique but the value of depends on the distortion function.
V Extension to MIMO Systems
In recent years, it has become popular to analyze MIMO systems that are subject to hardware impairments, in particular, in MIMO communications [11, 12, 6]. In this part, we extend the Bussgang results to be applicable to such cases.
Consider two jointly circularly symmetric Gaussian random vectors and , which both have length . The correlation matrices are denoted as and and are assumed to have full rank. The crosscorrelation matrix is denoted as . Using this notation, we can generalize the Bussgang theorem as follows.
Theorem 3 (Bussgang Theorem for MIMO Distortions).
Consider the jointly circularly symmetric Gaussian random vectors and . Let denote a distortion function and is the distorted signal when using as input. The crosscorrelation matrix of and
is a linear transformation of the crosscorrelation matrix
of and :(17) 
Proof:
The proof is a matrix extension of the proof of Theorem 2. Let us express as a summation of the MMSE estimate of it given and the estimation error :
(18) 
where is defined as . If we multiply both sides of (18) by from the right and take the expectation, we obtain
(19) 
from which it follows that . Hence, and are uncorrelated, which implies that they are also independent since these are jointly Gaussian variables. Finally, we obtain (17) as
(20) 
by utilizing that and that since and are independent. ∎
From this theorem we notice that the Bussgang gain is represented by the matrix
(21) 
and we call it a MIMO extension since the distortion function takes multiple inputs and provide multiple outputs. It is possible to extend the result to case where is rankdeficient, in which case the inverse in (21) is replaced by a pseudoinverse; see [3, Section ii@.A] for details.
A consequence of Theorem 3 is the Bussgang decomposition for MIMO functions:
(22) 
where the additive distortion term is uncorrelated both with and any other random vector that is correlated with . This result is illustrated in Fig. 2(a).
Va ElementWise Distortion for MIMO Systems
The Bussgang decomposition for MIMO functions has been widely used to model the hardware impairments in multipleantenna communication systems [6, 11]. In this case, is the number of receive antennas and the distortion function represents impairments in the antenna branches. A common assumption is that there is no crosstalk between the branches, so that each one can be separately modeled in the way shown in Fig. 1. The distortion function then has the form
(23) 
where denotes the element of . Hence, each output is a distorted version of only the input having the same index. We can then simplify the Bussgang matrix by utilizing Theorem 3. More precisely, it follows that , where is a diagonal matrix and is the Bussgang gain corresponding to the component of the distortion function, i.e., . Hence, the Bussgang gain matrix of the overall MIMO distortion becomes and we obtain the simplified Bussgang decomposition
(24) 
Hence, when an elementwise distortion function affects the Gaussian signal , the output is an elementwise scaled version of plus a distortion vector that is uncorrelated with .
VB Are the Elements of Distortion Uncorrelated?
Since the Bussgang gain matrix is diagonal when having elementwise distortions, one may tend to think that the elements of the distortion will also be uncorrelated, so that we effectively get one separate Bussgang decomposition per received signal. However, this is generally not the case as we will show next. Let denote the correlation matrix of the distortion vector . Using the fact that is uncorrelated with , it can be computed as
(25) 
Whenever the input signal contains correlated elements, such that is nondiagonal, the correlation matrix will likely also be nondiagonal. This is intuitively quite clear: If two (almost) identical signals are sent through identical hardware components, then the distortion should also be (almost) identical. This type of correlation typically appears in wireless communications since each receive antenna observes a different linear combination of the same transmitted information signals. Some conditions for when the correlation can be neglected, so that is approximately diagonal, are derived in [3]. However, it is rather common that the correlation is neglected without motivation (cf. [6, 12]), which might lead to substantial approximation errors.
As an example, we consider a setup where a 4antenna receiver quantifies the real and imaginary parts of each entry in the received signal using identical bit ADCs. The input signal is generated as , where is the MIMO channel matrix from a 4antenna transmitter. We consider Rayleigh fading where has independent distributed entries. For each channel realization, is assumed perfectly known and the transmitted signal is , so is conditionally complex Gaussian distributed. The Bussgang decomposition then says that the ADC output can be written as . To demonstrate that the elements of are correlated, Fig. 3
shows the cumulative distribution function (CDF) of the normalized offdiagonal elements of
(i.e., the correlation coefficients) for different number of ADC bits. When the ADC resolution is low, most of the correlation coefficients are nonzero and some are rather large. However, when the ADC resolution is high, the offdiagonal elements are almost zero and can potentially be approximated as zero when quantifying communication rates.VC Generalized Bussgang Decomposition for NonGaussian Input Signals
In the Bussgang theorem, we are utilizing that and are Gaussian signals. The main result cannot be generalized to nonGaussian signals. However, we can always decompose the distorted signal according to (22) using the Bussgang gain matrix , but it generally won’t be a diagonal matrix, even if an elementwise distortion of the type in (23) is used. The intuition is that is the linear MMSE estimate of given a nonGaussian distributed observation . In this analogy, is the estimation error which is uncorrelated with since
(26) 
The generalized Bussgang decomposition for nonGaussian input is illustrated in Fig. 2(b). It is suitable both for quantifying the SDR and to the analyze the performance of nonlinear communication systems. For example, [10] did this using practically modulated data signals. The paper also shows that although treating the uncorrelated distortion as independent Gaussian noise is convenient, one can increase the performance by exploiting its information content.
Vi Lessons Learned
The Bussgang decomposition establishes that the output of a nonlinear function is a scaled version of the random input signal plus an uncorrelated distortion term. It is an exact and unique representation. The distortion is not independent and not Gaussian, but can be treated as that to obtain a lower bound on the communication performance. The decomposition can be extended to MIMO systems but then the entries of the distortion vector are generally mutually correlated.
References
 [1] J. J. Bussgang, “Crosscorrelation functions of amplitudedistorted Gaussian signals,” Research Laboratory of Electronics, Massachusetts Institute of Technology, Tech. Rep. 216, 1952.
 [2] J. Minkoff, “The role of AMtoPM conversion in memoryless nonlinear systems,” IEEE Trans. Commun., vol. 33, no. 2, pp. 139–144, 1985.
 [3] E. Björnson, L. Sanguinetti, and J. Hoydis, “Hardware distortion correlation has negligible impact on UL massive MIMO spectral efficiency,” IEEE Trans. Commun., vol. 67, no. 2, pp. 1085–1098, Feb 2019.
 [4] T. Schenk, RF imperfections in highrate wireless systems: Impact and digital compensation. Springer, 2008.
 [5] A. K. Fletcher, S. Rangan, V. K. Goyal, and K. Ramchandran, “Robust predictive quantization: Analysis and design via convex optimization,” IEEE Journal of Selected Topics in Signal Processing, vol. 1, no. 4, pp. 618–632, 2007.
 [6] Q. Bai, A. Mezghani, and J. A. Nossek, “On the optimization of ADC resolution in multiantenna systems,” in IEEE ISWCS, 2013.
 [7] B. Hassibi and B. M. Hochwald, “How much training is needed in multipleantenna wireless links?” IEEE Trans. Inform. Theory, vol. 49, no. 4, pp. 951–963, 2003.
 [8] A. Papoulis and S. U. Pillai, Probability, Random Variables, and Stochastic Processes, 4th ed. McGrawHill Higher Education, 2002.
 [9] W. McGee, “Circularly complex Gaussian noise–a Price theorem and a Mehler expansion,” IEEE Transactions on Information Theory, vol. 15, no. 2, pp. 317–319, 1969.

[10]
Ö. T. Demir and E. Björnson, “Channel estimation in massive MIMO under hardware nonlinearities: Bayesian methods versus deep learning,”
IEEE Open Journal of the Communications Society, vol. 1, pp. 109–124, 2020.  [11] E. Björnson, J. Hoydis, M. Kountouris, and M. Debbah, “Massive MIMO systems with nonideal hardware: Energy efficiency, estimation, and capacity limits,” IEEE Trans. Inform. Theory, vol. 60, no. 11, pp. 7112–7139, 2014.
 [12] L. Xu, X. Lu, S. Jin, F. Gao, and Y. Zhu, “On the uplink achievable rate of massive MIMO system with lowresolution ADC and RF impairments,” IEEE Communications Letters, vol. 23, no. 3, pp. 502–505, 2019.
Comments
There are no comments yet.