 # On approximate least squares estimators of parameters on one-dimensional chirp signal

Chirp signals are quite common in many natural and man-made systems like audio signals, sonar, radar etc. Estimation of the unknown parameters of a signal is a fundamental problem in statistical signal processing. Recently, Kundu and Nandi 2008 studied the asymptotic properties of least squares estimators of the unknown parameters of a simple chirp signal model under the assumption of stationary noise. In this paper, we propose periodogram-type estimators called the approximate least squares estimators to estimate the unknown parameters and study the asymptotic properties of these estimators under the same error assumptions. It is observed that the approximate least squares estimators are strongly consistent and asymptotically equivalent to the least squares estimators. Similar to the periodogram estimators, these estimators can also be used as initial guesses to find the least squares estimators of the unknown parameters. We perform some numerical simulations to see the performance of the proposed estimators and compare them with the least squares estimators and the estimators proposed by Lahiri et al., 2013. We have analysed two real data sets for illustrative purposes.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this paper we consider the following multiple component chirp signal model:

 y(t)=p∑k=1(A0kcos(α0kt+β0kt2)+B0ksin(α0kt+β0kt2))+X(t);p≥1, (1)

for  . Here is the real valued signal observed at . s, s are real valued amplitudes and s, s are the frequencies and the frequency rates, respectively and is the number of components of the model. Here,

is a sequence of error random variables with mean zero and finite fourth moment. The explicit assumption on the error structure is provided in Section

2.

Unlike the sinusoidal signal, a chirp signal has a frequency that changes with time. These signals occur in many physical phenomena of interest in science and engineering. Chirp model has its roots in radar signal modelling and is used in various forms for modelling trajectories of moving objects. Also many estimation procedures have been proposed in the literature, for the estimation of the unknown parameters of chirp signals, which is of primary interest. See Bello , Kelly , Abatzoglou , Djuric and Kay , Peleg and Porat , Shamsunder et al., , Ikram et al., , Besson et al., , Saha and Kay , Nandi and Kundu , Kundu and Nandi  and references cited therein. For recent references, see Lahiri et al., ,  and Mazumder .

Least squares estimators (LSEs) are a reasonable choice for estimating the unknown parameters of a linear or a non-linear model. The theoretical properties of the LSEs for a chirp signal model, were first obtained by Nandi and Kundu 

under the assumption that the additive errors are independently and identically distributed (i.i.d.) random variables with mean zero and finite variance. They proved that if the errors are i.i.d normal, the asymptotic variances attain the Cramer Rao lower bound. Since in practice, the errors may not be independent, so to make the model more realistic, Kundu and Nandi

 assumed stationarity of the error component to incorporate the dependence structure and studied the properties of the LSEs of the same model. It is observed that dispersion matrix of the asymptotic distribution of the LSEs turns out to be quite complicated. Using a number theoretic result of Vinogradov , Lahiri et al.,  provided a simplified structure of this dispersion matrix.

Although the LSEs have nice theroetical properties, finding the least squares estimates is computationally quite demanding. For instance, for the sinusoidal model, it has been observed by Rice and Rosenblatt , that the least squares surface has several local minima near the true parameter value (see Fig. 1, page 481) and due to this reason most of the iterative procedures, even when they converge, often converge to a local minimum rather than a global minimum. The same problem is observed for the chirp model. Thus a very good set of initial values are required for any iterative method to work.

One of the most popular estimators for finding the initial values for the frequencies of the sinusoidal model are the periodogram estimators (PEs). These are obtained by maximizing the following periodogram function:

 I(ω)=1n∣∣∣n∑t=1y(t)e−i(ωt)∣∣∣2 (2)

at the Fourier frequencies, namely at ; . It has been proved that if the periodogram function is maximised over the entire range , the estimators obtained, called the approximate least squares estimators (ALSEs), are consistent and asymptotically equivalent to the least squares estimators (see Whittle , Walker ). In this paper, we study the behaviour of the periodogram-type estimators, of the unknown parameters of the chirp model and see how they compare with the corresponding least squares estimators theoretically. Analogous to the periodogram function for the sinusoidal model, a periodogram-type function for the chirp model can be defined as follows:

 I(α,β)=2n∣∣∣n∑t=1y(t)e−i(αt+βt2)∣∣∣2. (3)

Corresponding to the Fourier frequencies at which is maximised for the sinusoidal model, it seems reasonable that for the chirp model, we maximise at ; , to obtain the initial guesses for the frequency and frequency rate parameters, respectively.

Consider the periodogram-like function defined in equation (3), which can also be written as:

 I(α,β)=2n{(n∑t=1y(t)cos(αt+βt2))2+(n∑t=1y(t)sin(αt+βt2))2}. (4)

The ALSEs of and are obtained by maximising with respect to and simultaneously. Our primary focus is to estimate the non-linear parameters and , and once we estimate these parameters efficiently, the linear parameters and

can be obtained by separable linear regression technique of Richards

.

In this paper, we prove that the ALSEs are strongly consistent. As a matter of fact, the consistency of the ALSEs of the linear parameters and is obtained under slightly weaker conditions than that of the LSEs, as we do not require their parameter space to be bounded in this case. Also the rate of convergence of the ALSEs of the linear parameters is and those of the frequency and frequency rate are and , respectively. The convergence rates of ALSEs are thus same as that of their corresponding LSEs. We show that the asymptotic distribution of the ALSEs is equivalent to that of the LSEs.

Recently, Lahiri et al., , proposed an efficient algorithm to compute the estimators of the unknown parameters of the chirp model. We perform numerical simulations to compare the proposed ALSEs with the LSEs and the estimators obtained by the efficient algorithm. We observe that for most of the cases, although the LSEs provide the best results, the time taken by the ALSEs is comparatively less. Among the three estimators, the estimators computed using the efficient algorithm, takes the least amount of time, though the biases and MSEs increase as compared to the other two estimators.

The rest of the paper is organised as follows. In section 2, we prove the consistency of the ALSEs and their asymptotic equivalence to the LSEs. In section 3, we discuss about the parameter estimation for the multiple component chirp model. In section 4 we present some simulation results and in section 5, we analyze some real life data sets for illustrative purposes. Finally, in section 6 we conclude the paper. All the proofs have been provided in the appendices.

## 2 Main Results for the One Component Chirp Model

In this section, we study the asymptotic properties of the following one component chirp model:

 y(t)=A0cos(α0t+β0t2)+B0sin(α0t+β0t2)+X(t). (5)

We will use the following notations: = (, , , ), = (, , , ), = (, , , ), the LSE of , and = , the ALSE of . The following assumptions are made on the error component of model (5): Assumption 1. Let be the set of integers. is a stationary linear process with the following form:

 X(t)=∞∑j=−∞a(j)e(t−j), (6)

where is a sequence of i.i.d random variables with , , and s are real constants such that

 ∞∑j=−∞|a(j)|<∞. (7)

This is a standard assumption for a stationary linear process. Any finite dimensional stationary MA, AR or ARMA process can be represented as (6) when the coefficients s satisfy condition (7) and hence this covers a large class of stationary random variables.
Let , , and , be the ALSEs of , , and , respectively. First we find and by maximising , as defined in (4) with respect to and and once we obtain and , the ALSEs of the linear parameters and can be obtained as follows:

 ~A=2nn∑t=1y(t)cos(~αt+~βt2)\textmdand~B=2nn∑t=1y(t)sin(~αt+~βt2). (8)

In the following two theorems, we state the consistency of the ALSE, .

###### Theorem 1.

Let (, ) be an interior point of [0, ] [0, ]. If {X(t)} satisfies Assumption 1, then the ALSEs and are strongly consistent estimators of and , respectively.

###### Proof.

See Appendix A. ∎

###### Theorem 2.

Under the conditions of Theorem 1, the ALSEs and of the linear parameters and are strongly consistent estimators.

###### Proof.

See Appendix A. ∎

It has been observed in the following theorem that ALSEs have the same distribution as the LSEs asymptotically.

###### Theorem 3.

Under the Assumption 1, the limiting distribution of is same as that of , where is the ALSE of and is the LSE of and =

###### Proof.

See Appendix B. ∎

## 3 Main Results for the Multiple Component Chirp Model

In this section, we consider a chirp signal model with multiple components. Mathematically, a multiple-component chirp model is given by:

 y(t)=p∑k=1(A0kcos(α0kt+β0kt2)+B0ksin(α0kt+β0kt2))+X(t),   p>1, t=1,2,⋯,n, (9)

where is the real valued signal observed at = 1, 2 , , , and are the amplitudes and and are the frequencies and frequency rates, respectively for = 1, 2, , .

To estimate the unknown parameters, we propose a sequential procedure to find the ALSEs. This method reduces the computational complexity of the estimators significantly without compromising on their efficiency. Following is the algorithm to find the ALSEs through sequential method:
Step 1: Compute and by maximizing the periodogram-like function

 I1(α,β)=1n{(n∑t=1y(t)cos(αt+βt2))2+(n∑t=1y(t)sin(αt+βt2))2}. (10)

Then the linear parameter estimates can be obtained by substituting and in (8). Thus

 ~A1=2nn∑t=1y(t)cos(~α1t+~β1t2)\textmdand~B1=2nn∑t=1y(t)sin(~α1t+~β1t2).

Step 2: Now we have the estimates of the parameters of the first component of the observed signal. We subtract the contribution of the first component from the original signal to remove the effect of the first component and obtain new data, say

 y1(t)=y(t)−~A1cos(~α1t+~β1t2)−~B1sin(~α1t+~β1t2),    t=1,2,⋯,n.

Step 3: Now compute and by maximizing

which is obtained by replacing the original data vector by the new data vector in (

10) and and by substituting and in (8).

Step 4: Continue the process upto -steps.

Note that we use the following notation: the parameter vector and the true parameter vector for all = 1, 2, , and the parameter space

Next to establish the asymptotic properties of these estimators, we further make the following model assumptions:

Assumption 2. is an interior point of and the frequencies and the frequency rates are such that .

Assumption 3. s and s satisfy the following relationship:

 ∞>A012+B012>A022+B022>⋯>A0p2+B0p2>0.

In the following theorems we prove that the ALSEs obtained by the sequential method described above are strongly consistent.

###### Theorem 4.

Under the assumptions 1, 2 and 3, and are strongly consistent estimators of and respectively, that is, as .

###### Proof.

See Appendix C. ∎

###### Theorem 5.

If the assumptions 1, 2 and 3 are satisfied and p 2, and are strongly consistent estimators of and , respectively, that is, as .

###### Proof.

See Appendix C. ∎

The result obtained in the above theorem can be extended upto the -th step. Thus for any , the ALSEs obtained at the -th step are strongly consistent.

###### Theorem 6.

If the assumptions 1, 2 and 3 are satisfied, and if , , and are the estimators obtained at the -th step, and k p then 0 and 0 as .

###### Proof.

See Appendix C. ∎

Lahiri et al.,  proved that the ordinary LSEs of the unknown parameters of the -component chirp model have the following asymptotic distribution:

 ((^θ1−θ01)D−1,⋯,(^θp−θ0p)D−1)d→N4p(0,2cσ2Σ(θ0)).

 Σ(θ0)=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣Σ10⋯00Σ2⋯0⋮⋮⋱⋮00⋯Σp⎤⎥ ⎥ ⎥ ⎥ ⎥⎦,

where,

 (11)

Also, note that

 Σ−1k=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣10B0k2B0k301−A0k2−A0k3B0k2−A0k2A0k2+B0k23A0k2+B0k24B0k3−A0k3A0k2+B0k24A0k2+B0k25⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦. (12)

We have the following result regarding the asymptotic distribution of the ALSEs.

###### Theorem 7.

Under the assumptions 1, 2, and 3, the asymptotic distribution of is equivalent to the asymptotic distribution of , for all , where is the ALSE and is the LSE of the unknown parameter vector associated with the -th component of the component model.

###### Proof.

See Appendix D. ∎

## 4 Numerical Experiments

In this section, we present simulation studies for one component and two component chirp models. We first consider the following one component chirp model:

 y(t)=A0cos(α0t+β0t2)+B0sin(α0t+β0t2)+X(t),

with the true parameter values and and is an MA(1) process, that is , with and s are i.i.d. normal random variables with mean zero and variance . For simulations we consider different : 0.1, 0.5 and 1. The different sample sizes we use are , and and for each we replicate the process, that is generate the data and obtain the estimates 1000 times. We estimate the parameters by the least squares estimation method, the approximate least squares estimation method and using the efficient algorithm as proposed by Lahiri et al., .

For the LSEs, we first minimize the error sum of squares function with respect to and using the Nelder and Mead method of optimization (using optim function in the R Stats Package). For the initial values, it is intuitive to minimize the function over the grid , analogous to what is suggested by Rice and Rosenblatt  for the sinusoidal model. For the ALSEs, we maximize the periodogram-like function , as defined in (4), again using the Nelder and Mead method and the starting values are obtained by maximizing on grid points as used for the corresponding LSEs.

Tables 12 and 3 provide the results, averaged over 1000 simulation runs, we obtain for the one component model. In these tables, we observe that, the ALSEs have very small bias in the absolute value. The MSEs of the LSEs are very close to their asymptotic variances and the MSEs of the ALSEs also get very close to those of LSEs as increases and hence to the theoretical asymptotic variances of the LSEs, showing that they are asymptotically equivalent. Also when we increase the sample size, the MSEs of both the estimators decrease showing that they are consistent. We observe that the estimators obtained by the Efficient Algorithm are close to the true values but the bias and the MSEs are not as small as compared with the other two estimators. However the time taken to compute the estimates by the Efficient Algorithm is much less than the time taken by the ALSEs and the LSEs.

We also perform simulations for the following two component model using the proposed sequential estimators:

 y(t)=A01cos(α01t+β01t2)+B01sin(α01t+β01t2)+A02cos(α02t+β02t2)+B02sin(α02t+β02t2)+X(t).

For simulation, we take the true values as = 2, = 1.75, = 1.5, = 0.1, = 3, = 2.25, = 2.5 and = 0.2, and compute both the LSEs and the ALSEs of all the unknown parameters, sequentially. The error structure is same as that for one component simulation study.

This process of data generation and estimation of the unknown parameters is replicated 1000 times and we calculate the average values, bias and MSEs of these estimates. We also report the time taken for the entire simulation process by each of the estimation methods. We compute the asymptotic variance of the estimates to compare the MSEs with them. Simulation results provided in tables 45 and 6, for the two component model, show that the MSEs of the proposed sequential estimators are well matched to the MSEs of LSEs and they become close as increases. Also they are comparable to the asymptotic variance of the LSEs. In many cases, it is observed that the MSEs of the ALSEs of the first component, are smaller than the corresponding LSEs. In all the tables, it is consistently observed that compared to the LSEs, computation of the ALSEs takes lesser time.

It is observed that the estimates of the unknown non-linear parameters of the second component for the two component model, that is of , , have very small bias as compared to those obtained at the first stage, that is of , or those obtained for the one component model, , . Since the proposed ALSEs have desirable properties, it is a good idea to obtain the initial estimates by maximising the periodogram-like function as defined in (4) and then carry out the least squares estimation.

## 5 Real Data Analysis

For illustration, we perform analysis of two speech signal data sets "AHH" and "AAA". These data have been obtained from a sound instrument at the Speech Signal Processing laboratory of the Indian Institute of Technology Kanpur. We have 469 data points in the "AHH" signal data set and 477 data points in the "AAA" signal data set, both sampled at 10 kHz frequency. Figure 1 gives the plot of the observed signal "AHH" and Figure 2 gives the plot of the observed signal "AAA".

We try to fit a multiple component chirp model to both the data sets, using the proposed sequential estimation procedure which computes ALSEs at each stage. At the same time, we compute the sequential LSEs as proposed by Lahiri et al.,  for comparison purposes. To find the initial values of the frequency and frequency rate, at each stage we maximize the periodogram-like function, over a fine grid: , = 1, 2, , , = 1, 2, , . For the estimation of the number of components, we use the following form of BIC:

 \textmdBIC(k)=n ln(\textmdSSE(k))+2 (4k+1) ln(n).

The model order is estimated as the value of for which the BIC is minimum. For the "AHH" data, when we estimate the parameters using sequential least squares estimation procedure, it is evident from Figure 3 that the number of components that fits this data is 8. Using the proposed sequential ALSEs to fit the model also gives the same estimated number of components which can be seen in Figure 4. The number of components when we estimate the parameters of the "AAA" data, using sequential least squares estimation procedure is 9, as can be seen from Figure 5. The proposed sequential ALSEs also give the same estimated number of components which can be seen in Figure 6. Figure 3: BIC plot: "AHH" data set when estimates are obtained by sequential LSE procedure. Figure 4: BIC plot: "AHH" data set when estimates are obtained by sequential ALSE procedure. Figure 5: BIC plot: "AAA" data set when estimates are obtained by sequential LSE procedure. Figure 6: BIC plot: "AAA" data set when estimates are obtained by sequential ALSE procedure.

Figure 7 and Figure 8 gives the observed as well as the fitted signal for the "AHH" data, estimated using the sequential LSEs and using the sequential ALSEs, respectively. We observe from these plots that both the fits look similar. Hence we may conclude from here as well, that the ALSEs are equivalent to the LSEs. Figure 9 and Figure 10 give the observed as well as the fitted signal for the "AAA" data, estimated using the sequential LSEs and using the sequential ALSEs, respectively. Figure 7: Observed "AHH" signal and signal fitted using sequential LSEs. Figure 8: Observed "AHH" signal and signal fitted using sequential ALSEs. Figure 9: Observed "AAA" signal and signal fitted using sequential LSEs. Figure 10: Observed "AAA" signal and signal fitted using sequential ALSEs.

We analyze the residuals by performing an augmented Dickey-Fuller (ADF) test and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test to check for their stationarity. This is done using in-built R-functions "adf.test" and "kpss.test" in "tseries" package in R. ADF test, tests the null hypothesis of unit-root being present in the time series against the alternative of no unit root, that is, stationarity and KPSS test is used for testing a null hypothesis that an observable time series is stationary around a deterministic trend against the alternative of a unit root. For the "AHH" data set, in the ADF test, we reject the null hypothesis and in KPSS test we do not reject the null hypothesis, and thereby from results of both the tests, we conclude that the residuals are stationary. For the "AAA" data set, in the ADF test, we reject the null hypothesis and in KPSS test we do not reject the null hypothesis, and thereby from results of both the tests, we conclude that the residuals are stationary. Figure

1114 provide the residual plots for the two data sets under the two sequential procedures. Figure 11: Residual plot: of "AHH" data when the estimation is using LSEs. Figure 12: Residual plot: of "AHH" data when the estimation is using ALSEs. Figure 13: Residual plot: of "AAA" data when the estimation is using LSEs. Figure 14: Residual plot: of "AAA" data when the estimation is using ALSEs.

## 6 Conclusion

In this paper, we proposed periodogram-type estimators, called the approximate least squares estimators (ALSEs), for the parameters of a one-dimensional one component chirp model and studied their asymptotic properties. We showed that they are consistent and asymptotically equivalent to the LSEs. Also we obtained the consistency of the ALSEs under weaker conditions than those required for the LSEs. For the multiple component chirp model, we proposed a sequential procedure based on calculating ALSEs and at each step of the sequential procedure establish that these are strongly consistent and asymptotically equivalent to the corresponding sequential LSEs, having the same rates of convergence. Simulation studies presented in the paper also confirm this large sample equivalence. Hence one may use the periodogram-like estimators as the initial values to find the LSEs. We also perform analysis of two speech signal data sets for illustrative purposes and the performances are quite satisfactory.

## Acknowledgement

The authors would like to thank two unknown reviewers and the associate editor for their constructive comments which have helped to improve the manuscript significantly.

## Appendix A

The following lemmas are required to prove Theorem 1.

###### Lemma 1.

If {X(t)} satisfies Assumption 1, then:

1. [label=()]

2. ,

3. .

###### Proof.

Refer to Kundu and Nandi .

###### Lemma 2.

If , then except for a countable number of points, the following results are true:

1. [label=()]

2. ,

3. ,

4. ,

5. ,

6. .

for all k = 0, 1, 2,

###### Proof.

Refer to Lahiri et al., .

###### Lemma 3.

Suppose and are the ALSEs of and , respectively. Let us denote and

 Sc={ξ;ξ=(α,β), |ξ−ξ0|>c},

If there exists a 0, such that

 limsupsupSc1n[I(ξ)−I(ξ0)]< 0 a.s., (13)

then almost surely. Here is as defined in (4).

###### Proof.

Let us denote by = and by to emphasize that they depend on . Suppose (13) is true and does not converge to as Then there exists a 0 such that

 P(limsupn→∞|~ξn−ξ0|>c)⩾0.

Hence, a 0 and a subsequence of such that for all = 1, 2, , that is for all = 1, 2, . Since is the ALSE of when , it maximises ,

 ⇒Ink(~ξnk)⩾Ink(ξ0)⇒1nk[Ink(~ξnk)−Ink(ξ0)]⩾0.
 \textmdHence,limsupsupSc1nk[Ink(α,β)−Ink(α0,β0)]⩾0.

Thus, we have which contradicts (13). Hence, the result follows. ∎

###### Lemma 4.

Suppose and are the ALSEs of and , respectively. Let us define = and = diag(), then

 (~ξ−ξ0)(√nD1)−1a.s.−−→0.
###### Proof.

Let us denote as the 1 2 first derivative vector, that is, and as the 2 2 second derivative matrix of , that is,

 I′′(ξ)=⎛⎜ ⎜⎝∂2I(α,β)∂α2∂2I(α,β)∂α∂β∂2I(α,β)∂α∂β∂2I(α,β)∂β2⎞⎟ ⎟⎠.

Using multivariate Taylor series expansion of around , we get:

 I′(~ξ)−I′(ξ0)=(~ξ−ξ0)I′′(¯ξ) (14)

where is such that Since = 0, (14) can be re-written as the following:

Let us first consider,

 1√nI′(ξ0)D=(∂I(α0,β0)∂α∂(I(α0,β0))∂β)1√n⎛⎜⎝1n√n001n2√n⎞⎟⎠=(1n2∂I(α0,β0)∂α1n3∂I(α0,β0)∂β).

Using Lemmas 1 and 2, it can be shown that:

 1n2∂I(α0,β0)∂αa.s.−−→0\textmdand1n3∂I(α0,β0)∂βa.s.−−→0.

Thus we have, 0. Now, consider the 2 2 matrix . Since and is a continuous function of ,

 limn→∞[DI′′(¯ξ)D]=limn→∞[DI′′(ξ0)D].
 DI′′(ξ0)D=⎛⎜⎝1n√n001n2√n⎞⎟⎠⎛⎜ ⎜⎝∂2I(α0,β0)∂α2∂2(I(α0,β0))∂α∂β∂2(I(α0,β0))∂β∂α∂2I(α0,β0)∂β2⎞⎟ ⎟⎠⎛⎜⎝1n√n001n2√n⎞⎟⎠=⎛⎜ ⎜⎝1n3∂2I(α0,β0)∂α21n4∂2(I(α0,