On Distributed Nonlinear Signal Analytics : Bandwidth and Approximation Error Tradeoffs

05/22/2018 ∙ by Vijay Anavangot, et al. ∙ IIT Bombay 0

Analytics will be a part of the upcoming smart city and Internet of Things (IoT). The focus of this work is approximate distributed signal analytics. It is envisaged that distributed IoT devices will record signals, which may be of interest to the IoT cloud. Communication of these signals from IoT devices to the IoT cloud will require (lowpass) approximations. Linear signal approximations are well known in the literature. It will be outlined that in many IoT analytics problems, it is desirable that the approximated signals (or their analytics) should always over-predict the exact signals (or their analytics). This distributed nonlinear approximation problem has not been studied before. An algorithm to perform distributed over-predictive signal analytics in the IoT cloud, based on signal approximations by IoT devices, is proposed. The fundamental tradeoff between the signal approximation bandwidth used by IoT devices and the approximation error in signal analytics at the IoT cloud is quantified for the class of differentiable signals. Simulation results are also presented.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Signal representation using a finite number of coefficients is well known and is termed as source coding [1], sampling [2], and function approximation [3]. Analytics will be a part of the upcoming smart city and Internet of Things (IoT). It is envisaged that distributed signals may be recorded at the IoT devices [4]. At the IoT cloud, signal analytics from the recorded signals by many IoT devices is of interest.

Due to bandwidth constraint, each IoT device should send signal approximation for the intended signal analytics to the IoT cloud. A parsimonious signal approximation at each IoT device, that minimizes approximation error at the IoT cloud, is desired. The fundamental tradeoff between approximation bandwidth used by IoT devices and the approximation error in signal analytics at the IoT cloud is desirable.

Linear or mean-squared signal approximations are well known in the literature. An obvious method to approximate in a distributed manner is to compute the signals’ linear approximations using existing algorithms and communicate them [3, 5, 6, 7]. However, this method will not be applicable in certain nonlinear problems. The following envisaged smart city applications motivate distributed nonlinear signal analytics.

Fig. 1: A schematic of nonlinear signal analytics in IoT is illustrated. IoT devices communicate their approximate signals to the IoT cloud in a distributed manner.

Smart meters: Consider the setup shown in Fig. 1. A smart city planner fixes smart meters (IoT devices) in each home. The smart meter records the (instantaneous) power signal consumed, and has to report it to the IoT cloud. The planner has to calculate the smallest sufficient supply capacity to meet the energy demand at all times. If are the power consumption signals at various devices, then the planner is interested in . However, due to bandwidth constraint IoT devices can only send approximations . The IoT cloud will compute . For sufficient supply capacity calculation, it is required that

In the above equation, the difference between the two quantities is the approximation error. Using an (over-predictive) envelope approximation, i.e., is one approach to tackle this problem. In this case, the IoT cloud reconstructs the sum signal, and then reports to the planner (see Fig. 1). The desired signal analytic and the associated approximation are both nonlinear.

Pollution control: Consider a smart city regulator, which has to report pollutant concentration (such as PM 2.5 levels). IoT devices with PM 2.5 sensors can be employed in the smart city, which record . To save bandwidth, each IoT device has to communicate its approximated pollution signal to the IoT cloud. The regulator can use air diffusion models (see [8]) to characterize the spatio-temporal evolution of the pollutant for the entire city. The regulator may want to provide a (pessimistic) pollution signal approximation, which requires over-predictive nonlinear signal analytics.

Renewable energy in a smart grid: Consider a smart city where solar panels generate electricity that returns to the power grid in the region. Let be the renewable power generated as a function of time . These signals will be approximately communicated to the IoT cloud. The electricity planner will be interested in

to estimate an (under-predictive) envelope of the renewable power. This is also a nonlinear signal analytics problem.

With a focus towards over-predictive nonlinear signal analytics in an IoT, the following main results will be presented:

  1. An algorithm using envelope approximations [9] is presented for over-predictive signal analytics with IoT devices and cloud. Its simulation results are presented.

  2. For the above algorithm, with Fourier basis, fundamental tradeoffs between approximation error and bandwidth will be analyzed for differentiable signals.

Prior Art: Linear signal analytics is well understood: distributed signals can be projected (approximated) in a linear basis and approximately communicated. However, this method will not apply to the above-mentioned applications. As far as we know, nonlinear signal analytics has not been studied. On the other hand, analytics of distributed scalars is well reported. Computation and reporting of symmetric functions of the scalar parameters (such as weighted average) has been studied [5, 6]. Gossip based algorithms compute scalar functions in large networks, using linear fusion methods [7, 10]. Recently, there have been efforts towards big-data analytics in smart city/IoT [4, 11, 12]. To the best of our knowledge, these works do not address signal analytics where continuous-time signals are involved.

The paper is organized as follows. Section II explains the system background and problem formulation. An order optimal approximation algorithm for the over-predictive signal analytics is proposed in Section III. The bandwidth and error tradeoff for this scheme is analyzed in Section IV. Simulation setup and numerical results using electricity load data sets are presented in Section V.

Ii Background and problem setup

IoT signals, their assumed smoothness properties, tradeoffs of interest, and problem formulation are discussed in this section.

Ii-a Signal model and IoT description

Consider an IoT with distributed IoT devices and an IoT cloud. We assume a cloud based IoT architecture with computing capability at the individual IoT devices [12]. The IoT cloud acts as a central server with the capacity to handle massive data arriving from a number of IoT devices. The IoT devices indexed record signals at various locations. Due to bandwidth constraint, the IoT devices individually communicate approximations to the IoT cloud. Signal analytics are derived in the IoT cloud based on . Without loss of generality, it is assumed that are recorded over the finite observation interval . The signals will be assumed to be -times differentiable in for .

Ii-B Fourier representation of IoT signals

Due to space constraints, this first exposition will use Fourier series basis for on , with the constraint .111This end point symmetry constraint prevents Gibbs phenomenon during reconstruction [13] of the signals, and it can be avoided by considering other basis (e.g., polynomials), but is omitted due to space constraints [3]. The Fourier series of are given by

The -times differentiability of implies that their Fourier coefficients decay polynomially in :

Fact II.1 (Sec 2.3,[14])

A signal , with , is -times differentiable if its Fourier coefficient  obey

Ii-C Approximation analysis and its definitions

For , bandlimited approximations of with Fourier coefficient  are defined as

(1)

where will be a function of chosen later according to specified criterion. Here, the bandwidth of the approximate signals .

Let the sum signal analytic of interest be . From , for example, or can be obtained. The corresponding approximation obtained by the IoT cloud is . Let be the approximation error according to some distance measure. For a given bandwidth , the overpredictive distributed signal analytics problem (see Section I) is:

(2)

For a signal , the approximation such that is called an envelope approximation [9].

Ii-D Distance measures of interest

For overpredictive approximation, we will construct envelope of the signal recorded by an IoT device [9]. While making envelope approximation of a signal , a distance function is needed to capture the proximity of with  [9]. In this work, the and distance measures will be used. Using the Fourier representations of (akin to (1)) and , and the envelope property it can be observed that

(3)
(4)
(5)

In the above equation, the distance is upper-bounded using an error between and in the sequence space. The approximation error in signal analytics corresponding to the distance measures will be denoted by , respectively.

Ii-E Bandwidth and approximation error tradeoff

The fundamental tradeoffs between the bandwidth parameter and the approximation error for distance metrics will be presented, with the constraint that .

Iii Overpredictive signal analytics

Recall the smart city applications described in Fig. 1, with IoT devices reporting the approximate signals observed to the IoT cloud. The signal analytics problem of interest are

(6)

for , and

(7)

where the above minimizations are over .

Iii-a Algorithm for overpredictive signal analytics in IoT

It is assumed that each IoT device works in a distributed manner. To ensure , we propose that each IoT device can perform envelope approximation of its observed signal [9]. The following steps are proposed for obtaining a bandwidth- approximation of :

  1. Each device records its individual signal , calculates its envelope , and communicates its Fourier coefficients to the IoT cloud.

  2. Using Fourier coefficients from each IoT device, the cloud calculates .

Signal envelope calculation in Step 1 above is outlined next. For distance (see (3)), it will be calculated as [9]

minimize
subject to (8)

where and

are the Fourier series coefficients of the envelope approximation. The above linear program with linear constraints is solvable efficiently 

[9]. For and , the cost function is replaced by those in (4) and (5).

As is increased, the envelopes become more proximal to their target . It is expected that , , and will decrease as increases. However, analyzing the dependence of versus is difficult. Accordingly, naïve envelope approximation [15] will be used to analyze fundamental bounds on their tradeoff. Since each IoT device approximates the signal in a distributed manner, the error in will increase with the number of IoT devices in the network as discussed in the next section.

Iii-B The naïve envelope approximation

First consider IoT device in isolation. Let be the orthogonal projection of on the span of for . Then The naïve envelope approximation scheme is as follows [15]:

(9)

where . Using the triangle inequality,

(10)

For  [16, Sec. 2.2] we can show that, .

Iv Naive envelope approximation analysis

Fig. 2: (a) Distributed envelope approximation applied on electricity load datasets collected from Eastern and Western Region power grids in India [17] for approximation coefficients (b) Sum envelope recovered at the IoT cloud using error optimization (c) Convergence of to SA with increasing available bandwidth at IoT device (d) Error versus Bandwidth plots for and error metric compared to the theoretical upperbound with (see Theorem IV.1).
Fig. 3: Dataset : PM2.5 (Speciation) concentration in Denver from US-EPA database [18](a) Lower and upper envelopes for with optimization (b) Lower and upper envelopes for with optimization (c) The variation of average gap between upper and lower envelopes with increasing values of (d) Error in time estimate of peak pollution concentration, for various value of .

The main result is proved in this section.

Theorem IV.1

In (6) and (7), let be the optimal distance and be the distance corresponding to the naïve approximation in (9). There exist signals such that for

Proof: Using (10) and the triangle inequality on sum of signals and their envelopes, we determine bounds on . They are tabulated in Table I, where the result follows for by taking ratios of and . The steps are omitted due to space constraints. This result shows that the naive approximation is order optimal for the and errors, if the IoT device signals are -times differentiable.

The optimality of the naïve envelopes for the distance holds for a signal class with the following properties: (i) the Fourier coefficients , and (ii) is real and even, that is . ¿From these symmetry assumptions, it follows that . Restricted to this signal class, the envelope approximation is re-stated as:

(11)

The above optimization can be shown to result in if for . In this case . And, for this signal as all the Fourier coefficients are positive. Since , as naive method will be suboptimal, so the two are equal.

It is noted that and in Table I are comparable. The presented result is for 1 IoT devices. For IoT devices, and as well as and scale linearly with . And as well as scale quadratically with . Their ratios remain the same as in .

error metric
TABLE I: Bounds on the approximation errors

V Numerical simulations

Dataset Description: Time series data of electricity loads (in KWhr) from Eastern and Western region grids of India are considered [17] (see Fig. 2

). The time samples available at 30 minute intervals are interpolated using low pass projection on Fourier basis with 25 coefficients. For brevity of simulations a normalized amplitude scale is used. We also consider the pollution dataset from US EPA

222United States Environmental Protection Agency [18], consisting of the time-series variation of PM2.5 concentration in Denver. The hourly sampled data, for a duration of 1 day, is smoothened using lowpass filtering using 10 Fourier coefficients (see Fig. 3).

Simulation setup and analysis: Simulations for the approximation error and bandwidth tradeoffs are presented for grid load data in Fig. 2 [17]. The time series plots for individual load variation are shown in Fig.2 (a). The sum signal envelope is reconstructed at the IoT Cloud with Fourier coefficients (see Fig. 2(b)). The convergence of the estimate to with increasing is studied in Fig. 2(c). It is observed that the maxima of the sum signal is eventually tracked under each of the cost functions. The oscillations in the error plot (see Fig. 2 (c)) are the artifacts of the distributed and nonlinear nature of the approximation algorithm. At any IoT device, the maximum (nonlinear) of the signal is tracked with uniformly decreasing error. However, since each device independently (ie in a distributed manner) reports the approximations to the IoT cloud, the errors do not die down uniformly, but with occasional rise. Fig. 2(d) captures the bandwidth-error tradeoff for the three distance measures considered in the paper. The performance of the naïve approximation scheme is seen to coincide with that of the error optimization. This is attributed to the relaxation of the cost in terms of the coefficients as discussed in (5).

We analyze the gap between the upper and lower signal envelopes using the pollution dataset [18] (Fig. 3). We observe that the average gap between the envelopes decays with increasing value of . It is to be noted that the lower envelope is negative in certain time intervals333This indicates the limitation of Fourier basis representation in capturing the signal for a generic signal class. Alternate basis representations such as polynomials or wavelets suitable for the signal class will be considered as future extensions.. In Fig. 3 (d) we analyze the error in time estimate of the peak pollutant concentration, . The estimates approach the true value of signal analytic (ie ) eventually with growing bandwidth parameter .

Energy consumption in IoT Device and IoT Cloud :By using IoT energy consumption data, a comparison of different IoT energy requirements are shown in Fig. 4 [19]. For a communication mode with energy/bit requirement and bits/coefficient, the proposed envelope scheme requires energy units per device. The computing energy requirement (see Fig. 2(d)) is determined by counting the number of floating point operations (FLOP) in MATLAB to execute the optimization algorithm at the device. The conversion of FLOP to equivalent Joules is performed with respect to a 100 GFLOP/sec/W processor (see TI C667x processor [20]). It is observed that the communication and computation energy expenditure is observed to linear increase with the number of approximation coefficients.

Fig. 4: Energy expenditure at IoT device for communication and computing. Simulation parameters - bit for Ethernet, bit for WiFi, bits per coefficient bits and GFLOPS/W processor.

Vi Conclusions

Distributed nonlinear signal analytics was introduced for the first time. It was observed that in applications such as energy distribution, signal observed by IoT devices should be subjected to envelope approximation. An algorithm for over-predictive nonlinear signal analytics in the IoT cloud was developed in this work, by using the envelope approximation technique. The fundamental tradeoff between approximation error in signal analytics () and the bandwidth parameter was established. It was observed that this tradeoff depends on the smoothness of signals at the IoT devices. Simulation results were presented for load data collected from two power grids. Envelope approximation schemes using polynomial and wavelet basis are proposed as a future extensions.

References

  • [1] T.M. Cover and J.A. Thomas, Elements of Information Theory, John Wiley and Sons, 2006.
  • [2] C. E. Shannon, “Communication in the presence of noise,” Proceedings of the IRE, vol. 37, no. 1, pp. 10–21, Jan 1949.
  • [3] Ronald A. DeVore and George G. Lorentz, Constructive Approximation, Springer, 1993.
  • [4] A. Zanella, N. Bui, A. Castellani, L. Vangelista, and M. Zorzi, “Internet of things for smart cities,” IEEE Internet of Things Journal, vol. 1, no. 1, pp. 22–32, Feb 2014.
  • [5] A. Giridhar and P. R. Kumar, “Computing and communicating functions over sensor networks,” IEEE Journal on Selected Areas in Communications, vol. 23, no. 4, pp. 755–764, April 2005.
  • [6] P. Vyavahare, N. Limaye, and D. Manjunath, “Optimal embedding of functions for in-network computation: Complexity analysis and algorithms,” IEEE/ACM Transactions on Networking, vol. 24, no. 4, pp. 2019–2032, Aug 2016.
  • [7] P. Jesus, C. Baquero, and P. S. Almeida, “A survey of distributed data aggregation algorithms,” IEEE Communications Surveys Tutorials, vol. 17, no. 1, pp. 381–404, Firstquarter 2015.
  • [8] J. Ranieri, I. Dokmanic, A. Chebira, and M. Vetterli, “Sampling and reconstruction of time-varying atmospheric emissions,” in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2012, pp. 3673–3676.
  • [9] Animesh Kumar, “Optimal envelope approximation in fourier basis with applications in TV white space,” arXiv, 2017, Available at http://arxiv.org/abs/1706.00900.
  • [10] John Nikolas Tsitsiklis, Problems in Decentralized Decision Making and Computation, Ph.D. thesis, Massachusetts Institute of Technology. Laboratory for Information and Decision Systems, 1984.
  • [11] Y. Sun, H. Song, A. J. Jara, and R. Bie, “Internet of things and big data analytics for smart and connected communities,” IEEE Access, vol. 4, pp. 766–773, 2016.
  • [12] J. Jin, J. Gubbi, S. Marusic, and M. Palaniswami, “An information framework for creating a smart city through internet of things,” IEEE Internet of Things Journal, vol. 1, no. 2, pp. 112–121, April 2014.
  • [13] Alan V Oppenheim, Discrete-time signal processing, Pearson Education India, 1999.
  • [14] Stephane Mallat, A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way, Academic Press, 3rd edition, 2008.
  • [15] G. Maheshwari and A. Kumar, “Optimal quantization of tv white space regions for a broadcast based geolocation database,” in 2016 24th European Signal Processing Conference (EUSIPCO), Aug 2016, pp. 418–422.
  • [16] Rajendra Bhatia, Fourier Series, Hindustan Book Agency, 2nd edition, 1993.
  • [17] Power System Operation Corporation Ltd., “Electricity load factor in Indian power system,” Report of the National Load Dispatch Center, 2016.
  • [18] “Tables of hourly data for particulates,” https://www.epa.gov/outdoor-air-quality-data, Accessed: 2018-02-20.
  • [19] Internet Society, “The Internet of Things (IoT): An Overview,” 2015.
  • [20] “TMS320C6678 multicore fixed and floating-point digital signal processor,” http://www.ti.com/lit/ds/symlink/tms320c6678.pdf, Accessed: 2018-02-20.