Preconditioned Gradient Descent Algorithm for Inverse Filtering on Spatially Distributed Networks

07/22/2020 ∙ by Cheng Cheng, et al. ∙ Duke University University of Central Florida 0

Graph filters and their inverses have been widely used in denoising, smoothing, sampling, interpolating and learning. Implementation of an inverse filtering procedure on spatially distributed networks (SDNs) is a remarkable challenge, as each agent on an SDN is equipped with a data processing subsystem with limited capacity and a communication subsystem with confined range due to engineering limitations. In this letter, we introduce a preconditioned gradient descent algorithm to implement the inverse filtering procedure associated with a graph filter having small geodesic-width. The proposed algorithm converges exponentially, and it can be implemented at vertex level and applied to time-varying inverse filtering on SDNs.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Spatially distributed networks (SDNs) have been widely used in (wireless) sensor networks, drone fleets, smart grids and many real world applications [1][4]. An SDN has a large amount of agents and each agent equipped with a data processing subsystem having limited data storage and computation power and a communication subsystem for data exchanging to its “neighboring” agents within communication range. The topology of an SDN can be described by a connected, undirected and unweighted finite graph with a vertex in representing an agent and an edge in between vertices indicating that the corresponding agents are within some range in the spatial space. In this letter, we consider SDNs equipped with a communication subsystem at each agent to directly communicate between two agents if the geodesic distance between their corresponding vertices is at most ,

(I.1)

where the geodesic distance is the number of edges in a shortest path connecting , and we call the minimal integer in (I.1) as the communication range of the SDN. Therefore the implementation of data processing on our SDNs is a distributed task and it should be designed at agent/vertex level with confined communication range. In this letter, we consider the implementation of graph filtering and inverse filtering on SDNs, which are required to be fulfilled at agent level with communication range no more than .

A signal on a graph

is a vector

indexed by the vertex set, and a graph filter maps a graph signal linearly to another graph signal , which is usually represented by a matrix indexed by vertices in . Graph filtering and its inverse filtering play important roles in graph signal processing and they have been used in smoothing, sampling, interpolating and many real-world applications [2], [5][9]. A graph filter is said to have geodesic-width if

(I.2)

[4, 10, 11]. For a filter with geodesic-width , the corresponding filtering process

(I.3)

can be implemented at vertex level, and the output at a vertex is a “weighted” sum of the input in its -neighborhood,

(I.4)

For SDNs with communication range , the above implementation at vertex level provides an essential tool for the filtering procedure (I.3), in which each agent has equipped with subsystems to store and with , to compute addition and multiplication in (I.4), and to exchange data to its neighboring agents satisfying .

For an invertible filter , the implementation of the inverse filtering procedure

(I.5)

cannot be directly applied for our SDNs, since the inverse filter may have geodesic-width larger than the communication range . For the consideration of implementing inverse filtering on an SDN with communication range , we construct a diagonal preconditioning matrix in (II.1) at vertex level, and propose the preconditioned gradient descent algorithm (PGDA) (II.8) to implement inverse filtering on the SDN, see Algorithms II.1 and II.2.

A conventional approach to implement the inverse filtering procedure (I.5) is via the iterative quasi-Newton method

(I.6)

with arbitrary initial , where the graph filter is an approximation to the inverse . A challenge in the quasi-Newton method is how to select the approximation filter appropriately. For the widely used polynomial graph filters of a graph shift where [11][19], several methods have been proposed to construct polynomial approximation filters [13, 14, 16, 17, 19]. However, for the convergence of the corresponding quasi-Newton method, some prior knowledge is required for the polynomial and the graph shift , such as the whole spectrum of the shift in the optimal polynomial approximation method [19], the interval containing the spectrum of the shift in the Chebyshev approximation method [16, 17, 19], and the spectral radius of the shift and the zero set of the polynomial in the autoregressive moving average filtering algorithm [13, 14]. For a non-polynomial graph filter , the approximation filter in the gradient descent method is of the form with selection of the optimal step length

depending on maximal and minimal singular values of the filter

[12, 20], and the approximation filter in the iterative matrix inverse approximation algorithm (IMIA) could be selected under a strong assumption on [10, Theorem 3.2]. The proposed PGDA (II.8) is the quasi-Newton method (I.6) with being selected as the approximation filter , see (II.3). Comparing with the quasi-Newton methods in [10, 12, 13, 14, 16, 17, 19, 20], one significance of the proposed PGDA is that the sequence , in (II.8) converges exponentially to the output of the inverse filtering procedure (I.5) whenever the filter is invertible, see Theorems II.3 and III.1.

For a time-varying filter the quasi-Newton method (I.6) to implement their inverse filtering on SDNs should be self-adaptive as each agent does not have the whole time-varying filter and it only receives the submatrix of the filter within the communication range [4]. The IMIA algorithm is self-adaptive [10, Eq. (3.4)], and the gradient descent method [12, 20] is not self-adaptive in general except that the step length can be chosen to be time-independent. The second significance of the proposed PGDA is its self-adaptivity and compatibility to implement the time-varying inverse filtering procedure on our SDNs, as the preconditioner (and hence the approximation filter in the PGDA) is constructed at vertex level with confined communication range, see Algorithm II.1.

Ii Preconditioned gradient descent algorithm for inverse filtering

Let be a connected, undirected and unweighted graph and be a filter on the graph with geodesic-width . In this section, we induce a diagonal matrix with diagonal elements , given by

(II.1)

where we denote the set of all -hop neighbors of a vertex by . The above diagonal matrix can be evaluated at vertex level, and constructed on SDNs with communication range , see Algorithm II.1.

  Inputs: Geodesic width of the filter and nonzero entries and for in the -th row and column of the filter .
  1) Calculate.
  2) Send to all neighbors and receive from neighbors .
  3) Calculate .
  Output: .
Algorithm II.1 Realization of the preconditioner at a vertex .

For symmetric matrices and , we use and to denote the positive semidefiniteness and positive definiteness of their difference respectively. A crucial observation about the diagonal matrix is as follows.

Theorem II.1.

Let be a graph filter with geodesic-width and be as in (II.1). Then

(II.2)
Proof.

For , we have

This proves (II.2) and completes the proof. ∎

Denote the spectral radius of a matrix by . By Theorem II.1, is an approximation filter to the inverse filter in the sense that

(II.3)
Remark II.2.

Define the Schur norm of a matrix by

and denote the zero and identity matrices of appropriate size by and respectively. One may verify that

(II.4)

By (II.1), we have

(II.5)

Then we may consider the conclusion (II.2) for the preconditioner as a distributed version of the well-known matrix dominance (II.4) for the graph filter .

  Inputs: Iteration number , geodesic-width , preconditioning constant , observation at vertex , and filter coefficients and .
  1) Calculate .
  Initialization: Initial , and .
  2) Calculate .
  3) Send to neighbors and receive from neighbors .
  4) Update
  5) Send to neighbors and receive from neighbors .
  6) Set and return to Step 2) if .
  Outputs: .
Algorithm II.2 Implementation of the PGDA (II.8) at a vertex .

Preconditioning technique has been widely used in numerical analysis to solve a linear system, where the difficulty is how to select the preconditioner appropriately. In this letter, we use as a right preconditioner to the linear system

(II.6)

associated with the inverse filtering procedure (I.5), and we solve the following right preconditioned linear system

(II.7)

via the gradient descent algorithm

with initial . The above iterative algorithm can be reformulated as a quasi-Newton method (I.6) with replaced by ,

(II.8)

with initial . We call the above approach to implement the inverse filtering procedure (I.5) by the preconditioned gradient descent algorithm, or PGDA for abbreviation.

Define , and the norm for . By (II.8), we have

(II.9)

Therefore the iterative algorithm (II.8) converges exponentially by (II.3) and (II.9).

Theorem II.3.

Let be an invertible graph filter and , be as in (II.8). Then

In addition to the exponential convergence in Theorem II.3, the PGDA is that each iteration can be implemented at vertex level, see Algorithm II.2. Therefore for an invertible filter with , the PGDA (II.8) can implement the inverse filtering procedure (I.5) on SDNs with each agent only storing, computing and exchanging the information in a -hop neighborhood.

Iii Symmetric preconditioned gradient descent algorithm for inverse filtering

  Inputs: Iteration number , geodesic-width , observation at vertex , and filter coefficients and .
  1) Calculate , and .
  Initialization: Initial and .
  2) Compute.
  3) Send to neighbors and receive from neighbors .
  4) Set and return to Step 2) if .
  Outputs: .
Algorithm III.1 Implementation of the SPGDA (III.3) at a vertex .

In this section, we consider implementing the inverse filtering procedure (I.5) associated with a positive definite filter on a connected, undirected and unweighted graph . Define the diagonal matrix with diagonal entries

(III.1)

and set

(III.2)

We remark that the normalized matrix in (III.2) associated with a diffusion matrix has been used to understand diffusion process [21], and the one corresponding to the Laplacian on the graph is half of its normalized Laplacian , where is degree matrix of [11]. Similar to the PGDA (II.8), we propose the following symmetric preconditioned gradient descent algorithm, or SPGDA for abbreviation,

(III.3)

with initial , to solve the following preconditioned linear system

(III.4)

Comparing with the PGDA (II.8), the SPGDA for a positive definite graph filter has less computation and communication cost in each iteration and it also can be implemented at vertex level, see Algorithm III.1.

For , we obtain from (III.1) and the symmetry of the matrix that

Combining (II.1) and (III.1) proves that

(III.5)

cf. (II.2). This together with (III.2) implies that

(III.6)

Similar to the proof of Theorem II.3, we have

Theorem III.1.

Let be a positive definite graph filter. Then , in (III.3) converges exponentially,

Iv Numerical simulations

Fig. 1: Plotted on the left is a corrupted blockwise polynomial signal and in the middle is the output of the filtering procedure, where and the condition number of the filter is . Shown on the right is average of the relative inverse filtering error over 1000 trials, where and , , are the outputs of SPGDA, PGDA, OpGD and IMIA.

Let be a random geometric graph with 512 vertices deployed on and an undirected edge between two vertices if their physical distance is not larger than [11, 22]. In the first simulation, we consider the inverse filtering procedure associated with the graph filter

where is the normalized Laplacian on the graph , the filter is defined by

is the coordinator of a vertex and

are i.i.d random noises uniformly distributed on

. Let be the blockwise polynomial consisting of four strips and imposes on the first and third diagonal strips and on the second and fourth strips respectively [11, 19]. In the simulation, the signals

are obtained by a blockwise polynomial corrupted by noises

with their components being i.i.d. random variables with uniform distribution on

, and the observations of the filtering procedure are given by , see the left and middle images of Figure 1. In the simulation, we use the SPGDA (III.3) and the PGDA (II.8) with zero initial to implement the inverse filtering procedure , and also we compare their performances with the gradient decent algorithm

(IV.1)

with zero initial and optimal step length selected in [12, 19, 20], OpGD in abbreviation, and the iterative matrix inverse approximation algorithm,

(IV.2)

IMIA in abbreviation, where and the diagonal matrix has entries , see [10, Eq. (3.4)] with . Shown in Figure 1 is the average of the relative inverse filtering error over 1000 trials, and it reaches the relative error 5% at about th iteration for IMIA, 118th iteration for SPGDA, and more than 3000 iterations for PGDA and OpGD. This confirms that , in the SPGDA, PGDA, OpGD and IMIA converge exponentially to the output of the inverse filtering, and the convergence rate are spectral radii of matrices , , and , see Theorems II.3 and III.1. Here the average of spectral radii in SPGDA, PGDA, OpGD and IMIA are respectively. We remark that the reason for PGDA and OpGD to have slow convergence in the above simulation could be that their spectral radii are too close to . For the filter perturbation level , our simulations indicate that for some filters being invertible but not positive definite, the corresponding PGDA and OpGD converge while SPGDA and IMIA diverge.

Fig. 2: Plotted on the left is the original temperature data

. Shown on the right is average of the signal-to-noise ratio

, over 1000 trials, where , , are the outputs of PGDA, SPGDA, OpGD, IMIA and ICPA, and average of the limit SNR is 16.7869.

Let be the undirected graph with 218 locations in the United States as vertices and edges constructed by the 5 nearest neighboring locations, and let be the recorded temperature vector of those 218 locations on August 1st, 2010 at 12:00 PM, see Figure 2 [19, 23]. In the second simulation, we consider to implement the inverse filtering procedure

arisen from the minimization problem

in denoising the hourly temperature data , where is the normalized Laplacian on , is a penalty constraint and is the temperature vector corrupted by i.i.d. random noise with its components being randomly selected in in a uniform distribution [19, 23]. Shown in Figure 2 is the performance of the SPGDA, PGDA, OpGD, IMIA and ICPA to implement the above inverse filtering procedure with noise level and the penalty constraint [19], where ICPA is the iterative Chebyshev polynomial approximation algorithm of order one [16, 17, 19]. This indicates that the 3rd term in ICPA, the 5th term in IMIA, the 8th term of SPGDA, the 10th term of OpGD and the 30th term of PGDA can be used as the denoised temperature vector .

To implement the inverse filter procedure (I.5) on SDNs, we observe from the above two simulations that OpGD outperforms PGDA while the selection of optimal step length in OpGD is computationally expensive. If the filter is positive definite, SPGDA, IMIA and ICPA may have better performance than OpGD and PGDA have. On the other hand, SPGDA always converges, but the requirement in [10, Theorem 3.2] to guarantee the convergence of IMIA may not be satisfied and ICPA is applicable for polynomial filters.

References

  • [1] J. Yick, B. Mukherjee, and D. Ghosal, “Wireless sensor network survey,” Comput. Netw., vol. 52, no. 12, pp. 2292-2330, Aug. 2008.
  • [2]

    D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains,”

    IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83-98, May 2013.
  • [3] R. Hebner, “The power grid in 2030,” IEEE Spectrum, vol. 54, no. 4, pp. 50-55, Apr. 2017.
  • [4] C. Cheng, Y. Jiang, and Q. Sun, “Spatially distributed sampling and reconstruction,” Appl. Comput. Harmon. Anal., vol. 47, no. 1, pp. 109-148, July 2019.
  • [5] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs,” IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644-1656, Apr. 2013.
  • [6] A. Sandryhaila and J. M. F. Moura, “Big data analysis with signal processing on graphs: representation and processing of massive data sets with irregular structure,” IEEE Signal Process. Mag., vol. 31, no. 5, pp. 80-90, Sept. 2014.
  • [7] A. Ortega, P. Frossard, J. Kovačević, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: overview, challenges, and applications,” Proc. IEEE , vol. 106, no. 5, pp. 808-828, May 2018.
  • [8] M. Coutino, E. Isufi, and G. Leus, “Advances in distributed graph filtering,” IEEE Trans. Signal Process., vol. 67, no. 9, pp. 2320-2333, May 2019.
  • [9] J. Jiang, D. B. Tay, Q. Sun and S. Ouyang, “Recovery of time-varying graph signals via distributed algorithms on regularized problems”, IEEE Trans. Signal Inf. Process. Netw., to appear, 2020.
  • [10] J. Jiang, and D. B. Tay, “Decentralised signal processing on graphs via matrix inverse approximation,” Signal Process., vol. 165, pp.292-302, Dec. 2019.
  • [11] J. Jiang, C. Cheng, and Q. Sun, “Nonsubsampled graph filter banks: theory and distributed algorithms,” IEEE Trans. Signal Process., vol. 67, no. 15, pp. 3938-3953, Aug. 2019.
  • [12] S. Chen, A. Sandryhaila, J. M. F. Moura, and J. Kovačević, “Signal recovery on graphs: variation minimization,” IEEE Trans. Signal Process., vol. 63, no. 17, pp. 4609-4624, Sept. 2015.
  • [13] E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Autoregressive moving average graph filtering,” IEEE Trans. Signal Process., vol. 65, no. 2, pp. 274-288, Jan. 2017.
  • [14] E. Isufi, A. Loukas, N. Perraudin, and G. Leus, “Forecasting time series with VARMA recursions on graphs,” IEEE Trans. Signal Process., vol. 67, no. 18, pp. 4870-4885, Sept. 2019.
  • [15] W. Waheed and D. B. H. Tay, “Graph polynomial filter for signal denoising,” IET Signal Process., vol. 12, no. 3, pp. 301-309, Apr. 2018.
  • [16] D. I. Shuman, P. Vandergheynst, D. Kressner, and P. Frossard, “Distributed signal processing via Chebyshev polynomial approximation,” IEEE Trans. Signal Inf. Process. Netw., vol. 4, no. 4, pp. 736-751, Dec. 2018.
  • [17] C. Cheng, J. Jiang, N. Emirov, and Q. Sun, “Iterative Chebyshev polynomial algorithm for signal denoising on graphs”, in Proceeding 13th Int. Conf. on SampTA, Bordeaux, France, July 2019, pp. 1-5.
  • [18] J. Jiang, D. B. Tay, Q. Sun, and S. Ouyang, “Design of nonsubsampled graph filter banks via lifting schemes,” IEEE Signal Process. Lett., vol. 27, pp. 441-445, Feb. 2020.
  • [19] N. Emirov, C. Cheng, J. Jiang, and Q. Sun, “Polynomial graph filter of multiple shifts and distributed implementation of inverse filtering,” arXiv: 2003.11152, Mar. 2020.
  • [20] X. Shi, H. Feng, M. Zhai, T. Yang, and B. Hu, “Infinite impulse response graph filters in wireless sensor networks,” IEEE Signal Process. Lett., vol. 22, pp. 1113-1117, Aug. 2015.
  • [21]

    B. Nadler, S. Lafon, I. Kevrekidis, and R. Coifman, “Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators,” In

    Advances in Neural Information Processing Systems 18, Y. Weiss, B. Schölkopf, and J. Platt eds, MIT Press, Cambridge, 2006, pp. 955-962.
  • [22] P. Nathanael, J. Paratte, D. Shuman, L. Martin, V. Kalofolias, P. Vandergheynst, and D. K. Hammond, “GSPBOX: A toolbox for signal processing on graphs,” arXiv:1408.5781, Aug. 2014.
  • [23] J. Zeng, G. Cheung, and A. Ortega, “Bipartite approximation for graph wavelet signal decomposition,” IEEE Trans. Signal Process., vol. 65, no. 20, pp. 5466-5480, Oct. 2017.