1 Introduction
Consider a network represented by a graph consisting of a node set of cardinality and a weighted adjacency matrix whose entry, , denotes weight of the edge connecting node to node . A graph signal
can be viewed as a vertexvalued network process that can be represented by a vector of size
supported on , where its ^{th} component denotes the signal value at node . Under the assumption that properties of the network process relate to the underlying graph, the goal of graph signal processing (GSP) is to generalize traditional signal processing tasks and develop algorithms that fruitfully exploit this relational structure [1, 2].A keystone generalization which has drawn considerable attention in recent years pertains to sampling and reconstruction of graph signals [3, 4, 5, 6, 7, 8, 9, 10]. The task of finding an exact sampling set to perform reconstruction with minimal information loss is known to be NPhard. Conditions for exact reconstruction of graph signals from noiseless samples were put forth in [3, 4, 5, 6]. Sampling of noisecorrupted signals using randomized schemes including uniform and leverage score sampling is studied in [11], for which optimal sampling distributions and performance bounds are derived. In [7, 10], reconstruction of graph signals and their power spectrum density is studied and greedy schemes are developed. However, the performance guarantees in [11, 10] are restricted to the case of stationary graph signals, i.e., the covariance matrix in the nodal or spectral domains has certain structure (e.g., diagonal; see also [12, 13, 14]).
In this paper, we study the problem of sampling and reconstruction of graph signals and propose two algorithms that solve it approximately. First, we develop a semidefinite programming (SDP) relaxation that finds a nearoptimal sampling set in polynomial time. Then, we formulate the sampling task as that of maximizing a monotone weak submodular function and propose an efficient randomized greedy algorithm motivated by [15]
. We analyze the performance of the randomized greedy algorithm and in doing so, we show that the MSE associated with the selected sampling set is on expectation a constant factor away from that of the optimal set. Moreover, we prove that the randomized greedy algorithm achieves the derived approximation bound with high probability for every sampling task. In contrast to prior work, our results do not require stationarity of the signal. Finally, in simulation studies we illustrate the superiority of the proposed schemes over stateoftheart randomized and greedy algorithms
[11, 10] in terms of running time, accuracy, or both. ^{1}^{1}1MATLAB implementations of the proposed algorithms are available at https://github.com/realabolfazl/GSsampling.Notation. denotes the entry of matrix , is the row of , () is a submatrix of that contains rows (columns) indexed by the set , and and
represent the maximum and minimum eigenvalues of
, respectively.denotes the identity matrix and
.2 Preliminaries and Problem Statement
Let be a zeromean, random graph signal which is bandlimited in a given basis
. This means that the signal’s socalled graph Fourier transform (GFT)
is sparse. There are several choices for in the literature with most aiming to decompose a graph signal into different modes of variation with respect to the graph topology. For instance, can be defined via the Jordan decomposition of the adjacency matrix [16, 17], through the eigenvectors of the Laplacian when
is undirected [1], or it can be obtained as the result of an optimization procedure [18, 19]. We also assume that the signal in not necessarily stationary with respect to , and that is a zeromean random vector with (generally nondiagonal) covariance matrix = . Recall that since is bandlimited, is sparse with at most nonzero entries. Let be the support set of , where . Then, one can write , where .Suppose that only a few (possibly noisy) entries of can be observed, corresponding to taking measurements from a subset of nodes in . The goal of sampling is to select the subset that enables reconstruction of the original signal with the smallest possible distortion. Formally, let be the noisecorrupted signal, where is the zeromean noise vector with covariance matrix . Let be a sampling subset of nodes and let be the reconstructed graph signal based on the measurements (i.e., the entries of ) indexed by . Since the signal is bandlimited, has at most
nonzero entries. Therefore, we assume that the least mean square estimator of
has at most nonzero entries. This in turn imposes the constraint . Now, since , the samples and the nonzero frequency components of are related via the Bayesian linear model(1) 
Hence, in order to find it suffices to estimate based on . The least mean square estimator of , denoted by , satisfies the Bayesian counterparts of the normal equations in the GaussMarkov theorem (see e.g., [20, Ch. 10]). Accordingly, it is given by
(2) 
where
(3) 
is the error covariance matrix of . Therefore, and its error covariance matrix can be obtained as .
The problem of sampling for nearoptimal reconstruction can now be formulated as the task of choosing so as to minimize the mean square error (MSE) of the estimator . Since the MSE is defined as the trace of the error covariance matrix, we obtain the following optimization problem,
(4) 
Using trace properties and the fact that is a Hermitian positive semidefinite matrix, (4) simplifies to
(5) 
The optimization problem (5) is NPhard and evaluating all possibilities to find an exact solution makes it intractable even for relatively small graphs. In the next section we propose two alternatives to find nearoptimal solutions in polynomial time.
3 New Schemes for Sampling Graph Signals
Here we resort to two approximation methods to find a nearoptimal solution of (5). Our proposed algorithms are based on semidefinite and weak submodular optimization techniques that have recently shown superior performance in applications such as sensor selection[21], graph sketching [9], wireless sensor networks [22]
[23], and sparse signal recovery [24, 25].3.1 Sampling via SDP relaxation
We first develop an SDP relaxation for problem (5). Our proposed scheme is motivated by the framework of [21] developed in the context of sensor scheduling. However, our focus is on sampling and reconstruction of graph signals which entails a different objective function, i.e., MSE. Let indicate whether the node of is included in the sampling set and define . Then, (3) can alternatively be written as
(6) 
Therefore, by relaxing the binary constraint one can obtain a convex relaxation of (5),
(7) 
In order to obtain an SDP in standard form, let be a positive semidefinite matrix such that . Then, (7) is equivalent to
(8) 
The last constraint in (8), i.e., , can be thought of as being the Schur complement [26] of the block matrix
(9) 
Note that the Schur complement of is positive semidefinite if and only if [26]. Therefore, replacing the last constraint in (8) with the positive semidefiniteness constraint on results in the following SDP relaxation:
(10) 
An exact solution to (10) can be obtained by means of existing SDP solvers; see, e.g., [27, 28]. However, the solution contains realvalued entries and hence a rounding procedure is needed to obtain a binary solution. Here, we propose to use the rounding procedure introduced in [21] and accordingly select the nodes of corresponding to the ’s with largest values. The proposed SDP relaxation sampling scheme is summarized as Algorithm 1.
3.2 Sampling via a randomized greedy scheme
The computational complexity of the SDP approach developed in Section 3.1 might be challenging in applications dealing with large graphs. Hence, we now propose an iterative randomized greedy algorithm for the task of sampling and reconstruction of graph signals by formulating (5) as the problem of maximizing a monotone weak submodular set function. First, we define the notion of monotonicity, submodularity, and curvature that will be useful in the subsequent analysis.
A set function is submodular if
(11) 
for all subsets and . The term is the marginal value of adding element to set . Moreover, the function is monotone if for all .
The concept of submodularity can be generalized by the notion of curvature or submodularity ratio [24] that quantifies how close a set function is to being submodular. Specifically, the maximum elementwise curvature of a monotone nondecreasing function is defined as
(12) 
with . Note that a set function is submodular if and only if . Set functions with are called weak or approximate submodular functions [24].
Next, similar to [10], we formulate (5) as a set function maximization task. Let . Then, (5) can equivalently be written as
(13) 
In Proposition 1 below, by applying the matrix inversion lemma [29] we establish that is monotone and weakly submodular. Moreover, we derive an efficient recursion to find the marginal gain of adding a new node to the sampling set .
Proposition 1.
is a weak submodular, monotonically increasing set function, , and for all
(14) 
(15) 
Proposition 1 enables efficient construction of the sampling set in an iterative fashion. To further reduce the computational cost, we propose a randomized greedy algorithm that performs the task of sampling set selection in the following way. Starting with , at iteration of the algorithm, a subset of size is sampled uniformly at random and without replacement from . The marginal gain of each node in is found using (14), and the one corresponding to the highest marginal gain is added to . Then, the algorithm employs the recursive relation (15) to update for the subsequent iteration. This procedure is repeated until some stopping criteria, such as condition on the cardinality of is met. Regarding , we follow the suggestion in [15] and set , where is a predetermined parameter that controls tradeoff between the computational cost and MSE of the reconstructed signal; randomized greedy algorithm with smaller produces sampling solutions with lower MSE while the one with larger requires lower computational costs. Note that if , the randomized greedy algorithm in each iteration considers all the available nodes and hence matches the greedy scheme proposed in [10]. However, as we illustrate in our simulation studies, the proposed randomized greedy algorithm is significantly faster than the method in [10] for larger while returning essentially the same sampling solution. The randomized greedy algorithm is formalized as Algorithm 2.
Performance guarantees. Here we analyze performance of the randomized greedy algorithm. First, Theorem 1 below states that if is characterized by a bounded maximum elementwise curvature, Algorithm 2 returns a sampling subset yielding an MSE that is on average within a multiplicative factor of the MSE associated with the optimal sampling set.
Theorem 1.
The proof of Theorem 1 relies on the argument that if , then with high probability the random subset in each iteration of Algorithm 2 contains at least one node from .
Next, we study the performance of the randomized greedy algorithm using the tools of probably approximately correct (PAC) learning theory [30, 31]. The randomization of Algorithm 2 can be interpreted as approximating the marginal gains of the nodes selected by the greedy scheme proposed in [10]. More specifically, for the iteration it holds that , where subscripts and refer to the sampling sets and nodes selected by the randomized greedy (Algorithm 2) and greedy algorithm in [10], respectively, and for all
are random variables. In view of this argument and by employing the Bernstein inequality
[32], we obtain Theorem 2 which states that the randomized greedy algorithm selects a nearoptimal sampling set with high probability.Theorem 2.
Instate the notation and hypotheses of Theorem 1. Assume are independent and let . Then, with probability at least it holds that
(17) 
Indeed, in simulation studies (see Section 4) we empirically verify the results of Theorems 1 and 2 and illustrate that Algorithm 2 performs favorably compared to the competing schemes both on average and for each individual sampling task. Before moving on to these numerical studies, in Theorem 3 we show that the maximum elementwise curvature of is bounded, even for nonstationary graph signals.
Theorem 3.
Let be the maximum elementwise curvature of . Then, it holds that
(18) 
An implication of Theorem 3 is a generalization of a result in [10] for stationary signals. There, it has been shown that if is stationary and for some , then the curvature of the MSE objective is bounded. However, Theorem 3 holds even in the scenarios where the signal is nonstationary and is nondiagonal.
4 Numerical Simulations
We study the recovery of simulated noisy signals supported on synthetic and realworld graphs to assess performance of the proposed sampling algorithms in terms of MSE and running time. To this end, we first consider an undirected ErdősRényi random graph of size and edge probability 0.2 [33]. Bandlimited graph signals are generated by taking as the first eigenvectors of the graph adjacency matrix. The nonzero frequency components
are drawn from a zeromean, multivariate Gaussian distribution with covariance matrix
which is selected uniformly at random from the set of positive semidefinite (PSD) matrices. Zeromean Gaussian noise with covariance is added to . Algorithms 1 and 2 are run to recover the signal for different sampling set sizes. We compare the MSE performance of the proposed schemes with the stateoftheart greedy algorithm [10] and the random sampling approaches in [11]. For the randomized greedy algorithm we use and . Fig. 1 (top) depicts the MSE versus (sample size), where the results are obtained by averaging over MonteCarlo simulations. As the figure indicates, Algorithms 1 and 2 outperform the random sampling schemes of [11] and perform nearly as well as the greedy sampling algorithm [10]. While not shown here for the sake of clarity of the presentation, similar patterns were also observed for other workhorse random graphs, e.g., preferential attachment and Barbási–Albert models [33].Next, we study the performance of the proposed schemes for each individual sampling tasks (each MonteCarlo realizations), for the setting where and . Bandlimited graph signals are generated as before except that this time we take as the first eigenvectors of the adjacency matrix. Fig. 1 (bottom) depicts superimposed MSE histograms of Algorithms 1 and 2 as well as the greedy sampling scheme [10] for 100 realizations per method and fixed . As the figure illustrates, the proposed SDP relaxation and randomized greedy schemes perform well and are comparable with the greedy approach.
Finally, we test Algorithm 2 on the Minnesota road network^{2}^{2}2https://sparse.tamu.edu/Gleich/minnesota with nodes in order to showcase scalability of the proposed graph sampling method. To that end, Bandlimited graph signals are generated by taking the first eigenvectors of the graph Laplacian matrix, where the nonzero frequency components are drawn from a zeromean, multivariate Gaussian distribution with randomly chosen PSD covariance matrix . The signals are corrupted with additive white Gaussian noise with . As expected, Figs. 2 (top) and (bottom) depict trends of decreasing MSE and increasing running time versus , respectively. The results are averaged over MonteCarlo simulations run on a commercial laptop with an Intel Core i processor at GHz. Remarkably, the proposed randomized greedy procedure achieves an orderofmagnitude speedup over the stateoftheart algorithm in [10] while showing only a marginal degradation in the MSE performance.
5 Conclusion
We considered the problem of sampling a bandlimited graph signal in the presence of noise, where the goal is to select a subset of graph nodes of prescribed cardinality to minimize the mean square signal reconstruction error. First, we developed an SDP relaxation method to find an approximate solution to the NPhard sampleset selection task. Then the problem was reformulated as the maximization of a monotone weak submodular function, and a novel randomizedgreedy algorithm was proposed to find a nearoptimal sample subset. In addition, we analyzed the performance of the randomized greedy algorithm, and showed that the resulting MSE is a constant factor away from the optimal MSE. Unlike prior work, our guarantees do not require stationarity of the graph signal. Simulations studies showed that the proposed sampling algorithms compare favorably to competing alternatives in terms of accuracy and runtime.
References

[1]
David I Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, and Pierre
Vandergheynst,
“The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains,”
Signal Processing Magazine, IEEE, vol. 30, no. 3, pp. 83–98, 2013.  [2] Aliaksei Sandryhaila and José M. F. Moura, “Discrete signal processing on graphs,” IEEE Transactions on Signal Processing, vol. 61, no. 7, pp. 1644–1656, 2013.
 [3] Han Shomorony and A Salman Avestimehr, “Sampling large data on graphs,” in Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on. IEEE, 2014, pp. 933–936.
 [4] Mikhail Tsitsvero, Sergio Barbarossa, and Paolo Di Lorenzo, “Signals on graphs: Uncertainty principle and sampling,” IEEE Transactions on Signal Processing, vol. 64, no. 18, pp. 4845–4860, 2016.
 [5] Aamir Anis, Akshay Gadde, and Antonio Ortega, “Efficient sampling set selection for bandlimited graph signals using graph spectral proxies,” IEEE Transactions on Signal Processing, vol. 64, no. 14, pp. 3775–3789, 2016.
 [6] Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, and Jelena Kovacevic, “Discrete signal processing on graphs: Sampling theory,” IEEE Transactions on Signal Processing, vol. 24, no. 63, pp. 6510–6523, 2015.
 [7] Sundeep Prabhakar Chepuri and Geert Leus, “Subsampling for graph power spectrum estimation,” in Sensor Array and Multichannel Signal Processing Workshop (SAM), 2016 IEEE. IEEE, 2016, pp. 1–5.
 [8] Antonio G Marques, Santiago Segarra, Geert Leus, and Alejandro Ribeiro, “Sampling of graph signals with successive local aggregations,” IEEE Transactions on Signal Processing, vol. 64, no. 7, pp. 1832–1843, 2016.

[9]
Fernando Gama, Antonio G Marques, Gonzalo Mateos, and Alejandro Ribeiro,
“Rethinking sketching as sampling: Linear transforms of graph signals,”
in Signals, Systems and Computers, 2016 50th Asilomar Conference on. IEEE, 2016, pp. 522–526.  [10] Luiz FO Chamon and Alejandro Ribeiro, “Greedy sampling of graph signals,” IEEE Transactions on Signal Processing, 2017.
 [11] Siheng Chen, Rohan Varma, Aarti Singh, and Jelena Kovacevic, “Signal recovery on graphs: Fundamental limits of sampling strategies,” IEEE Transactions on Signal and Information Processing over Networks, vol. 2, no. 4, pp. 539–554, 2016.
 [12] A. G. Marques, S. Segarra, G. Leus, and A. Ribeiro, “Stationary graph processes and spectral estimation,” vol. 65, no. 22, pp. 5911–5926, Aug. 2017.
 [13] N. Perraudin and P. Vandergheynst, “Stationary signal processing on graphs,” vol. 65, no. 13, pp. 3462–3477, July 2017.
 [14] B. Girault, “Stationary graph signals using an isometric graph translation,” Aug 2015, pp. 1516–1520.

[15]
Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi, Jan Vondrak, and
Andreas Krause,
“Lazier than lazy greedy,”
in
AAAI Conference on Artificial Intelligence
. AAAI, 2015.  [16] Aliaksei Sandryhaila and José M. F. Moura, “Discrete signal processing on graphs: Frequency analysis,” IEEE Transactions on Signal Processing, vol. 62, no. 12, pp. 3042–3054, June 2014.
 [17] Joya A. Deri and José M. F. Moura, “Spectral projectorbased graph fourier transforms,” J. Sel. Topics Signal Processing, vol. 11, no. 6, pp. 785–795, 2017.
 [18] Rasoul Shafipour, Ali Khodabakhsh, Gonzalo Mateos, and Evdokia Nikolova, “A digraph fourier transform with spread frequency components,” in Proc. of IEEE Global Conf. on Signal and Information Processing, Nov 2017.
 [19] Stefania Sardellitti, Sergio Barbarossa, and Paolo Di Lorenzo, “On the graph fourier transform for directed graphs,” J. Sel. Topics Signal Processing, vol. 11, no. 6, pp. 796–811, 2017.
 [20] Steven M Kay, Fundamentals of statistical signal processing, Prentice Hall PTR, 1993.
 [21] Siddharth Joshi and Stephen Boyd, “Sensor selection via convex optimization,” IEEE Transactions on Signal Processing, vol. 57, no. 2, pp. 451–462, 2009.
 [22] Manohar Shamaiah, Siddhartha Banerjee, and Haris Vikalo, “Greedy sensor selection under channel uncertainty,” IEEE Wireless Communications Letters, vol. 1, no. 4, pp. 376–379, 2012.
 [23] Abolfazl Hashemi, Mahsa Ghasemi, Haris Vikalo, and Ufuk Topcu, “A randomized greedy algorithm for nearoptimal sensor scheduling in largescale sensor networks,” arXiv preprint arXiv:1709.08823, 2017.

[24]
Abhimanyu Das and David Kempe,
“Submodular meets spectral: greedy algorithms for subset selection,
sparse approximation and dictionary selection,”
in
Proceedings of the 28th International Conference on International Conference on Machine Learning
. Omnipress, 2011, pp. 1057–1064. 
[25]
Abolfazl Hashemi and Haris Vikalo,
“Sparse linear regression via generalized orthogonal leastsquares,”
in Signal and Information Processing (GlobalSIP), 2016 IEEE Global Conference on. IEEE, 2016, pp. 1305–1309.  [26] Roger A Horn and Charles R Johnson, Matrix analysis, Cambridge university press, 2012.
 [27] Sanjeev Arora, Elad Hazan, and Satyen Kale, “Fast algorithms for approximate semidefinite programming using the multiplicative weights update method,” in Foundations of Computer Science, 2005. FOCS 2005. 46th Annual IEEE Symposium on. IEEE, 2005, pp. 339–348.
 [28] Michael Grant, Stephen Boyd, and Yinyu Ye, “CVX: Matlab software for disciplined convex programming,” 2008.
 [29] Richard Bellman, Introduction to matrix analysis, SIAM, 1997.
 [30] Leslie G Valiant, “A theory of the learnable,” Communications of the ACM, vol. 27, no. 11, pp. 1134–1142, 1984.
 [31] Leslie Valiant, “Probably approximately correct: Nature’s algorithms for learning and prospering in a complex world,” 2013.
 [32] Joel A Tropp et al., “An introduction to matrix concentration inequalities,” Foundations and Trends® in Machine Learning, vol. 8, no. 12, pp. 1–230, 2015.
 [33] Mark Newman, Networks: an introduction, Oxford university press, 2010.