DeepAI
Log In Sign Up

Spatial Graph Convolution Neural Networks for Water Distribution Systems

11/17/2022
by   Inaam Ashraf, et al.
Bielefeld University
0

We investigate the task of missing value estimation in graphs as given by water distribution systems (WDS) based on sparse signals as a representative machine learning challenge in the domain of critical infrastructure. The underlying graphs have a comparably low node degree and high diameter, while information in the graph is globally relevant, hence graph neural networks face the challenge of long-term dependencies. We propose a specific architecture based on message passing which displays excellent results for a number of benchmark tasks in the WDS domain. Further, we investigate a multi-hop variation, which requires considerably less resources and opens an avenue towards big WDS graphs.

READ FULL TEXT VIEW PDF
12/13/2022

Leave Graphs Alone: Addressing Over-Squashing without Rewiring

Recent works have investigated the role of graph bottlenecks in preventi...
06/18/2021

Message Passing in Graph Convolution Networks via Adaptive Filter Banks

Graph convolution networks, like message passing graph convolution netwo...
02/01/2022

Stability and Generalization Capabilities of Message Passing Graph Neural Networks

Message passing neural networks (MPNN) have seen a steep rise in popular...
04/28/2021

Reconstructing nodal pressures in water distribution systems with graph neural networks

Knowing the pressure at all times in each node of a water distribution s...
12/14/2021

Improving Spectral Graph Convolution for Learning Graph-level Representation

From the original theoretically well-defined spectral graph convolution ...
03/30/2021

Variational models for signal processing with Graph Neural Networks

This paper is devoted to signal processing on point-clouds by means of n...
06/04/2018

Deep Graphs

We propose an algorithm for deep learning on networks and graphs. It rel...

1 Introduction

Transportation systems, energy grids, and water distribution systems (WDS) constitute parts of our critical infrastructure that are vital to our society and subject to special protective measures and regulations. As they are under increasing strain in the face of limited resources and as they are vulnerable to attacks, their efficient management and continuous monitoring is of great importance. As an example, the average amount of non-revenue water amounts to 25% in the EU [6], making the detection of leaks in WDS an important task. Advances in sensor technology and increasing digitalisation hold the potential for intelligent monitoring and adaptive control using AI technologies [5, 25, 13]

. In addition to more classical AI approaches, deep learning technologies are increasingly being used to solve learning tasks in the context of critical infrastructures

[4].

A common feature of WDS, energy networks and transport networks is that the data has a temporal and spatial character: Data is generated in real time according to an underlying graph, given by the power grid, the pipe network and the transport routes, respectively. Measurements are available for some nodes that correspond to local sensors, e.g. pressure sensors or smart meters. Based on this partial information, the task is to derive corresponding quantities at every node of the graph, to identify the system state or to derive optimal planning and control strategies. In this paper, we target the learning challenges of the first feature, inferring relevant quantities at each location of the graph based on few measurements. While classical deep learning models such as convolutional networks or recurrent models can reliably handle Euclidean data, graphs constitute non-Euclidean data that require techniques from geometric deep learning. Based on initial approaches dating back more than a decade [28, 10], a variety of graph neural networks (GNNs) have recently been proposed that are able to directly process information such as present in critical infrastructure [2, 3, 7, 14, 29]. First applications demonstrate the suitability of GNNs for the latter [23, 5, 3].

Graphs from the domain of WDS or smart grids display specific characteristics (s. Fig. 3): as they are located in the plane, the node degree is small and the network diameter is large. These characteristics display a challenge for GNNs, as the problem of long-term dependencies and over-smoothing occurs [32, 27]

. In this contribution, we design a GNN architecture capable of dealing with these specific graph structures: We show that our spatial GNN is able to effectively integrate long-range node dependencies and we demonstrate the impact of a suitable transfer function and residual connections. As the required resources quickly become infeasible for big graphs, we also investigate the comparability of a sparse multi-hop alternative. All methods are evaluated for pressure prediction in WDS for a variety of benchmark networks, displaying promising results.

2 Related Work

The task of pressure estimation at all nodes in a WDS from pressure values available at a few nodes has recently been dealt with [8]. The authors employed spectral graph convolutional neural networks (GCNs) and performed extensive experiments to demonstrate their approach. However, their methodology does not fully benefit from the available structural information of the graph; we provide further details on this in Sec. 4. We propose a spatial GCN based methodology that effectively utilizes the graph structure by using both node and edge features and thus produces significantly better results (s. Sec. 5).

A related task of state (pressure, flow) estimation in WDS based on demand patterns and sparse pressure information has been addressed [31]. The authors used hydraulics in the optimization objective since the task was to model the complex hydraulics used by the popular EPANET simulator [26] using GNNs. They present promising results only on relatively small WDS, the ability to scale to larger WDS is yet to be investigated. While their model solves the task of state estimation in WDS, their approach requires demand patterns from every consumer also during inference. In contrast, our proposed model relies on pressure values computed by the EPANET solver (based on demand patterns) only during the training process. During evaluation, our model estimates pressures solely based on sparse pressure values obtained from a few sensors. Further, it successfully estimates pressures even in case of noisy demands (s. Sec. 5).

GNNs were first introduced in the work [28] as an extension of recursive neural networks for tree structures [11]

. Since then, a number of GCN algorithms have been developed, which can be classified in to spectral-based and spatial-based. The approach

[2] introduced spectral GCNs based on spectral graph theory, which was followed by further work [14, 3, 12, 18, 20]. The counterpart are spatial GCNs which apply a local approximation of the spectral graph kernel [9, 22, 7, 24, 32, 29]. These are also referred to as message passing neural networks.

Unlike convolutional neural networks (CNNs), spatial GCNs suffer from issues like vanishing gradient, over-smoothing and over-fitting, when used to build deeper models. Generalized aggregation functions, residual connections and normalization layers can address these issues and improve performance on diverse GCN tasks and large scale graph datasets [19] .

To enable high-level embeddings in feed-forward neural networks, self normalizing neural networks (SNNs) were introduced

[15]

based on a special activation function called scaled exponential unit (SeLU). We combine residual connections

[19] with SNNs since residual connections help solve the over-smoothing problem when we use multiple GCN layers, whereas self-normalizing property of SeLU enables the required information propagation in case of sparse features.

3 Methodology

The main contribution of our work is a spatial GCN capable of efficiently dealing with the specific graph characteristics as present in WDS. We address the estimation of missing node features based on sparse measurements. As we detail below, we employ multiple spatial GCN layers without suffering from typical problems of vanishing gradient, over-smoothing and over-fitting. For this purpose, we combine residual connections ([19]) with SeLU activation function ([15]). To decrease model size, we leverage GCN layers with multiple hops realizing message passing between more distant neighbors comparable to [21]. Our model employs spatial GCNs using both node and edge features. The complete architecture is depicted in Fig. 1. Formally, a graph is represented as , where:

  • is the set of nodes,

  • is the set of edges,

  • is the set of node features, where and is the number of node features

  • is the set of edge features, where and is the number of edge features

Node and edge features are embedded by fully connected linear layers and :

(1)

We denote intermediate model activations as for nodes and for edges. Multiple GCN layers convolve the information from the neighboring nodes for estimation of node features. Each GCN layer employs the three-step process of message generation, message aggregation and node feature update. In the layer, the edge features are updated by

(2)
Figure 1: Model architecture employing multiple GCN layers. Each GCN layer consists of message generation, sum aggregation and a final MLP.

Adding the absolute difference between the current and neighbor nodes features empirically improves the learning. Then, messages are generated as follows:

(3)

where

denotes vector concatenation. After concatenation, we employ the SeLU activation function (

[15]) to all messages, which is given by:

(4)

where and

are hyperparameters as in

[15]. SeLU’s self-normalizing nature greatly improves learning in the light of highly sparse values at the beginning of the training process. All messages from the neighbor nodes are sum-aggregated:

(5)

Similar to [19]

, we add residual connections to the aggregated messages and pass these through a Multi-Layer Perceptron (MLP):

(6)

The overall message construction, aggregation and update is [19]:

(7)

After employing multiple GCN layers, the resultant node embeddings are fed to a final fully-connected linear layer to estimate all node features.

(8)

where is the estimated node features, is the last GCN layer and is modeled by the linear layer. We use the L1 loss as objective function:

(9)

with as ground truth, as number of nodes and as number of samples in a mini-batch.

Figure 2: Model architecture employing multiple multi-hop GCN layers.
Multi-hop Variation

Given the sparsity and size of a graph, our methodology requires a comparably large number of GCN layers proportional to the size of graph. This reduces scalability to larger graphs. To reduce the number of parameters, we propose GCN layers with multiple hops as shown in Fig. 2. Specifically, message generation and aggregation are repeated in each GCN layer before passing it to the MLP:

(10)

with as number of hops. The embedding for the next layer is:

(11)

This enables the model to gather information from neighbors that are multiple hops away, requiring fewer GCN layers.

4 Experiments

Figure 3: L-Town Water Distribution System ([30]) – nodes in red have sensors.

The methodology can be applied to missing node feature estimation on any graph. Here, we investigate WDS, which are modelled as graphs by representing junctions as nodes and pipes between junctions as edges. WDS are especially challenging because pressure sensors are installed at only few nodes due to constraints (size of the system, cost, availability, practicality) [17], resulting in graphs with sparse feature information. Additionally, the node degree in WDS is usually low (s. Tab. 1). These properties can be observed in the popular L-Town WDS [30] shown in Fig. 3. Such characteristics require GNNs to model long-range dependencies between nodes to properly integrate the available information.

WDS Anytown C-Town L-Town Richmond
Number of junctions 22 388 785 865
Number of pipes 41 429 909 79
Diameter 5 66 79 234
Degree (min, mean, max) (1, 3.60, 7) (1, 2.24, 4) (1, 2.32, 5) (1, 2.19, 4)
Table 1: Major attributes of WDS.

To the best of our knowledge, the task of node feature estimation in WDS using GNNs based on sparse features has only been dealt by [8]

. These researchers compared their model to a couple of non-GNN based baselines: The first baseline refers to the mean of known node features as value for unknown node features, the second baseline uses interpolated regularization

[1]. The work [8] demonstrates that the GNN model significantly outperforms both baselines. Therefore, in our experiments, we compare our approach only to the GNN model, ChebNet, of [8]. We run two experiments on simulated data. First, we compare our approach to [8] on three WDS datasets Anytown, C-Town, and Richmond. Second, we conduct an in-depth evaluation on L-Town with extensive hyperparameter tuning.

4.1 Datasets

We use a total of four WDS datasets for our experiments: Anytown, C-Town, L-Town and Richmond 111https://engineering.exeter.ac.uk/research/cws/resources/benchmarks/#a8 222https://www.batadal.net/data.html ([30]). Major attributes of the WDS are listed in Table 1. We use the dataset generation methodology of [8] for three of the WDS (Anytown, C-Town, Richmond) and record 1000 consecutive time steps for each of the three networks. For each network, we use three different sparsity levels i.e. sensor ratios of 0.05, 0.1 and 0.2. We do not evaluate on sparsity levels of 0.4 and 0.8 as done in [8], which are more easy. We sample 5 different random sensor configurations for each sparsity level and each WDS instead of 20.

Model Anytown C-Town Richmond L-Town
ChebNet No. of layers 4 4 4 4
Degrees () [39, 43, 45, 1] [200, 200, 20, 1] [240, 120, 20, 1] [240, 120, 20, 1]
No. of filters () [14, 20, 27, 1] [60, 60, 30, 1] [120, 60, 30, 1] [120, 60, 30, 1]
Parameters (million) 0.038 0.780 0.958 0.929
m-GCN No. of GCN layers 5 33 60 45 10
No. of hops 1 2 3 1 5
No. of MLP layers 2 2 2 2 2
Latent dimension 32 32 48 96 96
Parameters (million) 0.031 0.203 0.830 2.488 0.553
Table 2: Model Hyperparameters and Parameters.

For the popular L-Town network, we use only a single configuration of sensors as designed by [30], which gives a sensor ratio of 0.0422. We use two different sets of simulation settings; one with smooth toy demands and the other close to actual noisy demand patterns. The simulations are carried out using EPANET [26] provided by Python package wntr ([16]). The samples are generated every 15 minutes, resulting in 96 samples every day. We use one month of data for training (2880 samples) and evaluate on data of the next two months (5760 samples). The training data is divided in train-validation-test splits with 60-20-20 ratio.

4.2 Training setup

The model parameters are summarized in Table 2

. All models are implemented in Pytorch using Adam optimizer. For the ChebNet baseline

[8], we set the learning rate of 3e-4 and weight decay of 6e-6. For our m-GCN models, we use learning rate of 1e-5 and no weight decay. We now describe the training setup of the ChebNet baseline and our m-GCN model for the two experiments, respectively.

For the first experiment the models are trained for 2000 epochs. We set an early stopping criteria such that it stops after 250 epochs if the change in loss is no larger than 1e-6. We configure ChebNet similar to

[8]. Input is masked as per the sensor ratio and the mask is concatenated with the pressure values. Hence there are two node features. ChebNet can only use scalar edge fetaures, i.e. edge weights. Out of the three types of edge weights used by [8] (binary, weighted, logarithmically weighted), we use the binary weights since other types did not increase performance. For our model (m-GCN), we did not perform an extensive hyperparameter search since we achieved considerably better results than ChebNet model of [8] with a set of intuitive hyperparameter values. We use single hop configuration for Anytown and multi-hop architectures for C-Town and Richmond WDS. We only use masked pressure values as input i.e. one node feature. Further, we use two edge features namely pipe length and diameter.

For the second in-depth evaluation on L-Town, we dropped the second node feature for ChebNet since this significantly improved the results. We use the ChebNet model configuration used for Richmond WDS by the authors. We train our m-GCN model with two configurations; one with the default single hop and the second with multiple hops as listed in Table 2. For both m-GCN models, we add a third edge feature namely pressure reducing valves (PRVs) mask. PRVs are used at certain connections in a WDS to reduce pressure, hence these edges should be modeled differently. We use a binary mask to pass this information to the model that helps in improving the pressure estimation at neighboring nodes. We train all three models for 5000 epochs without early stopping.

5 Results

WDS Anytown C-Town Richmond
Ratio Error () ChebNet m-GCN Diff ChebNet m-GCN Diff ChebNet m-GCN Diff
0.05 All 54.19 53.15 -1.04 12.88 9.77 -3.11 4.34 2.17 -2.17
Sensor 7.06 3.77 -3.28 7.50 4.61 -2.89 3.47 1.81 -1.66
Non-sensor 56.44 55.50 -0.94 13.16 10.04 -3.12 4.38 2.19 -2.19
0.1 All 35.43 34.85 -0.57 8.16 5.47 -2.69 3.86 1.93 -1.93
Sensor 6.66 7.19 0.53 7.10 4.83 -2.27 3.45 2.02 -1.43
Non-sensor 38.3 37.62 -0.68 8.28 5.55 -2.73 3.90 1.92 -1.98
0.2 All 14.98 13.51 -1.47 7.05 5.58 -1.47 3.24 1.59 -1.65
Sensor 5.40 3.06 -2.34 6.46 5.46 -1.00 3.03 1.62 -1.40
Non-sensor 17.11 15.83 -1.28 7.20 5.61 -1.59 3.29 1.59 -1.71
Table 3: Mean errors across nodes and samples across 5 different sensor configurations for 3 different ratios of sensors.
Comparison with spectral GCN-based approach

First, we compare our model with the work of [8] using their datasets and training settings. The results of the experiments on Anytown, C-Town and Richmond WDS are shown in Table 3. Here, we evaluate on the basis of mean relative absolute error given by:

(12)

Since Anytown is a much smaller WDS, sensor ratios translate to very few sensors (0.05: 1 sensor, 0.1: 2 sensors, 0.2: 4 sensors). Hence, both models do not accurately estimate the pressures in these cases. The number of available sensors is comparatively bigger for both C-Town and Richmond WDS, even for the smallest ratio, thus naturally increasing performance. As can be seen, m-GCN outperforms ChebNet [8] by a considerable margin.

Figure 4: Mean relative absolute errors for all nodes on noisy data for L-Town WDS.
Figure 5: Estimation results of m-GCN and ChebNet compared to ground truth on L-Town.
Detailed analysis on L-Town

We present more in-depth analysis for the evaluation results on L-Town. Mean relative absolute errors for ChebNet and single-hop m-GCN models are plotted in Fig. 4. Both models are trained on smooth data and evaluated on noisy realistic data. As can be seen, error values for m-GCN are much lower across all nodes compared to ChebNet. We plot time series of 4 days for a couple of nodes in Fig. 5. The first node (top plot) has an installed sensor, hence the model gets the ground truth value as input and it has to only reconstruct it. The second node (bottom plot) does not have an installed sensor and the model gets zero-input. As depicted, m-GCN is able to successfully reconstruct and estimate both nodes. The results from ChebNet suffer considerable errors. There are areas in the L-Town WDS, where water levels are essentially stagnant with some noise. As shown in Fig. 6 our m-GCN is able to model those nodes correctly. In contrast, spectral convolutions do not take into account the graph structure and thus end up imposing the seasonality of nodes from other areas of the graph to the nodes in this area.

Similar to our first experiment, we present mean relative absolute error values for all, sensor and non-sensor nodes for L-Town in Table 4. Our model produces significantly better results compared to the ChebNet. Since our model is based on neighborhood aggregation, the number of GCN layers required will continue to increase with the increasing size of the graphs. In order to reduce the number of layers and model parameters, we trained our model with only 10 GCN layers with 5 hops each. As evident, we are able to reduce the parameters by almost five times at the expense of some performance. Nevertheless, it is still significantly better than the baseline ChebNet model. Our main motivation for this is that the multi-hop approach makes the model more scalable to larger graphs. Further, it is a step towards developing a generalized version of the model that can work for different sensor configurations and/or different graph sizes without hyperparameter tuning and re-training.

Figure 6: Estimation results of m-GCN and ChebNet compared to ground truth on nodes from an area in L-Town with essentially stagnant pressure values.
Model Error ()
All Sensor Non-sensor
Smooth Data
ChebNet 2.55 2.87 2.38 3.55 2.55 2.83
m-GCN (45 x 1) 0.39 0.37 0.43 0.52 0.39 0.36
m-GCN (10 x 5) 0.83 0.68 0.74 0.59 0.83 0.69
Noisy Data
ChebNet 2.92 3.35 2.78 4.02 2.93 3.32
m-GCN (45 x 1) 0.54 0.75 0.64 1.06 0.53 0.73
m-GCN (10 x 5) 0.90 0.82 0.81 0.74 0.90 0.83
Table 4: Mean errors across nodes and samples on L-Town.

6 Conclusion

We have proposed a spatial GCN which is particularly suited for graph tasks on graphs with small node degree and sparse node features, since it is able to model long-term dependencies. We have demonstrated its suitability for node pressure inference based on sparse measurement values as an important and representative task from the domain of WDS, displaying its behavior for a number of benchmarks. Notably, the model generalizes not only across time windows, but also from noise-less toy demand signals to realistic ones. In addition to a very good performance overall, we also proposed first steps to target the challenge of scalability to larger graphs by introducing multi-hop architectures with considerably fewer parameters as compared to fully connected deep ones. In future work, we will investigate the behavior for larger networks based on these first results. Moreover, unlike simulation tools in the domain, the GNN has the potential to generalize over different graphs structures including partially faulty ones. We will evaluate this capability in future work.

6.0.1 Acknowledgements

We gratefully acknowledge funding from the European Research Council (ERC) under the ERC Synergy Grant Water-Futures (Grant agreement No. 951424). This research was also supported by the research training group “Dataninja” (Trustworthy AI for Seamless Problem Solving: Next Generation Intelligence Joins Robust Data Analysis) funded by the German federal state of North Rhine-Westphalia, and by funding from the VW-Foundation for the project IMPACT funded in the frame of the funding line AI and its Implications for Future Society.

References

  • [1] M. Belkin, I. Matveeva, and P. Niyogi (2004)

    Regularization and semi-supervised learning on large graphs

    .
    In Learning Theory, J. Shawe-Taylor and Y. Singer (Eds.), Berlin, Heidelberg, pp. 624–638. External Links: ISBN 978-3-540-27819-1 Cited by: §4.
  • [2] J. Bruna, W. Zaremba, A. D. Szlam, and Y. LeCun (2014) Spectral networks and locally connected networks on graphs. CoRR. Cited by: §1, §2.
  • [3] M. Defferrard, X. Bresson, and P. Vandergheynst (2016) Convolutional neural networks on graphs with fast localized spectral filtering. NIPS 29, pp. 3844–3852. Cited by: §1, §2.
  • [4] K. Dick, L. Russell, Y. S. Dosso, F. Kwamena, and J. R. Green (2019) Deep learning for critical infrastructure resilience. JIS 25 (2), pp. 05019003. Cited by: §1.
  • [5] C. Eichenberger and et al. (2022)

    Traffic4cast at neurips 2021 - temporal and spatial few-shot transfer learning in gridded geo-spatial processes

    .
    In Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, Vol. 176, pp. 97–112. Cited by: §1, §1.
  • [6] EurEau (2021) Europe’s water in figures. Cited by: §1.
  • [7] H. Gao, Z. Wang, and S. Ji (2018) Large-scale learnable graph convolutional networks. In SIGKDD, pp. 1416–1424. Cited by: §1, §2.
  • [8] G. Hajgató, B. Gyires-Tóth, and G. Paál (2021) Reconstructing nodal pressures in water distribution systems with graph neural networks. arXiv. External Links: Document Cited by: §2, §4.1, §4.2, §4.2, §4, §5.
  • [9] W. L. Hamilton, R. Ying, and J. Leskovec (2017) Inductive representation learning on large graphs. In NIPS, pp. 1025–1035. Cited by: §2.
  • [10] B. Hammer, A. Micheli, and A. Sperduti (2005) Universal approximation capability of cascade correlation for structures. Neural Comput. 17 (5), pp. 1109–1159. Cited by: §1.
  • [11] B. Hammer (2000)

    Learning with recurrent neural networks

    .
    Leacture Notes in Control and Information Sciences, Vol. 254, Springer. Cited by: §2.
  • [12] M. Henaff, J. Bruna, and Y. LeCun (2015) Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163. Cited by: §2.
  • [13] M. Kammoun, A. Kammoun, and M. Abid (2022)

    Leak detection methods in water distribution networks: a comparative survey on artificial intelligence applications

    .
    Journal of Pipeline Systems Engineering and Practice 13 (3), pp. 04022024. Cited by: §1.
  • [14] T. N. Kipf and M. Welling (2017) Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR), Cited by: §1, §2.
  • [15] G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter (2017) Self-normalizing neural networks. In NIPS, NIPS’17, pp. 972–981. External Links: ISBN 9781510860964 Cited by: §2, §3, §3.
  • [16] K. A. Klise, R. Murray, and T. Haxton (2018) An overview of the water network tool for resilience (wntr).. Cited by: §4.1.
  • [17] K. A. Klise, C. A. Phillips, and R. J. Janke (2013) Two-tiered sensor placement for large water distribution network models. JIS 19 (4), pp. 465–473. Cited by: §4.
  • [18] R. Levie, F. Monti, X. Bresson, and M. M. Bronstein (2018) Cayleynets: graph convolutional neural networks with complex rational spectral filters. IEEE Transactions on Signal Processing 67 (1), pp. 97–109. Cited by: §2.
  • [19] G. Li, C. Xiong, A. Thabet, and B. Ghanem (2020) DeeperGCN: all you need to train deeper gcns. External Links: 2006.07739 Cited by: §2, §2, §3, §3.
  • [20] R. Li, S. Wang, F. Zhu, and J. Huang (2018) Adaptive graph convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32. Cited by: §2.
  • [21] Y. Li, R. Yu, C. Shahabi, and Y. Liu (2017) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. arXiv. External Links: Link Cited by: §3.
  • [22] F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein (2017) Geometric deep learning on graphs and manifolds using mixture model cnns. In

    IEEE conference on computer vision and pattern recognition

    ,
    pp. 5115–5124. Cited by: §2.
  • [23] S. P. Nandanoori, S. Guan, S. Kundu, S. Pal, K. Agarwal, Y. Wu, and S. Choudhury (2022) Graph neural network and koopman models for learning networked dynamics: a comparative study on power grid transients prediction. arXiv. Cited by: §1.
  • [24] M. Niepert, M. Ahmed, and K. Kutzkov (2016) Learning convolutional neural networks for graphs. In ICML, pp. 2014–2023. Cited by: §2.
  • [25] O. A. Omitaomu and H. Niu (2021) Artificial intelligence techniques in smart grid: a survey. Smart Cities 4 (2), pp. 548–568. External Links: ISSN 2624-6511 Cited by: §1.
  • [26] L. Rossman, H. Woo, M. Tryby, F. Shang, R. Janke, and T. Haxton (2020) EPANET 2.2 user’s manual, water infrastructure division. CESER. Cited by: §2, §4.1.
  • [27] R. Sato (2020) A survey on the expressive power of graph neural networks. CoRR. External Links: Link, 2003.04078 Cited by: §1.
  • [28] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini (2009) The graph neural network model. IEEE Transactions on Neural Networks 20 (1), pp. 61–80. Cited by: §1, §2.
  • [29] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio (2018) Graph Attention Networks. ICLR. Note: accepted as poster Cited by: §1, §2.
  • [30] S. G. Vrachimis, D. G. Eliades, R. Taormina, A. Ostfeld, Z. Kapelan, S. Liu, M. Kyriakou, P. Pavlou, M. Qiu, and M. M. Polycarpou (2020) BattLeDIM: battle of the leakage detection and isolation methods. In CCWI/WDSA Joint Conf, Cited by: Figure 3, §4.1, §4.1, §4.
  • [31] L. Xing and L. Sela (2022) Graph neural networks for state estimation in water distribution systems: application of supervised and semisupervised learning. Journal of Water Resources Planning and Management 148 (5). Cited by: §2.
  • [32] K. Xu, W. Hu, J. Leskovec, and S. Jegelka (2018) How powerful are graph neural networks?. arXiv preprint arXiv:1810.00826. Cited by: §1, §2.