In Search of Lost Edges: A Case Study on Reconstructing Financial Networks

09/03/2019 ∙ by Michael Lebacher, et al. ∙ Financial Network Analytics Universität München Humboldt-Universität zu Berlin 0

To capture the systemic complexity of international financial systems, network data is an important prerequisite. However, dyadic data is often not available, raising the need for methods that allow for reconstructing networks based on limited information. In this paper, we are reviewing different methods that are designed for the estimation of matrices from their marginals and potentially exogenous information. This includes a general discussion of the available methodology that provides edge probabilities as well as models that are focussed on the reconstruction of edge values. Besides summarizing the advantages, shortfalls and computational issues of the approaches, we put them into a competitive comparison using the SWIFT (Society for Worldwide Interbank Financial Telecommunication) MT 103 payment messages network (MT 103: Single Customer Credit Transfer). This network is not only economically meaningful but also fully observed which allows for an extensive competitive horse race of methods. The comparison concerning the binary reconstruction is divided into an evaluation of the edge probabilities and the quality of the reconstructed degree structures. Furthermore, the accuracy of the predicted edge values is investigated. To test the methods on different topologies, the application is split into two parts. The first part considers the full MT 103 network, being an illustration for the reconstruction of large, sparse financial networks. The second part is concerned with reconstructing a subset of the full network, representing a dense medium-sized network. Regarding substantial outcomes, it can be found that no method is superior in every respect and that the preferred model choice highly depends on the goal of the analysis, the presumed network structure and the availability of exogenous information.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 29

page 30

page 35

page 36

page 37

page 38

page 39

page 40

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In recent years, interest in applying network-based methodology to financial data has strongly increased (see e.g. Soramäki et al., 2007, Schweitzer et al., 2009, Imakubo et al., 2010, Baek et al., 2014, Battiston et al., 2016). A huge amount of this research effort is directed to the study and assessment of systemic risk (see e.g. Gai and Kapadia, 2010, Kauê Dal’Maso Peron et al., 2012, Billio et al., 2012, Chinazzi et al., 2013, Thurner and Poledna, 2013, Soramäki and Cook, 2013, Bardoscia et al., 2017 and Caccioli et al., 2018). This focus stems from the fact that in the aftermath of the financial crisis it became clear that the banking system forms a complex network with inherent interdependencies and feedback loops. As a consequence, the centrality and connectedness of a financial institution can be just as important as size for its potential to wreak havoc on the system overall (Markose et al., 2012, Liu et al., 2015). Battiston et al. (2012) even suggest to add the term “too-central-to-fail” to the discussion of “too-big-to-fail” institutions. Given that, investigation of the topologies of financial networks is very important for regulators, central banks and other institutions concerned with the stability of the financial system. Although considerable effort is put into modeling system risk, those methods generally require information from the full network that is most often not observed. This raises the need for a methodology that allows providing an accurate reconstruction of the networks derived from the limited information available.

The canonical examples of network reconstruction in finance are exposure networks created by interbank loans. In these networks, the total assets and liabilities of a given bank are mostly known, but the actual loans made to other banks, i.e. the binary edge structure (existence or non-existence of loans) and their corresponding edge weights (loan volume), are unobserved. Knowledge of the edges and their values is nevertheless crucial to measure the systemic risk in the exposure network. If one bank fails to meet its’ obligations that could lead its’ creditor(s) unable to make their obligations which leads to further contagion, potentially affecting all banks or a large portion of the network. An example of how to process such information, if available, is DebtRank (Battiston et al., 2012), being a popular metric for assessing systemic importance in exposure networks based on the values of loans between bank pairs.

Although the reconstruction problem is introduced here as a task that belongs to the realms of Finance or Economics, it emerges in many different disciplines. In order to get an overview, from the perspective of Economics, see for example Sheldon and Maurer (1998), Upper (2011) and Elsinger et al. (2013). The article by Squartini et al. (2018) provides a very broad overview from a methodological perspective, based on maximum-entropy methods and Statistical Physics (see also Cimini et al., 2015 and Mastrandrea et al., 2014). In Computer Sciences and Statistics, a similar problem is often called traffic matrix estimation or network tomography and this research branch developed its’ own methodological toolkit (e.g. Castro et al., 2004, Zhang et al., 2003b, Airoldi and Blocker, 2013, Zhou et al., 2016 and Nie et al., 2017). In the given paper, not all models proposed in different research fields can be included but we have selected the ones that are feasible and potentially useful for the given data situation.

A good reference point for this paper is certainly the extensive study by Anand et al. (2018). In their paper, they employ seven different reconstruction methods to 25 different networks. Although we are not that ambitious regarding the variety of use cases, our approach can be seen as a related paper that focusses on other aspects. First of all, we do not restrict our methodology to methods that rely only on aggregated row- and column sums but also include density-calibrated methods and models that are capable of incorporating exogenous covariates. Further, it is tried to propose regularized least-squares models inspired by the network tomography literature and new methodology not considered by Anand et al. (2018)

. Additionally, we provide a more detailed technical exposition of the models in a manner that is comprehensible for practitioners. Regarding the evaluation techniques, we separate the evaluation of the binary and valued reconstruction more clearly and employ measures that are more standard in Statistics and Machine Learning.

To compare the different models, we use data provided by the Society for Worldwide Interbank Financial Telecommunication (SWIFT, www.swift.com). SWIFT acts as an infrastructure for financial institutions and enables them to send and receive information about financial transactions encoded in the form of secure standardized messages. One of the most important types of messages is the MT 103 single customer credit transfer, representing payments sent between clients of financial institutions. The MT 103 data under study consist of monthly bilateral message counts aggregated at the country level between January 2003 and February 2018. Note that the concept of a country here is not limited to independent, passport-granting states, but also includes territories (e.g. Turks and Caicos Islands), dependencies (e.g. Guernsey and Jersey) and autonomous constituent states (e.g. Greenland).

The SWIFT network is especially suitable for testing network reconstruction methods because it is an economically meaningful data set (see Cook and Soramaki, 2014 for an extensive investigation). Further, the data provides a long time series available for testing with full link data available. Hence, in this dataset, it is known exactly how well different methods work allowing to compare different models.

The article is structured as follows. In Section 2 we formalize the problem and give general notation for the paper. This is followed by a description of the SWIFT data in Section 3. In Section 4 we introduce the models under study and their evaluation is provided in Section 5. Section 6 discusses the results and concludes the paper.111We provide the code online at Github https://github.com/lebachem/lost_edges . Because the used data set is confidential, the code is not accompanied with the actual dataset but with a “fake dataset” that does not represent the original data but only the same dimension and a similar density.

2 Notation

The SWIFT MT 103 messages can be represented as a series of matrices containing dyadic count data. The elements of can be interpreted as directed edge values among countries at time points . We exclude self-loops from our study and, therefore, elements are left undefined for . Accordingly, within-country payments are not regarded. We also assume that the number of nodes is invariant with respect to time so that at each time point the number of variables is given by .

2.1 Binary Network Structure

Although the binary networks structure is readily available if the valued structure is given, both aspects of the network need to be modelled separately. To account for this aspect, we also introduce notation for the binary network structure. Let denote the binary networks, defined via

with elements being indicators whether the corresponding entry of the matrix is zero or greater than zero. Let the density (also called the connectivity) of the network be

providing the number of non-zero edges in the network relative to the number of possible edges at time point . Additionally, we define the number of outgoing edges to be the outdegree and the number of ingoing edges is measured with the indegree. Formally, the outdegree and the indegree for node at time point are given by

(1)

2.2 Valued Network

Similarly, we are interested in the row and column sums of the valued network, i.e. the valued in- and outdegree. Other names in the network literature describing the same concepts are the in-strength and out-strength or the weighted in- and outdegree. Let the th valued outdegree and valued indegree be

(2)

For a more compact formulation, we stack the row and column sums, resulting in a

-dimensional column vector of marginals

Furthermore, let

be an -dimensional column vector containing the values of the edges (without diagonal elements) and define the known binary routing matrix such that the linear relation

(3)

holds for . Note that relation (3) is just a compact way of writing equations (2) in matrix notation. Henceforth, we will refer to relation (3) as marginal restrictions. The restriction that all matrix entries are non-negative is referred to as non-negativity constraint. If we refer to methods that yield stochastic solutions we adopt the nomenclature from Physics and label a collection of sampled networks as network ensemble (e.g. Bargigli, 2014). In the model description we will suppress the time-superscript in most representations for ease of notation.

As a general convention, vectors and matrices are given in bold and (with the exception of the deterministic routing matrix

), random variables are given by upper case and realisations by lower case letters.

3 Data description

Figure 1: Time series of valued edges in the full MT 103 network on a monthly basis. Messages per edge on the vertical axis, time measured in months on the horizontal axis.
Source: SWIFT BI Watch.

The data under study is provided by the Society for Worldwide Interbank Financial Telecommunication (SWIFT, www.swift.com) and provides standardized messages called MT 103, representing payment transfers. The data is aggregated to the country level and allow to construct a network where the countries are the nodes and the directed, valued edges between them represent the number of messages sent from country to country at time point . The available database covers time points on a monthly basis, ranging from January 2003 to February 2018.

We restrict our analysis to the countries that are existent during the whole observational period. This includes one entity that is not a country or a territory but represents international market infrastructure, referring to monetary organizations that operate in many countries (see the country list in Table 4 of Appendix A). The network including all 203 countries is set to be the baseline for all models that can deal with “big” networks and is labeled full network.

In Figure 1

we plot all individual edges of the full network against time. Notably, there is a great deal of country-related heterogeneity in the data. The time series with the highest time-averaged amount of messages sent corresponds to the edge United States - China (US-CN) and is on average almost ten times higher than the second-highest valued edge (United States - Hong Kong, US-HK). Furthermore, already the yearly 80% quantile of the number of messages within each month ranges only between one and five messages per edge. This implies that the major share of all messages is sent and received by a small subset of countries characterized by a high-intensity exchange.

To analyze both, the full network and its’ “dense core” we additionally investigate a reduced dataset, containing the 59 most important countries. This network is labeled to be the reduced network. Depending on the month, the reduced network accounts for about up to of all messages sent in the whole system. In the following, we show descriptive measures for the full network but in the Annex B the same descriptives are provided for the reduced network.

Figure 2: Summary statistics for the full MT 103 network as monthly time series. Density of the network (left), share of non-zero marginals (middle) and cumulative edge values (right).
Source: SWIFT BI Watch.

The structure of the full binary networks is summarized in Figure 2. On the left-hand side, it can be seen that the density of the network decreases steadily from 2006 on whereas, the development of the total MT 103 messages (right-hand side) follows a clear upward trend. This pattern implies that increasingly more messages are sent per edge. Similarly, but in a more modest form, this can also be concluded for the reduced network (see Figure 10 in the Annex B). The SWIFT data exclusively contains countries that send or receive MT 103 messages. Therefore, each country can only have either a zero out- or indegree. However, if many countries would be restricted to only receive or send messages, the dimensionality of the problem could be greatly reduced. This can be investigated by calculating the share of non-zero marginals for each year. The resulting plot is given in the middle plot of Figure 2 and we find that the low density is not mirrored by a low share of valued marginals. This means that almost no information about the density can be inferred from the marginals since the vast majority of them are greater zero. In the reduced network, the density is much higher (about 0.85 averaged over all months) without any zero marginal.

Figure 3: Binary network topology of the full MT 103 network aggregated for all time points. Cumulative indegree (left) and outdegree distributions (middle) with maximum and minimum values indicated in grey. Outdegree against indegree (right) for all months in dotted with colour intensity by frequency. 45 degree line in solid black.
Source: SWIFT BI Watch.

In Figure 3

the degree structure of the full network is visualized. The first two panels show the cumulative degree distribution (indegree on the left and outdegree on the right) aggregated for all months. The realizations between the monthly minimum and maximum values are indicated in grey. It can be seen that both, the indegree as well as the outdegree grow close to linear and are, therefore, almost uniformly distributed. This is a rather uncommon finding and does not match with common structures like scale-free networks (

Barabási and Albert, 1999; Albert and Barabási, 2002) or random graphs (Erdös-Rényi graphs, Erdös and Rényi, 1959). This is consistent with the findings of Cook and Soramaki (2014) who noted that the data cannot appropriately be described with standard power-law distributions. Again similar and even more pronounced results can be found for the reduced network, shown in Figure 11 of Annex B.

The right panel of Figure 3 plots shows for all nodes and all months the indegree versus the outdegree. This can be thought of a check how “symmetric” the network is and it appears that there is a strong positive (non-linear) relationship between the in- and outdegrees.

In some models, exogenous data can be incorporated. Based on the empirical investigation by Cook and Soramaki (2014), we assume it to be plausible that financial activity in a given country is related to its’ economic size and consider the annual Gross Domestic Product (GDP, in current USD Billions) as a valid covariate. The data is provided by the International Monetary Fund (IMF) and we denote the GDP of country by .

4 Models for Network Reconstruction

4.1 Overview

Since almost none of the zeros in the network can be inferred from the marginals, most of the models that provide edge probabilities rely on two crucial assumptions: (i) The true density is known and (ii) the row- and column sums of the valued edges carry information about the binary structure. Both points are highly related and evolve around the basic problem that knowledge of the marginals is not sufficient to provide information about the edge probabilities (Gandy and Veraart, 2017, Proposition 3.1). This fundamental identification problem can be overcome only by adding additional constraints (i.e. knowledge of the density). For the sake of this article, we assume the true density to be known. In practice, this implies that models that are found to perform very well in our comparison might not do so with an incorrectly specified density. Given knowledge about the true density, the second assumption is less problematic and, depending on the presumed structure of the network under study, it can be plausible that the marginals provide information that helps to determine the edge probabilities.

Another issue that complicates network reconstruction is the high dimensionality of the full network. With nodes, the number of dyads amounts to . This brings many methods to their computational limits. In the reduced network, the problem greatly simplifies as shrinks to which allows to apply almost all methods considered in this paper. An exception is the density-corrected directed weighted configuration model (DWCM) by Bargigli (2014) that is not considered in this paper because the algorithm failed to converge even in the small network. In the model description, we will mention which methods are computationally tractable in the full network and in case they are not, they are only applied to the reduced network.

All methods used are summarized in Table 1, including their names, abbreviations and references to the corresponding sections with a detailed description. Additionally, it is shown whether the methods are applied to the full network and whether knowledge of the true density is needed to calibrate the models.

Method Abbreviation Section Full netw. Calibrated
Maximum-Entropy IPFP 4.2 X
Maximum-Entropy, GDP IPFP-GDP 4.2
Maximum-Entropy, lag. values IPFP-LAG 4.2
Gravity Model GRAVITY 4.3 X
Dens. cor. Gravity Model DC-GRAVITY 4.3 X X
Dens. cor. Gravity, GDP DC-GRAVITY-GDP 4.3 X
Dens. cor. Gravity, lag. values DC-GRAVITY-LAG 4.3 X
Tomogravity Model TOMOGRAVITY 4.4
LASSO LASSO 4.5 X X
Hierarchical Erdös-Rényi Model H-ER 4.6 X X
Hierarchical Fitness Model H-FIT 4.6 X X
Minimum Density MINDENS 4.7 X
Table 1: Summary of reconstruction methods used in this article together with abbreviations, their ability to fit the full network (Full netw.) and whether calibration to the true density is needed (Calibrated).

4.2 Iterative Proportional Fitting

A very simplistic, but nevertheless powerful method to reconstruct dense networks is given by the iterative proportional fitting procedure (IPFP, Deming and Stephan, 1940, Fienberg et al., 1970

). The algorithm has gained much attention in the matrix reconstruction literature under the name maximum-entropy method because it allows for estimating the parameters of the maximum-entropy probability distribution (the methodological backbone of many reconstruction tasks,

Squartini et al., 2018).

In the Statistics literature, the procedure is originally intended to provide maximum likelihood estimates for parameters of log-linear models in contingency tables (

Bishop et al., 1975, Haberman, 1978, 1979

). In the given case, this interpretation is convenient because it allows for a specific interpretation of the outcomes as the maximum likelihood estimates for the expectation of a Poisson-distributed random variable

(4)

with log-linear expectation . The two parameters and correspond to row- and column-effects. Furthermore, the model provides a model-based possibility to calculate the probability of observing a value greater than zero:

However, with high values for , the probabilities approach zero exponentially fast, meaning that for high marginals most probabilities will be almost or numerically even equal to one.

The model is a dense reconstruction method that provides edge values for all rows and columns with valued marginals. Besides this drawback, the model has the merit of being computationally efficient (see the R package ipfp by Blocker et al., 2014) and due to the construction of the algorithm, it is guaranteed that the row- and column sums of the predicted entries match the observed marginals exactly.

In Lebacher and Kauermann (2019) the IPFP model from above is extended to incorporate informative dyadic exogenous information, which is labeled here as . In particular, the covariates can be included in the log-linear expectation

If the association between and the unknown is high, the prediction accuracy increases relative to the standard IPFP solution. However, this approach comes at a price since model fitting is based on constrained non-linear optimization making computation significantly more demanding than standard IPFP. Furthermore, only dyadic covariates have the potential to increase the predictive power but in practice often only monadic information is available. We take a pragmatic approach and use a transformation of the GDP values of countries and that is not linearly separable and can be interpreted as dyadic overall GDP, defined through:

(5)

This yields the expectation

To test the power of the model in situations where a covariate with a strong association is available, we also include a logarithmic transformation of the lagged edge values:

(6)

Estimation is pursued as described in Lebacher and Kauermann (2019), with a constrained Poisson likelihood. In the following, the IPFP-type models using GDP and the lagged variables are denoted IPFP-GDP and IPFP-LAG, respectively.

4.3 Gravity Models

Gravity models are at the heart of many methods related to the analysis of network flow data (Kolaczyk, 2009). Besides their successful application to economic trade data (Disdier and Head, 2008, Head and Mayer, 2014) they are also among the preferred models for network tomography in Computer Sciences (Vardi, 1996). Network tomography relates to a problem that often appears when analyzing computer networks. Here, the individual edge loads are assumed to be known but the flow is allowed to intersect the nodes in the network. The task is then to provide accurate predictions for flows between arbitrary nodes. Very often the gravity model is found to be among the best algorithms to solve this problem (Zhang et al., 2003a). Although the formulation of the problem seems to be very different compared to the network reconstruction task, it leads to the same mathematical structure.

From a methodological point of view, the gravity model is simply a special case of the IPFP model discussed above and in fact, the gravity model is the immediate maximum-entropy solution in each network reconstruction problem where self-loops are allowed (Squartini et al., 2018, Sheldon and Maurer, 1998). Mathematically, the model builds on a simple multiplicative structure

(7)

with

representing the sum over all valued in- or outdegrees. Though simple in structure and fast to compute, the model has two main drawbacks. First, the model yields biased results if the diagonal elements are restricted to be zero because then the row and column sums of the predictions do not match the marginal restrictions exactly. However, in big networks, the bias is often negligible. Second, as in all maximum-entropy models, the approach relies on inferring sparseness from the marginals and predicts exclusively non-zero matrix entries if all marginals are greater than zero.

Because economic and financial networks most often exhibit a density smaller than one, Cimini et al. (2015) proposed a model that is designed for reconstructing the binary structure of networks with limited information available. Basically, they extend the gravity model from above towards a two-step procedure. In the first step they propose to model the probabilities of observing an edge with a parameter such that they match with the pre-defined targeted density

(8)

where the parameters and are node-specific fitness variables. Following the idea that the marginals carry information about the binary network structure, they are typically set equal to the marginals (i.e. and ) or some transformation of them. Another interpretation is that the economic strength determines the fitness of a country or previous bilateral exchanges influence the fitness of dyadic relations. We include the transformed GDP values and set as defined in equation (5). For the logarithmic lagged exchange we set as fitness variables. Note that adding instead of prevents the probabilities from being zero irrespective of in cases with .

The parameter can be found by any precise root-search program. In applications with larger dimensionality, the values for

might become numerically very small and we use a genetic algorithm (implemented in the

R package GA by Scrucca, 2013) to overcome this problem. Given an estimate for that satisfies (8), Cimini et al. (2015) propose to sample binary networks network ensembles with variables and use a density-corrected version of model (7) for the edge values

(9)

In the following we refer to the density-corrected gravity model by DC-GRAVITY and the models with GDP and lagged variables are abbreviated by DC-GRAVITY-GDP and DC-GRAVITY-LAG.

4.4 Tomogravity model

An important model candidate from the network tomography literature is proposed by Zhang et al. (2003b)

. In their article, the problem of learning origin-destination flows from link load data in IP networks motivates the estimation of a traffic matrix. The authors regard the problem as an ill-posed regression problem that must be regularized with the Kullback-Leibler divergence from an independence model. The predicted values can be found by minimizing the loss-function

(10)

with respect to and subject to the non-negativity constraint. The first term is simply the sum of squared deviations from the marginals. In the penalization term, the gravity model serves as a null model together with a regularization parameter . Note that the model is a dense reconstruction technique and does neither provide probabilities nor do the predictions match with the observed marginals.

Although this appears to be an appealing combination between the successful gravity model and information-theoretic reasoning, the procedure is so far seldom applied to the reconstruction of networks. The approach is implemented in the R package tomogravity (see Blocker et al., 2014). The implementation is computationally expensive and we, therefore, apply this model only for the reduced data set. Zhang et al. (2003b) show in a simulation study, that the performance of the algorithm is not very sensitive to varying values of and as a rule of thumb they recommend to use if no training data are available and we follow their rule in the application section.

4.5 LASSO Model

Regarding the network reconstruction problem again as an ill-posed regression problem, it might not even be necessary to make use of a new penalization term. Instead, the least absolute shrinkage and selection operator (LASSO) approach proposed by Tibshirani (1996) can be employed, which uses a penalty to enforce sparsity in the model. Although approaches with some kind of regularization are common in network tomography (Castro et al., 2004) the LASSO is applied rather rarely for network reconstruction. An exception is given by Chen et al. (2017) who propose a LASSO-type model to predict flows in a bike-sharing network from station traffic (number of ingoing and outgoing bikes at each station).

Technically, the quadratic deviation from the marginals is combined with a regularization term that penalizes the sum of the predicted matrix entries, yielding the following loss function

(11)

By the non-negativity constraint, the absolute value in the penalization term can be dropped. The R package glmnet by Friedman et al. (2009) allows for efficient and scalable estimation.

In principle, the model might appear to be attractive because the regularization shrinks some predictions exactly to zero. However, it is not clear how to derive the penalization parameter because cross-validation aiming at the marginals does not lead to satisfactory results. Chen et al. (2017) propose to use a training data set - information that might not always be available. To use the approach nevertheless in the competitive comparison without a training set available, we optimize the penalty parameter on a grid such that the number of non-zero coefficients is consistent with the real density.

Note further, that the predicted marginals are, by construction, always be smaller than the observed ones because of the shrinkage property of the LASSO. On the other hand, the model has much potential for exploratory analysis by investigating the path plots of the coefficients, i.e. the values of the coefficients against increasing values of .

4.6 Hierarchical Fitness Models

A central finding of the study by Anand et al. (2018) states that no method works equally well for different reconstruction tasks. Based on this insight, Gandy and Veraart (2017) proposed that a construction method should be adjustable to topological characteristics and especially to the density of a network. To do so, they present a hierarchical model designed for the reconstruction of financial networks. In the hierarchy of the model, the first step consists of estimating the edge probabilities consistent with the target density . As a baseline model, the authors propose an Erdös-Rényi model with

treating each edge to be equally likely. Given the obtained set of probabilities, edge weights are sampled from an exponential distribution with common expectation

(12)

The sampling algorithm is constructed such that the sampled networks provide stochastic network ensembles but each realization is consistent with the marginal restrictions.

Additionally, they proposed a model that is inspired by fitness-based approaches similar as in equation (9). In this model, the edge probability is determined by the logistic function

(13)

with being some constant that is estimated for consistency with the target density. In this model, the marginals serve as log-transformed fitness variables. In principle, any kind of variables could be used for the fitness model but only the marginals are yet implemented in the R package systemicrisk. The software implementation is very efficient and not overstrained by the dimensionality of the full network. Nevertheless, the algorithm is in trouble with the high values of the marginals. In the given application the marginals are scaled down in the estimation procedure and the predictions are then rescaled again.

By construction, the model puts much more emphasis on the binary network structure than on the prediction of the edge values. This is because the marginals are used directly only in the first step to estimate the edge probabilities. In the second step, all edge values are assumed to share the same expectation (12) and the marginal constraints enter only indirectly as a restriction.

In the comparison, the hierarchical Erdös-Rényi model is abbreviated by H-ER and the hierarchical fitness model is called H-FIT.

4.7 Minimum Density

Anand et al. (2015) noted that the problem of binary network reconstruction can be viewed as finding a solution between two extreme points in the space of possible networks. Either, a maximally dense solution is searched for (maximum-entropy approaches), or it is the goal to find a solution with a minimal number of non-zero edges that are still consistent with the marginal constraints. Given that financial networks are typically sparse and disassortative, maximum-entropy solutions almost certainly provide an incorrect binary network structure.

In principle, if the density of the network is driven to the lowest level possible, the allocation of the edge weights might even become a simple task because of the small number of possibilities that are left. In its original form, the loss function of the minimum density model is simply given by the number of non-zero edges

subject to the marginal constraints and the non-negativity constraint. The loss function is not differentiable and direct minimization is computationally expensive. To circumvent this obstacle, Anand et al. (2015) relax the problem by giving up the assumption that the marginal constraint must hold exactly and shift the focus on the quadratic deviations from the marginals. Then, the authors propose an algorithm that implements two Markov processes, one adds new edges and weights and the second one deletes edges. Initialized with an arbitrary network the algorithm iterates as long as the loss function does not decrease any more together with a sufficient fit for the marginals.

The proposed algorithm is stochastic with non-unique solutions and generates ensembles of low-density networks. Typically, the realization with the lowest density is taken to be the optimal estimate (called MINDENS henceforth). By definition, the method does not rely on knowledge of the real density . Therefore, it is appropriate to regard the model as a lower-bound (in the space of feasible networks that satisfy the marginal constraints) instead of viewing it as an accurate reconstruction. This also has implications for the edge values, because a minimal number of edges in the system leads to maximal concentration of the edge values on a few nodes.

5 Evaluation

5.1 Binary Network Reconstruction

We evaluate the quality of the binary network reconstruction with different measures. For models that provide edge probabilities, we use the area under the curve (AUC) of the receiver-operating characteristic (ROC) curve and the precision-recall (PR) curve (see Grau et al., 2015). We regard both measures as complementary for model evaluation. While the ROC curve is, so to speak, ignorant about how good we predict either or , the PR curve describes how well the models do in predicting . This is relevant because in low-density networks it is simpler to predict a zero than a one. Further, we look at the Bier score decomposition proposed by Murphy (1973) (see also Siegert, 2017). For each time period and model, we obtain different probabilities with equal probabilities that correspond to edges for . The Bier score decomposition is given by

The reliability measures the distance between the estimated probabilities and the average real frequencies, with being the best value that can be achieved. This means that a low reliability actually is the preferred outcome. The term labeled is called uncertainty and gives the variability of the edges in the sample. Resolution () gives the difference between the different share of empirical probabilities for each of the categories and their overall average. Hence, it is a measure for the ability to discriminate between zero and one. A higher value indicates a better resolution. If the ability to discriminate is at is maximum, all probabilities are either one and zero, in this situation it holds that . We report these measures aggregated for all years and show the aggregated difference which can be interpreted as the reduction of the uncertainty due to resolution.

We follow Squartini et al. (2018) and provide graphical representations of the reconstructed networks in the Appendices C.1 and D.1

. There, the reconstructed adjacency matrices (based on binarization with a threshold according to the true density) for the most recent full and reduced network are shown.

We are not only interested in the prediction of individual edge occurrences but also in the quality of the reconstructed network topology. Given the strong heterogeneity in the network, the degree distribution can be regarded as a very important measure for the binary structure. We evaluate the fit of the outdegree distribution using the square root of the mean squared error of the real and the reconstructed outdegree distribution

and correspondingly for the indegree.

Figure 4: Evaluation of probabilities in the full network. Time series of the area under the curve (AUC) values for receiver-operating-characteristics (ROC, left panel) curve and precision-recall (PR, right panel) curve for the IPFP model, the degree corrected Gravity model (DC-GRAVITY), the hierarchical Erdös-Rényi model (H-ER) and the hierarchical fitness model.
Source: SWIFT BI Watch.

To make the models comparable, we calibrate all estimates to the same target density. For models that are not scaled to the real density, we use a pragmatic approach and take the highest (the number of edges in the real network) estimates to be one and all other estimates to be zero while in the probability-based models, we use the highest probabilities to predict a one.

A visual impression of the quality of the degree reconstruction is given in Appendix C.2, plotting the predicted outdegree (indegree) against the real outdegree (indegree) for the most recent network observation of the full network and in D.2 for the reduced network.

Figure 5: Decomposition of the Brier score for the full network into reliability (REL) and remaining uncertainty after subtracting resolution (UNC-RES) for the IPFP model, the Gravity model (GRAVITY), the degree corrected gravity model model (DC-GRAVITY), the hierarchical fitness models (H-ER, H-FIT). Uncertainty (UNC) as a dashed line.
Source: SWIFT BI Watch.

5.1.1 Full Network

In the full network, four different models can be compared using AUC values and the decomposed Brier score. These four models include the iterative proportional fitting model from Section 4.2 (IPFP), the density-corrected gravity model by Cimini et al. (2015) from Section 4.3 (DC-GRAVITY) and the two hierarchical models from Section 4.6, with edge-probabilities coming either from the Erdös-Rényi (H-ER) or the fitness model (H-FIT). In Figure 4, we plot the AUC values for the ROC (left panel) and PR (right panel) curves against time. In Figure 5 we show the decomposition of the Brier score, with uncertainty () as a dashed vertical line.

For the reconstruction of the degrees, additionally the Gravity model (GRAVITY) from Section 4.3, the LASSO from Section 4.5 and the minimum density (MINDENS) method of Section 4.7 enter the comparison. This is visualized in Figure 6 with the root mean squared errors for the outdegree in the left panel and for the indegree on the right. In both figures, the abbreviations of models are ordered to approximately match the time-averaged height of the respective measures.

Edge Probabilities
The hierarchical Erdös-Rényi (H-ER) model performs worst in the left panel and second-worst in the right panel of Figure 4. Seemingly, the assumption of equal probabilities for all dyads is strongly violated in this network. This relates to the discussion in Section 3 where we showed that the patterns of the degree distribution do not match with the Erdös-Rényi model.

However, also the IPFP model that allows for differing edge probabilities in its’ Poisson interpretation does not perform satisfactorily and we see a declining trend of the prediction accuracy with time. As a consequence, the AUC values of the ROC curve decrease strongly and when evaluated with the PR curves, the model even provides the worst outcomes. This is a result of the growing values of the marginals, implying that the IPFP probabilities become very close to one or even numerically equal to one, leading to a loss of variation among the probabilities.

The two winners of this comparison, the density-corrected gravity model (DC-GRAVITY) and the hierarchical fitness model (H-FIT), give very similar accuracy measures in both panels of Figure 4. The AUC values for the ROC curves provided by the H-FIT model are slightly better than the ones of the DC-GRAVITY model and the other way round when evaluated with the PR curves. The strong similarity of the models’ predictive power is, in fact, intuitive and results from the comparable choice of functions for determining the edge probabilities.

These results can be supported by the decomposition of the aggregated Brier score shown in Figure 5. The DC-GRAVITY model and the H-FIT model both provide a very low reliability measure and a comparatively high resolution. Interestingly, they are closely followed by the H-ER model that does not appear to be much worse with respect to the Brier score. Different from that, we find that the IPFP model has a low resolution and a high reliability measure, indicating that provided probabilities deviate strongly from the real ones and the ability to separate the predictions into “0” and “1” is rather low. Again this is because the IPFP model is not calibrated and many predictions are numerically just equal to one.

Degree Structure
Turning to the reconstruction of the degree structure, the different scaling of the two panels in Figure 6 shows that it is simpler to reconstruct the indegrees as compared to the outdegrees. The minimum density solution (MINDENS) marks an extreme case, resulting in the worst reconstruction of the out- and indegree structure. However, MINDENS has the comparative disadvantage of not being calibrated to the density and predicts far fewer edges than present in the real networks. Therefore, fewer edges can be allocated to certain nodes. With the exception of the United States, the model predicts no out- or indegrees above 65 at all (see also Figure 24 in the Annex C.2).

The LASSO provides the most unstable behavior and exhibits a high variance. Although the model is calibrated to the real density, the edge reconstruction is second-worst and delivers unsatisfactory reconstructions for the out- and the indegree. In Figure

25 of Annex C.1 it can be seen that the reconstructed degrees look almost random and Figure 18 indicates that the model is not able to make efficient use of the provided information on the row and column sums.

Figure 6: Time series of root mean squared error (RMSE) for the reconstruction of the outdegree (right) and the indegree (left) of the full network for the IPFP model, the Gravity model (GRAVITY), degree corrected gravity model model (DC-GRAVITY), the hierarchical fitness models (H-ER, H-FIT), the LASSO model and the minimum density model (MINDENS).
Source: SWIFT BI Watch.

The performance of the hierarchical Erdös-Rényi model (H-ER) shows that ignorance about the marginals for predicting the binary structure also can lead to unsatisfactory outcomes. The H-ER model “over-estimates” the out- and indegree for countries with medium-sized degrees and “under-estimates” the out- and indegree for countries with high degrees. This is clearly a result of the assumption that all edges are equally likely (as long as consistent with the marginals), leading to a random block structure in the network.

The Gravity model (GRAVITY) and the IPFP model make the best use of the information provided by the marginals to reconstruct the outdegree but not for the indegree. Their predictive quality concerning the degrees is almost identical and it is hard to distinguish both models in the left panel (IPFP overlays GRAVITY in both plots).

The hierarchical fitness model (H-FIT) together with the degree corrected gravity model (DC-GRAVITY) perform slightly worse than the GRAVITY and IPFP models concerning the outdegree but can be said to be the winner in the competition for the indegree reconstruction.

5.1.2 Reduced Network

In the reduced network, a greater variety of models can be investigated. Essentially, we can add four additional models in our comparative study, by extending the degree corrected gravity model (DC-GRAVITY) from Section 4.3 with the usage of GDP (DC-GRAVITY-GDP) and the lagged values (DC-GRAVITY-LAG) for determining the edge probabilities as well as the extended IPFP approach from Section 4.2 using the GDP values (IPFP-GDP) and the lagged values (IPFP-LAG) a covariates. In the degree reconstruction part, additionally the TOMOGRAVITY model (Section 4.4) is considered. The MINDENS model, however, is not considered for the reduced network because it is very dense.

Figure 7: Evaluation of probabilities in the reduced network. Time series of area under the curve (AUC) values for receiver-operating-characteristics (ROC, left panel) and precision-recall (PR, right panel) for the IPFP model, the degree corrected gravity model (DC-GRAVITY) with covariates (DC-GRAVITY-GDP, DC-GRAVITY-LAG), the hierarchical Erdös-Rényi model (H-ER), the hierarchical fitness model (H-FIT) and the IPFP-based models with covariates (IPFP-GDP, IPFP-LAG).
Source: SWIFT BI Watch.
Figure 8: Decomposition of the Brier score for the reduced network into reliability (REL) and remaining uncertainty after subtracting resolution (UNC-RES) for the IPFP model, the degree corrected gravity model (DC-GRAVITY) with covariates (DC-GRAVITY-GDP, DC-GRAVITY-LAG), the hierarchical Erdös-Rényi model (H-ER), the hierarchical fitness model (H-FIT) and the IPFP-based models with covariates (IPFP-GDP, IPFP-LAG). Uncertainty (UNC) as a dashed line.
Source: SWIFT BI Watch.

Edge Probabilities
The IPFP probabilities are among the worst in both panels of Figure 7. Similar to the full network, the AUC values for the ROC curve are strongly decreasing with time. An almost parallel pattern can be found for the IPFP-based reconstruction with GDP values (IPFP-GDP). Although the exogenous information helps to improve the performance relative to IPFP, the outcome is still very bad in comparison to the other models.

While the information on the GDP nevertheless improves the fit in the IPFP-based models, this is not the case for the density-corrected gravity model (DC-GRAVITY). It turns out that the version that includes GDP values (DC-GRAVITY-GDP) performs even worse than without (DC-GRAVITY) with both measures. Again, we find that the DC-GRAVITY and the H-FIT model behave very similar.

The two models with lagged variables as covariates, the DC-GRAVITY-LAG model and the IPFP model combined with the lagged values (IPFP-LAG), have the unfair advantage of incorporating much more information than all others and reach outstanding AUC values in both panels of Figure 7 (both lines overlay in the plots). In Figures 30 and 35 it can be seen that the reconstructed network based on the lagged covariates is almost identical to the original one, showing that having observed an edge in is almost deterministic for predicting an edge in .

The decomposition of the Brier score in Figure 8 mirror the results discussed above. However, it is striking that the IPFP-LAG model has a much higher reliability measure in comparison to the DC-GRAVITY-LAG model which results from not being calibrated to the real density. Further note that the three models IPFP, IPFP-GDP and DC-GRAVITY-GDP have a resolution score of almost zero indicating that the knowledge GDP does not contribute much information about the binary edge structure.

Figure 9: Time series of root mean squared error (RMSE) for the reconstruction of the outdegree (right) and the indegree (left) of the reduced network for the IPFP model, the Gravity model (GRAVITY), the degree corrected gravity model (DC-GRAVITY) with covariates (DC-GRAVITY-GDP, DC-GRAVITY-LAG), the hierarchical Erdös-Rényi model (H-ER), the hierarchical fitness model (H-FIT), the IPFP-based models with covariates (IPFP-GDP, IPFP-LAG), the LASSO model and the TOMOGRAVITY model.
Source: SWIFT BI Watch.

Degree Structure
Given that the reduced network is very dense, the degree structures might be more easily reconstructed as compared to the sparse full network. However, the predicted edges still need to be allocated correctly to the corresponding nodes which are not a trivial task. This becomes obvious when regarding the visualization of the degree reconstruction in Supplementary Material. There it can be seen that the reconstruction of the binary degrees is partly very bad. Quantified with the root mean squared errors as shown in Figure 9, the models can be compared directly.

In both panels of Figure 9 it can be seen very clearly that the models that incorporate the lagged matrix entries (IPFP-LAG, DC-GRAVITY-LAG) lead to degree reconstructions that are superior in every respect. Except for some spikes, that might reflect a kind of seasonality pattern, the root mean squared errors are close to zero.

If no information from exogenous covariates is available, the DC-GRAVITY model is found to perform very well for the indegree. Especially regarding the outdegree, almost all methods (amongst others H-FIT, GRAVITY, IPFP) give good and very comparable results.

Again the LASSO proves to be a bad choice for reconstructing the degree structure, exhibiting a high variance over time as well as large deviations from the actual degrees.

5.2 Valued Network Prediction

The mechanisms that determine the edge probabilities might differ fundamentally from the ones that lead to certain edge values. Additionally, some models are restricted to the prediction of edge values and the prediction of binary networks constructed with threshold values is not the usage they are originally built for. Therefore, we now pay attention to the predictive quality of the valued reconstruction in terms of the errors

and the errors

These measures are regarded in terms of overall errors aggregated over all time points as well as their monthly averages and the corresponding standard errors.

5.2.1 Full Network

Method overall overall average SE average SE
IPFP
GRAVITY 47,971.710 211.125 263.581 68.321 14.886 4.843
DC-GRAVITY
LASSO
H-FIT
H-ER
MINDENS
Table 2: Evaluation of the reconstructed valued full MT 103 networks, Method in the first column. Aggregated and errors in columns two and three as well as average errors and their standard errors over time in the last four columns. Minimal values in bold.
Source: SWIFT BI Watch.

In Table 2, it can be seen that the two dense reconstruction models IPFP and GRAVITY give the best reconstruction evaluated with the and errors with the GRAVITY model being slightly ahead. The third-best prediction quality is delivered by the DC-GRAVITY model. It can be inferred that the risk of guessing the wrong edges to be zero or one (and placing a high weight or no weight to the false edges) strongly counterweights the seeming disadvantage of the dense reconstruction methods. This effect is pronounced in the MINDENS model and even more so in the LASSO model that comes with extremely high errors. However, also the hierarchical fitness model (H-FIT), one of the best models for binary network reconstruction is found to provide edge value predictions that are by far worse compared to the GRAVITY solution.

5.2.2 Reduced Network

Method overall overall average SE average SE
IPFP 54.409
GRAVITY 38,646.430 203.317 212.343 54.268 14.321 4.709
DC-GRAVITY
TOMOGRAVITY
LASSO
H-FIT
H-ER
DC-GRAVITY-GDP
DC-GRAVITY-LAG
IPFP-GDP 38,231.110 214.741 210.061 15.018
IPFP-LAG 4,137.484 34.178 22.859 11.523 2.128 1.391
Table 3: Evaluation of the reconstructed valued reduced MT 103 networks, Method in the first column. Aggregated and errors in columns two and three as well as average errors and their standard errors over time in the last four columns. The last four rows give models with exogenous information included. Minimal values in bold.
Source: SWIFT BI Watch.

In the reduced network, we conclude that IPFP, the GRAVITY model and the TOMOGRAVITY model results in very similar aggregated and errors. These models are closely followed by the DC-GRAVITY-GDP model. The hierarchical models (H-FIT, G-ER) perform comparable and by far better than the LASSO.

Models that include exogenous information are separated and given in the last four rows of Table 3. Among these models, the second-best result is given by the IPFP-GDP model, showing that the GDP values provide useful information that improves the quality of the edge value reconstruction, for example, relative to IPFP or the GRAVITY model. The IPFP model that incorporates lagged edge values (IPFP-LAG) as covariates performs outstandingly well. But again it might unrealistic to assume the availability of lagged data points. Interestingly, both density-corrected gravity models with exogenous covariates (DC-GRAVITY-GDP, DC-GRAVITY-LAG) are only slightly better than the DC-GRAVITY model. This can be explained by the fact, that the exogenous information is only used to determine the edge probabilities.

6 Discussion

In this paper, we have compared different models for network construction using the SWIFT MT 103 networks. The models are compared along different dimensions, including the accuracy of edge prediction, degree reconstruction, and edge value estimation. Overall, four conclusions that can be drawn from this competitive comparison.

(i) The task of reconstructing edge values differs fundamentally from the task of estimating edge probabilities. Technically, this is very intuitive because the marginals give exclusively information about the edge values and all approaches that output edge probabilities are necessarily dependent on further restrictions (the real density).

Even if the true density is assumed to be known, no model emerged that can be said to be great in achieving outstanding predictions of the edge probabilities and their values. This conclusion is also in line with the findings of the extensive comparison by Anand et al. (2018). We, therefore, recommend that the model choice should be governed by the specific use case and depending on the importance attached to either reconstruction. If the binary structure is of interest and the model is presumed to be sparse, the hierarchical fitness model (H-FIT) and the density-corrected gravity model (DC-GRAVITY) are good choices. While Anand et al. (2018) highlight the ability of the minimum density method (MINDENS) to detect absent edges we must supplement this by noting that the method nevertheless performs not that good if interest lies in detecting present edges.

Regarding the quality of the edge value prediction, either in sparse or dense networks the maximum entropy models (IPFP and GRAVITY) work very well. The same was found by Anand et al. (2018), pointing on the good quality of maximum entropy solutions. However, in contrast to their findings, we highlight here more clearly the potential shortfalls for sparse reconstruction methods for the prediction of edge values.

(ii) Other than Anand et al. (2018), we do not find that the preferred models change when either a dense or a sparse network is to be reconstructed. However, this statement must be taken with care since in our analysis the dense network is, in fact, a subset of the sparse one.

(iii) Including exogenous information can help to improve both, the binary and the valued network reconstruction and partly leads to dramatic increases in the predicted performance. However, this increase in predictive accuracy is not guaranteed. If variables with a low association to the unknown edge values are chosen, the quality of the reconstruction might even decline (see also Lebacher and Kauermann, 2019). Especially regarding the binary network reconstruction, the inclusion of GDP led to mixed results.

(iv) As an “off the shelf” model in situations without exogenous information available, the density-corrected gravity model (DC-GRAVITY) can be recommended because it is found to work well on the big sparse network as well as on the small dense network with respect to the edge probabilities and the edge values. A similar conclusion can be found in Anand et al. (2018, p. 116), stating that among the probabilistic methods the model is the “clear winner across all measures of interest”. Similarly, Gandy and Veraart (2019) report that this model is performing very well in binary and valued reconstruction. Further, the model can be extended towards the inclusion of exogenous information in a simple way.

For further research, it seems to be necessary to compare the performance of edge probabilities when using calibration densities that differ from the real one. Another important research question relates to the ability of reconstruction models to provide uncertainty quantification. Many approaches introduced above results in network ensembles or come with an associated stochastic structure that can be used to construct prediction intervals.

Acknowledgement

We would like to thank Peter Ware and Nancy Murphy for providing the SWIFT data.

Declaration of Interest

The project was supported by the European Cooperation in Science and Technology [COST Action CA15109 (COSTNET)]. We also gratefully acknowledge funding provided by the German Research Foundation (DFG) for the project KA 1188/10-1: International Trade of Arms: A Network Approach.

References

  • Airoldi and Blocker (2013) Airoldi, E. M. and A. W. Blocker (2013): “Estimating latent processes on a network from indirect measurements,” Journal of the American Statistical Association, 108, 149–164.
  • Albert and Barabási (2002) Albert, R. and A.-L. Barabási (2002): “Statistical mechanics of complex networks,” Rev. Mod. Phys., 74, 47–97.
  • Anand et al. (2015) Anand, K., B. Craig, and G. Von Peter (2015): “Filling in the blanks: Network structure and interbank contagion,” Quantitative Finance, 15, 625–636.
  • Anand et al. (2018) Anand, K., I. van Lelyveld, Ádám Banai, S. Friedrich, R. Garratt, G. Hałaj, J. Fique, I. Hansen, S. M. Jaramillo, H. Lee, J. L. Molina-Borboa, S. Nobili, S. Rajan, D. Salakhova, T. C. Silva, L. Silvestri, and S. R. S. de Souza (2018): “The missing links: A global study on uncovering financial network structures from partial data,” Journal of Financial Stability, 35, 107 – 119.
  • Baek et al. (2014) Baek, S., K. Soramaki, and J. Yoon (2014): “Network indicators for monitoring intraday liquidity in bok-wire+,” Bank of Korea Working Paper, 1.
  • Barabási and Albert (1999) Barabási, A.-L. and R. Albert (1999): “Emergence of Scaling in Random Networks,” Science, 286, 509–512.
  • Bardoscia et al. (2017) Bardoscia, M., S. Battiston, F. Caccioli, and G. Caldarelli (2017): “Pathways towards instability in financial networks,” Nature Communications, 8, 14416.
  • Bargigli (2014) Bargigli, L. (2014): “Statistical ensembles for economic networks,” Journal of Statistical Physics, 155, 810–825.
  • Battiston et al. (2016) Battiston, S., J. D. Farmer, A. Flache, D. Garlaschelli, A. G. Haldane, H. Heesterbeek, C. Hommes, C. Jaeger, R. May, and M. Scheffer (2016): “Complexity theory and financial regulation,” Science, 351, 818–819.
  • Battiston et al. (2012) Battiston, S., M. Puliga, R. Kaushik, P. Tasca, and G. Caldarelli (2012): “DebtRank: Too central to fail? Financial networks, the Fed and systemic risk,” Scientific reports, 2, 541.
  • Billio et al. (2012) Billio, M., M. Getmansky, A. W. Lo, and L. Pelizzon (2012): “Econometric measures of connectedness and systemic risk in the finance and insurance sectors,” Journal of Financial Economics, 104, 535–559.
  • Bishop et al. (1975) Bishop, Y. M., P. W. Holland, and S. E. Fienberg (1975):

    Discrete multivariate analysis: theory and practice

    , Cambridge: MIT Press.
  • Blocker et al. (2014) Blocker, A. W., P. Koullick, and E. Airoldi (2014): “networkTomography: Tools for network tomography,” R package version 0.3.
  • Caccioli et al. (2018) Caccioli, F., P. Barucca, and T. Kobayashi (2018): “Network models of financial systemic risk: A review,” Journal of Computational Social Science, 1, 81–114.
  • Castro et al. (2004) Castro, R., M. Coates, G. Liang, R. Nowak, and B. Yu (2004): “Network tomography: Recent developments,” Statistical Science, 19, 499–517.
  • Chen et al. (2017) Chen, L., X. Ma, G. Pan, J. Jakubowicz, et al. (2017): “Understanding bike trip patterns leveraging bike sharing system open data,” Frontiers of computer science, 11, 38–48.
  • Chinazzi et al. (2013) Chinazzi, M., G. Fagiolo, J. A. Reyes, and S. Schiavo (2013): “Post-mortem examination of the international financial network,” Journal of Economic Dynamics and Control, 37, 1692–1713.
  • Cimini et al. (2015) Cimini, G., T. Squartini, D. Garlaschelli, and A. Gabrielli (2015): “Systemic risk analysis on reconstructed economic and financial networks,” Scientific reports, 5, 15758.
  • Cook and Soramaki (2014) Cook, S. and K. Soramaki (2014): “The global network of payment flows,” SWIFT Institute Working Paper.
  • Deming and Stephan (1940) Deming, W. E. and F. F. Stephan (1940): “On a least squares adjustment of a sampled frequency table when the expected marginal totals are known,” The Annals of Mathematical Statistics, 11, 427–444.
  • Disdier and Head (2008) Disdier, A.-C. and K. Head (2008): “The puzzling persistence of the distance effect on bilateral trade,” The Review of Economics and Statistics, 90, 37–48.
  • Elsinger et al. (2013) Elsinger, H., A. Lehar, and M. Summer (2013): “Network models and systemic risk assessment,” in Handbook on Systemic Risk, ed. by J.-P. Fouque and J. A. Langsam, Cambridge: Cambridge University Press, chap. IV, 287–305.
  • Erdös and Rényi (1959) Erdös, P. and A. Rényi (1959): “On Random Graphs I,” Publicationes Mathematicae Debrecen, 6, 290.
  • Fienberg et al. (1970) Fienberg, S. E. et al. (1970): “An iterative procedure for estimation in contingency tables,” The Annals of Mathematical Statistics, 41, 907–917.
  • Friedman et al. (2009) Friedman, J., T. Hastie, and R. Tibshirani (2009): “glmnet: LASSO and elastic-net regularized generalized linear models,” R package version 2.0-1.6.
  • Gai and Kapadia (2010) Gai, P. and S. Kapadia (2010): “Contagion in financial networks,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 466, 2401–2423.
  • Gandy and Veraart (2017) Gandy, A. and L. A. Veraart (2017): “A Bayesian methodology for systemic risk assessment in financial networks,” Management Science, 63, 4428–4446.
  • Gandy and Veraart (2019) Gandy, A. and L. A. M. Veraart (2019): “Adjustable network reconstruction with applications to CDS exposures,” Journal of Multivariate Analysis, 172, 193 – 209.
  • Grau et al. (2015) Grau, J., I. Grosse, and J. Keilwagen

    (2015): “PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R,”

    Bioinformatics, 31, 2595–2597.
  • Haberman (1978) Haberman, S. J. (1978): Analysis of qualitative data 1: Introductory topics, New York: Academic Press.
  • Haberman (1979) ——— (1979): Analysis of qualitative data 2: New Developments, New York: Academic Press.
  • Head and Mayer (2014) Head, K. and T. Mayer (2014): “Gravity equations: Workhorse, toolkit, and cookbook,” in Handbook of international economics, ed. by G. Gopinath, E. Helpman, and K. Rogoff, Amsterdam: Elsevier Science Publishing, vol. 4, 131–195.
  • Imakubo et al. (2010) Imakubo, K., Y. Soejima, et al. (2010): “The transaction network in Japan’s interbank money markets,” Monetary and Economic Studies, 28, 107–150.
  • Kauê Dal’Maso Peron et al. (2012) Kauê Dal’Maso Peron, T., L. da Fontoura Costa, and F. A. Rodrigues (2012): “The structure and resilience of financial market networks,” Chaos: An Interdisciplinary Journal of Nonlinear Science, 22, 013117.
  • Kolaczyk (2009) Kolaczyk, E. D. (2009): Statistical analysis of network data. Methods and Models, New York: Springer.
  • Lebacher and Kauermann (2019) Lebacher, M. and G. Kauermann (2019): “Regression-based network reconstruction with codal and dyadic covariates and random effects,” arXiv preprint arXiv:1903.11886.
  • Liu et al. (2015) Liu, Z., S. Quiet, and B. Roth (2015): “Banking sector interconnectedness: what is it, how can we measure it and why does it matter?” Bank of England Quarterly Bulletin, Q2.
  • Markose et al. (2012) Markose, S., S. Giansante, and A. R. Shaghaghi (2012): “Too interconnected to fail - financial network of US CDS market: Topological fragility and systemic risk,” Journal of Economic Behavior & Organization, 83, 627–646.
  • Mastrandrea et al. (2014) Mastrandrea, R., T. Squartini, G. Fagiolo, and D. Garlaschelli (2014): “Enhanced reconstruction of weighted networks from strengths and degrees,” New Journal of Physics, 16, 043022.
  • Murphy (1973) Murphy, A. H. (1973): “A new vector partition of the probability score,” Journal of applied Meteorology, 12, 595–600.
  • Nie et al. (2017) Nie, L., D. Jiang, and Z. Lv

    (2017): “Modeling network traffic for traffic matrix estimation and anomaly detection based on Bayesian network in cloud computing networks,”

    Annals of Telecommunications, 72, 297–305.
  • Schweitzer et al. (2009) Schweitzer, F., G. Fagiolo, D. Sornette, F. Vega-Redondo, A. Vespignani, and D. R. White (2009): “Economic networks: The new challenges,” Science, 325, 422–425.
  • Scrucca (2013) Scrucca, L. (2013): “GA: A package for genetic algorithms in R,” Journal of Statistical Software, 53, 1–37.
  • Sheldon and Maurer (1998) Sheldon, G. and M. Maurer (1998): “Interbank lending and systemic risk: An empirical analysis for Switzerland,” Swiss Journal of Economics and Statistics, 134, 685–704.
  • Siegert (2017) Siegert, S. (2017): “Simplifying and generalising Murphy’s Brier score decomposition,” Quarterly Journal of the Royal Meteorological Society, 143, 1178–1183.
  • Soramäki et al. (2007) Soramäki, K., M. L. Bech, J. Arnold, R. J. Glass, and W. E. Beyeler (2007): “The topology of interbank payment flows,” Physica A: Statistical Mechanics and its Applications, 379, 317–333.
  • Soramäki and Cook (2013) Soramäki, K. and S. Cook (2013): “SinkRank: An algorithm for identifying systemically important banks in payment systems,” Economics: The Open-Access, Open-Assessment E-Journal, 7, 1–27.
  • Squartini et al. (2018) Squartini, T., G. Caldarelli, G. Cimini, A. Gabrielli, and D. Garlaschelli (2018): “Reconstruction methods for networks: The case of economic and financial systems,” Physics Reports, 757, 1 – 47.
  • Thurner and Poledna (2013) Thurner, S. and S. Poledna (2013): “DebtRank-transparency: Controlling systemic risk in financial networks,” Scientific reports, 3, 1888.
  • Tibshirani (1996) Tibshirani, R. (1996): “Regression shrinkage and selection via the LASSO,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58, 267–288.
  • Upper (2011) Upper, C. (2011): “Simulation methods to assess the danger of contagion in interbank markets,” Journal of Financial Stability, 7, 111–125.
  • Vardi (1996) Vardi, Y. (1996): “Network tomography: Estimating source-destination traffic intensities from link data,” Journal of the American Statistical Association, 91, 365–377.
  • Zhang et al. (2003a) Zhang, Y., M. Roughan, N. Duffield, and A. Greenberg (2003a): “Fast accurate computation of large-scale IP traffic matrices from link loads,” in ACM SIGMETRICS Performance Evaluation Review, ACM, vol. 31, 206–217.
  • Zhang et al. (2003b) Zhang, Y., M. Roughan, C. Lund, and D. Donoho (2003b): “An information-theoretic approach to traffic matrix estimation,” in Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, Association for Computing Machinery, 301–312.
  • Zhou et al. (2016) Zhou, H., L. Tan, Q. Zeng, and C. Wu

    (2016): “Traffic matrix estimation: A neural network approach with extended input and expectation maximization iteration,”

    Journal of Network and Computer Applications, 60, 220 – 232.

Appendix A Countries included

ISO name reduced ISO name reduced ISO name reduced
AD Andorra 0 GQ Equatorial Guinea 0 PE Peru 0
AE United Arab Emirates 1 GR Greece 1 PF French Polynesia 0
AG Antigua & Barbuda 0 GT Guatemala 0 PG Papua New Guinea 0
AI Anguilla 0 GY Guyana 0 PH Philippines 1
AL Albania 0 HK Hong Kong SAR China 1 PK Pakistan 0
AM Armenia 0 HN Honduras 0 PL Poland 1
AO Angola 0 HR Croatia 1 PR Puerto Rico 0
AR Argentina 0 HT Haiti 0 PS Palestinian Territories 0
AT Austria 1 HU Hungary 1 PT Portugal 1
AU Australia 1 ID Indonesia 1 PY Paraguay 0
AW Aruba 0 IE Ireland 1 QA Qatar 0
AZ Azerbaijan 0 IL Israel 1 RE Réunion 0
BA Bosnia & Herzegovina 0 IM Isle of Man 0 RO Romania 1
BB Barbados 0 IMI Internat. Market Infrastrct. 0 RU Russia 1
BD Bangladesh 0 IN India 1 RW Rwanda 0
BE Belgium 1 IR Iran 0 SA Saudi Arabia 1
BF Burkina Faso 0 IS Iceland 0 SB Solomon Islands 0
BG Bulgaria 1 IT Italy 1 SC Seychelles 0
BH Bahrain 0 JE Jersey 0 SD Sudan 0
BI Burundi 0 JM Jamaica 0 SE Sweden 1
BJ Benin 0 JO Jordan 0 SG Singapore 1
BM Bermuda 0 JP Japan 1 SI Slovenia 1
BN Brunei 0 KE Kenya 0 SK Slovakia 1
BO Bolivia 0 KG Kyrgyzstan 0 SL Sierra Leone 0
BR Brazil 1 KH Cambodia 0 SM San Marino 0
BS Bahamas 0 KN St. Kitts & Nevis 0 SN Senegal 0
BW Botswana 0 KR South Korea 1 SR Suriname 0
BY Belarus 1 KW Kuwait 1 SV El Salvador 0
BZ Belize 0 KY Cayman Islands 0 SY Syria 0
CA Canada 1 KZ Kazakhstan 1 TC Turks & Caicos Islands 0
CF Central African Republic 0 LA Laos 0 TG Togo 0
CH Switzerland 1 LB Lebanon 0 TH Thailand 1
CI Côte d’Ivoire 0 LC St. Lucia 0 TJ Tajikistan 0
CL Chile 0 LI Liechtenstein 0 TL Timor-Leste 0
CM Cameroon 0 LK Sri Lanka 0 TM Turkmenistan 0
CN China 1 LS Lesotho 0 TN Tunisia 0
CO Colombia 0 LT Lithuania 1 TO Tonga 0
CR Costa Rica 0 LU Luxembourg 1 TR Turkey 1
CU Cuba 0 LV Latvia 1 TT Trinidad & Tobago 0
CV Cape Verde 0 LY Libya 0 TW Taiwan 1
CY Cyprus 1 MA Morocco 0 TZ Tanzania 0
CZ Czechia 1 MC Monaco 0 UA Ukraine 1
DE Germany 1 MD Moldova 0 UG Uganda 0
DJ Djibouti 0 MG Madagascar 0 US United States 1
DK Denmark 1 MK Macedonia 0 UY Uruguay 0
DM Dominica 0 ML Mali 0 UZ Uzbekistan 0
DO Dominican Republic 0 MN Mongolia 0 VC St. Vincent & Grenadines 0
DZ Algeria 0 MO Macau SAR China 0 VE Venezuela 0
EC Ecuador 0 MR Mauritania 0 VG British Virgin Islands 0
EE Estonia 1 MS Montserrat 0 VI U.S. Virgin Islands 0
EG Egypt 0 MT Malta 0 VN Vietnam 1
ES Spain 1 MU Mauritius 0 VU Vanuatu 0
ET Ethiopia 0 MV Maldives 0 WS Samoa 0
FI Finland 1 MW Malawi 0 YE Yemen 0
FJ Fiji 0 MX Mexico 1 YT Mayotte 0
FO Faroe Islands 0 MY Malaysia 1 ZA South Africa 1
FR France 1 MZ Mozambique 0 ZM Zambia 0
GA Gabon 0 NA Namibia 0 ZW Zimbabwe 0
GB United Kingdom 1 NC New Caledonia 0 GF French Guiana 0
GD Grenada 0 NE Niger 0 KI Kiribati 0
GE Georgia 0 NG Nigeria 1 CD Congo - Kinshasa 0
GG Guernsey 0 NI Nicaragua 0 CG Congo - Brazzaville 0
GH Ghana 0 NL Netherlands 1 MQ Martinique 0
GI Gibraltar 0 NO Norway 1 SZ Swaziland 0
GL Greenland 0 NP Nepal 0 CK Cook Islands 0
GM Gambia 0 NZ New Zealand 1 VA Holy See 0
GN Guinea 0 OM Oman 0 TD Chad 0
GP Guadeloupe 0 PA Panama 1
Table 4: Countries included in the analysis with ISO 2 country code (ISO), name of the country (name) and occurrence in the small MT 103 network set (reduced=1).
Source: SWIFT BI Watch.

Appendix B Descriptives for the reduced data set

Figure 10: Summary statistics for the reduced MT 103 network as monthly time series. Density of the network (left), share of non-zero marginals (middle) and cumulative edge values (right).
Source: SWIFT BI Watch.
Figure 11: Binary network topology of the reduced MT 103 network aggregated for all time points. Cumulative indegree (left) and outdegree distributions (middle) with maximum and minimum values indicated in grey. Outdegree against indegree (right) for all months in dotted with colour intensity by frequency. 45 degree line in solid black.
Source: SWIFT BI Watch.

Appendix C Binary Reconstruction: Full Network

c.1 Predicted Adjacency Matrices: Full Network

Figure 12: Adjacency matrices, representing the full MT 103 network in February 2018. IFPF reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 13: Adjacency matrices, representing the full MT 103 network in February 2018. GRAVITY reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 14: Adjacency matrices, representing the full MT 103 network in February 2018. DC-GRAVITY reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 15: Adjacency matrices, representing the full MT 103 network in February 2018. H-ER reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 16: Adjacency matrices, representing the full MT 103 network in February 2018. H-FIT reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 17: Adjacency matrices, representing the full MT 103 network in February 2018. MINDENS reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 18: Adjacency matrices, representing the full MT 103 network in February 2018. LASSO reconstruction (left), real network (right).
Source: SWIFT BI Watch.

c.2 Degree Reconstruction: Full Network

Figure 19: Degree Reconstruction in the full MT 103 network in February 2018. IPFP reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 20: Degree Reconstruction in the full MT 103 network in February 2018. GRAVITY reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 21: Degree Reconstruction in the full MT 103 network in February 2018. DC-GRAVITY reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 22: Degree Reconstruction in the full MT 103 network in February 2018. H-ER reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 23: Degree Reconstruction in the full MT 103 network in February 2018. H-FIT reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 24: Degree Reconstruction in the full MT 103 network in February 2018. MINDENS reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 25: Degree Reconstruction in the full MT 103 network in February 2018. LASSO reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.

Appendix D Binary Reconstruction: Reduced Network

d.1 Predicted Adjacency Matrices: Reduced Network

Figure 26: Adjacency matrices, representing the reduced MT 103 network in February 2018. IPFP reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 27: Adjacency matrices, representing the reduced MT 103 network in February 2018. GRAVITY reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 28: Adjacency matrices, representing the reduced MT 103 network in February 2018. DC-GRAVITY reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 29: Adjacency matrices, representing the reduced MT 103 network in February 2018. DC-GRAVITY-GDP reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 30: Adjacency matrices, representing the reduced MT 103 network in February 2018. DC-GRAVITY-LAG reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 31: Adjacency matrices, representing the reduced MT 103 network in February 2018. H-ER reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 32: Adjacency matrices, representing the reduced MT 103 network in February 2018. H-FIT reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 33: Adjacency matrices, representing the reduced MT 103 network in February 2018. LASSO reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 34: Adjacency matrices, representing the reduced MT 103 network in February 2018. IPFP-GDP reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 35: Adjacency matrices, representing the reduced MT 103 network in February 2018. IPFP-LAG reconstruction (left), real network (right).
Source: SWIFT BI Watch.
Figure 36: Adjacency matrices, representing the reduced MT 103 network in February 2018. TOMOGRAVITY reconstruction (left), real network (right).
Source: SWIFT BI Watch.

d.2 Degree Reconstruction: Reduced Network

Figure 37: Degree Reconstruction in the reduced MT 103 network in February 2018. IPFP reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 38: Degree Reconstruction in the reduced MT 103 network in February 2018. GRAVITY reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 39: Degree Reconstruction in the reduced MT 103 network in February 2018. DC-GRAVITY reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 40: Degree Reconstruction in the reduced MT 103 network in February 2018. DC-GRAVITY-GDP reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 41: Degree Reconstruction in the reduced MT 103 network in February 2018. DC-GRAVITY-LAG reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 42: Degree Reconstruction in the reduced MT 103 network in February 2018. H-ER reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 43: Degree Reconstruction in the reduced MT 103 network in February 2018. H-FIT reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 44: Degree Reconstruction in the reduced MT 103 network in February 2018. LASSO reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 45: Degree Reconstruction in the reduced MT 103 network in February 2018. IPFP-GDP reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 46: Degree Reconstruction in the reduced MT 103 network in February 2018. IPFP-LAG reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.
Figure 47: Degree Reconstruction in the reduced MT 103 network in February 2018. TOMOGRAVTIY reconstruction of the outdegree (left), outdegree (middle) and in- and outdegree (right).
Source: SWIFT BI Watch.