International trade research addresses the important questions about the drivers and effects of international trade in goods and services, as well as the design and implications of trade policy, regional integration and the global trading system. It provides vital information for trade and economy policy making and, at the same time, sheds light on wider issues relating to poverty, development, migration, productivity and global economy. Traditionally, international trade research emphasizes more on the theoretical aspect (Sen, 2010). An empirical shift of the discipline did not occur until the beginning of this century (Davis and Weinstein, 2001).
With recent advances in information technology, the United Nation system (UN), the World Bank, the International Monetary Fund, among others, collect, produce and distribute an enormous amount of internationally comparable trading data over time, providing a gold mine for empirical analysis of international trade. These dynamic network data provide a wide variety of information (e.g. the patterns of interactions, the evolution of the relative importance, and the natural grouping of actors in the network) that can be extracted to understand many aspects of international trade. Smith and White (1992) measure the structure of world economic system and identify the roles that particular countries play in the global division of labor by using block modeling (Lorrain and White, 1971) on international commodity trade flows at three time points (year 1965, 1970 and 1980). Kim and Shin (2002) study globalization and regionalization in international trade by calculating network related quantities, such as in-degree, out-degree, centralization and block densities, during three consecutive periods. Mahutga (2006)
examines the structural equivalence in international trade by conducting correspondence analysis – one of a family of techniques based on the Singular Value Decomposition – to the equivalence matrix that is constructed to summarize the degree of regular equivalence for each pair of countries from the network data. See alsoHafner-Burton et al. (2009) for a survey of network analysis in international relations. The econometric tools necessary for the empirical analysis of such data that reflect pairwise interactions between economic agents are still in their infancy. Moreover, much of the existing empirical trade literature is concerned with patterns of international trade at a point of time. This focus of empirical work stands in marked contrast with the theoretical literature on growth and trade that are dynamic and evolving over time.
In this paper, we propose an empirical framework for analyzing the evolution of patterns of international trade over time. We model the trade flow data as time series of square matrices that describe pairwise relationships among a set of entities. Specifically, trade data between countries over a period of time can be represented as a matrix-variate time series , where is a matrix, and each element is the directed volume of trade from country to country at time . The -th row represents data for which country is the exporter and the column represents data for which country is the importer. We explore the underlying latent lower-dimensional structure of the dynamic network by using variations of the matrix factor model (Wang et al., 2018). The latent networks and their connection to the surface networks provides a clear view of the evolution of international trade over three decades. The resulting lower dimensional representation of the dynamic network can be used for second-step analyses such as prediction of matrix time series.
Researchers have studied dynamic network/relational data analysis from various aspects. Snijders and colleagues (Snijders, 2001; Huisman and Snijders, 2003; Snijders, 2005, 2006; Snijders et al., 2007, 2010a, 2010b) developed an actor-driven, or actor-oriented, model for network evolution that incorporates individual level attributes. The change of network structure is the result of the economic rational choice of social actors (selection) and the characteristics of others to whom they are tied (influence). They apply the analysis to an evolving friendship network and the focus is link evolution between friends. Hanneke et al. (2010) and Krivitsky and Handcock (2014)
introduced a class of temporal exponential random graph models for longitudinal network data (i.e. the networks are observed in panels). They model the formation and dissolution of edges in a separable fashion, assuming an exponential family model for the transition probability from a network at timeto a network at time . Westveld and Hoff (2011) represent the network and temporal dependencies with a random effects model, resulting in a stochastic process defined by a set of stationary covariance matrices. Xing et al. (2010) extends an earlier work on a mixed membership stochastic block model for static network (Airoldi et al., 2008)
to the dynamic scenario by using a state-space model where the mixed membership is characterized through the observation function and the dynamics of the latent ‘tomographic’ states are defined by the state function. Estimation is based on the maximum likelihood principal using a variational EM algorithm. These methods focus on the connectivity of the nodes, that is, 0-1 status rather than the weights of the links. The methods are deduced from random graph theory and model the relational data at relation (edge) or entity (node) level, and thus often confronted with computational challenges, over-parameterization, and over-fitting issues.
In contrast to the pre-existing research in dynamic network analysis, the approach we propose focuses more on the edges (traffic flows) of the network and their dynamic properties. The nodes are characterized by the flows to/from other nodes. Specifically, the traffic flows in a network are represented as a time series of matrix observations – the relational matrices– instead of the traditional nodes and edges characterization. The structure of a matrix preserves the pair-wise relationships and the sequence of matrices preserves the dynamic property of such relationships. We adopt a matrix factor model where the observed surface dynamic network is assumed to be driven by a latent dynamic network with lower dimensions. The linear relationship between the surface network and the latent network is characterized by unknown but deterministic loading matrices. The latent network and the corresponding loadings are estimated via an eigenanalysis of a positive definite matrix constructed from the auto-cross-moments of the network times series, thus capturing the dynamics presenting in the network. Since the dimension of the latent network is typically small or at least much smaller than the surface network, the proposed model often yields a concise description of the whole network series, achieving the objective of dimension reduction. The resulting latent network of much smaller dimensions can also be used for downstream microscope analysis of the dynamic network.
Different from Xing et al. (2010)
that summarize the relational data by the relationships between a small number of groups, we impose neither any distributional assumptions on the underlying network nor any parametric forms on its moment function. The latent network is learned directly from the data with little subjective input. The meaning of the nodes of the latent network in our model is automatically learned from the data and is not confined to the ‘groups’ to which the actors belong, which provide a more flexible interpretation of the data. Additionally, our modeling framework is very flexible and extendable: using a matrix factor model framework, it can accommodate continuous and ordinal relational data. It can be extended to incorporate prior information on the network structure or include exogenous and endogenous covariate as explanatory variables of the relationships. Although the focus of the analysis in this paper is to estimate the latent lower-dimensional network underlying the surface network, the innovative idea of modeling dynamic network as time series of relational matrices is simple, yet quite general. Autoregressive models for matrix-variate time series(Chen et al., 2018) can be included under this framework to model the dynamics of latent matrix factors and to provide predictions of the network flows.
The remaining part of the paper is organized as follows. In Section 2, we describe the international trade flow data from 1981 to 2015 and present some explanatory analysis results. In Section 3, we introduce two factor models for network time series data and discuss their interpretations. In Section 4, we present the estimation procedure and the properties of estimators. In Section 5, we apply the proposed factor models to the international trade data described in Section 2. In Section 6, we summarize this paper and present future research directions.
2 International Trade Flow Data and Exploratory Analysis
2.1 Trade Flow Data
In the following analysis, we use monthly multilateral import and export volumes of commodity goods among 24 countries and regions over the 1982 – 2015 period. The data come from the International Monetary Fund (IMF) Direction of Trade Statistics (DOTS) (IMF, 2017), which provides monthly data on countries’ exports and imports by their partners. The source has been widely used in international trade analysis such as the Bloomberg Trade Flow. Even though IMF-DOTS provides data from January, 1948 containing 236 countries, the quality of data varies across time and countries. Some countries failed to report their volumes of trade in some or all years. Most of these missing cases are concentrated in small and underdeveloped countries or current or former Communist countries. In this study, we restrict the sample to 24 countries and regions from three major trading groups, namely NAFTA, EU and APEC, over a 408-month period from January, 1982 to December, 2015. The countries and regions used in alphabetic order are Australia, Canada, China Mainland, Denmark, Finland, France, Germany, Hong Kong, Indonesia, Ireland, Italy, Japan, Korea, Malaysia, Mexico, Netherlands, New Zealand, Singapore, Spain, Sweden, Taiwan, Thailand, United Kingdom, and United States.
We use the import CIF data of all goods denominated in U.S. dollars since it is generally believed that they are more accurate than the export ones (Durand, 1953; Linnemann, 1966). This is especially true when we are interested in tracing countries of production and consumption rather than countries of consignment or of purchase and sale (Linnemann, 1966)
. The figures for exports are determined by imputing them from imports. For example, Canada’s export volume to France is determined as France’s import volume from Canada. This calculation is done to make world total imports and exports equal. Note that the trade data for Taiwan as a reporting region is not published in the IMF-DOTS. In this paper, import data for Taiwan are imputed from the export data reported by its partner countries. AsLinnemann (1966) notes, in order to reduce the effect of incidental transactions of unusual size and of incidental difficulties in trade contract, trade flows were measured as three-month averages, rather than as direct observations of a particular month. For example, the trade flows in March, 2014 are the averages of those in February, March, and April of 2014.
2.2 Exploratory Analysis
The dynamic trading network can be cast into a time series of relational matrices that record the ties (trading volumes) between the nodes (countries) in the network. The length of our network matrix time series is 408 months. At each time point, the observation is a square matrix whose rows and columns represent the same set of countries. Each row (column) corresponds to an export (import) country. Each cell in the matrix contains the dollar trading volume that the exporting country exports to the importing country. The diagonal elements are undefined.
Figure 1 plots the time series of trading volumes in U.S. dollar among top 13 countries from January, 1982 to December, 2015 in our dataset. Each time series is normalized for ease of visualization. These 13 countries are representative of all countries and regions in our dataset. They falls into three major groups: Canada, Mexican, and United States compose the NAFTA group; France, Germany, Italy, Spain, and United Kingdom are in the EU group; Australia, China Mainland, Indian, Japan and Korea belong to the APEC group. Overall, all countries experienced rapid growth in trades along with the accelerating wave of globalization. The world saw largest collapse in the value of good traded in 2009 when the impact of the global financial crisis was at its worst. Some actually have not recovered yet. For example, we see that Spain’s downturn in import has not recovered so far, though its export has mostly recovered. While the upward trends are shared among all countries, the pattern of trading are more alike among countries within the same group. For example, the exports time series of the five European countries resembles more to each other than to the exports time series of the Asian countries.
In order to illustrate the pattern of bilateral relationships, a set of four circular trading plots are shown in Figure 2. The direction of flow is indicated by the arrowhead. The size of the flow is shown by the width of the arrow at its base. Numbers on the outer section axis, used to read the size of trading flows, are in billions. Each plot is based on the monthly flows over a one-year period, aggregated to selected annual volumes. Note that the four plots are representative of the bilateral relationship patterns in the 1980’s, 1990’s, 2000’s and 2010’s.
For the three groups (EU, NAFTA, and APEC), most of the trade flows occur within the same group. This phenomenon is most prominent within the EU group where the imports and exports are all in red shade that denotes EU countries in Figure 2. The trade flows of NAFTA countries are least confined within the group, mainly because the U.S. alone trades a lot with both EU and APEC countries.
For individual countries, most noticeable are changes in the share and direction of trade of U.S., China Mainland, Mexico and Japan. Over the years, U.S. maintains the most distinctive one among all countries because of its large trading volumes and wide range of trading counter-parties. The destinations of U.S. exports gradually shift from Japan and European countries to China Mainland and Mexico. In the 1980’s Japan accounted for the largest importing and exporting flow among APEC countries. As shown clearly in Figure 2, China Mainland’s slice of pie in global trades grew steadily in size and becomes the largest in the 2010’s. Mexico experienced a similar steady growth in global trades although less prominent than that of China Mainland. The trading patterns are most stable of the EU countries. The EU countries almost keep the same portions in the size of imports and exports over years.
The explanatory statistical analysis and visualization tools provide very clear and powerful but only descriptive observations. It is clear that there exists a possibly lower dimensional latent network, underlying the large scale dynamic network on the surface. However, there are few statistical tool available to quantify this latent structure. In the next section, we present a new methodology that is able to quantify the latent dynamic networks that underpins the observed surface dynamic networks as well as the relationship that connect the latent networks and the surface networks.
3 Matrix Factor Models for Dynamic Transport Network
In this section, we propose a new general methodology for investigating the evolving structure of dynamic networks. Here we focus on the traffic flows in the dynamic network such as international import-export trade network, air-passenger volume between cities, and the number of directional interactions among people. The networks in our current considerations are typically dense. We refer to such a dynamic network as dynamic transport network. In the proposed framework, the bilateral relationships in the network at time is recorded in a relational matrix whose rows and columns corresponds to the same set of actors in the network. The elements of record information of the ties between each pair of the actors. The dynamic features of the networks are characterized by the temporal dependencies among consequential observations. Specifically, the entire dynamic networks is modeled as a sequences of temporally dependent matrix-variate . An important attribute of this modeling framework is that it captures both the network structure and the temporal dynamics of the dynamic networks at a high level without any distributional assumption, different from the most common node-and-edge level modeling.
To formalize the methods, let represent the by relational matrix of observed pairwise asymmetrical relationships at time , . A general entry of , denoted as , represents the directed relationship of actor to actor . For example, in international trade context expresses the volume of trade flow from country to country at time ; in the transportation context represents the volume, fare, or length of a trip from location to location starting at time .
Our model for dynamic transport network can be written as:
where is an (vertical) matrix of ”loadings” of the actors on a relatively few components (we will call them “hubs”). is a small, usually asymmetric, by matrix giving the directional relationships among the latent hubs, and is simply a matrix of error terms. Since does not have diagonal elements, has a missing diagonal as well. Loading matrix relates the observed actors to the latent hubs and describes the dynamic interrelations among the hubs.
The interpretation of Model (3.1) can be demonstrated by referring to an example of international trade. Model (3.1) describes basic factors underlying the pattern of international trade for a given set of countries. One can view the latent factors in Model (3.1) as hubs and the export-import trading among the countries all go through these hubs. Each country exports to the hubs in certain distributions (determined by the loading matrix ) and import from the hubs in the same distributions. The hubs trade, on behave of the participating countries, among themselves and also within the hubs. The trading volume among the hubs are reflected by the factor matrix , which is changing over time (dynamic). The -th element reflects the export trading volume from hub to hub at time point . By examine the loading (distribution) from each country, it is often possible to ‘label’ the hubs, even though they are purely estimated from data, instead of through construction. For example, if a hub’s import are mainly contributed by members of major energy (such as oil and gas) production countries, then it can be labeled as an energy hub. Or if a hub’s contribution mainly comes from countries in a geographic region such as Euro Zone, then it can be labeled as Europe hub. Note that under Model (3.1),
Each term can be interpreted as the (export) contribution of country to hub , and the (import) contribution of country to hub in the export activity from hub to . The total volume is the summation of the exporting volumes from country to through all the latent hubs.
An interesting feature of the above model is that, while is allowed to be asymmetric, the left and right loading matrices are required to be identical. This provides a description of data in terms of asymmetric relations among a single set of hubs rather than envisioning a different set of hubs. For example, in our international trade example Model (3.1) implies that the countries have the same set of hubs in their “exporting” role as they have in their “importing” role. A second possible approach, where the left loading matrix may be different from the right one, can be written as:
where () is the () vertical loading matrices of the row (column) actors on () hubs. Matrices () and are defined the same as in those in (3.1). This formulation is the matrix factor model considered in Wang et al. (2018).
Model (3.1) describes asymmetric relationships among actors in terms of asymmetric relationships among a single set of underlying hubs. Model (3.2) is a more general model where there are two sets of underlying hubs, and the directional relationships are hypothesized to hold from hubs of one kind to hubs of the other kind. Figure 3 illustrate the differences between the two models. In the international trade example, Model (3.1) would identify a single set of hubs, corresponding to nodes #1 to #4 in the left network plot in Figure 3, and provide matrices that describe how much each hub tends to trade with each of the other hubs, corresponding to the colored solid lines connecting different nodes in the figure. A single loading matrix characterizes the relationship between individual countries and the latent hubs, shown as the green dotted lines connecting countries and hubs. In contrast, Model (3.2) provides two sets of underlying hubs: relates the countries in their row position to the exporting hubs, corresponding to ‘Ex’ nodes #1 – #4 in the right network plot in Figure 3; and relates to the countries in their column position to the importing hubs, corresponding to ‘Im’ nodes #1 to #4. The then gives the directed relationships from the exporting hubs to the importing hubs. In the international trade example, describes countries’ contribution to the exporting hubs and countries’ contribution to the importing hubs.
are not linear transformation of one another, Models (3.1) and (3.2) are not equivalent. Consequently, Model (3.1) makes a strong claim about a given data set. When the rows and the columns of a given directional relationship matrix can be demonstrated to span the same space, this agreement is a fact unlikely to arise by chance and probably demonstrates the validity of (3.1). With data containing noise, the row and column spaces will probably not match exactly, but a close agreement might still be interpreted as surprising and interesting. However, we will not discuss statistical goodness of fit tests of these two models in this article, but in Section 5 we will demonstrate detailed comparisons of the two models applied to the international trade data.
4 Estimation Procedure and Properties
Similar to all factor models, the latent factors in the proposed Model (3.1) for asymmetric directional matrix time series can be linearly transformed into alternative but equivalent factors. In general, if is any nonsingular transformation matrix, we can define an alternative matrix, , by letting and defining the associated matrix . Here, we may assume that the columns of are orthonormal, that is, , where
denotes the identity matrix of dimension. Even with these constraints, and are not uniquely determined in (3.1), as aforementioned linear transformation is still valid for any orthonormal . However, the column space of the loading matrix is uniquely determined. Hence, in what follows, we will focus on the estimation of the column space of . We denote the factor loading spaces by . For simplicity, we will depress the matrix column space notation and use the matrix notation directly.
To facilitate the estimation, we use the QR decompositionto normalize the loading matrices, so that Model (3.1) can be re-expressed as
where and .
We assume that is zero mean. Let be a positive integer. For , define
which can be interpreted as the auto-cross-moment matrices at lag between column and column of and , respectively.
For a predetermined , we define
Similar to the column vector version, we define matrix for the row vectors of ’s as following
where and .
Finally, we define , that is
Obviously is a non-negative definite matrix. By Condition 2 and others in Wang et al. (2018), it can be shown by similar argument that the right side of (4.8) constitutes a positive definite matrix sandwiched by and . Applying the spectral decomposition to , we have
where is a orthogonal matrix and is a diagonal matrix with diagonal elements in descending order. As , the columns of
are the eigenvectors ofcorresponding to its is the same as which is the same as . Under certain regularity conditions, the matrix has rank . Hence, the columns of the factor loading matrix can be estimated by the orthogonal eigenvectors of the matrix corresponding to its non-zero eigenvalues and the columns are arranged such that the corresponding eigenvalues are in the descending order.
Now we define the sample versions of these quantities and introduce the estimation procedure. For a prescribed positive integer , let
where and . Note that the above calculations are carried out by omitting the NA values. Since the diagonal of the transport volume matrix is undefined (NA), omitting the NA’s is equivalent to setting them to zero.
A natural estimator for the specified above is defined as , where is the eigenvector of corresponding to its -th largest eigenvalue. Consequently, we estimate the factors and residuals respectively by
The above estimation procedure assumes the number of row factors is known. To determine we could use: (a) the eigenvalue ratio-based estimator in Lam and Yao (2012)
; (b) the Scree plot which is standard in principal component analysis. Letbe the ordered eigenvalues of . The ratio-based estimator for is defined as
where is an integer. In practice we may take or .
The theoretic properties of the above estimators can be derived trivially from those of the general matrix factor models. For more details, see Wang et al. (2018).
5 Analysis of the International Trade Flow Data
By examining the network of international trade, we will analyze how countries compare to each other in terms of trade volumes and patterns and how these volumes and patterns evolve as economical cycles and political events unfold. We want to emphasize that our analysis does not draw on aggregate country statistics such as GNP, production statistics or any other national attributes.
5.1 Five-Year Rolling Estimation
To allow for structural changes over time, we break the -month period into rolling -year periods: through , through and so forth. For each -year period, we assume that the loadings are constant and we estimate the loading matrix under Model (3.1) and and under Model (3.2). We estimate three loading matrices , and with the same number of factors across the periods for comparison purpose. We index these matrices by the mid-year of the five-year periods.
As noted in Section 4, we can only identify the column spaces of the loading matrices because of the rotational indeterminacy. Let be a matrix whose columns constitute a set of basis of the loading space, then any can be used to represent the column spaces of the loading matrices for any non-singular matrix. The indeterminacy actually provides flexibility for better model interpretation. Which rotation we select can depend on which perspective we wish to take toward the interpretation of and . Although in general we might like to seek some kind of approximate simple structure for the columns of , this can be done in different ways, corresponding to different orthogonal or oblique rotation criteria in factor analysis.
In the analyses presented in this article, we will adopt as standard a procedure which applies Varimax (Kaiser, 1958) to the columns of after they have been scaled to have unit length (eigenvectors are automatically of unit length); this keeps the columns of mutually orthonormal. We further standardize the columns of so that they sum to one. This is feasible because we are dealing with data which contain all positive values, and the estimated columns of contain mostly positive entries with only few negative and small values. It is safe to truncate the negative values to zero while maintain consistency in our estimation. We note that non-negative matrix decomposition can be employed further to make with all positive entries.
When the columns of are standardized to sum to one (i.e. ), the factor matrix can be thought of as a compressed or miniature version of the original observation matrix . Note that , hence . The sum of all the elements in is equal to the sum of all elements in , the signal part of fit by the model. The factor matrix can be interpreted as expressing relationships among the latent hubs in the same units as the original data. That is, the factor matrix can be interpreted as one of the same kind as the original data matrix , but describing the relations among the latent hubs of the countries, rather than the countries themselves. The diagonals for the observed relational matrices are undefined, and will be ignored in the analysis by setting their values to zero. The diagonals for the latent factor matrices can be interpreted as the relationship within the same hub, e.g. the import-export between European countries. With the normalization, the columns of show the percentage of contribution each country is to the hub (from hub’s point of view). This interpretation of our model is different from that of the mixed membership model (Airoldi et al., 2008; Xing et al., 2010), where the rows of the membership matrix sum to one, measuring each actor’s percentage of membership to different communities.
5.2 Model Trading Volume with Same Export and Import Loadings
We first apply Model (3.1) to the international trade volume data. We use the ratio-based method in (4.11) as well as scree plot to estimate the number of latent dimensions. The comparison between these two methods of estimating latent dimensions in different time periods is shown in Table 1
. The scree plot method selects the minimal number of dimension that explain at least 85 percents of the variance in the original data. The estimate by (4.11) tends to be smaller than the one given by scree plot. The percentage of total variance explained by the factor model is shown in the last line.
As shown in Table 1, most dimension estimates are smaller than or equal to 4 and the factor model with explains at least of the total variance. Thus, latent dimension will be used for illustration in all periods for ease of comparison. We will focus on the loading matrix , which prescribes the interpretations of the latent hubs by linking them to the observed countries, and the factor matrix , which characterizes the directional relationship between latent hubs.
Figure 4 presents the heat maps of the loadings of each country/region on the top four latent hubs from to . See supplementary material for plotted values. Four vertically aligned heat maps correspond to four columns of loading matrix from year to . For example, the first columns (denoted by 1984) of the plot (a), (b), (c), and (d) are the four columns of the loading matrix calculated using data from 1982 to 1986; the second columns (denoted by 1985) of the four heat maps correspond to the four columns of the loading matrix calculated using data from 1983 to 1987; and so on.
Although traditional eigen-analysis arranges spectral decomposition using the rank of eigen-values, the choice of ranking is actually flexible. We choose to rank the columns of from different years according to their maximum loading on the United States, United Kingdom, and China Mainland for plots (a), (b) and (c). The reason for our choice is that the structure of international trade changes over time. The latent factors or hubs may rank differently in terms of their accounted variances at different time periods. For example, latent hub of European countries accounts for the largest portion of variance in 1985, but it ranks the third in 2001 and even no longer belongs to the top four hubs in 2009. Plot (d) contains the remaining factor for all the years. In such representation, plots (a), (b), (c) and (d) are considered together as top four hubs. The factors in one heat map may ranked differently in terms of accounted variance at different times. But they correspond to the same interpretation at certain time periods.
Recall that each column in a heat map sums up to one. Thus, the value at each cell reflects a country’s contribution in the corresponding hub at a certain year. For example in Figure 4 (a), the darkest cell corresponds to USA at year 1984 indicates that the portion of trading taken by USA on latent hub (a) is the largest among all countries. The changes of color intensity of the cells shows the evolution in a country’s participation in the four hubs over years.
The latent hub corresponding to Figure 4 (a) can be interpreted as a United States dominated hub, as the loadings of the United States on this hub are much larger than that of all other countries. From the plot, it is clear that the United States dominates this hub very strongly from to . However, its contribution gradually decreases since and reaches its minimal from year onwards, possibly due to the aftermath of the financial crisis. The decrease from United States is offset by the increase from United Kingdom, Netherlands, Hong Kong, Japan, Taiwan and Korea, which is manifested by the increasingly darker cells since for those countries in Figure (a).
The latent hub corresponding to Figure 4 (b) are aligned according to the maximum loading on United Kingdom, and not surprisingly, it is also heavily loaded by European countries such as France, Italy, Netherlands, Spain and Germany. Therefore, this hub can be interpreted as a hub dominated by European countries. From 1985 to 1989, Germany’s trading was so distinctive from other European countries that it took a separate hub as shown in Figure 4 (d). During this period, France, United Kingdom, Italy and Netherlands accounted for a large portion of European’s trading. After 1990, Germany, France, United Kingdom, and Italy took approximately equal portions. With the introduction of Euro in , Netherlands, Spain, and United Kingdom’s contributions in trade increase. We also note that the loading of some Asian economies, such as Hong Kong, Japan, Taiwan, Malaysia and Singapore, on this hub is also significant in certain periods including from 1992 to 1994, and from 2008 onwards. This suggests that, in these periods, the hub representing Asian economies explain more variance in the original data than the European hub and replace European hub as one of the top four hubs.
The latent hub corresponding to Figure 4 (c) are hubs that China Mainland has maximum loadings on. Before 1989, Japan loads more on this hub than China Mainland does. China Mainland’s loading on this hub kept increasing throughout the period. Its contribution to the hub becomes larger than Japan’s from the year 1989. It shows a clearer transition of trading centrality of large Asia economies, though Japan is also actively participating in all other hubs.
The latent hub corresponding to Figure 4 (d) features sizable loadings on Canada, Mexico, Japan, Taiwan, and Korea. Thus the fourth hub of the latent factor matrix represents the group of large economies in North American and Asia-pacific except for the US and China Mainland. The evolution of the hub (d) is striking. Before , Germany’s trading is so distinctive from the other European countries that it uses this single hub exclusively. After that, this hub is dominated by NAFTA countries between 1990 and 2000 and between 2007 and 2012. APEC countries dominated this hub between 2001 and 2007.
Figure 5 plots the trading network among four latent hubs as well as the relationship between countries and latent hubs for four selected years. The trading network among latent hubs is plotted based on the average of latent factor matrix in the corresponding 5-year rolling window. The colored circles represent four latent hubs. Note that the eigen-decomposition algorithm we used does not guarantee positive entries in (hubs). The negative values in are interpreted as a change of trading direction. Non-negative matrix factorization proposed by Lee and Seung (2001) can be used to ensure positive entries, though we did not use it here for simplicity. There are very few negative entries in this example. The size of each circle conveys the trading volumes within the hub, i.e., the values of the diagonal elements in the latent factor matrix. The width of the solid lines connecting the circles conveys the trading volume between different hubs, i.e., the values of the off-diagonal elements in the latent factor matrix. The direction of the flow is conveyed by the color of the line. Specifically, the color of the line is the same as its export hub. For example, a blue line connecting a blue node and a red node represents the trade flow from the blue node to the red node. Note that the widths of the solid lines across different network plots are not comparable because they are scaled to fit each individual plot for different years because the trading volume changes dramatically in the period.
The relationships between countries and the four hubs, shown as the dotted lines, are plotted using a truncated version of the estimated loading matrix to provide an uncluttered view that only captures the prominent relations. The truncation is achieved by first rounding all entries of to integers and then normalizing the non-zero entries to have column sum one.
Clearly shown in the network plot, in 1985 the United States and Germany solely dominate hubs #1 and #4, respectively. Latent hub #2 is mainly used by European countries such as Spain, Netherlands, France, Sweden, United Kingdom and Italy. Latent hub #3 is mainly used by Japan, Korea, Taiwan and Canada. As shown by the thick orange lines, hub #1, representing the U.S., exports mostly to hub #3, which load mostly on large Asian economies and Canada. The thick pink and purple lines connecting hubs #4 and #2 imply that Germany trades a lot with other European countries even through itself stands out from the European countries.
In 1995, European countries become closer and they all mainly use a single hub #2, which reflects the effects of the foundation of European Union in 1993. The trade within the hub (group) is the largest, indicating the strong inter-European trading activities. The year of 1995 also celebrates developments of Asian countries when they dominate two latent hubs, namely hub #3 and #4. This can be explained by the fast development of these Asian countries to emulate the developed economies in North American and European economies during the late 80’s and early 90’s. There are large amount of exporting from Asian countries to the United States and European countries as indicated by the thick pink and green lines to hub #1 and hub #2. Also, the trading among Asian countries is also large as shown by the thick lines connection green hub #3 and pink hub #4. Mexican and Canada also contribute to these two hubs.
In 2003, hub #3 is mostly used by Canada, China Mainland and Mexico. It represents the latent hub that exports a lot to hub #1 (Hong Kong and United States). Hub #2 that contributed meaningfully by Netherlands, France, Italy, Spain, United Kingdom stays the same as the European hub in 1995. Hub #4 can be interpreted as APEC hub because it is mostly composed of Japan, Taiwan, Korea, and Malaysia. The United States still loaded on hub #1. However, Hong Kong also load heavily on this hub, indicating that these two countries share some similar import/export patterns. For example, Hong Kong trades a lot with Canada, China Mainland and Mexico (the thick orange and green lines between nodes #1 and #3) and it also imports a large volume from the APEC type #4. The exporting volumes from hub #3 (Canada, China Mainland and Mexico) to U.S. hub #1 and from APEC hub #4 to U.S. hub #1 are among the largest trading volumes in this period.
In 2013, China Mainland dominates in a single hub #3, indicating China Mainland’s growing importance in international trade in the 2010’s. Netherlands, Germany, Italy and France remain contributing a large portion to the European hub #2, while United Kingdom contribute more to the United States hub #1. The United States still loads completely on hub #1. At the same time, the compositions of the hubs are less geographically concentrated – European hub #2 and United States hub #1 are shared with Taiwan, Korea, Mexico and Hong Kong and Asian hub #4 is shared with Canada – which indicates that international trades are more global and less regional in the 2010’s.
A hierarchical clustering algorithm(Xu and Wunsch, 2005; Murtagh and Legendre, 2014) is employed to cluster countries based on their contribution patterns over years under Euclidean distance and the ward.D criterion. The dendrograms in Figure 6 shows detailed structures of the hierarchical clustering results. The rectangles denote clusters that divide countries into four groups. It offers a different perspective to inspect the dynamics of countries’ trading behaviors. Generally speaking, geographically or culturally proximate countries are usually in the same group and behave similarly. For example, one can easily identify the European group and the Asia-pacific group from the dendrograms. Countries with similar trading behaviors also tend to be clustered in the same group. For example in the 1990’s, Canada and Mexico are in the same group with Hong Kong, Japan and China Mainland – they all export in large volumes to United States. The overall structure of international trading seems steady over years: fours groups in all years can be labeled as ‘United States’, ‘European active’, ‘Asia-pacific active’, ‘European-Asia-pacific less active’. However, the relationship between individual countries are changing over the period. In the 1980’s, United States and Germany are in the same group, reflecting the fact that they are the most active countries in trading, especially exporting, in the 80’s. In the 1990’s, United States accounts for a single group because of its dominant position in the international trades in the decade. China Mainland’s participation in the global trade has been gradually increasing over the years: in 1980’s China Mainland’s trading behavior is more like economies such as Korea. From 1990’s to 2010’s, as China Mainland becomes more active in importing and exporting, its trading behavior becomes more similar to that of the United States. Later its trading behavior becomes so distinctive that it makes up single cluster in the 2010’s. Again, these patterns resonate to some of the observations from Figures 4 and 5.
5.3 Model Trading Volume with Different Export and Import Loadings
Now we apply Model (3.2) to the international trade volume data. We use the ratio-based method in (4.11) as well as scree plot to estimate the number of latent dimensions. The comparison between these two methods of estimating importing and exporting dimensions in different time periods is shown in Table 2. Note that Model (3.2) assumes different exporting and import loadings and . Similar to Figure 1 , the scree plot method selects the minimal number of dimension that explain at least 85 percent of the variance in the original data. The percentage of total variance explained by the factor model is shown in the last line. With the additional flexibility of allowing different row and column loading matrix, the estimated dimension is slightly smaller than that in Table 1, though the ratio estimate becomes less stable.
|Ratio||(1, 1)||(1, 1)||(8, 1)||(1, 1)||(11, 1)||(6, 1)||(6, 3)||(2, 1)||(2, 1)||(2, 2)|
|Scree||(2, 2)||(2, 2)||(3, 3)||(3, 3)||(4, 4)||(5, 4)||(4, 4)||(4, 4)||(3, 3)||(3, 3)|
|(4,4)||(98, 98)||(95, 96)||(92, 94)||(91, 92)||(85, 91)||(85, 90)||(88, 89)||(91, 90)||(94, 93)||(95, 94)|
|Ratio||(5, 2)||(2, 2)||(2, 2)||(2, 2)||(2, 2)||(1, 1)||(1, 1)||(1, 1)||(1, 1)||(1, 1)|
|Scree||(3, 3)||(3, 3)||(2, 2)||(2, 2)||(2, 2)||(3, 3)||(3, 3)||(3, 3)||(3, 3)||(3, 3)|
|(4,4)||(93, 92)||(95, 94)||(96, 95)||(97, 97)||(96, 96)||(94, 94)||(92, 92)||(93, 94)||(95, 95)||(93, 93)|
|Ratio||(1, 1)||(6, 6)||(1, 1)||(6, 6)||(1, 6)||(1, 6)||(1, 5)||(5, 5)||(7, 1)||(1, 1)|
|Scree||(3, 3)||(3, 3)||(3, 3)||(4, 4)||(4, 4)||(4, 3)||(3, 3)||(3, 3)||(3, 3)||(3, 3)|
|(4,4)||(94, 93)||(93, 93)||(91, 91)||(88, 89)||(88, 91)||(89, 91)||(92, 93)||(94, 94)||(95, 95)||(90, 91)|
As shown in Table 2, most dimension estimators are smaller than 4 and the factor model with dimension explains at least of the total variance. Thus, latent dimension will be used for illustration and comparison between different period. In the following analysis, we employ the same visualization tools as those in Section 5.2. However, there are separate plots for loading matrices and since Model (3.2) differentiates the importing and exporting dimensions.
Figures 7 and 8 present the heat maps for exporting loading and importing loading , respectively. They are designed in the same way as those in Figure 4. The patterns are strikingly similar in the heat maps of , , and . Plots (a) in all three Figures 4, 7 and 8 represent the latent hub of United States. Plots (b), (c), and (d) in all figures represent the latent hub of European countries, Japan/China Mainland, and NAFTA countries (except US), respectively. The loadings of countries on these top four latent hubs evolve in the same way among these three figures.
There are a few noticeable differences in the import and export behavior though. For example, US’s import activities dominate the import hub #1 throughout the period, but its export activities weaken in the export hub #1 during the 2000’s, facing competition from the Asian countries. China’s export activities start in the early 1990’s but it’s import activities only show dominance in the 2000’s.
Figure 9 plots the trading network among four latent hubs as well as the relationship between countries and latent hubs for four selected years. Since we use different export (left) and import (right) loading matrix, the relationships between countries and latent hubs are different for import and export activities. The meanings of row and column dimensions of the latent factor matrix are different too. Specifically, the rows of represents the exporting hubs while the columns correspond to the importing hubs. Thus we distinguish the row and column hubs and have eight circle nodes for the latent hubs in Figure 9. The nodes annotated with “Ex” and “Im” correspond to the export (row) hubs and the import (column) hubs, respectively. We notice symmetry between the exporting and importing nodes or hubs, indicating empirically the validity of Model (3.1), for certain years. For example in 1995, the exporting node “Ex1” and importing node “Im1” both represent the United States hub; the exporting node “Ex2” and importing node “Im2” both represent the Europe hub; and the exporting node “Ex3” and importing node “Im3” both represent the Asia hub;the exporting node “Ex4” and the importing node “Im4” both represent the Canada & Mexico hub. However, in this paper we do not devise a formal statistical method for testing Model (3.1) and (3.2), which is an important problem for future research.
6 Summary and Conclusion
In this paper, we proposed an innovative framework of modeling dynamic transport networks and an effective method to estimate the dynamic structure that underpins the surface networks. We have collected, cleaned, and analyzed a data set of a dynamic transport network of monthly international trade volumes among 24 countries and regions over 34 years. We have investigated the trading hubs, centrality, patterns and trends in the trading network of the 24 countries and regions under the proposed framework and methodology. The results are able to offer sensible insights in international trading and show matching change points to trading policies.
Unlike the traditional node-and-edge level modeling of dynamic networks, which mainly focus on the link connectivity, the framework and the estimation method proposed in this paper offers an effective way for unveiling latent structure of the surface nodes and their relations in a dynamic transport network. The proposed methodology has several distinctive features in its structure and implementation. First, the matrix-variate time series modeling concisely captures the amount of data (traffic) moving across a network. The direction and size of a traffic is captured by the location and value in matrix, respectively; second, we impose neither any distributional assumptions on the underlying network nor any parametric forms on its covariance function. The latent network is learned directly from the data with little subjective input; and third, the idea is simple, yet quite general and flexible. It can be easily extended to include factor dynamics and covariates.
Our results on international trade flow consist of two major parts: the latent factor matrices that capture the structure and the dynamics of the latent low-dimensional network; and the loading matrices that connect the latent nodes with the surface nodes and characterize the semantics of the latent nodes.
Based on the latent factor matrices and the loading matrices, we have the findings on (i) meaningful trading hubs that aggregate and distribute trading flows among countries over the three decades. (ii) distinct countries that are central in the sense that some hubs are used exclusively by them and the changes of centrality in international trading. (iii) international trading patterns and trends for the 24 countries and regions and for the latent trading hubs. These findings are elaborated in detail in the following paragraphs.
Trading Hubs. Figure 4 consists of heatmaps of loading matrices that characterize the connection between surface nodes and latent nodes (trading hubs). Figure 5 consists of network plot of the latent nodes and the connection between surface nodes and latent nodes (trading hubs). Both show four major trading hubs, namely, the United States, European countries, large Asian economies, and German in the 1980’s and 1990’s or North American countries other than United States in the 2000’s and 2010’s. Figures 7, 8, and 9 differentiate the exporting and importing behaviors but suggests the same trading hubs.
Centrality. In the 1980’s, United States and Germany are the two largest economies that trade in large volumes and with a wide range of countries. United States keeps its central role from 1980’s to 2010’s. Germany lost its centrality in the late 1990’s. China Mainland gradually grows its trading capacity in the late 1990’s and assumes a central role in the 2000’s and 2010’s. Throughout decades, European has been a tight trading block that trades more with its own group members than with outside countries. Observations from Figures 7, 8, and 9 are the same for the evolution of centrality.
Patterns and trends. Note that the analysis is based on a five-year window rolling analysis. Our estimates are a sequence of time-evolving loading matrices and factor matrices that capture the dynamics of the networks of trading hubs and countries. Figure 4 provides several interesting observations of the pattern changes over the decades. First, the United States uses the first hub exclusively all the time, although its contribution to this hub is slightly decreasing in the 2000’s and 2010’s. Second, European countries uses the second hub exclusively, though its dominance in this hub was interrupted by emerges of developing Asian countries from 1992 to 1994 and from 2008 to 2011. The first period corresponds to the growth of the Four Asian Tiger economies that is attributed to export oriented policies and strong development policies. The second period corresponds to the 2008 financial crisis that affected most European countries. Third, Japan uses the large Asian economies hub exclusively in the 1980’s and China Mainland has gradually taken it over since 1990’s. Forth, in the 1980’s Germany is different from the other European countries that itself exclusively use one single hub. Gradually, Germany has blended in the European and become a member sharing the latent European hub since the formation of European Union in 1993. This phenomenon is mostly prominent after common currency ”euro“ was established in 1999 and came into full force in 2002. The fourth latent hub is later taken over by Canada and Mexico who trades heavily with United States. Figure 5 in addition offers more detail in the trading between latent trading hubs. Figure 7, 8, and 9 presents more details in exporting and importing patterns and trends.
There are many directions where we can extend our current work. The proposed methods are able to effectively reduce the dimension of the dynamic networks and uncover its core structure. The estimated latent dynamic networks and its relation with the surface networks can be further used for testing and predicting the networks. Also, current model does not explicitly model the dynamics of matrix factors. Incorporating an autoregressive model for the latent matrix factors will enable prediction of future network flows. This will result in a dynamic factor model for matrix-variate time series. Including covariates of nodes, such as the GDP of the country or geographic distance between countries, will also be interesting future research. Methods for testing the models (3.1) and (3.2) are also of great importance.
Airoldi et al. (2008)
Airoldi, E. M., Blei, D. M., Fienberg, S. E., and Xing, E. P. (2008).
Mixed membership stochastic blockmodels.
Journal of Machine Learning Research, 9(9):1981–2014.
- Chen et al. (2018) Chen, R., Xiao, H., and Yang, D. (2018). Autoregressive models for matrix-valued time series. Working paper.
- Davis and Weinstein (2001) Davis, D. R. and Weinstein, D. E. (2001). What role for empirics in international trade? MIT Press.
- Durand (1953) Durand, D. E. (1953). Country classification. In Allen, R. G. D. and Ely, E. J., editors, International Trade Statistics, pages 117–129. Wiley.
- Hafner-Burton et al. (2009) Hafner-Burton, E. M., Kahler, M., and Montgomery, A. H. (2009). Network analysis for international relations. International Organization, 63(3):559–592.
- Hanneke et al. (2010) Hanneke, S., Fu, W., and Xing, E. P. (2010). Discrete temporal models of social networks. Electronic Journal of Statistics, 4:585–605.
- Huisman and Snijders (2003) Huisman, M. and Snijders, T. A. (2003). Statistical analysis of longitudinal network data with changing composition. Sociological Methods & Research, 32(2):253–287.
- IMF (2017) IMF (2017). Direction of Trade Statistics, International Monetary Fund.
- Kaiser (1958) Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23(3):187–200.
- Kim and Shin (2002) Kim, S. and Shin, E.-H. (2002). A longitudinal analysis of globalization and regionalization in international trade: A social network approach. Social Forces, 81(2):445–468.
- Krivitsky and Handcock (2014) Krivitsky, P. N. and Handcock, M. S. (2014). A separable model for dynamic networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1):29–46.
- Lam and Yao (2012) Lam, C. and Yao, Q. (2012). Factor modeling for high-dimensional time series: inference for the number of factors. The Annals of Statistics, 40(2):694–726.
- Lee and Seung (2001) Lee, D. D. and Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems, pages 556–562.
- Linnemann (1966) Linnemann, H. (1966). An econometric study of international trade flows, volume 234. North-Holland Publishing Company Amsterdam.
- Lorrain and White (1971) Lorrain, F. and White, H. C. (1971). Structural equivalence of individuals in social networks. The Journal of Mathematical Sociology, 1(1):49–80.
- Mahutga (2006) Mahutga, M. C. (2006). The persistence of structural inequality? a network analysis of international trade, 1965–2000. Social Forces, 84(4):1863–1889.
- Murtagh and Legendre (2014) Murtagh, F. and Legendre, P. (2014). Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? Journal of Classification, 31(3):274–295.
- Sen (2010) Sen, S. (2010). International trade theory and policy: a review of the literature. Working paper.
- Smith and White (1992) Smith, D. A. and White, D. R. (1992). Structure and dynamics of the global economy: network analysis of international trade 1965–1980. Social Forces, 70(4):857–893.
- Snijders (2006) Snijders, T. (2006). Statistical methods for network dynamics. In Luchini, S., editor, Proceedings of the XLIII Scientific Meeting, Italian Statistical Society, pages 281–296. Padova:CLEUP.
- Snijders et al. (2007) Snijders, T., Steglich, C., and Schweinberger, M. (2007). Modeling the coevolution of networks and behavior. In van Montfort, K., Oud, J., and Satorra, A., editors, Longitudinal Models in the Behavioral and Related Sciences, chapter 3, pages 41 – 72. Mahwah: Routledge Academic.
- Snijders (2001) Snijders, T. A. (2001). The statistical evaluation of social network dynamics. Sociological Methodology, 31(1):361–395.
- Snijders (2005) Snijders, T. A. (2005). Models for longitudinal network data. Models and Methods in Social Network Analysis, 1:215–247.
- Snijders et al. (2010a) Snijders, T. A., Koskinen, J., and Schweinberger, M. (2010a). Maximum likelihood estimation for social network dynamics. The Annals of Applied Statistics, 4(2):567.
- Snijders et al. (2010b) Snijders, T. A., Van de Bunt, G. G., and Steglich, C. E. (2010b). Introduction to stochastic actor-based models for network dynamics. Social Networks, 32(1):44–60.
- Wang et al. (2018) Wang, D., Liu, X., and Chen, R. (2018). Factor models for matrix-valued high-dimensional time series. Journal of Econometrics.
- Westveld and Hoff (2011) Westveld, A. H. and Hoff, P. D. (2011). A mixed effects model for longitudinal relational and network data, with applications to international trade and conflict. The Annals of Applied Statistics, pages 843–872.
- Xing et al. (2010) Xing, E. P., Fu, W., and Song, L. (2010). A state-space mixed membership blockmodel for dynamic network tomography. The Annals of Applied Statistics, 4(2):535–566.
Xu and Wunsch (2005)
Xu, R. and Wunsch, D. (2005).
Survey of clustering algorithms.
IEEE Transactions on Neural Networks, 16(3):645–678.