Manufacturing firms with non-trivial product offerings scale up by procuring subcomponents, services, or capabilities (Barney, 1991). Inevitably, due to labour cost arbitrage and an ever increasing focus on cost efficiencies, supply networks have become more global as firms position themselves to optimise profitability. Whilst globalisation and outsourcing can have financial benefits and lead to faster time to market for manufactured goods, a supply network leads to structural dependencies amongst firms and subsequent concentration of risk, leaving value chains vulnerable to disruptions. The effects of globalisation mean that individual firms have little control or visibility over their extended supply network, exacerbating the risk of disruption.
In particular, the lack of visibility may result in firms procuring goods and services from firms which are known to perform nefarious activities, examples of which include, but are not limited to, the engagement of child labour, unsustainable business practices, and more general violations of employment law. An illustrative example of the structure of a complex supply network is given in Figure 1, where the focal firm (customer) remains unaware of a Tier 2 supplier and is also supplied by a Tier 3 supplier.
Recently, methods that leverage web scraping, entity recognition, and labelling have been proposed to provide transparency of the supply chain (Wichmann et al., 2020)
. In these method, entity recognition is used to derive nodes with edges being built through binary classification applied to text data on the entities. There are two main drawbacks to Natural Language Processing (NLP) based approaches: (i) it is implicitly assumed that all procurement activities are published as articles or metadata on the internet, and (ii) they are not statistically or otherwise verifiable.
In this work, we propose an automated approach to synthesise an appropriate representation for a downstream link prediction task. It is viewed that automated approaches may complement methods that gather incomplete information and help towards statistical verification of links that have been found. Specifically, we:
Introduce the first method to learn a heterogeneous graph (knowledge graph) of supply chain network data.
Leverage the learned representation to achieve state-of-the-art performance on link prediction using a relational graph convolution network.
2.1 Supply Chain Networks as Graphs
Representing supply chain networks as graphs was first proposed by (Choi et al., 2001). Since then, researchers have studied the impacts of ripple effects (Chauhan et al., 2020), (Dolgui et al., 2018), demonstrated that supply chain networks naturally form hubs and exhibit scale-free characteristics , and even trained algorithms to locate hidden links in these networks (Brintrup et al., 2018b) using manually-specified homogeneous graphs (single edge type and node type cf. Section 4). In this work, we build on this body of work by developing a heterogeneous supply chain graph representation that yields improved performance in the downstream task of link prediction.
2.2 Supply Chain Link Prediction
Various techniques have been proposed for link prediction in domains beyond supply chain applications. One of the most commonly-used techniques is based on computing similarity between pairs of nodes. Such similarities are derived based on handcrafted heuristics such as node degrees, or the number of common neighbours; these include the Jaccard Coefficient(Liben‐Nowell and Kleinberg, 2007), Katz (Katz, 1953), LHN Index (Leicht et al., 2006), Preferential Attachment (Barabasi and Albert, 1999), Adamic-Adar (Adamic and Adar, 2003), Resource Allocation (Zhou et al., 2009) and path-based similarity (Lu et al., 2009).
While many existing heuristics-based similarity techniques could work well in practice, they rely on domain experts to handcraft features. Given that we work with larger datasets with more attributes, manually defining such formulae is expensive. Additionally, while handcrafted heuristics can work well in a particular application, transferring them to different contexts is likely to fail. For instance, (Kovacs et al., 2019) shows that Common Neighbours (CN), a heuristic used in social network analysis, fails to perform in protein graph networks. This is due to the inductive bias stemming from CN’s assumption of homophily (similar nodes are connected), an assumption that does not hold in protein networks. Such issues have also been observed in supply chains (Brintrup et al., 2018a).
. Here, nodes are represented as vectors derived from topological features obtained by performing various forms of random walk within the neighborhood of the nodes. Link prediction then becomes a binary classification task, where a decoder scores a pair of node embeddings to calculate if there is a high likelihoood of an edge forming between them.
Recent approaches to extract more complex node embeddings are obtained by using graph neural networks (GNNs) ((Hamilton, 2020), (Bruna et al., 2014), (Duvenaud et al., 2015), (Kipf and Welling, 2017), and (Niepert et al., 2016)). GNNs have outperformed many existing algorithms across various domains such as airline carrier networks, citation networks, political blogs, protein interactions, power grids, router-level internet and E. coli metabolite reactions ((Zhang and Chen, 2017), (Zhang and Chen, 2018), (Zhang et al., 2020), (Huang and Zitnik, 2021), (Teru et al., 2020)).
While GNNs have been applied to extract node embeddings, they may also be used to learn representations of a triplet (a pair of nodes with an edge between them). One implementation of such a GNN is called the relational graph convolutional network.
Relational Graph Convolutional Networks (RGCNs): generate latent representations for entities within multi-relational graphs (or knowledge graphs) for downstream graph reasoning tasks. (Schlichtkrull et al., 2018b). Our approach begins by leveraging the GraphSAGE architecture proposed by Hamilton et al. (2017) to learn functions that inductively generate node embeddings for all entities in the knowledge graph. The inductive learning paradigm is chosen because supply chain networks evolve over time as companies (which act as autonomous agents in our network representation) choose their locations, product offerings, or procurement relationships. All entity types within the knowledge graph are initialised with a random embedding vector. The set of features for all nodes are chosen at random where
is the dimensionality of the feature vector associated with the nodes and is treated as a hyperparameter to be tuned during cross validation.
3 Our Approach
3.1 Learning a Heterogeneous Graph Representation of a Supply Chain Network
The formal definition of a knowledge graph varies between application fields. For the purposes of graph representation learning over supply chain networks, a definition in line with Palumbo et al. (2020) is adopted. In this paradigm, a knowledge graph can be conceptualised as a 3-tuple/triplet where is the set of entities (or nodes), is the set of relations, and is the ontology of the knowledge graph.
The ontology: defines the set of entity types, , and the set of relation types . Additionally, it assigns nodes to their entity type, , and entity types to their related properties, . Effectively, the ontology defines the underlying data structure of a knowledge graph. Within this context, the set of entity types comprises
Country where . The corresponding set of edges between entities () is defined with business-specific use cases in mind. A pictorial representation of the defined ontology is shown in Figure 4. Considering Figure 4 for entity type
Country, only a single relation type,
located_in, is allowed for triplets containing the entity type
Populating the knowledge graph: The ontology is populated through a tabular data structure comprising (incomplete) attribute information about companies within the automotive sector222The data is obtained from MarkLines, a company that specialises in automotive supply chain data collection.. The tabular data is converted into multiple multipartite graphs to derive relations. For an indicative example, Figure 5 demonstrates this procedure for two relation types: (company,
buys_from, company) and (company,
makes_product, product). Where relationships could not be deduced from the tabular data, bipartite projections were taken over the entity set where information was missing. This is a crucial step as complementary capability and product offerings may embed inductive bias when predicting
Some links included within the ontology were not immediately available in collected data but could be deduced. In this case, a co-occurrence frequency was used to derive these relations. The intuition here is that if a company possesses a capability (e.g. Plastic Injection Moulding) and produces products (Seat Belts, Bumpers, etc.), then enough instances of co-occurrence of capabilities with the same product would imply that the capability and product can be tied into the
The histogram of co-occurrence frequency is shown in Figure 6. As the data exhibits noise potentially due to spurious information, a cutoff threshold is required to filter relations based on co-occurrence frequency. This threshold is treated as a hyperparameter during training and can be optimised for whichever edge type a company may deem the riskiest. For example, if a company is interested in geographic risk, then the cutoff threshold is optimised for predicting
buys_from relationships successfully.
The other edge type that has to be deduced from data is . For this edge type, a bipartite graph consisting of relation type between companies and their respective product portfolio is leveraged. A bipartite projection is taken onto the product entities where the weights in the projection space indicate the number of times companies purchased similar products. Figure 7 shows the distribution of edge weights in the projection space. The cutoff threshold for introducing
complimentary_product_to relations is also treated as a hyperparameter during training.
Triplets or labelled directed edges are represented as factual tuples for and . For example, we have edge types and the edge where and indicate known information about a company and its capability since the relation type is restricted between entity types
Table 1 and Table 2 convey the extracted entities totalling 161k and extracted facts totalling 647k respectively.
Finally, the learning objective is the same as knowledge graph completion, and is geared towards predicting missing edges to complete the knowledge graph representation.
|company (e.g. General Motors)||41,826|
|product (e.g. Floor mat)||119,618|
|country (e.g. Germany).||74|
|capability (e.g. Machining)||36|
|certification (e.g. ISO9001)||9|
|(capability, capability_produces, product)||21,857|
|(company, buys_from, company)||88,997|
|(company, has_capability, capability)||83,787|
|(company, has_cert, certification)||32,654|
|(company, located_in, country)||40,421|
|(company, makes_product, product)||119,618|
|(product, complimentary_product_to, product)||260,658|
3.2 Loss Function for Link Prediction
The latent embeddings for nodes are generated using the GraphSAGE architecture in the minibatch setting. In the GraphSAGE paradigm, trainable functions are learned to generate compact embeddings by sampling and aggregating features from local neighbourhoods of nodes to be used in downstream tasks (link prediction in our case). The aggregator function for depth and as well as trainable weight matrices used in updating latent embedding for are trained to minimise the binary cross entropy loss across all relation types. This choice of link prediction loss is similar to that proposed by Schlichtkrull et al. (2018a) and is given as:
Where triples for with relation are scored according to based on an indicator denoting whether or not the triplet exists (detailed further in section 5). The score is derived based on the -th node embeddings for source and destination nodes, and respectively, and was chosen as , which is the DistMult scoring function (Yang et al., 2015). In this context, is a diagonal matrix for every relation and is the size of the initialised node embeddings. The loss naturally incentivises the model to associate higher scores to observable triples and lower scores for unobserved triples.
4 Related Work
The application of link prediction in supply networks has been scarce. To the best of our knowledge, Supply Network Link Prediction (SNLP) (Brintrup et al., 2018a) is the only published work to apply link prediction. SNLP was applied on the same automotive supply chain dataset that is used in our work. The authors represent the supply chain network as a homogeneous graph with one type of edge/relation:
buys_from, unlike our heterogeneous knowledge graph which has multiple relation types.
This baseline model represents every node with a set of attributes derived from handcrafted heuristics, such as the number of existing suppliers, overlaps between both companies’ product portfolios, product outsourcing associations and likelihood of having common buyers. This bears similarity with modern graph node embedding techniques, albeit their representations were not learnable. The approach treats link prediction as a binary classification problem given a pair of nodes with their respective attributes. They report an Area Under the Receiver Operating Curve (AUC) score of 0.76.
5 Experiments and Results
The task of relational link prediction is to discern whether a given edge is present in where is the set of all possible edges and is the set of captured edges in the knowledge graph representation. The set is the set of edges that have not been captured when building the supply chain knowledge graph, or are edges that will present themselves in the future (a new partnership between two companies is formed, new capabilities are invested in, etc.). The learning regime involves cross validation (70% training, 20% validation, and 10% testing) by splitting the set of all actualised triples into a training, validation, and test set. Negative triplets (triplets which are not facts in the knowledge graph) are then corrupted by either swapping the source or destination nodes ( and ) or by uniformly sampling a new relation type between the source and destination nodes. Models are assessed based on their capability to differentiate between factual and non-factual triplets. The task is therefore distilled into a binary classification task (for all relation types), and the commonly-used Area Under the Receiver Operating Curve (AUC) is used to assess model performance. To the best of our knowledge, the best reported AUC for this task is 0.76 (for the
buys_from relation in our context). As shown in Table 3, our multi-relational model outperforms the existing baseline and extends the prediction task to multiple relations.
Due to the effects of globalisation, supply chains are becoming more complex, and obtaining visibility into interdependencies within the network has become a tremendous challenge. While better information extraction techniques have been developed, there remains a large gap towards obtaining a complete representation of the network. The raw data alone often has missing information due to a company’s propensity to engage in secretive and competitive behavior. This information, however, is particularly important for supply chain practitioners to detect operational risks, such as unfair manufacturing practices and overreliance on certain sole suppliers. Graph representation learning, in the form of link prediction, can help impute such missing data.
Our paper proposes a novel method for learning a representation of a supply chain network as a heterogeneous graph, allowing us to predict the existence of various type of dependencies, as opposed to the incumbent SOTA (SNLP) approach of predicting just one type of dependency using a homogeneous graph. Moreover, our embeddings are learnable, which may also be responsible for the improved performance relative to SNLP.
In future work we wish to perform an ablation study to isolate the contributions of the learnable embedding and heterogeneous graph components. An extension of this work will include exploration of graph learning techniques for multi-hop reasoning to detect more complex dependencies associated with paths in the graphs, as opposed to single links.
- Friends and neighbors on the Web. Social Networks 25 (3), pp. 211–230 (en). External Links: Cited by: §2.2.
- Emergence of Scaling in Random Networks. Science 286 (5439), pp. 509–512 (en). External Links: Cited by: §2.2.
- Firm resources and sustained competitive advantage. Journal of Management 17 (1), pp. 99–120. External Links: Cited by: §1.
- Predicting Hidden Links in Supply Networks. Complexity 2018, pp. 1–12 (en). External Links: Cited by: §2.2, §4.
- Predicting Hidden Links in Supply Networks. Complexity 2018 (January). External Links: Cited by: §2.1.
- Spectral Networks and Locally Connected Networks on Graphs. arXiv:1312.6203 [cs]. Note: arXiv: 1312.6203 External Links: Cited by: §2.2.
- The relationship between nested patterns and the ripple effect in complex supply networks. International Journal of Production Research, pp. 1–17. Cited by: §2.1.
- Supply networks and complex adaptive systems: control versus emergence. Journal of Operations Management 19 (3), pp. 351–366 (en). External Links: Cited by: §2.1.
- Ripple effect in the supply chain: an analysis and recent literature. International Journal of Production Research 56 (1-2), pp. 414–430. Cited by: §2.1.
- Convolutional Networks on Graphs for Learning Molecular Fingerprints. arXiv:1509.09292 [cs, stat]. Note: arXiv: 1509.09292 External Links: Cited by: §2.2.
- Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA, pp. 855–864 (en). External Links: Cited by: §2.2.
- Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035. Cited by: §2.2.
Graph Representation Learning.
Synthesis Lectures on Artificial Intelligence and Machine Learning14 (3), pp. 1–159 (en). External Links: Cited by: §2.2.
- Graph Meta Learning via Local Subgraphs. arXiv:2006.07889 [cs, stat]. Note: arXiv: 2006.07889 External Links: Cited by: §2.2.
- A new status index derived from sociometric analysis. Psychometrika 18 (1), pp. 39–43 (en). External Links: Cited by: §2.2.
- Semi-Supervised Classification with Graph Convolutional Networks. arXiv:1609.02907 [cs, stat] (en). Note: arXiv: 1609.02907 External Links: Cited by: §2.2.
- Network-based prediction of protein interactions. Nature Communications 10 (1), pp. 1240 (en). Note: Number: 1 Publisher: Nature Publishing Group External Links: Cited by: §2.2.
- Vertex similarity in networks. Physical Review E 73 (2), pp. 026120 (en). External Links: Cited by: §2.2.
- The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58 (7), pp. 1019–1031 (en). External Links: Cited by: §2.2.
- Similarity index based on local paths for link prediction of complex networks. Physical Review E 80 (4), pp. 046122 (en). External Links: Cited by: §2.2.
Learning Convolutional Neural Networks for Graphs. In International Conference on Machine Learning, pp. 2014–2023 (en). Note: ISSN: 1938-7228 External Links: Cited by: §2.2.
- entity2rec: Property-specific Knowledge Graph Embeddings for Item Recommendation. Expert Systems with Applications 151, pp. 113235. External Links: Cited by: §3.1.
- DeepWalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, New York New York USA, pp. 701–710 (en). External Links: Cited by: §2.2.
- Modeling relational data with graph convolutional networks. In European semantic web conference, pp. 593–607. Cited by: §3.2.
- Modeling relational data with graph convolutional networks. In The Semantic Web, A. Gangemi, R. Navigli, M. Vidal, P. Hitzler, R. Troncy, L. Hollink, A. Tordai, and M. Alam (Eds.), Cham, pp. 593–607. External Links: Cited by: §2.2.
- LINE: Large-scale Information Network Embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence Italy, pp. 1067–1077 (en). External Links: Cited by: §2.2.
- Inductive Relation Prediction by Subgraph Reasoning. arXiv:1911.06962 [cs, stat]. Note: arXiv: 1911.06962 version: 2 External Links: Cited by: §2.2.
- Extracting supply chain maps from news articles using deep neural networks. Int. J. Prod. Res. 58 (17), pp. 5320–5336. External Links: Cited by: §1.
- Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations (ICLR) 2015, Cited by: §3.2.
- Weisfeiler-Lehman Neural Machine for Link Prediction. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax NS Canada, pp. 575–583 (en). External Links: Cited by: §2.2.
- Link Prediction Based on Graph Neural Networks. arXiv:1802.09691 [cs, stat]. Note: arXiv: 1802.09691 External Links: Cited by: §2.2.
- Revisiting Graph Neural Networks for Link Prediction. arXiv:2010.16103 [cs]. Note: arXiv: 2010.16103 External Links: Cited by: §2.2.
- Predicting missing links via local information. The European Physical Journal B 71 (4), pp. 623–630 (en). External Links: Cited by: §2.2.