A Framework for Institutional Risk Identification using Knowledge Graphs and Automated News Profiling

by   Mahmoud Mahfouz, et al.
JPMorgan Chase & Co.

Organizations around the world face an array of risks impacting their operations globally. It is imperative to have a robust risk identification process to detect and evaluate the impact of potential risks before they materialize. Given the nature of the task and the current requirements of deep subject matter expertise, most organizations utilize a heavily manual process. In our work, we develop an automated system that (a) continuously monitors global news, (b) is able to autonomously identify and characterize risks, (c) is able to determine the proximity of reaching triggers to determine the distance from the manifestation of the risk impact and (d) identifies organization's operational areas that may be most impacted by the risk. Other contributions also include: (a) a knowledge graph representation of risks and (b) relevant news matching to risks identified by the organization utilizing a neural embedding model to match the textual description of a given risk with multi-lingual news.



There are no comments yet.


page 1

page 2

page 3


A Framework for Cloud Security Risk Management Based on the Business Objectives of Organizations

Security is considered one of the top ranked risks of Cloud Computing (C...

Adverse Media Mining for KYC and ESG Compliance

In recent years, institutions operating in the global market economy fac...

Cyber Crossroads: A Global Research Collaborative on Cyber Risk Governance

Spending on cybersecurity products and services is expected to top 123 b...

Evaluating the role of risk networks on risk identification, classification and emergence

Modern society heavily relies on strongly connected, socio-technical sys...

Stablecoins 2.0: Economic Foundations and Risk-based Models

Stablecoins are one of the most widely capitalized type of cryptocurrenc...

Risk-Stratify: Confident Stratification Of Patients Based On Risk

A clinician desires to use a risk-stratification method that achieves co...

Prediction of Workplace Injuries

Workplace injuries result in substantial human and financial losses. As ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction & Related Work

Global institutions are exposed to various types of risks, ranging from market risks related to the institution’s core functions CFI (2020) BOE (2020) to operational, compliance, cyber-security, geopolitical and reputational risks Matellio (2020). Risks in these areas are inherently hard to identify and quantify. Risk mitigation is also extremely challenging, which is why the runway provided by its identification and quantification is crucial. Unfortunately, the lack of proper risk assessment has led to the demise of several organizations once the risk manifested De Haas and Van Horen (2012). In our work, we present a system for risk identification utilizing knowledge graphs for representing risk areas and a neural embedding model Reimers and Gurevych (2019) for multi-lingual news matching tailored towards financial institutions.

The formal definition of a risk and the study of methods for risk assessment and mitigation has a long history of academic research Henley and Kumamoto (1981), Covello and Mumpower (1985), Rechard (1999), Bedford et al. (2001), Thompson et al. (2005) and Zio (2009)

. Similarly, the use of natural language processing and knowledge graphs for news recommendation has been studied extensively and is used widely in practice. The majority of the work has focused on developing news recommendation systems tailored towards users preferences.

Wang et al. (2020) describes a news recommendation system employed by a major financial ratings agency utilizing a neural embedding model developed by Peters et al. (2018) for news contextual embeddings representation. Other approaches to news recommendations include IJntema et al. (2010) which utilize externally developed ontologies to find news, collaborative filtering Lu et al. (2015) and graph embeddings Ren et al. (2019).

System Architecture

The system proposed (Figure 1

) consists of four main components starting with a given set of risks identified by domain experts and producing a list of relevant news for each risk. The input to the system is a repository of risk descriptions provided by the domain experts. The first component extracts a set of relevant entities from the textual description of the risk. The second component uses these entities to construct a knowledge graph. The third component searches for a set of keywords related to the risk and parses them from multiple news sources. The fourth component is a neural network model used to rank news events using contextual embeddings generated for headlines as well as the risk descriptors.

In order to demonstrate the end-to-end workflow of our system, we created a set of artificial risks shown in table 1 which a financial institution would face inspired from the risk types defined in Matellio (2020).

(1) Cyber-attacks targeting the retail banking busin-
ess causing a loss of customer data
(2) US - China trade war escalation affecting the
corporate and investment banking business causing
a decrease in revenues
(3) Employee misconduct in the investment banki-
ng business causing a reputational damage
(4) Technology infrastructure failure in the corpor-
ate and investment banking business causing a rep-
utational damage and/or monetary loss
Table 1: Examples of Institutional Risks
Figure 1: System architecture

Risk Information Extraction

Given a textual description of the risk, the text is decomposed into (1) trigger (the root cause of the risk), (2) outcome (the impact of the given risk) and (3) exposure vessel (the entity/vessel the risk impacts).

Several approaches were tested to decompose the text into the three categories above. One of these is based on a deep bi-LSTM neural network sequence prediction model developed by Stanovsky et al. (2018) for supervised open information extraction. The model breaks a given sentence (in our case the risk text) into the relationships they express. In particular, the model extracts a list of propositions, each composed of a single predicate and an arbitrary number of arguments Stanovsky et al. (2018). As an example, consider the risk (1) in table 1. The model breaks the sentence into the components described in figure 2.

Figure 2: Risk information extraction example Gardner et al. (2018).

In this example the first argument would map to the trigger of the risk which in this case is cyber attacks. The outcome and exposure vessel typically follow the first verb in the sentence. In this example, the retail banking business is the exposure vessel and the argument after the second verb, loss of customer data, is the outcome.

Risks Knowledge Graph

We utilize the extracted information to construct a knowledge graph to assist in the risk identification and assessment process. A knowledge graph formally represents semantics by describing entities and their relationships. In our system, the knowledge graph is designed for visualising the risks faced by the institution and for reasoning over data. This is intended to help domain experts understand how risks are related to each other, what are the key triggers of risk facing the institution, etc. The nodes of the graph describe the triggers, outcomes and exposure vessels and the edges describe the relationship between the three categories. In our case, a trigger causes a given outcome and the outcome impacts a given exposure vessel. Figure 3 shows the knowledge graph representation for the risks in table 1.

Figure 3: Knowledge graph of the risks in table 1

Risk and News Embeddings

The news for each risk are retrieved from (1) Google News and (2) The Global Database of Events, Language and Tone (GDELT) Leetaru and Schrodt (2013)

based on the trigger identified. This, however, returns a large amount of news with many that are irrelevant to the risk itself. To help filter the news retrieved, we use a neural network model coupled with a cosine similarity metric to identify the top relevant news for each risk. The model used is based on a bidirectional encoder representation from transformers (BERT) neural network

Devlin et al. (2018) which is typically used for predicting masked works in a sentence. Our system uses Sentence-BERT, an extension of the original model used to compute contextual sentence embeddings Reimers and Gurevych (2019).

Usage & Conclusion

We presented a system for institutional risk identification using knowledge graphs and automated news profiling. The system discussed was tested on a set of 1,250 risks faced by our institution for the fourth quarter of 2019. The results were vetted by our business partners who provided strong positive feedback on the output. The model achieved an accuracy of 96.6% in identifying relevant news on a set of 1,132 news headlines reviewed by domain experts.


  • T. Bedford, R. Cooke, et al. (2001) Probabilistic risk analysis: foundations and methods. Cambridge University Press. Cited by: Introduction & Related Work.
  • B. o. E. BOE (2020) What risks do banks take?. Cited by: Introduction & Related Work.
  • C. F. I. CFI (2020) Major risks for banks. Cited by: Introduction & Related Work.
  • V. T. Covello and J. Mumpower (1985) Risk analysis and risk management: an historical perspective. Risk analysis 5 (2), pp. 103–120. Cited by: Introduction & Related Work.
  • R. De Haas and N. Van Horen (2012) International shock transmission after the lehman brothers collapse: evidence from syndicated lending. American Economic Review 102 (3), pp. 231–37. Cited by: Introduction & Related Work.
  • J. Devlin, M. Chang, K. Lee, and K. Toutanova (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Cited by: Risk and News Embeddings.
  • M. Gardner, J. Grus, M. Neumann, O. Tafjord, P. Dasigi, N. Liu, M. Peters, M. Schmitz, and L. Zettlemoyer (2018) Allennlp: a deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640. Cited by: Figure 2.
  • E. J. Henley and H. Kumamoto (1981) Reliability engineering and risk assessment. Prentice Hall. Cited by: Introduction & Related Work.
  • W. IJntema, F. Goossen, F. Frasincar, and F. Hogenboom (2010) Ontology-based news recommendation. In Proceedings of the 2010 EDBT/ICDT Workshops, pp. 1–6. Cited by: Introduction & Related Work.
  • K. Leetaru and P. A. Schrodt (2013) Gdelt: global data on events, location, and tone, 1979–2012. In ISA annual convention, Vol. 2, pp. 1–49. Cited by: Risk and News Embeddings.
  • Z. Lu, Z. Dou, J. Lian, X. Xie, and Q. Yang (2015) Content-based collaborative filtering for news topic recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29. Cited by: Introduction & Related Work.
  • Matellio (2020) 10 major risks faced by banks in 2021. Cited by: Introduction & Related Work, System Architecture.
  • M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365. Cited by: Introduction & Related Work.
  • R. P. Rechard (1999) Historical relationship between performance assessment for radioactive waste disposal and other types of risk assessment. Risk Analysis 19 (5), pp. 763–807. Cited by: Introduction & Related Work.
  • N. Reimers and I. Gurevych (2019) Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084. Cited by: Introduction & Related Work, Risk and News Embeddings.
  • J. Ren, J. Long, and Z. Xu (2019) Financial news recommendation based on graph embeddings. Decision Support Systems 125, pp. 113115. Cited by: Introduction & Related Work.
  • G. Stanovsky, J. Michael, L. Zettlemoyer, and I. Dagan (2018) Supervised open information extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 885–895. Cited by: Risk Information Extraction.
  • K. M. Thompson, P. F. Deisler Jr, and R. C. Schwing (2005) Interdisciplinary vision: the first 25 years of the society for risk analysis (sra), 1980–2005. Risk Analysis: An International Journal 25 (6), pp. 1333–1386. Cited by: Introduction & Related Work.
  • C. Wang, L. Kim, G. Bang, H. Singh, R. Kociuba, S. Pomerville, and X. Liu (2020) Discovery news: a generic framework for financial news recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, pp. 13390–13395. Cited by: Introduction & Related Work.
  • E. Zio (2009) Reliability engineering: old problems and new challenges. Reliability Engineering & System Safety 94 (2), pp. 125–141. Cited by: Introduction & Related Work.


The authors would like to acknowledge Danny Schwartzman, Brian O’toole, Cecilia Tilli, Natraj Raman, Salwa Alamir, Charese Smiley and Andrea Stefanucci from J.P. Morgan for their input and suggestions at various stages of the project.

This paper was prepared for informational purposes by the Artificial Intelligence Research group of JPMorgan Chase & Co and its affiliates (“J.P. Morgan”), and is not a product of the Research Department of J.P. Morgan. J.P. Morgan makes no representation and warranty whatsoever and disclaims all liability, for the completeness, accuracy or reliability of the information contained herein. This document is not intended as investment research or investment advice, or a recommendation, offer or solicitation for the purchase or sale of any security, financial instrument, financial product or service, or to be used in any way for evaluating the merits of participating in any transaction, and shall not constitute a solicitation under any jurisdiction or to any person, if such solicitation under such jurisdiction or to such person would be unlawful.