Introduction & Related Work
Global institutions are exposed to various types of risks, ranging from market risks related to the institution’s core functions CFI (2020) BOE (2020) to operational, compliance, cyber-security, geopolitical and reputational risks Matellio (2020). Risks in these areas are inherently hard to identify and quantify. Risk mitigation is also extremely challenging, which is why the runway provided by its identification and quantification is crucial. Unfortunately, the lack of proper risk assessment has led to the demise of several organizations once the risk manifested De Haas and Van Horen (2012). In our work, we present a system for risk identification utilizing knowledge graphs for representing risk areas and a neural embedding model Reimers and Gurevych (2019) for multi-lingual news matching tailored towards financial institutions.
The formal definition of a risk and the study of methods for risk assessment and mitigation has a long history of academic research Henley and Kumamoto (1981), Covello and Mumpower (1985), Rechard (1999), Bedford et al. (2001), Thompson et al. (2005) and Zio (2009)
. Similarly, the use of natural language processing and knowledge graphs for news recommendation has been studied extensively and is used widely in practice. The majority of the work has focused on developing news recommendation systems tailored towards users preferences.Wang et al. (2020) describes a news recommendation system employed by a major financial ratings agency utilizing a neural embedding model developed by Peters et al. (2018) for news contextual embeddings representation. Other approaches to news recommendations include IJntema et al. (2010) which utilize externally developed ontologies to find news, collaborative filtering Lu et al. (2015) and graph embeddings Ren et al. (2019).
The system proposed (Figure 1
) consists of four main components starting with a given set of risks identified by domain experts and producing a list of relevant news for each risk. The input to the system is a repository of risk descriptions provided by the domain experts. The first component extracts a set of relevant entities from the textual description of the risk. The second component uses these entities to construct a knowledge graph. The third component searches for a set of keywords related to the risk and parses them from multiple news sources. The fourth component is a neural network model used to rank news events using contextual embeddings generated for headlines as well as the risk descriptors.
In order to demonstrate the end-to-end workflow of our system, we created a set of artificial risks shown in table 1 which a financial institution would face inspired from the risk types defined in Matellio (2020).
Risk Information Extraction
Given a textual description of the risk, the text is decomposed into (1) trigger (the root cause of the risk), (2) outcome (the impact of the given risk) and (3) exposure vessel (the entity/vessel the risk impacts).
Several approaches were tested to decompose the text into the three categories above. One of these is based on a deep bi-LSTM neural network sequence prediction model developed by Stanovsky et al. (2018) for supervised open information extraction. The model breaks a given sentence (in our case the risk text) into the relationships they express. In particular, the model extracts a list of propositions, each composed of a single predicate and an arbitrary number of arguments Stanovsky et al. (2018). As an example, consider the risk (1) in table 1. The model breaks the sentence into the components described in figure 2.
In this example the first argument would map to the trigger of the risk which in this case is cyber attacks. The outcome and exposure vessel typically follow the first verb in the sentence. In this example, the retail banking business is the exposure vessel and the argument after the second verb, loss of customer data, is the outcome.
Risks Knowledge Graph
We utilize the extracted information to construct a knowledge graph to assist in the risk identification and assessment process. A knowledge graph formally represents semantics by describing entities and their relationships. In our system, the knowledge graph is designed for visualising the risks faced by the institution and for reasoning over data. This is intended to help domain experts understand how risks are related to each other, what are the key triggers of risk facing the institution, etc. The nodes of the graph describe the triggers, outcomes and exposure vessels and the edges describe the relationship between the three categories. In our case, a trigger causes a given outcome and the outcome impacts a given exposure vessel. Figure 3 shows the knowledge graph representation for the risks in table 1.
Risk and News Embeddings
The news for each risk are retrieved from (1) Google News and (2) The Global Database of Events, Language and Tone (GDELT) Leetaru and Schrodt (2013)
based on the trigger identified. This, however, returns a large amount of news with many that are irrelevant to the risk itself. To help filter the news retrieved, we use a neural network model coupled with a cosine similarity metric to identify the top relevant news for each risk. The model used is based on a bidirectional encoder representation from transformers (BERT) neural networkDevlin et al. (2018) which is typically used for predicting masked works in a sentence. Our system uses Sentence-BERT, an extension of the original model used to compute contextual sentence embeddings Reimers and Gurevych (2019).
Usage & Conclusion
We presented a system for institutional risk identification using knowledge graphs and automated news profiling. The system discussed was tested on a set of 1,250 risks faced by our institution for the fourth quarter of 2019. The results were vetted by our business partners who provided strong positive feedback on the output. The model achieved an accuracy of 96.6% in identifying relevant news on a set of 1,132 news headlines reviewed by domain experts.
- Probabilistic risk analysis: foundations and methods. Cambridge University Press. Cited by: Introduction & Related Work.
- What risks do banks take?. Cited by: Introduction & Related Work.
- Major risks for banks. Cited by: Introduction & Related Work.
- Risk analysis and risk management: an historical perspective. Risk analysis 5 (2), pp. 103–120. Cited by: Introduction & Related Work.
- International shock transmission after the lehman brothers collapse: evidence from syndicated lending. American Economic Review 102 (3), pp. 231–37. Cited by: Introduction & Related Work.
- Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Cited by: Risk and News Embeddings.
- Allennlp: a deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640. Cited by: Figure 2.
- Reliability engineering and risk assessment. Prentice Hall. Cited by: Introduction & Related Work.
- Ontology-based news recommendation. In Proceedings of the 2010 EDBT/ICDT Workshops, pp. 1–6. Cited by: Introduction & Related Work.
- Gdelt: global data on events, location, and tone, 1979–2012. In ISA annual convention, Vol. 2, pp. 1–49. Cited by: Risk and News Embeddings.
- Content-based collaborative filtering for news topic recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29. Cited by: Introduction & Related Work.
- 10 major risks faced by banks in 2021. Cited by: Introduction & Related Work, System Architecture.
- Deep contextualized word representations. arXiv preprint arXiv:1802.05365. Cited by: Introduction & Related Work.
- Historical relationship between performance assessment for radioactive waste disposal and other types of risk assessment. Risk Analysis 19 (5), pp. 763–807. Cited by: Introduction & Related Work.
- Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084. Cited by: Introduction & Related Work, Risk and News Embeddings.
- Financial news recommendation based on graph embeddings. Decision Support Systems 125, pp. 113115. Cited by: Introduction & Related Work.
- Supervised open information extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 885–895. Cited by: Risk Information Extraction.
- Interdisciplinary vision: the first 25 years of the society for risk analysis (sra), 1980–2005. Risk Analysis: An International Journal 25 (6), pp. 1333–1386. Cited by: Introduction & Related Work.
- Discovery news: a generic framework for financial news recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, pp. 13390–13395. Cited by: Introduction & Related Work.
- Reliability engineering: old problems and new challenges. Reliability Engineering & System Safety 94 (2), pp. 125–141. Cited by: Introduction & Related Work.
The authors would like to acknowledge Danny Schwartzman, Brian O’toole, Cecilia Tilli, Natraj Raman, Salwa Alamir, Charese Smiley and Andrea Stefanucci from J.P. Morgan for their input and suggestions at various stages of the project.
This paper was prepared for informational purposes by the Artificial Intelligence Research group of JPMorgan Chase & Co and its affiliates (“J.P. Morgan”), and is not a product of the Research Department of J.P. Morgan. J.P. Morgan makes no representation and warranty whatsoever and disclaims all liability, for the completeness, accuracy or reliability of the information contained herein. This document is not intended as investment research or investment advice, or a recommendation, offer or solicitation for the purchase or sale of any security, financial instrument, financial product or service, or to be used in any way for evaluating the merits of participating in any transaction, and shall not constitute a solicitation under any jurisdiction or to any person, if such solicitation under such jurisdiction or to such person would be unlawful.