An AI-based Approach for Tracing Content Requirements in Financial Documents

10/28/2021
by   Xiaochen Li, et al.
0

The completeness (in terms of content) of financial documents is a fundamental requirement for investment funds. To ensure completeness, financial regulators spend a huge amount of time for carefully checking every financial document based on the relevant content requirements, which prescribe the information types to be included in financial documents (e.g., the description of shares' issue conditions). Although several techniques have been proposed to automatically detect certain types of information in documents in various application domains, they provide limited support to help regulators automatically identify the text chunks related to financial information types, due to the complexity of financial documents and the diversity of the sentences characterizing an information type. In this paper, we propose FITI, an artificial intelligence (AI)-based method for tracing content requirements in financial documents. Given a new financial document, FITI selects a set of candidate sentences for efficient information type identification. Then, FITI uses a combination of rule-based and data-centric approaches, by leveraging information retrieval (IR) and machine learning (ML) techniques that analyze the words, sentences, and contexts related to an information type, to rank candidate sentences. Finally, using a list of indicator phrases related to each information type, a heuristic-based selector, which considers both the sentence ranking and the domain-specific phrases, determines a list of sentences corresponding to each information type. We evaluated FITI by assessing its effectiveness in tracing financial content requirements in 100 financial documents. Experimental results show that FITI provides accurate identification with average precision and recall values of 0.824 and 0.646, respectively. Furthermore, FITI can detect about 80 financial documents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2017

Content Based Document Recommender using Deep Learning

With the recent advancements in information technology there has been a ...
research
10/11/2019

Assessing Regulatory Risk in Personal Financial Advice Documents: a Pilot Study

Assessing regulatory compliance of personal financial advice is currentl...
research
06/05/2019

Terminology-based Text Embedding for Computing Document Similarities on Technical Content

We propose in this paper a new, hybrid document embedding approach in or...
research
01/29/2022

Information Extraction through AI techniques: The KIDs use case at CONSOB

In this paper we report on the initial activities carried out within a c...
research
05/06/2016

Detecting Context Dependence in Exercise Item Candidates Selected from Corpora

We explore the factors influencing the dependence of single sentences on...
research
04/16/2020

An approach based on Combination of Features for automatic news retrieval

Nowadays, according to the increasingly increasing information, the impo...
research
10/08/2020

PoinT-5: Pointer Network and T-5 based Financial NarrativeSummarisation

Companies provide annual reports to their shareholders at the end of the...

Please sign up or login with your details

Forgot password? Click here to reset