A Consolidated System for Robust Multi-Document Entity Risk Extraction and Taxonomy Augmentation

09/23/2019
by   Berk Ekmekci, et al.
0

We introduce a hybrid human-automated system that provides scalable entity-risk relation extractions across large data sets. Given an expert-defined keyword taxonomy, entities, and data sources, the system returns text extractions based on bidirectional token distances between entities and keywords and expands taxonomy coverage with word vector encodings. Our system represents a more simplified architecture compared to alerting focused systems - motivated by high coverage use cases in the risk mining space such as due diligence activities and intelligence gathering. We provide an overview of the system and expert evaluations for a range of token distances. We demonstrate that single and multi-sentence distance groups significantly outperform baseline extractions with shorter, single sentences being preferred by analysts. As the taxonomy expands, the amount of relevant information increases and multi-sentence extractions become more preferred, but this is tempered against entity-risk relations become more indirect. We discuss the implications of these observations on users, management of ambiguity and taxonomy expansion, and future system modifications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2019

A Taxonomy to Assess and Tailor Risk-based Testing in Recent Testing Standards

This article provides a taxonomy for risk-based testing that serves as a...
research
08/28/2017

On Type-Aware Entity Retrieval

Today, the practice of returning entities from a knowledge base in respo...
research
09/23/2019

Specificity-Based Sentence Ordering for Multi-Document Extractive Risk Summarization

Risk mining technologies seek to find relevant textual extractions that ...
research
11/28/2017

Classification of entities via their descriptive sentences

Hypernym identification of open-domain entities is crucial for taxonomy ...
research
05/02/2023

Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

Relation extraction (RE) aims to extract potential relations according t...
research
01/20/2021

Using Full-text Content of Academic Articles to Build a Methodology Taxonomy of Information Science in China

Research on the construction of traditional information science methodol...
research
08/29/2022

Probably Something: A Multi-Layer Taxonomy of Non-Fungible Tokens

Purpose: This paper aims to establish a fundamental and comprehensive un...

Please sign up or login with your details

Forgot password? Click here to reset