False Positive and Cross-relation Signals in Distant Supervision Data

11/14/2017
by   Anca Dumitrache, et al.
0

Distant supervision (DS) is a well-established method for relation extraction from text, based on the assumption that when a knowledge-base contains a relation between a term pair, then sentences that contain that pair are likely to express the relation. In this paper, we use the results of a crowdsourcing relation extraction task to identify two problems with DS data quality: the widely varying degree of false positives across different relations, and the observed causal connection between relations that are not considered by the DS method. The crowdsourcing data aggregation is performed using ambiguity-aware CrowdTruth metrics, that are used to capture and interpret inter-annotator disagreement. We also present preliminary results of using the crowd to enhance DS training data for a relation classification model, without requiring the crowd to annotate the entire set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2018

Crowdsourcing Semantic Label Propagation in Relation Classification

Distant supervision is a popular method for performing relation extracti...
research
01/09/2017

Crowdsourcing Ground Truth for Medical Relation Extraction

Cognitive computing systems require human labeled data for evaluation, a...
research
11/19/2015

Knowledge Base Population using Semantic Label Propagation

A crucial aspect of a knowledge base population system that extracts new...
research
12/07/2020

H-FND: Hierarchical False-Negative Denoising for Distant Supervision Relation Extraction

Although distant supervision automatically generates training data for r...
research
08/18/2018

CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement

Typically crowdsourcing-based approaches to gather annotated data use in...
research
05/24/2018

Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning

Distant supervision has become the standard method for relation extracti...
research
06/08/2023

Open Set Relation Extraction via Unknown-Aware Training

The existing supervised relation extraction methods have achieved impres...

Please sign up or login with your details

Forgot password? Click here to reset