Joint Structured Models for Extraction from Overlapping Sources

05/01/2010
by   Rahul Gupta, et al.
0

We consider the problem of jointly training structured models for extraction from sources whose instances enjoy partial overlap. This has important applications like user-driven ad-hoc information extraction on the web. Such applications present new challenges in terms of the number of sources and their arbitrary pattern of overlap not seen by earlier collective training schemes applied on two sources. We present an agreement-based learning framework and alternatives within it to trade-off tractability, robustness to noise, and extent of agreement. We provide a principled scheme to discover low-noise agreement sets in unlabeled data across the sources. Through extensive experiments over 58 real datasets, we establish that our method of additively rewarding agreement over maximal segments of text provides the best trade-offs, and also scores over alternatives such as collective inference, staged training, and multi-view learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2018

Agreement-based Learning

Model selection is a problem that has occupied machine learning research...
research
10/27/2021

Deep Transfer Learning for Multi-source Entity Linkage via Domain Adaptation

Multi-source entity linkage focuses on integrating knowledge from multip...
research
03/09/2022

ASET: Ad-hoc Structured Exploration of Text Collections [Extended Abstract]

In this paper, we propose a new system called ASET that allows users to ...
research
12/17/2020

InSRL: A Multi-view Learning Framework Fusing Multiple Information Sources for Distantly-supervised Relation Extraction

Distant supervision makes it possible to automatically label bags of sen...
research
11/11/2018

Multi-Source Neural Variational Inference

Learning from multiple sources of information is an important problem in...
research
07/09/2020

Multi-view Orthonormalized Partial Least Squares: Regularizations and Deep Extensions

We establish a family of subspace-based learning method for multi-view l...
research
02/14/2023

Investigating Multi-source Active Learning for Natural Language Inference

In recent years, active learning has been successfully applied to an arr...

Please sign up or login with your details

Forgot password? Click here to reset