DefExt: A Semi Supervised Definition Extraction Tool

06/08/2016
by   Luis Espinosa-Anke, et al.
0

We present DefExt, an easy to use semi supervised Definition Extraction Tool. DefExt is designed to extract from a target corpus those textual fragments where a term is explicitly mentioned together with its core features, i.e. its definition. It works on the back of a Conditional Random Fields based sequential labeling algorithm and a bootstrapping approach. Bootstrapping enables the model to gradually become more aware of the idiosyncrasies of the target corpus. In this paper we describe the main components of the toolkit as well as experimental results stemming from both automatic and manual evaluation. We release DefExt as open source along with the necessary files to run it in any Unix machine. We also provide access to training and test data for immediate use.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2017

Graph-Based Semi-Supervised Conditional Random Fields For Spoken Language Understanding Using Unaligned Data

We experiment graph-based Semi-Supervised Learning (SSL) of Conditional ...
research
01/20/2017

LAREX - A semi-automatic open-source Tool for Layout Analysis and Region Extraction on Early Printed Books

A semi-automatic open-source tool for layout analysis on early printed b...
research
03/31/2021

Defx at SemEval-2020 Task 6: Joint Extraction of Concepts and Relations for Definition Extraction

Definition Extraction systems are a valuable knowledge source for both h...
research
01/02/2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

We introduce VoxPopuli, a large-scale multilingual corpus providing 100K...
research
11/08/2019

Semi-Supervised Method using Gaussian Random Fields for Boilerplate Removal in Web Browsers

Boilerplate removal refers to the problem of removing noisy content from...
research
01/08/2023

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

This paper introduces SpeeChain, an open-source Pytorch-based toolkit de...

Please sign up or login with your details

Forgot password? Click here to reset