Iterative Paraphrastic Augmentation with Discriminative Span Alignment

07/01/2020
by   Ryan Culkin, et al.
0

We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing resources, or the rapid creation of new resources from a small, manually-produced seed corpus. We illustrate our framework on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. Based on roughly four days of collecting training data for the alignment model and approximately one day of parallel compute, we automatically generate 495,300 unique (Frame, Trigger) combinations annotated in context, a roughly 50x expansion atop FrameNet v1.7.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2021

Semantic Search as Extractive Paraphrase Span Detection

In this paper, we approach the problem of semantic search by framing the...
research
04/14/2023

OpenAssistant Conversations – Democratizing Large Language Model Alignment

Aligning large language models (LLMs) with human preferences has proven ...
research
04/23/2023

NAIST-SIC-Aligned: Automatically-Aligned English-Japanese Simultaneous Interpretation Corpus

It remains a question that how simultaneous interpretation (SI) data aff...
research
08/11/2021

LargeEA: Aligning Entities for Large-scale Knowledge Graphs

Entity alignment (EA) aims to find equivalent entities in different know...
research
04/29/2020

A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

We present a novel supervised word alignment method based on cross-langu...
research
09/01/2019

A Discriminative Neural Model for Cross-Lingual Word Alignment

We introduce a novel discriminative word alignment model, which we integ...
research
04/27/2021

AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions

This paper addresses text recognition for domains with limited manual an...

Please sign up or login with your details

Forgot password? Click here to reset