Primer AI's Systems for Acronym Identification and Disambiguation

12/14/2020
by   Nicholas Egan, et al.
0

The prevalence of ambiguous acronyms make scientific documents harder to understand for humans and machines alike, presenting a need for models that can automatically identify acronyms in text and disambiguate their meaning. We introduce new methods for acronym identification and disambiguation: our acronym identification model projects learned token embeddings onto tag predictions, and our acronym disambiguation model finds training examples with similar sentence embeddings as test examples. Both of our systems achieve significant performance gains over previously suggested methods, and perform competitively on the SDU@AAAI-21 shared task leaderboard. Our models were trained in part on new distantly-supervised datasets for these tasks which we call AuxAI and AuxAD. We also identified a duplication conflict issue in the SciAD dataset, and formed a deduplicated version of SciAD that we call SciAD-dedupe. We publicly released all three of these datasets, and hope that they help the community make further strides in scientific document understanding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2020

Acronym Identification and Disambiguation Shared Tasks for Scientific Document Understanding

Acronyms are the short forms of longer phrases and they are frequently u...
research
10/28/2020

What Does This Acronym Mean? Introducing a New Dataset for Acronym Identification and Disambiguation

Acronyms are the short forms of phrases that facilitate conveying length...
research
05/23/2021

CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding

Scientific document understanding is challenging as the data is highly d...
research
06/14/2021

Improving Paraphrase Detection with the Adversarial Paraphrasing Task

If two sentences have the same meaning, it should follow that they are e...
research
06/20/2023

Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction

Document-level relation extraction (DocRE) attracts more research intere...
research
04/01/2019

PAWS: Paraphrase Adversaries from Word Scrambling

Existing paraphrase identification datasets lack sentence pairs that hav...

Please sign up or login with your details

Forgot password? Click here to reset