ABB-BERT: A BERT model for disambiguating abbreviations and contractions

07/08/2022
by   prateek-kacker, et al.
0

Abbreviations and contractions are commonly found in text across different domains. For example, doctors' notes contain many contractions that can be personalized based on their choices. Existing spelling correction models are not suitable to handle expansions because of many reductions of characters in words. In this work, we propose ABB-BERT, a BERT-based model, which deals with an ambiguous language containing abbreviations and contractions. ABB-BERT can rank them from thousands of options and is designed for scale. It is trained on Wikipedia text, and the algorithm allows it to be fine-tuned with little compute to get better performance for a domain or person. We are publicly releasing the training dataset for abbreviations and contractions derived from Wikipedia.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2020

Generating Derivational Morphology with BERT

Can BERT generate derivationally complex words? We present the first stu...
research
11/12/2020

An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text

Unstructured clinical text in EHRs contains crucial information for appl...
research
10/15/2022

AraLegal-BERT: A pretrained language model for Arabic Legal text

The effectiveness of the BERT model on multiple linguistic tasks has bee...
research
05/23/2021

Killing Two Birds with One Stone: Stealing Model and Inferring Attribute from BERT-based APIs

The advances in pre-trained models (e.g., BERT, XLNET and etc) have larg...
research
06/27/2023

Investigating Cross-Domain Behaviors of BERT in Review Understanding

Review score prediction requires review text understanding, a critical r...
research
08/15/2021

Maps Search Misspelling Detection Leveraging Domain-Augmented Contextual Representations

Building an independent misspelling detector and serve it before correct...
research
11/15/2019

Evaluating robustness of language models for chief complaint extraction from patient-generated text

Automated classification of chief complaints from patient-generated text...

Please sign up or login with your details

Forgot password? Click here to reset