Atypical lexical abbreviations identification in Russian medical texts

06/04/2022
by   Anna Berdichevskaia, et al.
0

Abbreviation is a method of word formation that aims to construct the shortened term from the first letters of the initial phrase. Implicit abbreviations frequently cause the comprehension difficulties for unprepared readers. In this paper, we propose an efficient ML-based algorithm which allows to identify the abbreviations in Russian texts. The method achieves ROC AUC score 0.926 and F1 score 0.706 which are confirmed as competitive in comparison with the baselines. Along with the pipeline, we also establish first to our knowledge Russian dataset that is relevant for the desired task.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
01/01/2023

Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques

This paper presents our solutions for the MediaEval 2022 task on Disaste...
research
11/17/2022

CoLI-Machine Learning Approaches for Code-mixed Language Identification at the Word Level in Kannada-English Texts

The task of automatically identifying a language used in a given text is...
research
02/22/2022

Evaluating Persian Tokenizers

Tokenization plays a significant role in the process of lexical analysis...
research
07/10/2016

A New Bengali Readability Score

In this paper we have proposed methods to analyze the readability of Ben...
research
06/03/2017

Task-specific Word Identification from Short Texts Using a Convolutional Neural Network

Task-specific word identification aims to choose the task-related words ...
research
04/26/2021

Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis

We propose a novel phrase break prediction method that combines implicit...
research
05/03/2019

Time-sync Video Tag Extraction Using Semantic Association Graph

Time-sync comments reveal a new way of extracting the online video tags....

Please sign up or login with your details

Forgot password? Click here to reset