RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining

04/08/2022
by   Alexander Yalunin, et al.
0

This paper presents several BERT-based models for Russian language biomedical text mining (RuBioBERT, RuBioRoBERTa). The models are pre-trained on a corpus of freely available texts in the Russian biomedical domain. With this pre-training, our models demonstrate state-of-the-art results on RuMedBench - Russian medical language understanding benchmark that covers a diverse set of tasks, including text classification, question answering, natural language inference, and named entity recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2019

BioBERT: pre-trained biomedical language representation model for biomedical text mining

Biomedical text mining has become more important than ever as the number...
research
10/12/2020

BioMegatron: Larger Biomedical Domain Language Model

There has been an influx of biomedical domain-specific language models, ...
research
07/20/2023

UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition

Pre-trained transformer language models (LMs) have in recent years becom...
research
05/14/2020

A pre-training technique to localize medical BERT and enhance BioBERT

Bidirectional Encoder Representations from Transformers (BERT) models fo...
research
10/08/2020

Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition

Knowledge of a disease includes information of various aspects of the di...
research
08/28/2023

Biomedical Entity Linking with Triple-aware Pre-Training

Linking biomedical entities is an essential aspect in biomedical natural...
research
02/03/2023

GLADIS: A General and Large Acronym Disambiguation Benchmark

Acronym Disambiguation (AD) is crucial for natural language understandin...

Please sign up or login with your details

Forgot password? Click here to reset