Priberam at MESINESP Multi-label Classification of Medical Texts Task

05/12/2021
by   Ruben Cardoso, et al.
0

Medical articles provide current state of the art treatments and diagnostics to many medical practitioners and professionals. Existing public databases such as MEDLINE contain over 27 million articles, making it difficult to extract relevant content without the use of efficient search engines. Information retrieval tools are crucial in order to navigate and provide meaningful recommendations for articles and treatments. Classifying these articles into broader medical topics can improve the retrieval of related articles. The set of medical labels considered for the MESINESP task is on the order of several thousands of labels (DeCS codes), which falls under the extreme multi-label classification problem. The heterogeneous and highly hierarchical structure of medical topics makes the task of manually classifying articles extremely laborious and costly. It is, therefore, crucial to automate the process of classification. Typical machine learning algorithms become computationally demanding with such a large number of labels and achieving better recall on such datasets becomes an unsolved problem. This work presents Priberam's participation at the BioASQ task Mesinesp. We address the large multi-label classification problem through the use of four different models: a Support Vector Machine (SVM), a customised search engine (Priberam Search), a BERT based classifier, and a SVM-rank ensemble of all the previous models. Results demonstrate that all three individual models perform well and the best performance is achieved by their ensemble, granting Priberam the 6th place in the present challenge and making it the 2nd best team.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2020

MLPSVM:A new parallel support vector machine to multi-label learning

Multi-label learning has attracted the attention of the machine learning...
research
08/31/2016

A High Speed Multi-label Classifier based on Extreme Learning Machines

In this paper a high speed neural network classifier based on extreme le...
research
04/18/2017

Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models

Background: In this paper we present the approaches and methods employed...
research
04/13/2020

Cascade Neural Ensemble for Identifying Scientifically Sound Articles

Background: A significant barrier to conducting systematic reviews and m...
research
04/24/2019

Toponym Identification in Epidemiology Articles -- A Deep Learning Approach

When analyzing the spread of viruses, epidemiologists often need to iden...
research
04/19/2022

LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation

The rapid growth of biomedical literature poses a significant challenge ...
research
06/09/2016

Large scale biomedical texts classification: a kNN and an ESA-based approaches

With the large and increasing volume of textual data, automated methods ...

Please sign up or login with your details

Forgot password? Click here to reset