DeepAI
Log In Sign Up

Large-Scale Multi-Label Text Classification on EU Legislation

06/05/2019
by   Ilias Chalkidis, et al.
0

We consider Large-Scale Multi-Label Text Classification (LMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, annotated with 4.3k EUROVOC labels, which is suitable for LMTC, few- and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with label-wise attention perform better than other current state of the art methods. Domain-specific WORD2VEC and context-sensitive ELMO embeddings further improve performance. We also find that considering only particular zones of the documents is sufficient. This allows us to bypass BERT's maximum text length limit and fine-tune BERT, obtaining the best results in all but zero-shot learning cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/26/2019

Extreme Multi-Label Legal Text Classification: A case study in EU Legislation

We consider the task of Extreme Multi-Label Text Classification (XMTC) i...
10/04/2020

An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels

Large-scale Multi-label Text Classification (LMTC) has a wide range of N...
02/11/2022

Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

Large-scale multi-label text classification (LMTC) aims to associate a d...
08/30/2022

Flexible Job Classification with Zero-Shot Learning

Using a taxonomy to organize information requires classifying objects (d...
09/28/2019

Generalized Zero-shot ICD Coding

The International Classification of Diseases (ICD) is a list of classifi...
04/04/2021

Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning

Exploiting label hierarchies has become a promising approach to tackling...
04/01/2020

Deep Learning Based Multi-Label Text Classification of UNGA Resolutions

The main goal of this research is to produce a useful software for Unite...