Multilingual Hierarchical Attention Networks for Document Classification

07/04/2017
by   Nikolaos Pappas, et al.
0

Hierarchical attention networks have recently achieved remarkable performance for document classification in a given language. However, when multilingual document collections are considered, training such models separately for each language entails linear parameter growth and lack of cross-language transfer. Learning a single multilingual model with fewer parameters is therefore a challenging but potentially beneficial objective. To this end, we propose multilingual hierarchical attention networks for learning document structures, with shared encoders and/or shared attention mechanisms across languages, using multi-task learning and an aligned semantic space as input. We evaluate the proposed models on multilingual document classification with disjoint label sets, on a large dataset which we provide, with 600k news documents in 8 languages, and 5k labels. The multilingual models outperform monolingual ones in low-resource as well as full-resource settings, and use fewer parameters, thus confirming their computational efficiency and the utility of cross-language transfer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2021

CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task

This paper describes Charles University submission for Multilingual Low-...
research
09/13/2022

Learning ASR pathways: A sparse multilingual ASR model

Neural network pruning can be effectively applied to compress automatic ...
research
05/29/2019

Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains

Neural language modeling (LM) has led to significant improvements in sev...
research
02/19/2020

Hierarchical models vs. transfer learning for document-level sentiment classification

Documents are composed of smaller pieces - paragraphs, sentences, and to...
research
04/30/2021

Scaling End-to-End Models for Large-Scale Multilingual ASR

Building ASR models across many language families is a challenging multi...
research
06/01/2021

NewsEmbed: Modeling News through Pre-trained Document Representations

Effectively modeling text-rich fresh content such as news articles at do...
research
10/12/2020

Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models

Massively multilingual models subsuming tens or even hundreds of languag...

Please sign up or login with your details

Forgot password? Click here to reset