DeepAI AI Chat
Log In Sign Up

Cross-lingual Text Classification with Heterogeneous Graph Neural Network

by   ZiYun Wang, et al.

Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages, which is very useful for low-resource languages. Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks, but rarely consider factors beyond semantic similarity, causing performance degradation between some language pairs. In this paper we propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification using graph convolutional networks (GCN). In particular, we construct a heterogeneous graph by treating documents and words as nodes, and linking nodes with different relations, which include part-of-speech roles, semantic similarity, and document translations. Extensive experiments show that our graph-based method significantly outperforms state-of-the-art models on all tasks, and also achieves consistent performance gain over baselines in low-resource settings where external tools like translators are unavailable.


page 1

page 2

page 3

page 4


Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification

Text classification must sometimes be applied in situations with no trai...

Cross-Lingual Text Classification with Minimal Resources by Transferring a Sparse Teacher

Cross-lingual text classification alleviates the need for manually label...

Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph

In cross-lingual text classification, it is required that task-specific ...

Massively Multilingual Language Models for Cross Lingual Fact Extraction from Low Resource Indian Languages

Massive knowledge graphs like Wikidata attempt to capture world knowledg...

Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language

Graph Convolutional Networks (GCN) have achieved state-of-art results on...

Cross-Lingual Task-Specific Representation Learning for Text Classification in Resource Poor Languages

Neural network models have shown promising results for text classificati...

Detecting Text Formality: A Study of Text Classification Approaches

Formality is an important characteristic of text documents. The automati...

Code Repositories