ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus

07/14/2021
by   Ayyoob Imani, et al.
0

With more than 7000 languages worldwide, multilingual natural language processing (NLP) is essential both from an academic and commercial perspective. Researching typological properties of languages is fundamental for progress in multilingual NLP. Examples include assessing language similarity for effective transfer learning, injecting inductive biases into machine learning models or creating resources such as dictionaries and inflection tables. We provide ParCourE, an online tool that allows to browse a word-aligned parallel corpus, covering 1334 languages. We give evidence that this is useful for typological research. ParCourE can be set up for any parallel corpus and can thus be used for typological research on other corpora as well as for exploring their quality and properties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2017

MMCR4NLP: Multilingual Multiway Corpora Repository for Natural Language Processing

Multilinguality is gradually becoming ubiquitous in the sense that more ...
research
10/04/2020

Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus

Machine translation has been a major motivation of development in natura...
research
03/31/2020

MULTEXT-East

MULTEXT-East language resources, a multilingual dataset for language eng...
research
05/22/2023

Crosslingual Transfer Learning for Low-Resource Languages Based on Multilingual Colexification Graphs

Colexification in comparative linguistics refers to the phenomenon of a ...
research
03/07/2023

Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora

This paper introduces two multilingual government themed corpora in vari...
research
08/20/2020

PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data

In natural language processing (NLP), there is a need for more resources...
research
08/24/2016

A Large-Scale Multilingual Disambiguation of Glosses

Linking concepts and named entities to knowledge bases has become a cruc...

Please sign up or login with your details

Forgot password? Click here to reset