BERT Cannot Align Characters

09/20/2021
by   Antonis Maronikolakis, et al.
28

In previous work, it has been shown that BERT can adequately align cross-lingual sentences on the word level. Here we investigate whether BERT can also operate as a char-level aligner. The languages examined are English, Fake-English, German and Greek. We show that the closer two languages are, the better BERT can align them on the character level. BERT indeed works well in English to Fake-English alignment, but this does not generalize to natural languages to the same extent. Nevertheless, the proximity of two languages does seem to be a factor. English is more related to German than to Greek and this is reflected in how well BERT aligns them; English to German is better than English to Greek. We examine multiple setups and show that the similarity matrices for natural languages show weaker relations the further apart two languages are.

READ FULL TEXT
research
11/28/2021

Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models

Zero-shot cross-lingual transfer is an important feature in modern NLP m...
research
11/08/2019

Cross-Lingual Relevance Transfer for Document Retrieval

Recent work has shown the surprising ability of multi-lingual BERT to se...
research
08/10/2018

Homophonic Quotients of Linguistic Free Groups: German, Korean, and Turkish

In 1993, the homophonic quotient groups for French and English (the quot...
research
08/31/2019

Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER

Contextual word embeddings (e.g. GPT, BERT, ELMo, etc.) have demonstrate...
research
07/02/2021

He Thinks He Knows Better than the Doctors: BERT for Event Factuality Fails on Pragmatics

We investigate how well BERT performs on predicting factuality in severa...
research
05/27/2021

On the Globalization of the QAnon Conspiracy Theory Through Telegram

QAnon is a far-right conspiracy theory that became popular and mainstrea...
research
02/08/2021

Effects of Layer Freezing when Transferring DeepSpeech to New Languages

In this paper, we train Mozilla's DeepSpeech architecture on German and ...

Please sign up or login with your details

Forgot password? Click here to reset