Language classification from bilingual word embedding graphs

07/18/2016
by   Steffen Eger, et al.
0

We study the role of the second language in bilingual word embeddings in monolingual semantic evaluation tasks. We find strongly and weakly positive correlations between down-stream task performance and second language similarity to the target language. Additionally, we show how bilingual word embeddings can be employed for the task of semantic language classification and that joint semantic spaces vary in meaningful ways across second languages. Our results support the hypothesis that semantic language similarity is influenced by both structural similarity as well as geography/contact.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2021

Co-occurrences using Fasttext embeddings for word similarity tasks in Urdu

Urdu is a widely spoken language in South Asia. Though immoderate litera...
research
09/18/2015

Word, graph and manifold embedding from Markov processes

Continuous vector representations of words and objects appear to carry s...
research
06/02/2021

Evaluating Word Embeddings with Categorical Modularity

We introduce categorical modularity, a novel low-resource intrinsic metr...
research
10/13/2020

BRUMS at SemEval-2020 Task 3: Contextualised Embeddings forPredicting the (Graded) Effect of Context in Word Similarity

This paper presents the team BRUMS submission to SemEval-2020 Task 3: Gr...
research
03/31/2019

SART - Similarity, Analogies, and Relatedness for Tatar Language: New Benchmark Datasets for Word Embeddings Evaluation

There is a huge imbalance between languages currently spoken and corresp...
research
08/24/2018

Features of word similarity

In this theoretical note we compare different types of computational mod...
research
08/18/2019

Parsimonious Morpheme Segmentation with an Application to Enriching Word Embeddings

Traditionally, many text-mining tasks treat individual word-tokens as th...

Please sign up or login with your details

Forgot password? Click here to reset