Exploring Language Similarities with Dimensionality Reduction Technique

In recent years several novel models were developed to process natural language, development of accurate language translation systems have helped us overcome geographical barriers and communicate ideas effectively. These models are developed mostly for a few languages that are widely used while other languages are ignored. Most of the languages that are spoken share lexical, syntactic and sematic similarity with several other languages and knowing this can help us leverage the existing model to build more specific and accurate models that can be used for other languages, so here I have explored the idea of representing several known popular languages in a lower dimension such that their similarities can be visualized using simple 2 dimensional plots. This can even help us understand newly discovered languages that may not share its vocabulary with any of the existing languages.

READ FULL TEXT

page 9

page 13

research
01/28/2015

Survey:Natural Language Parsing For Indian Languages

Syntactic parsing is a necessary task which is required for NLP applicat...
research
03/19/2020

Utilizing Language Relatedness to improve Machine Translation: A Case Study on Languages of the Indian Subcontinent

In this work, we present an extensive study of statistical machine trans...
research
01/18/2022

Extending the Vocabulary of Fictional Languages using Neural Networks

Fictional languages have become increasingly popular over the recent yea...
research
12/07/2022

JamPatoisNLI: A Jamaican Patois Natural Language Inference Dataset

JamPatoisNLI provides the first dataset for natural language inference i...
research
05/23/2023

LIMIT: Language Identification, Misidentification, and Translation using Hierarchical Models in 350+ Languages

Knowing the language of an input text/audio is a necessary first step fo...
research
09/30/2021

A surprisal–duration trade-off across and within the world's languages

While there exist scores of natural languages, each with its unique feat...
research
03/08/2019

Source codes in human communication

Although information theoretic characterizations of human communication ...

Please sign up or login with your details

Forgot password? Click here to reset