DeepAI AI Chat
Log In Sign Up

DialectGram: Detecting Dialectal Variation at Multiple Geographic Resolutions

10/04/2019
by   Hang Jiang, et al.
0

Several computational models have been developed to detect and analyze dialect variation in recent years. Most of these models assume a predefined set of geographical regions over which they detect and analyze dialectal variation. However, dialect variation occurs at multiple levels of geographic resolution ranging from cities within a state, states within a country, and between countries across continents. In this work, we propose a model that enables detection of dialectal variation at multiple levels of geographic resolution obviating the need for a-priori definition of the resolution level. Our method DialectGram, learns dialect-sensitive word embeddings while being agnostic of the geographic resolution. Specifically it only requires one-time training and enables analysis of dialectal variation at a chosen resolution post-hoc – a significant departure from prior models which need to be re-trained whenever the pre-defined set of regions changes. Furthermore, DialectGram explicitly models senses thus enabling one to estimate the proportion of each sense usage in any given region. Finally, we quantitatively evaluate our model against other baselines on a new evaluation dataset DialectSim (in English) and show that DialectGram can effectively model linguistic variation.

READ FULL TEXT
10/04/2019

DialectGram: Automatic Detection of Dialectal Variation at Multiple Geographic Resolutions

We propose DialectGram, a method to detect dialectical variation across ...
10/22/2015

Freshman or Fresher? Quantifying the Geographic Variation of Internet Language

We present a new computational technique to detect and analyze statistic...
06/12/2020

Evaluating a Multi-sense Definition Generation Model for Multiple Languages

Most prior work on definition modeling has not accounted for polysemy, o...
09/19/2020

Word class flexibility: A deep contextualized approach

Word class flexibility refers to the phenomenon whereby a single word fo...
03/28/2021

On the limits of algorithmic prediction across the globe

The impact of predictive algorithms on people's lives and livelihoods ha...
10/25/2020

Contextualized Word Embeddings Encode Aspects of Human-Like Word Sense Knowledge

Understanding context-dependent variation in word meanings is a key aspe...
06/27/2022

Adaptive Cluster Thresholding with Spatial Activation Guarantees Using All-resolutions Inference

Classical cluster inference is hampered by the spatial specificity parad...

Code Repositories

DialectGram

[SCiL 2020] DialectGram: Automatic Detection of Dialectal Changes with Multi-geographic Resolution Analysis


view repo