Stability of Syntactic Dialect Classification Over Space and Time

09/11/2022
by   Jonathan Dunn, et al.
0

This paper analyses the degree to which dialect classifiers based on syntactic representations remain stable over space and time. While previous work has shown that the combination of grammar induction and geospatial text classification produces robust dialect models, we do not know what influence both changing grammars and changing populations have on dialect models. This paper constructs a test set for 12 dialects of English that spans three years at monthly intervals with a fixed spatial distribution across 1,120 cities. Syntactic representations are formulated within the usage-based Construction Grammar paradigm (CxG). The decay rate of classification performance for each dialect over time allows us to identify regions undergoing syntactic change. And the distribution of classification accuracy within dialect regions allows us to identify the degree to which the grammar of a dialect is internally heterogeneous. The main contribution of this paper is to show that a rigorous evaluation of dialect classification models can be used to find both variation over space and change over time.

READ FULL TEXT
research
09/21/2023

Syntactic Variation Across the Grammar: Modelling a Complex Adaptive System

While language is a complex adaptive system, most work on syntactic vari...
research
04/11/2019

Modeling Global Syntactic Variation in English Using Dialect Classification

This paper evaluates global-scale dialect identification for 14 national...
research
05/22/2023

The Grammar and Syntax Based Corpus Analysis Tool For The Ukrainian Language

This paper provides an overview of a text mining tool the StyloMetrix de...
research
09/06/2018

Evaluating Syntactic Properties of Seq2seq Output with a Broad Coverage HPSG: A Case Study on Machine Translation

Sequence to sequence (seq2seq) models are often employed in settings whe...
research
09/30/2019

Lexical Features Are More Vulnerable, Syntactic Features Have More Predictive Power

Understanding the vulnerability of linguistic features extracted from no...
research
03/11/2023

Verbal behavior without syntactic structures: beyond Skinner and Chomsky

What does it mean to know language? Since the Chomskian revolution, one ...
research
11/13/2020

A grammar compressor for collections of reads with applications to the construction of the BWT

We describe a grammar for DNA sequencing reads from which we can compute...

Please sign up or login with your details

Forgot password? Click here to reset