Probabilistic Method of Measuring Linguistic Productivity

08/24/2023
by   Sergei Monakhov, et al.
0

In this paper I propose a new way of measuring linguistic productivity that objectively assesses the ability of an affix to be used to coin new complex words and, unlike other popular measures, is not directly dependent upon token frequency. Specifically, I suggest that linguistic productivity may be viewed as the probability of an affix to combine with a random base. The advantages of this approach include the following. First, token frequency does not dominate the productivity measure but naturally influences the sampling of bases. Second, we are not just counting attested word types with an affix but rather simulating the construction of these types and then checking whether they are attested in the corpus. Third, a corpus-based approach and randomised design assure that true neologisms and words coined long ago have equal chances to be selected. The proposed algorithm is evaluated both on English and Russian data. The obtained results provide some valuable insights into the relation of linguistic productivity to the number of types and tokens. It looks like burgeoning linguistic productivity manifests itself in an increasing number of types. However, this process unfolds in two stages: first comes the increase in high-frequency items, and only then follows the increase in low-frequency items.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2017

Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks

Increasingly, cognitive scientists have demonstrated interest in applyin...
research
01/30/2023

Exploring the Constructicon: Linguistic Analysis of a Computational CxG

Recent work has formulated the task for computational construction gramm...
research
05/25/2018

Japanese Predicate Conjugation for Neural Machine Translation

Neural machine translation (NMT) has a drawback in that can generate onl...
research
04/19/2019

Recognizing the vocabulary of Brazilian popular newspapers with a free-access computational dictionary

We report an experiment to check the identification of a set of words in...
research
05/10/2022

On the Value of Project Productivity for Early Effort Estimation

In general, estimating software effort using a Use Case Point (UCP) size...
research
08/25/2023

Assessing Keyness using Permutation Tests

We propose a resampling-based approach for assessing keyness in corpus l...
research
07/15/2015

Associative Measures and Multi-word Unit Extraction in Turkish

Associative measures are "mathematical formulas determining the strength...

Please sign up or login with your details

Forgot password? Click here to reset