Language Distribution Prediction based on Batch Markov Monte Carlo Simulation with Migration

02/26/2018
by   XingYu Fu, et al.
0

Language spreading is a complex mechanism that involves issues like culture, economics, migration, population etc. In this paper, we propose a set of methods to model the dynamics of the spreading system. To model the randomness of language spreading, we propose the Batch Markov Monte Carlo Simulation with Migration(BMMCSM) algorithm, in which each agent is treated as a language stack. The agent learns languages and migrates based on the proposed Batch Markov Property according to the transition matrix T and migration matrix M. Since population plays a crucial role in language spreading, we also introduce the Mortality and Fertility Mechanism, which controls the birth and death of the simulated agents, into the BMMCSM algorithm. The simulation results of BMMCSM show that the numerical and geographic distribution of languages varies across the time. The change of distribution fits the world cultural and economic development trend. Next, when we construct Matrix T, there are some entries of T can be directly calculated from historical statistics while some entries of T is unknown. Thus, the key to the success of the BMMCSM lies in the accurate estimation of transition matrix T by estimating the unknown entries of T under the supervision of the known entries. To achieve this, we first construct a 20 by 20 by 5 factor tensor X to characterize each entry of T. Then we train a Random Forest Regressor on the known entries of T and use the trained regressor to predict the unknown entries. The reason why we choose Random Forest(RF) is that, compared to Single Decision Tree, it conquers the problem of over fitting and the Shapiro test also suggests that the residual of RF subjects to the Normal distribution.

READ FULL TEXT

page 4

page 13

page 15

page 17

research
04/19/2018

A Dynamic Boosted Ensemble Learning Based on Random Forest

We propose Dynamic Boosted Random Forest (DBRF), a novel ensemble algori...
research
11/12/2019

Prediction of Missing Semantic Relations in Lexical-Semantic Network using Random Forest Classifier

This study focuses on the prediction of missing six semantic relations (...
research
02/22/2020

Markov Chain Monte-Carlo Phylogenetic Inference Construction in Computational Historical Linguistics

More and more languages in the world are under study nowadays, as a resu...
research
03/27/2020

Rational Agent-Based Decision Algorithm for Strategic Converged Network Migration Planning

To keep up with constantly growing user demands for services with higher...
research
12/04/2020

Impact of weather factors on migration intention using machine learning algorithms

A growing attention in the empirical literature has been paid to the inc...

Please sign up or login with your details

Forgot password? Click here to reset