Instate: Predicting the State of Residence From Last Name

by   Atul Dhingra, et al.

India has twenty-two official languages. Serving such a diverse language base is a challenge for survey statisticians, call center operators, software developers, and other such service providers. To help provide better services to different language communities via better localization, we introduce a new machine learning model that predicts the language(s) that the user can speak from their name. Using nearly 438M records spanning 33 Indian states and 1.13M unique last names from the Indian Electoral Rolls Corpus (?), we build a character-level transformer-based machine-learning model that predicts the state of residence based on the last name. The model has a top-3 accuracy of 85.3 to infer languages understood by the respondent. We provide open-source software that implements the method discussed in the paper.


page 1

page 2

page 3

page 4


raceBERT – A Transformer-based Model for Predicting Race and Ethnicity from Names

This paper presents raceBERT – a transformer-based model for predicting ...

Predicting User Actions in Software Processes

This paper describes an approach for user (e.g. SW architect) assisting ...

Contextual Analysis for Middle Eastern Languages with Hidden Markov Models

Displaying a document in Middle Eastern languages requires contextual an...

Linguistically inspired morphological inflection with a sequence to sequence model

Inflection is an essential part of every human language's morphology, ye...

Aksharantar: Towards building open transliteration tools for the next billion users

We introduce Aksharantar, the largest publicly available transliteration...

Towards Smart e-Infrastructures, A Community Driven Approach Based on Real Datasets

e-Infrastructures have powered the successful penetration of e-services ...

Learning to pronounce as measuring cross-lingual joint orthography-phonology complexity

Machine learning models allow us to compare languages by showing how har...

Please sign up or login with your details

Forgot password? Click here to reset