It's All in the Name: A Character Based Approach To Infer Religion

10/27/2020
by   Rochana Chaturvedi, et al.
2

Demographic inference from text has received a surge of attention in the field of natural language processing in the last decade. In this paper, we use personal names to infer religion in South Asia - where religion is a salient social division, and yet, disaggregated data on it remains scarce. Existing work predicts religion using dictionary based method, and therefore, can not classify unseen names. We use character based models which learn character patterns and, therefore, can classify unseen names as well with high accuracy. These models are also much faster and can easily be scaled to large data sets. We improve our classifier by combining the name of an individual with that of their parent/spouse and achieve remarkably high accuracy. Finally, we trace the classification decisions of a convolutional neural network model using layer-wise relevance propagation which can explain the predictions of complex non-linear classifiers and circumvent their purported black box nature. We show how character patterns learned by the classifier are rooted in the linguistic origins of names.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2019

MaskDGA: A Black-box Evasion Technique Against DGA Classifiers and Adversarial Defenses

Domain generation algorithms (DGAs) are commonly used by botnets to gene...
research
02/07/2021

What's in a Name? – Gender Classification of Names with Character Based Machine Learning Models

Gender information is no longer a mandatory input when registering for a...
research
08/12/2019

Self-supervised Data Bootstrapping for Deep Optical Character Recognition of Identity Documents

The essential task of verifying person identities at airports and nation...
research
06/23/2016

Explaining Predictions of Non-Linear Classifiers in NLP

Layer-wise relevance propagation (LRP) is a recently proposed technique ...
research
06/18/2021

Predicting gender of Brazilian names using deep learning

Predicting gender by the name is not a simple task. In many applications...
research
06/01/2023

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

As language models continue to be integrated into applications of person...
research
05/09/2022

Behind the Mask: Demographic bias in name detection for PII masking

Many datasets contain personally identifiable information, or PII, which...

Please sign up or login with your details

Forgot password? Click here to reset