Going beyond accuracy: estimating homophily in social networks using predictions

01/30/2020
by   George Berry, et al.
0

In online social networks, it is common to use predictions of node categories to estimate measures of homophily and other relational properties. However, online social network data often lacks basic demographic information about the nodes. Researchers must rely on predicted node attributes to estimate measures of homophily, but little is known about the validity of these measures. We show that estimating homophily in a network can be viewed as a dyadic prediction problem, and that homophily estimates are unbiased when dyad-level residuals sum to zero in the network. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally have this property and can introduce large biases into homophily estimates. Bias occurs due to error autocorrelation along dyads. Importantly, node-level classification performance is not a reliable indicator of estimation accuracy for homophily. We compare estimation strategies that make predictions at the node and dyad levels, evaluating performance in different settings. We propose a novel "ego-alter" modeling approach that outperforms standard node and dyad classification strategies. While this paper focuses on homophily, results generalize to other relational measures which aggregate predictions along the dyads in a network. We conclude with suggestions for research designs to study homophily in online networks. Code for this paper is available at https://github.com/georgeberry/autocorr.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2017

Evaluating Social Networks Using Task-Focused Network Inference

Networks are representations of complex underlying social processes. How...
research
05/07/2023

The Role of Scaling and Estimating the Degree Ratio in the Network Scale-up Method

The Network Scale-up Method (NSUM) uses social networks and answers to "...
research
12/03/2018

Online Graph-Adaptive Learning with Scalability and Privacy

Graphs are widely adopted for modeling complex systems, including financ...
research
07/24/2019

Semi-Supervised Tensor Factorization for Node Classification in Complex Social Networks

This paper proposes a method to guide tensor factorization, using class ...
research
12/18/2018

Globalness Detection in Online Social Network

Classification problems have made significant progress due to the maturi...
research
08/08/2021

Recurrent Graph Neural Networks for Rumor Detection in Online Forums

The widespread adoption of online social networks in daily life has crea...
research
09/03/2018

Network estimation via graphon with node features

Estimating the probabilities of linkages in a network has gained increas...

Please sign up or login with your details

Forgot password? Click here to reset