A Generalized Estimating Equation Approach to Network Regression

01/12/2023
by   Riddhi Pratim Ghosh, et al.
0

Regression models applied to network data where node attributes are the dependent variables poses a methodological challenge. As has been well studied, naive regression neither properly accounts for community structure, nor does it account for the dependent variable acting as both model outcome and covariate. To address this methodological gap, we propose a network regression model motivated by the important observation that controlling for community structure can, when a network is modular, significantly account for meaningful correlation between observations induced by network connections. We propose a generalized estimating equation (GEE) approach to learn model parameters based on clusters defined through any single-membership community detection algorithm applied to the observed network. We provide a necessary condition on the network size and edge formation probabilities to establish the asymptotic normality of the model parameters under the assumption that the graph structure is a stochastic block model. We evaluate the performance of our approach through simulations and apply it to estimate the joint impact of baseline covariates and network effects on COVID-19 incidence rate among countries connected by a network of commercial airline traffic. We find that during the beginning of the pandemic the network effect has some influence, the percentage of urban population has more influence on the incidence rate compared to the network effect after the travel ban was in effect.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2022

Bayesian community detection for networks with covariates

The increasing prevalence of network data in a vast variety of fields an...
research
12/11/2021

Latent Community Adaptive Network Regression

The study of network data in the social and health sciences frequently c...
research
10/07/2021

High Dimensional Logistic Regression Under Network Dependence

Logistic regression is one of the most fundamental methods for modeling ...
research
07/04/2020

On identifying unobserved heterogeneity in stochastic blockmodel graphs with vertex covariates

Both observed and unobserved vertex heterogeneity can influence block st...
research
11/28/2018

Distribution Regression with Sample Selection, with an Application to Wage Decompositions in the UK

We develop a distribution regression model under endogenous sample selec...
research
09/15/2022

Selecting a significance level in sequential testing procedures for community detection

While there have been numerous sequential algorithms developed to estima...
research
02/08/2022

Predicting Voting Outcomes in the Presence of Communities, Echo Chambers and Multiple Parties

A recently proposed graph-theoretic metric, the influence gap, has shown...

Please sign up or login with your details

Forgot password? Click here to reset