Sparse network asymptotics for logistic regression

10/09/2020
by   Bryan S. Graham, et al.
0

Consider a bipartite network where N consumers choose to buy or not to buy M different products. This paper considers the properties of the logistic regression of the N× M array of i-buys-j purchase decisions, [Y_ij]_1≤ i≤ N,1≤ j≤ M, onto known functions of consumer and product attributes under asymptotic sequences where (i) both N and M grow large and (ii) the average number of products purchased per consumer is finite in the limit. This latter assumption implies that the network of purchases is sparse: only a (very) small fraction of all possible purchases are actually made (concordant with many real-world settings). Under sparse network asymptotics, the first and last terms in an extended Hoeffding-type variance decomposition of the score of the logit composite log-likelihood are of equal order. In contrast, under dense network asymptotics, the last term is asymptotically negligible. Asymptotic normality of the logistic regression coefficients is shown using a martingale central limit theorem (CLT) for triangular arrays. Unlike in the dense case, the normality result derived here also holds under degeneracy of the network graphon. Relatedly, when there happens to be no dyadic dependence in the dataset in hand, it specializes to recently derived results on the behavior of logistic regression with rare events and iid data. Sparse network asymptotics may lead to better inference in practice since they suggest variance estimators which (i) incorporate additional sources of sampling variation and (ii) are valid under varying degrees of dyadic dependence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2022

Infill asymptotics for logistic regression estimators for spatio-temporal point processes

This paper discusses infill asymptotics for logistic regression estimato...
research
03/02/2020

Score Engineered Logistic Regression

In several FICO studies logistic regression has been shown to be a very ...
research
05/13/2020

An Asymptotic Result of Conditional Logistic Regression Estimator

In cluster-specific studies, ordinary logistic regression and conditiona...
research
04/05/2023

Distributed Logistic Regression for Massive Data with Rare Events

Large-scale rare events data are commonly encountered in practice. To ta...
research
06/05/2017

The Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square

Logistic regression is used thousands of times a day to fit data, predic...
research
04/27/2022

Asymptotic Inference for Infinitely Imbalanced Logistic Regression

In this paper we extend the work of Owen (2007) by deriving a second ord...
research
12/28/2021

Improving Nonparametric Classification via Local Radial Regression with an Application to Stock Prediction

For supervised classification problems, this paper considers estimating ...

Please sign up or login with your details

Forgot password? Click here to reset