Statistical Estimation from Dependent Data

07/20/2021
by   Yuval Dagan, et al.
0

We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e.g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies. We model these dependencies in the language of Markov Random Fields and, importantly, allow these dependencies to be substantial, i.e do not assume that the Markov Random Field capturing these dependencies is in high temperature. As our main contribution we provide algorithms and statistically efficient estimation rates for this model, giving several instantiations of our bounds in logistic regression, sparse logistic regression, and neural network settings with dependent data. Our estimation guarantees follow from novel results for estimating the parameters (i.e. external fields and interaction strengths) of Ising models from a single sample. We evaluate our estimation approach on real networked data, showing that it outperforms standard regression approaches that ignore dependencies, across three text classification datasets: Cora, Citeseer and Pubmed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2019

Regression from Dependent Observations

The standard linear and logistic regression models assume that the respo...
research
10/07/2021

High Dimensional Logistic Regression Under Network Dependence

Logistic regression is one of the most fundamental methods for modeling ...
research
08/28/2020

Introduction to logistic regression

For random field theory based multiple comparison corrections In brain i...
research
07/09/2023

On the sample complexity of estimation in logistic regression

The logistic regression model is one of the most popular data generation...
research
11/07/2018

Interpreting the Ising Model: The Input Matters

The Ising model is a widely used model for multivariate binary data. It ...
research
06/30/2020

Bayesian Analysis of Social Influence

The network influence model is a model for binary outcome variables that...
research
02/22/2023

An Interpretable Determinantal Choice Model for Subset Selection

Understanding how subsets of items are chosen from offered sets is criti...

Please sign up or login with your details

Forgot password? Click here to reset