Online Updating Statistics for Heterogenous Updating Regressions via Homogenization Techniques

06/23/2021
by   Lin Lu, et al.
0

Under the environment of big data streams, it is a common situation where the variable set of a model may change according to the condition of data streams. In this paper, we propose a homogenization strategy to represent the heterogenous models that are gradually updated in the process of data streams. With the homogenized representations, we can easily construct various online updating statistics such as parameter estimation, residual sum of squares and F-statistic for the heterogenous updating regression models. The main difference from the classical scenarios is that the artificial covariates in the homogenized models are not identically distributed as the natural covariates in the original models, consequently, the related theoretical properties are distinct from the classical ones. The asymptotical properties of the online updating statistics are established, which show that the new method can achieve estimation efficiency and oracle property, without any constraint on the number of data batches. The behavior of the method is further illustrated by various numerical examples from simulation experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2020

Unified Rules of Renewable Weighted Sums for Various Online Updating Estimations

This paper establishes unified frameworks of renewable weighted sums (RW...
research
09/05/2022

Online Updating Huber Robust Regression for Big Data Streams

Big data has grasped great attention in different fields over recent yea...
research
01/21/2021

A General Framework of Online Updating Variable Selection for Generalized Linear Models with Streaming Datasets

In the research field of big data, one of important issues is how to rec...
research
10/11/2022

Renewable Learning for Multiplicative Regression with Streaming Datasets

When large amounts of data continuously arrive in streams, online updati...
research
12/17/2021

Online Generalized Additive Model

Additive models and generalized additive models are effective semiparame...
research
12/23/2022

Balanced Subsampling for Big Data with Categorical Covariates

The use and analysis of massive data are challenging due to the high sto...
research
03/29/2022

A Computational Architecture for Machine Consciousness and Artificial Superintelligence: Updating Working Memory Iteratively

This theoretical article examines how to construct human-like working me...

Please sign up or login with your details

Forgot password? Click here to reset