Scalable Estimation and Inference with Large-scale or Online Survival Data

01/06/2020
by   Jinfeng Xu, et al.
0

With the rapid development of data collection and aggregation technologies in many scientific disciplines, it is becoming increasingly ubiquitous to conduct large-scale or online regression to analyze real-world data and unveil real-world evidence. In such applications, it is often numerically challenging or sometimes infeasible to store the entire dataset in memory. Consequently, classical batch-based estimation methods that involve the entire dataset are less attractive or no longer applicable. Instead, recursive estimation methods such as stochastic gradient descent that process data points sequentially are more appealing, exhibiting both numerical convenience and memory efficiency. In this paper, for scalable estimation of large or online survival data, we propose a stochastic gradient descent method which recursively updates the estimates in an online manner as data points arrive sequentially in streams. Theoretical results such as asymptotic normality and estimation efficiency are established to justify its validity. Furthermore, to quantify the uncertainty associated with the proposed stochastic gradient descent estimator and facilitate statistical inference, we develop a scalable resampling strategy that specifically caters to the large-scale or online setting. Simulation studies and a real data application are also provided to assess its performance and illustrate its practical utility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2017

On Scalable Inference with Stochastic Gradient Descent

In many applications involving large dataset or online updating, stochas...
research
06/21/2019

Trade-offs in Large-Scale Distributed Tuplewise Estimation and Learning

The development of cluster computing frameworks has allowed practitioner...
research
11/23/2017

Online and Batch Supervised Background Estimation via L1 Regression

We propose a surprisingly simple model for supervised video background e...
research
05/21/2021

Online Statistical Inference for Parameters Estimation with Linear-Equality Constraints

Stochastic gradient descent (SGD) and projected stochastic gradient desc...
research
02/24/2023

Statistical Inference with Stochastic Gradient Methods under φ-mixing Data

Stochastic gradient descent (SGD) is a scalable and memory-efficient opt...
research
01/31/2023

Patch Gradient Descent: Training Neural Networks on Very Large Images

Traditional CNN models are trained and tested on relatively low resoluti...
research
01/21/2011

A fast and recursive algorithm for clustering large datasets with k-medians

Clustering with fast algorithms large samples of high dimensional data i...

Please sign up or login with your details

Forgot password? Click here to reset