You Only Compress Once: Optimal Data Compression for Estimating Linear Models

02/22/2021
by   Jeffrey Wong, et al.
0

Linear models are used in online decision making, such as in machine learning, policy algorithms, and experimentation platforms. Many engineering systems that use linear models achieve computational efficiency through distributed systems and expert configuration. While there are strengths to this approach, it is still difficult to have an environment that enables researchers to interactively iterate and explore data and models, as well as leverage analytics solutions from the open source community. Consequently, innovation can be blocked. Conditionally sufficient statistics is a unified data compression and estimation strategy that is useful for the model development process, as well as the engineering deployment process. The strategy estimates linear models from compressed data without loss on the estimated parameters and their covariances, even when errors are autocorrelated within clusters of observations. Additionally, the compression preserves almost all interactions with the the original data, unlocking better productivity for both researchers and engineering systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2019

Efficient Computation of Linear Model Treatment Effects in an Experimentation Platform

Linear models are a core component for statistical software that analyze...
research
08/27/2018

Empirical likelihood for linear models with spatial errors

For linear models with spatial errors, the empirical likelihood ratio st...
research
11/01/2018

Score-Matching Representative Approach for Big Data Analysis with Generalized Linear Models

We propose a fast and efficient strategy, called the representative appr...
research
09/16/2022

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

The quality of generalized linear models (GLMs), frequently used by insu...
research
09/25/2019

Optimally Compressed Nonparametric Online Learning

Batch training of machine learning models based on neural networks is no...
research
03/02/2017

Active Learning for Accurate Estimation of Linear Models

We explore the sequential decision making problem where the goal is to e...
research
06/08/2020

Learning the Truth From Only One Side of the Story

Learning under one-sided feedback (i.e., where examples arrive in an onl...

Please sign up or login with your details

Forgot password? Click here to reset