Optimal Mean Estimation without a Variance

We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. Concretely, given a sample 𝐗 = {X_i}_i = 1^n from a distribution 𝒟 over ℝ^d with mean μ which satisfies the following weak-moment assumption for some α∈ [0, 1]: ∀v = 1: 𝔼_X 𝒟[|⟨ X - μ, v⟩|^1 + α] ≤ 1, and given a target failure probability, δ, our goal is to design an estimator which attains the smallest possible confidence interval as a function of n,d,δ. For the specific case of α = 1, foundational work of Lugosi and Mendelson exhibits an estimator achieving subgaussian confidence intervals, and subsequent work has led to computationally efficient versions of this estimator. Here, we study the case of general α, and establish the following information-theoretic lower bound on the optimal attainable confidence interval: Ω(√(d/n) + (d/n)^α/(1 + α) + (log 1 / δ/n)^α/(1 + α)). Moreover, we devise a computationally-efficient estimator which achieves this lower bound.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2020

Is distribution-free inference possible for binary regression?

For a regression problem with a binary label response, we examine the pr...
research
08/05/2022

Catoni-style Confidence Sequences under Infinite Variance

In this paper, we provide an extension of confidence sequences for setti...
research
06/18/2020

A Framework for Sample Efficient Interval Estimation with Control Variates

We consider the problem of estimating confidence intervals for the mean ...
research
01/25/2019

Communication Complexity of Estimating Correlations

We characterize the communication complexity of the following distribute...
research
11/30/2018

Prior-free Data Acquisition for Accurate Statistical Estimation

We study a data analyst's problem of acquiring data from self-interested...
research
10/20/2022

A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean

A confidence sequence (CS) is an anytime-valid sequential inference prim...
research
04/18/2019

Efficient two-sample functional estimation and the super-oracle phenomenon

We consider the estimation of two-sample integral functionals, of the ty...

Please sign up or login with your details

Forgot password? Click here to reset