Data-adaptive trimming of the Hill estimator and detection of outliers in the extremes of heavy-tailed data

08/23/2018
by   Shrijita Bhattacharya, et al.
0

We introduce a trimmed version of the Hill estimator for the index of a heavy-tailed distribution, which is robust to perturbations in the extreme order statistics. In the ideal Pareto setting, the estimator is essentially finite-sample efficient among all unbiased estimators with a given strict upper break-down point. For general heavy-tailed models, we establish the asymptotic normality of the estimator under second order regular variation conditions and also show it is minimax rate-optimal in the Hall class of distributions. We also develop an automatic, data-driven method for the choice of the trimming parameter which yields a new type of robust estimator that can adapt to the unknown level of contamination in the extremes. This adaptive robustness property makes our estimator particularly appealing and superior to other robust estimators in the setting where the extremes of the data are contaminated. As an important application of the data-driven selection of the trimming parameters, we obtain a methodology for the principled identification of extreme outliers in heavy tailed data. Indeed, the method has been shown to correctly identify the number of outliers in the previously explored Condroz data set.

READ FULL TEXT
research
10/16/2021

A Reduced-Bias Weighted least square estimation of the Extreme Value Index

In this paper, we propose a reduced-bias estimator of the EVI for Pareto...
research
04/18/2018

Estimation of the extreme value index in a censorship framework: asymptotic and finite sample behaviour

We revisit the estimation of the extreme value index for randomly censor...
research
12/20/2017

Extreme Value Analysis Without the Largest Values: What Can Be Done?

In this paper we are concerned with the analysis of heavy-tailed data wh...
research
12/05/2019

Outlier detection and a tail-adjusted boxplot based on extreme value theory

Whether an extreme observation is an outlier or not, depends strongly on...
research
06/09/2023

Two-level histograms for dealing with outliers and heavy tail distributions

Histograms are among the most popular methods used in exploratory analys...
research
05/22/2023

Robust heavy-tailed versions of generalized linear models with applications in actuarial science

Generalized linear models (GLMs) form one of the most popular classes of...
research
07/04/2020

Estimating Extreme Value Index by Subsampling for Massive Datasets with Heavy-Tailed Distributions

Modern statistical analyses often encounter datasets with massive sizes ...

Please sign up or login with your details

Forgot password? Click here to reset