A framework for statistical modelling of the extremes of longitudinal data, applied to elite swimming

by   Harry Spearing, et al.

We develop methods, based on extreme value theory, for analysing observations in the tails of longitudinal data, i.e., a data set consisting of a large number of short time series, which are typically irregularly and non-simultaneously sampled, yet have some commonality in the structure of each series and exhibit independence between time series. Extreme value theory has not been considered previously for the unique features of longitudinal data. Across time series the data are assumed to follow a common generalised Pareto distribution, above a high threshold. To account for temporal dependence of such data we require a model to describe (i) the variation between the different time series properties, (ii) the changes in distribution over time, and (iii) the temporal dependence within each series. Our methodology has the flexibility to capture both asymptotic dependence and asymptotic independence, with this characteristic determined by the data. Bayesian inference is used given the need for inference of parameters that are unique to each time series. Our novel methodology is illustrated through the analysis of data from elite swimmers in the men's 100m breaststroke. Unlike previous analyses of personal-best data in this event, we are able to make inference about the careers of individual swimmers - such as the probability an individual will break the world record or swim the fastest time next year.


Time series conditional extremes

Accurate modelling of the joint extremal dependence structure within a s...

Ranking, and other Properties, of Elite Swimmers using Extreme Value Theory

The International Swimming Federation (FINA) uses a very simple points s...

Integrative analysis of time course metabolic data and biomarker discovery

Metabonomics time-course experiments provide the opportunity to understa...

Conformal Prediction Intervals with Temporal Dependence

Cross-sectional prediction is common in many domains such as healthcare,...

Panel data analysis via mechanistic models

Panel data, also known as longitudinal data, consist of a collection of ...

Changepoint Analysis of Topic Proportions in Temporal Text Data

Changepoint analysis deals with unsupervised detection and/or estimation...

A hierarchical Bayesian non-asymptotic extreme value model for spatial data

Spatial maps of extreme precipitation are crucial in flood protection. W...

Please sign up or login with your details

Forgot password? Click here to reset