Calibration of P-values for calibration and for deviation of a subpopulation from the full population

01/31/2022
by   Mark Tygert, et al.
0

The author's recent research papers, "Cumulative deviation of a subpopulation from the full population" and "A graphical method of cumulative differences between two subpopulations" (both published in volume 8 of Springer's open-access "Journal of Big Data" during 2021), propose graphical methods and summary statistics, without extensively calibrating formal significance tests. The summary metrics and methods can measure the calibration of probabilistic predictions and can assess differences in responses between a subpopulation and the full population while controlling for a covariate or score via conditioning on it. These recently published papers construct significance tests based on the scalar summary statistics, but only sketch how to calibrate the attained significance levels (also known as "P-values") for the tests. The present article reviews and synthesizes work spanning many decades in order to detail how to calibrate the P-values. The present paper presents computationally efficient, easily implemented numerical methods for evaluating properly calibrated P-values, together with rigorous mathematical proofs guaranteeing their accuracy, and illustrates and validates the methods with open-source software and numerical examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2021

Cumulative differences between subpopulations

Comparing the differences in outcomes (that is, in "dependent variables"...
research
05/18/2023

Cumulative differences between paired samples

The simplest, most common paired samples consist of observations from tw...
research
08/04/2020

Plotting the cumulative deviation of a subgroup from the full population as a function of score

Assessing whether a subgroup of a full population is getting treated equ...
research
05/19/2022

Metrics of calibration for probabilistic predictions

Predictions are often probabilities; e.g., a prediction could be for pre...
research
02/13/2019

A nonparametric graphical tests of significance in functional GLM

A new nonparametric graphical test of significance of a covariate in fun...
research
06/03/2020

One Step to Efficient Synthetic Data

We propose a general method of producing synthetic data, which is widely...
research
12/01/2021

Controlling for multiple covariates

A fundamental problem in statistics is to compare the outcomes attained ...

Please sign up or login with your details

Forgot password? Click here to reset