confidence-planner: Easy-to-Use Prediction Confidence Estimation and Sample Size Planning

01/12/2023
by   Antoni Klorek, et al.
0

Machine learning applications, especially in the fields of me­di­cine and social sciences, are slowly being subjected to increasing scrutiny. Similarly to sample size planning performed in clinical and social studies, lawmakers and funding agencies may expect statistical uncertainty estimations in machine learning applications that impact society. In this paper, we present an easy-to-use python package and web application for estimating prediction confidence intervals. The package offers eight different procedures to determine and justify the sample size and confidence of predictions from holdout, bootstrap, cross-validation, and progressive validation experiments. Since the package builds directly on established data analysis libraries, it seamlessly integrates into preprocessing and exploratory data analysis steps. Code related to this paper is available at: https://github.com/dabrze/confidence-planner.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Power Analysis and Sample Size Estimation

This study's first purpose is to provide quantitative evidence that woul...
research
12/29/2020

Statistical Formulas for F Measures

We provide analytic formulas for the standard error and confidence inter...
research
05/28/2021

Distribution-free inference for regression: discrete, continuous, and in between

In data analysis problems where we are not able to rely on distributiona...
research
02/23/2023

RESI: An R Package for Robust Effect Sizes

Effect size indices are useful parameters that quantify the strength of ...
research
09/29/2019

Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters

Time series forecasting is one of the most active research topics. Machi...
research
05/23/2022

Please, Don't Forget the Difference and the Confidence Interval when Seeking for the State-of-the-Art Status

This paper argues for the widest possible use of bootstrap confidence in...
research
05/01/2019

Scalable GWR: A linear-time algorithm for large-scale geographically weighted regression with polynomial kernels

While a number of studies have developed fast geographically weighted re...

Please sign up or login with your details

Forgot password? Click here to reset