Tea: A High-level Language and Runtime System for Automating Statistical Analysis

04/10/2019
by   Eunice Jun, et al.
0

Though statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests, and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2021

Statistical quantification of confounding bias in predictive modelling

The lack of non-parametric statistical tests for confounding bias signif...
research
10/16/2017

Checking the Soundness of Statistical Tests for Random Number Generators by Using a Three-Level Test

Statistical tests of pseudorandom number generators (PRNGs) are applicab...
research
04/06/2021

Hypothesis Formalization: Empirical Findings, Software Limitations, and Design Implications

Data analysis requires translating higher level questions and hypotheses...
research
05/25/2020

Keyed Non-Parametric Hypothesis Tests

The recent popularity of machine learning calls for a deeper understandi...
research
11/22/2022

Optimal design of the Wilcoxon-Mann-Whitney-test

In scientific research, many hypotheses relate to the comparison of two ...
research
04/29/2022

A Grammar for Hypothesis-Driven Visual Analysis

A hallmark of visual analytics is its ability to support users in transl...
research
08/13/2018

DeepBase: Deep Inspection of Neural Networks

Although deep learning models perform remarkably across a range of tasks...

Please sign up or login with your details

Forgot password? Click here to reset