Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

01/07/2022
by   Eunice Jun, et al.
7

Proper statistical modeling incorporates domain theory about how concepts relate and details of how data were measured. However, data analysts currently lack tool support for recording and reasoning about domain assumptions, data collection, and modeling choices in an integrated manner, leading to mistakes that can compromise scientific validity. For instance, generalized linear mixed-effects models (GLMMs) help answer complex research questions, but omitting random effects impairs the generalizability of results. To address this need, we present Tisane, a mixed-initiative system for authoring generalized linear models with and without mixed-effects. Tisane introduces a study design specification language for expressing and asking questions about relationships between variables. Tisane contributes an interactive compilation process that represents relationships in a graph, infers candidate statistical models, and asks follow-up questions to disambiguate user queries to construct a valid model. In case studies with three researchers, we find that Tisane helps them focus on their goals and assumptions while avoiding past mistakes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2018

Analysis of a longitudinal multilevel experiment using GAMLSSs

The standard procedures for analysing hierarquical or grouped data are b...
research
08/11/2023

Learning Bayesian Networks with Heterogeneous Agronomic Data Sets via Mixed-Effect Models and Hierarchical Clustering

Research involving diverse but related data sets, where associations bet...
research
11/19/2019

Symbolic Formulae for Linear Mixed Models

A statistical model is a mathematical representation of an often simplif...
research
09/28/2022

Model Specification in Mixed-Effects Models: A Focus on Random Effects

Mixed-effect models are flexible tools for researchers in a myriad of me...
research
11/20/2018

QuaRel: A Dataset and Models for Answering Questions about Qualitative Relationships

Many natural language questions require recognizing and reasoning with q...
research
07/16/2020

Coefficients of Determination for Mixed-Effects Models

In consistency with the law of total variance, the coefficient of determ...
research
02/23/2023

Clustering Hierarchies via a Semi-Parametric Generalized Linear Mixed Model: a statistical significance-based approach

We introduce a novel statistical significance-based approach for cluster...

Please sign up or login with your details

Forgot password? Click here to reset