DeepAI AI Chat
Log In Sign Up

A Statistical Analysis of Compositional Surveys

by   Michelle Pistner Nixon, et al.
Penn State University

A common statistical problem is inference from positive-valued multivariate measurements where the scale (e.g., sum) of the measurements are not representative of the scale (e.g., total size) of the system being studied. This situation is common in the analysis of modern sequencing data. The field of Compositional Data Analysis (CoDA) axiomatically states that analyses must be invariant to scale. Yet, many scientific questions rely on the unmeasured system scale for identifiability. Instead, many existing tools make a wide variety of assumptions to identify models, often imputing the unmeasured scale. Here, we analyze the theoretical limits on inference given these data and formalize the assumptions required to provide principled scale reliant inference. Using statistical concepts such as consistency and calibration, we show that we can provide guidance on how to make scale reliant inference from these data. We prove that the Frequentist ideal is often unachievable and that existing methods can demonstrate bias and a breakdown of Type-I error control. We introduce scale simulation estimators and scale sensitivity analysis as a rigorous, flexible, and computationally efficient means of performing scale reliant inference.


page 1

page 2

page 3

page 4


A Guideline for the Statistical Analysis of Compositional Data in Immunology

The study of immune cellular composition is of great scientific interest...

LinDA: Linear Models for Differential Abundance Analysis of Microbiome Compositional Data

One fundamental statistical task in microbiome data analysis is differen...

Radically Compositional Cognitive Concepts

Despite ample evidence that our concepts, our cognitive architecture, an...

Compositional Active Inference I: Bayesian Lenses. Statistical Games

We introduce the concepts of Bayesian lens, characterizing the bidirecti...

Generalizations to Corrections for the Effects of Measurement Error in Approximately Consistent Methodologies

Measurement error is a pervasive issue which renders the results of an a...

A causal view on compositional data

Many scientific datasets are compositional in nature. Important examples...

Statistical Inference for Coadded Astronomical Images

Coadded astronomical images are created by stacking multiple single-expo...