The N-ary in the Coal Mine: Avoiding Mixture Model Failure with Proper Validation

08/11/2023
by   Travis Maxfield, et al.
0

Modeling the properties of chemical mixtures is a difficult but important part of any modeling process intended to be applicable to the often messy and impure phenomena of everyday life, including food and environmental safety, healthcare, etc. Part of this difficulty stems from the increased complexity of designing suitable model validation schemes for mixture data, a fact which has been elucidated in previous work only in the case of binary mixture models. We extend these previously defined validation strategies for QSAR modeling of binary mixtures to the more complex case of general, N-ary mixtures and argue that these strategies are applicable to many modeling tasks beyond simple chemical mixtures. Additionally, we propose a method of establishing a baseline model performance for each mixture dataset to be in used in model selection comparisons. This baseline is intended to account for the statistical dependence generically present between the properties of mixtures that share constituents. We contend that without such a baseline, estimates of model performance can be dramatically overestimated, and we demonstrate this with multiple case studies using real and simulated data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/08/2017

Baseline Mixture Models for Social Networks

Continuous mixtures of distributions are widely employed in the statisti...
research
06/21/2018

Mixtures of Experts Models

Mixtures of experts models provide a framework in which covariates may b...
research
06/17/2021

Calculation of chemical reactions in electrophoresis

The main goal of the work is to find stationary solutions of the equatio...
research
01/29/2020

Machine Learning in Thermodynamics: Prediction of Activity Coefficients by Matrix Completion

Activity coefficients, which are a measure of the non-ideality of liquid...
research
07/05/2023

Bayesian D- and I-optimal designs for choice experiments involving mixtures and process variables

Many food products involve mixtures of ingredients, where the mixtures c...
research
10/25/2021

On Learning Prediction-Focused Mixtures

Probabilistic models help us encode latent structures that both model th...
research
12/02/2015

Object-based World Modeling in Semi-Static Environments with Dependent Dirichlet-Process Mixtures

To accomplish tasks in human-centric indoor environments, robots need to...

Please sign up or login with your details

Forgot password? Click here to reset