Breiman's two cultures: You don't have to choose sides

by   Andrew C. Miller, et al.

Breiman's classic paper casts data analysis as a choice between two cultures: data modelers and algorithmic modelers. Stated broadly, data modelers use simple, interpretable models with well-understood theoretical properties to analyze data. Algorithmic modelers prioritize predictive accuracy and use more flexible function approximations to analyze data. This dichotomy overlooks a third set of models - mechanistic models derived from scientific theories (e.g., ODE/SDE simulators). Mechanistic models encode application-specific scientific knowledge about the data. And while these categories represent extreme points in model space, modern computational and algorithmic tools enable us to interpolate between these points, producing flexible, interpretable, and scientifically-informed hybrids that can enjoy accurate and robust predictions, and resolve issues with data analysis that Breiman describes, such as the Rashomon effect and Occam's dilemma. Challenges still remain in finding an appropriate point in model space, with many choices on how to compose model components and the degree to which each component informs inferences.


page 1

page 2

page 3

page 4


Neuro-Fuzzy Algorithmic (NFA) Models and Tools for Estimation

Accurate estimation such as cost estimation, quality estimation and risk...

Ordered Sets for Data Analysis

This book dwells on mathematical and algorithmic issues of data analysis...

An interpretable neural network model through piecewise linear approximation

Most existing interpretable methods explain a black-box model in a post-...

PoPPy: A Point Process Toolbox Based on PyTorch

PoPPy is a Point Process toolbox based on PyTorch, which achieves flexib...

Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

Over the last two decades, the field of computational science has seen a...

Algorithmic Bias in Recidivism Prediction: A Causal Perspective

ProPublica's analysis of recidivism predictions produced by Correctional...

code::proof: Prepare for most weather conditions

Computational tools for data analysis are being released daily on reposi...