Entropy-based Characterization of Modeling Constraints

06/27/2022
by   Orestis Loukas, et al.
0

In most data-scientific approaches, the principle of Maximum Entropy (MaxEnt) is used to a posteriori justify some parametric model which has been already chosen based on experience, prior knowledge or computational simplicity. In a perpendicular formulation to conventional model building, we start from the linear system of phenomenological constraints and asymptotically derive the distribution over all viable distributions that satisfy the provided set of constraints. The MaxEnt distribution plays a special role, as it is the most typical among all phenomenologically viable distributions representing a good expansion point for large-N techniques. This enables us to consistently formulate hypothesis testing in a fully-data driven manner. The appropriate parametric model which is supported by the data can be always deduced at the end of model selection. In the MaxEnt framework, we recover major scores and selection procedures used in multiple applications and assess their ability to capture associations in the data-generating process and identify the most generalizable model. This data-driven counterpart of standard model selection demonstrates the unifying prospective of the deductive logic advocated by MaxEnt principle, while potentially shedding new insights to the inverse problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2020

Robust Hypothesis Testing and Model Selection for Parametric Proportional Hazard Regression Models

The semi-parametric Cox proportional hazards regression model has been w...
research
07/24/2023

Adaptive debiased machine learning using data-driven model selection techniques

Debiased machine learning estimators for nonparametric inference of smoo...
research
04/07/2022

Categorical Distributions of Maximum Entropy under Marginal Constraints

The estimation of categorical distributions under marginal constraints s...
research
05/12/2023

Distribution free MMD tests for model selection with estimated parameters

Several kernel based testing procedures are proposed to solve the proble...
research
08/02/2017

Dirichlet Bayesian Network Scores and the Maximum Entropy Principle

A classic approach for learning Bayesian networks from data is to select...
research
11/05/2010

Model Selection by Loss Rank for Classification and Unsupervised Learning

Hutter (2007) recently introduced the loss rank principle (LoRP) as a ge...
research
05/07/2020

On a computationally-scalable sparse formulation of the multidimensional and non-stationary maximum entropy principle

Data-driven modelling and computational predictions based on maximum ent...

Please sign up or login with your details

Forgot password? Click here to reset