A robust multivariate linear non-parametric maximum likelihood model for ties

02/23/2021 ∙ by Landon Hurley, et al. ∙ 0

Statistical analysis in applied research, across almost every field (e.g., biomedical, economics, computer science, and psychological) makes use of samples upon which the explicit error distribution of the dependent variable is unknown or, at best, difficult to linearly model. Yet, these assumptions are extremely common. Unknown distributions are of course biased when incorrectly specified, compromising the generalisability of our interpretations – the linearly unbiased Euclidean distance is very difficult to correctly identify upon finite samples and therefore results in an estimator which is neither unbiased nor maximally informative when incorrectly applied. The alternative common solution to the problem however, the use of non-parametric statistics, has its own fundamental flaws. In particular, these flaws revolve around the problem of order-statistics and the estimation in the presence of ties, which often removes the introduction of multiple independent variables and the estimation of interactions. We introduce a competitor to the Euclidean norm, the Kemeny norm, which we prove to be a valid Banach space, and construct a multivariate linear expansion of the Kendall-Theil-Sen estimator, which performs without compromising the parameter space extensibility, and establish its linear maximum likelihood properties. Empirical demonstrations upon both simulated and empirical data shall be used to demonstrate these properties, such that the new estimator is nearly equivalent in power for the glm upon Gaussian data, but grossly superior for a vast array of analytic scenarios, including finite ordinal sum-score analysis, thereby aiding in the resolution of replication in the Applied Sciences.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.