Supervised learning with probabilistic morphisms and kernel mean embeddings
In this paper I propose a concept of a correct loss function in a generative model of supervised learning for an input space 𝒳 and a label space 𝒴, both of which are measurable spaces. A correct loss function in a generative model of supervised learning must accurately measure the discrepancy between elements of a hypothesis space ℋ of possible predictors and the supervisor operator, even when the supervisor operator does not belong to ℋ. To define correct loss functions, I propose a characterization of a regular conditional probability measure μ_𝒴|𝒳 for a probability measure μ on 𝒳×𝒴 relative to the projection Π_𝒳: 𝒳×𝒴→𝒳 as a solution of a linear operator equation. If 𝒴 is a separable metrizable topological space with the Borel σ-algebra ℬ (𝒴), I propose an additional characterization of a regular conditional probability measure μ_𝒴|𝒳 as a minimizer of mean square error on the space of Markov kernels, referred to as probabilistic morphisms, from 𝒳 to 𝒴. This characterization utilizes kernel mean embeddings. Building upon these results and employing inner measure to quantify the generalizability of a learning algorithm, I extend a result due to Cucker-Smale, which addresses the learnability of a regression model, to the setting of a conditional probability estimation problem. Additionally, I present a variant of Vapnik's regularization method for solving stochastic ill-posed problems, incorporating inner measure, and showcase its applications.
READ FULL TEXT