Machine Learning using the Variational Predictive Information Bottleneck with a Validation Set
Zellner (1988) modeled statistical inference in terms of information processing and postulated the Information Conservation Principle (ICP) between the input and output of the information processing block, showing that this yielded Bayesian inference as the optimum information processing rule. Recently, Alemi (2019) reviewed Zellner's work in the context of machine learning and showed that the ICP could be seen as a special case of a more general optimum information processing criterion, namely the Predictive Information Bottleneck Objective. However, Alemi modeled the model training step in machine learning as using training and test data sets only, and did not account for the use of a validation data set during training. The present note is an attempt to extend Alemi's information processing formulation of machine learning, and the predictive information bottleneck objective for model training, to the widely-used scenario where training utilizes not only a training but also a validation data set.
READ FULL TEXT