Evaluation of Error Probability of Classification Based on the Analysis of the Bayes Code
Suppose that we have two training sequences generated by parametrized distributions P_θ_1^* and P_θ_2^*, where θ_1^* and θ_2^* are unknown. Given training sequences, we study the problem of classifying whether a test sequence was generated according to P_θ_1^* or P_θ_2^*. This problem can be thought of as a hypothesis testing problem and the weighted sum of type I error and type II error is analyzed. To prove the result, we utilize the analysis of the codeword lengths of the Bayes code. It is shown that the bound of the probability of error is characterized by the terms involving Rényi divergence, the dimension of a parameter space, and the ratio of the length between the training sequences and the test sequence.
READ FULL TEXT