Asymptotic Normality for Plug-in Estimators of Generalized Shannon's Entropy
Shannon's entropy is one of the building blocks of information theory and an essential aspect of Machine Learning methods (e.g., Random Forests). Yet, it is only finitely defined for distributions with fast decaying tails on a countable alphabet. The unboundedness of Shannon's entropy over the general class of all distributions on an alphabet prevents its potential utility from being fully realized. To fill the void in the foundation of information theory, Zhang (2020) proposed generalized Shannon's entropy, which is finitely defined everywhere. The plug-in estimator, adopted in almost all entropy-based ML method packages, is one of the most popular approaches to estimating Shannon's entropy. The asymptotic distribution for Shannon's entropy's plug-in estimator was well studied in the existing literature. This paper studies the asymptotic properties for the plug-in estimator of generalized Shannon's entropy on countable alphabets. The developed asymptotic properties require no assumptions on the original distribution. The proposed asymptotic properties allow interval estimation and statistical tests with generalized Shannon's entropy.
READ FULL TEXT