Rate-Distortion Theory for General Sets and Measures
This paper is concerned with a rate-distortion theory for sequences of i.i.d. random variables with general distribution supported on general sets including manifolds and fractal sets. Manifold structures are prevalent in data science, e.g., in compressed sensing, machine learning, image processing, and handwritten digit recognition. Fractal sets find application in image compression and in modeling of Ethernet traffic. We derive a lower bound on the (single-letter) rate-distortion function that applies to random variables X of general distribution and for continuous X reduces to the classical Shannon lower bound. Moreover, our lower bound is explicit up to a parameter obtained by solving a convex optimization problem in a nonnegative real variable. The only requirement for the bound to apply is the existence of a sigma-finite reference measure for X satisfying a certain subregularity condition. This condition is very general and prevents the reference measure from being highly concentrated on balls of small radii. To illustrate the wide applicability of our result, we evaluate the lower bound for a random variable distributed uniformly on a manifold, namely, the unit circle, and a random variable distributed uniformly on a self-similar set, namely, the middle third Cantor set.
READ FULL TEXT