Mismatched Guesswork
We study the problem of mismatched guesswork, where we evaluate the number of symbols y ∈Y which have higher likelihood than X ∼μ according to a mismatched distribution ν. We discuss the role of the tilted/exponential families of the source distribution μ and of the mismatched distribution ν. We show that the value of guesswork can be characterized using the tilted family of the mismatched distribution ν, while the probability of guessing is characterized by an exponential family which passes through μ. Using this characterization, we demonstrate that the mismatched guesswork follows a large deviation principle (LDP), where the rate function is described implicitly using information theoretic quantities. We apply these results to one-to-one source coding (without prefix free constraint) to obtain the cost of mismatch in terms of average codeword length. We show that the cost of mismatch in one-to-one codes is no larger than that of the prefix-free codes, i.e., D(μν). Further, the cost of mismatch vanishes if and only if ν lies on the tilted family of the true distribution μ, which is in stark contrast to the prefix-free codes. These results imply that one-to-one codes are inherently more robust to mismatch.
READ FULL TEXT