# The Structure of the Realizations of the Causal Information Rate-Distortion Function for Markovian Sources: Realizations with Densities

The main purpose of this note is to show that in a realization (x_1^n, y_1^n) of the causal information rate-distortion function (IRDF) for a κ-th order Markovian source x_1^n, under a single letter sum distortion constraint, the smallest integer ℓ for which y_k y_1^k-1,x_k-ℓ+1^k x_1^k-ℓ holds is ℓ=κ. This result is derived under the assumption that the sequences (x_1^n,y_1^n) have a joint probability density function.

## Authors

• 4 publications
• ### Computing the Rate-Distortion Function of Gray-Wyner System

In this paper, the rate-distortion theory of Gray-Wyner lossy source cod...
09/21/2020 ∙ by Guojun Chen, et al. ∙ 0

• ### Gauss-Markov Source Tracking with Side Information: Lower Bounds

We consider the problem of causal source coding and causal decoding of a...
04/17/2020 ∙ by Omri Lev, et al. ∙ 0

• ### Stationarity in the Realizations of the Causal Rate-Distortion Function for One-Sided Stationary Sources

This paper derives novel results on the characterization of the the caus...
04/11/2018 ∙ by Milan S. Derpich, et al. ∙ 0

• ### A Lower Bound on the Expected Distortion of Joint Source-Channel Coding

We consider the classic joint source-channel coding problem of transmitt...
02/21/2019 ∙ by Yuval Kochman, et al. ∙ 0

• ### The Sum-Rate-Distortion Region of Correlated Gauss-Markov Sources

We derive the sum-rate-distortion region for a generic number of success...
04/10/2018 ∙ by Giuseppe Cocco, et al. ∙ 0

• ### Circumventing the Curse of Dimensionality in Prediction: Causal Rate-Distortion for Infinite-Order Markov Processes

Predictive rate-distortion analysis suffers from the curse of dimensiona...
12/09/2014 ∙ by Sarah Marzen, et al. ∙ 0

• ### Joint Nonanticipative Rate Distortion Function for a Tuple of Random Processes with Individual Fidelity Criteria

The joint nonanticipative rate distortion function (NRDF) for a tuple of...
03/29/2021 ∙ by Charalambos D. Charalambous, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Consider the causal information rate-distortion function (IRDF) for a random source , defined as

 Ritc,n(D)\eq1ninfI(\rvaxn1;\rvayn1), (1)

where the minimization is over all conditional PDFs satisfying the distortion constraint

 1n\Expe\sumfromtoi=1nρ(\rvaxi,\rvayi)≤D (2)

and the causality Markov chains

 \rvayi1⟷\rvaxi1⟷\rvayni+1,\fspacei=1,…,n. (3)

If the infimum is achieved by some conditional distribution, the associated pair of sequences is called a realization of . Here we assume that such distribution exists and that the corresponding realization has a joint PDF. This assumption is satisfied if, for example, is Gaussian and .

The first purpose of this note is to show that in a realization of the causal IRDF for a -th order Markovian source , under the average distortion constraint (2), and supposing that in such realization the sequences have a joint PDF, it holds that

 f\rvayk|\rvaxn1,\rvayk−11(yk|xn1,yk−11) =\expo−sρ(xk,yk)˘Fk(xkk−κ+1,yk1)∫\expo−sρ(xk,yk)˘Fk(xkk−κ+1,yk1)dyk (4a) where f\rvaxn1 is the PDF of \rvaxn1 and ˘Fk(xkk−κ+1,yk1) =\expo∫ln(∫\expo−sρ(xk+1,yk+1)˘Fk+1(xk+1k−κ+2,yk+11)dyk+1)f\rvaxnk+1|\rvaxkk−κ+1(xnk+1|xkk−κ+1)dxnk+1 (4b)

The expressions given in (4) are a special case of the ones given by [1, equations (16),(17),(18)] for abstract spaces, where their derivation is not included. The value of our first result resides in that

• We provide a proof for the validity of (4) (absent in [1]).

• In this proof, we pose the causal IRDF optimization problem with as the decision variable (instead of the collection as would be the case in [1] for probability measures having an associated PDF). Accordingly, we impose an explicit causality constraint on , instead of enforcing causality structurally by restricting to be the product of , as done in [1, 2].

The second (and main) goal of this document is to note that from (4a) it is clear that

 \rvayk⟷\rvayk−11,\rvaxkk−κ+1⟷\rvaxk−κ1 (5)

holds, and that

 \rvayk⟷\rvayk−11,\rvaxk⟷\rvaxk−11 (6)

does not hold, except for . Crucially, (6) does not become true by supposing that the joint PDF of is stationary, thus contradicting [2, Remark IV.5] and what is stated in the discussion paragraph at the end of [1, Section V].

## Ii Proof

The causal IRDF under the above conditions is yielded by the solution to the following optimization problem:

 minimize: I(\rvaxn1;\rvayn1) (7a) subject to: (∫f\rvayn1|\rvaxn1(yn1|xn1)dyn1−1)f\rvaxn1(xn1)=0,\fspace∀xn1 (7b) ∬f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)\sumfromtok=1nρ(xk,yk)dyn1dxn1≤D (7c) (f\rvayk1|\rvaxk1(yk1|xk1)−f\rvayk1|\rvaxn1(yk1|xn1))f\rvaxn1(xn1)=0,\fspace∀yk1,xn1,k=1,…,n. (7d)

where the minimization is over the conditional PDF . Notice that (7d) is an explicit causality constraint equivalent to (3).

Let be any conditional PDF, and define

 g\rvayn1|\rvaxn1 \eq(f′\rvayn1|\rvaxn1−f\rvayn1|\rvaxn1) (8) g\rvayn1(yn1) \eq∫g\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)dxn1 (9) f\e\rvayn1|\rvaxn1 \eqf\rvayn1|\rvaxn1+\eg\rvayn1|\rvaxn1 (10) f\e\rvayn1(yn1) \eq∫f\e\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)dxn1 (11)

where .

Before writing the Lagrangian and taking its Gateaux differential, let us obtain the Gateaux differential of in the direction , given by

 dI(\rvaxn1;\rvayn1)d\e∣∣\e=0 =dd\e⎡⎢⎣∬f\e\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)ln⎛⎜⎝f\e\rvayn1|\rvaxn1(yn1|xn1)f\e\rvayn1(yn1)⎞⎟⎠dyn1dxn1⎤⎥⎦∣∣∣\e=0 (12) =∬g\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)ln(f\rvayn1|\rvaxn1(yn1|xn1)f\rvayn1(yn1))dyn1dxn1+R (13)

where

 R \eq∬f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)(g\rvayn1|\rvaxn1(yn1|xn1)f\rvayn1|\rvaxn1(yn1|xn1)−g\rvayn1(yn1)f\rvayn1(yn1))dyn1dxn1 (14) =∬g\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)dyn1dxn1−∬f\rvayn1,\rvaxn1(yn1,xn1)g\rvayn1(yn1)f\rvayn1(yn1)dyn1dxn1 (15) =∫g\rvayn1(yn1)dyn1−∫g\rvayn1(yn1)f\rvayn1(yn1)(∫f\rvayn1,\rvaxn1(yn1,xn1)dxn1)dyn1 (16) =0 (17)

On the other hand, for each , the causality constraint (7d) appears in the Lagrangian as

 ∬ λi(xn1,yi1)[f\rvayi1|\rvaxi1(yi1|xi1)−f\rvayi1|\rvaxn1(yi1|xn1)]f\rvaxn1(xn1)dyi1dxn1 (18) = ∬λi(xn1,yi1)(∫[f\rvayn1|\rvaxi1(yn1|xi1)−f\rvayn1|\rvaxn1(yn1|xn1)]dyni+1)f\rvaxn1(xn1)dyi1dxn1 (19) = ∫(∫λi(xn1,yi1)f\rvayn1|\rvaxi1(yn1|xi1)f\rvaxn1(xn1)dxn1−∫λi(xn1,yi1)f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)dxn1)dyn1 (20)

It will be convenient to manipulate this expression so as to give it a structure similar to the other terms in the Lagrangian. For this purpose, notice that

 ∫λi(xn1,yi1)f\rvayn1|\rvaxi1(yn1|xi1) f\rvaxn1(xn1)dxn1 (21) =∫λi(xn1,yi1)f\rvayn1,\rvaxi1(yn1,xi1)f\rvaxni+1|\rvaxi1(xni+1|xi1)dxn1 (22) =∫f\rvayn1,\rvaxi1(yn1,xi1)(∫λi(xn1,yi1)f\rvaxni+1|\rvaxi1(xni+1|xi1)dxni+1)dxi1 (23) =∫f\rvayn1,\rvaxi1(yn1,xi1)¯λ(xi1,yi1)dxi1 (24) =∫(∫f\rvayn1,\rvaxn1(yn1,xn1)dxni+1)¯λ(xi1,yi1)dxi1 (25) =∫f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)¯λ(xi1,yi1)dxn1 (26)

where

 ¯λi(xi1,yi1)\eq∫λi(xn1,yi1)f\rvaxni+1|\rvaxi1(xni+1|xi1)dxni+1,\fspacei=1,…,n. (27)

Substituting this into (20) we obtain

 ∫ λi(xn1,yi1)(f\rvayi1|\rvaxi1(yi1|xi1)−f\rvayi1|\rvaxn1(yi1|xn1))f\rvaxn1(xn1)dyi1dxn1 (28) =∫(¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)dyn1dxn1 (29)

We can now write the Lagrangian associated with optimization problem (7) as

 \Lsp(f\rvayn1|\rvaxn1) \eqI(\rvaxn1;\rvayn1)+∫η(xn1)(∫f\rvayn1|\rvaxn1(yn1|xn1)dyn1−1)f\rvaxn1(xn1)dxn1 (30) +s(∫f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)(\sumfromtoi=1nρ(xi,yi))dxn1dyn1−D) (31) +\Sumfromtoi=1n∫(¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1)dyn1dxn1 (32)

From the theory of Lagrangian optimization on vector spaces

[3], is a solution to Optimization Problem (7) only if

 0 =dd\e\Lsp(f\e\rvayn1|\rvaxn1)∣∣\e=0 (33) =\Sumoveryn1,xn1[ln(f\rvayn1|\rvaxn1(yn1|xn1)f\rvayn1(yn1))+η(xn1)+\sumfromtoi=1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))] \fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace\fspace×g\rvayn1|\rvaxn1(yn1|xn1)f\rvaxn1(xn1) (34)

for every function as defined in (8), i.e., for every conditional PDF . This holds if and only if for every :

 ln(f\rvayn1|\rvaxn1(yn1|xn1)f\rvayn1(yn1)) =−η(xn1)−\sumfromtoi=1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1)) (35) ⟺f\rvayn1|\rvaxn1(yn1|xn1) =\expo−η(xn1)−\sumfromtoi=1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1(yn1) (36)

The Lagrange multiplier function must enforce the constraint (7b). Hence,

 f\rvayn1|\rvaxn1(yn1|xn1) =\expo−\sumfromtoi=1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1(yn1)K1(xn1), (37)

where

 K1(xn1) \eq∫\expo−\sumfromtoi=1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1(yn1)dyn1 (38)

Marginalizing over we obtain

 f\rvayk1|\rvaxn1(yk1|xn1) =\expo−\sumfromtoi=1k(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))∫\expo−\sumfromtoi=k+1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1(yn1)dynk+1K1(xn1) (39)

Using Bayes’ rule we can write

 f\rvayk|\rvaxn1,yk−11(yk|xn1,yk−11) =f\rvayk1|\rvaxn1(yk1|xn1)f\rvayk−11|\rvaxn1(yk−11|xn1) (40) =\expo−sρ(xk,yk)Fk(xn1,yk1)∫\expo−sρ(xk,yk)Fk(xn1,yk1)dyk (41)

where

 Fk(xn1,yk1) \eq\expo−(¯λk(xk1,yk1)−λk(xn1,yk1))∫\expo−\sumfromtoi=k+1n(sρ(xi,yi)+¯λi(xi1,yi1)−λi(xn1,yi1))f\rvayn1(yn1)dynk+1 (42)

These functions can be written recursively as

 Fn(yn1) =f\rvayn1(yn1) (43a) Fk(xn1,yk1) =\expo−(¯λk(xk1,yk1)−λk(xn1,yk1))∫\expo−sρ(xk+1,yk+1)Fk+1(xn1,yk+11)dyk+1 (43b)

In order attain causality in (41), the functions must depend only on and . Since for each , the function does not depend on terms with , the causality constraint is met if and only if we choose in (43b) such that, for each

 Fk(xn1,yk1)=\expo−(¯λi(xk1,yk1)−λi(xn1,yk1))∫\expo−sρ(xk+1,yk+1)Fk+1(xn1,yk+11)dyk+1=˘Fk(xk1,yk1) (44)

for some function .

For , the causality constraint is satisfied automatically since (see (43a)).111 This reflects the fact that there is no need to enforce the causality constraint for , since there are no source samples for time . Suppose now that (44) (i.e., causality) is satisfied for , for some . In such case, one can replace in (44) by and, defining

 Kk+1(xk+11,yk1)\eq∫\expo−sρ(xk+1,yk+1)˘Fk+1(xk+11,yk+11)dyk+1,

write (44) as

 ¯λk(xk1,yk1)−λk(xn1,yk1) =lnKk+1(xn1,yk1)−ln˘Fk(xk1,yk1) (45)

Multiplying both sides by and integrating over we obtain

 0 =∫(¯λk(xk1,yk1)−λk(xn1,yk1))f\rvaxnk+1|\rvaxk1(xnk+1|xk1)dxnk+1 (46) =∫(lnKk+1(xn1,yk1)−ln˘Fk(xk1,yk1))f\rvaxnk+1|\rvaxk1(xn