Consider the causal information rate-distortion function (IRDF) for a random source , defined as
where the minimization is over all conditional PDFs satisfying the distortion constraint
and the causality Markov chains
If the infimum is achieved by some conditional distribution, the associated pair of sequences is called a realization of . Here we assume that such distribution exists and that the corresponding realization has a joint PDF. This assumption is satisfied if, for example, is Gaussian and .
The first purpose of this note is to show that in a realization of the causal IRDF for a -th order Markovian source , under the average distortion constraint (2), and supposing that in such realization the sequences have a joint PDF, it holds that
|where is the PDF of and|
The expressions given in (4) are a special case of the ones given by [1, equations (16),(17),(18)] for abstract spaces, where their derivation is not included. The value of our first result resides in that
In this proof, we pose the causal IRDF optimization problem with as the decision variable (instead of the collection as would be the case in  for probability measures having an associated PDF). Accordingly, we impose an explicit causality constraint on , instead of enforcing causality structurally by restricting to be the product of , as done in [1, 2].
The second (and main) goal of this document is to note that from (4a) it is clear that
holds, and that
does not hold, except for . Crucially, (6) does not become true by supposing that the joint PDF of is stationary, thus contradicting [2, Remark IV.5] and what is stated in the discussion paragraph at the end of [1, Section V].
The causal IRDF under the above conditions is yielded by the solution to the following optimization problem:
Let be any conditional PDF, and define
Before writing the Lagrangian and taking its Gateaux differential, let us obtain the Gateaux differential of in the direction , given by
On the other hand, for each , the causality constraint (7d) appears in the Lagrangian as
It will be convenient to manipulate this expression so as to give it a structure similar to the other terms in the Lagrangian. For this purpose, notice that
Substituting this into (20) we obtain
We can now write the Lagrangian associated with optimization problem (7) as
From the theory of Lagrangian optimization on vector spaces, is a solution to Optimization Problem (7) only if
for every function as defined in (8), i.e., for every conditional PDF . This holds if and only if for every :
The Lagrange multiplier function must enforce the constraint (7b). Hence,
Marginalizing over we obtain
Using Bayes’ rule we can write
These functions can be written recursively as
In order attain causality in (41), the functions must depend only on and . Since for each , the function does not depend on terms with , the causality constraint is met if and only if we choose in (43b) such that, for each
for some function .
For , the causality constraint is satisfied automatically since (see (43a)).111 This reflects the fact that there is no need to enforce the causality constraint for , since there are no source samples for time . Suppose now that (44) (i.e., causality) is satisfied for , for some . In such case, one can replace in (44) by and, defining
write (44) as
Multiplying both sides by and integrating over we obtain