1 1. Combination of uncertainties
1.1 1.1. Noncommittal approach
Let the stimulus be an integrable function of one variable that depends on two aspects of stimulation:
Stimulus location on , where can be space or time, the “location” indicating, respectively where or when stimulation occurred.
Stimulus content on , where can be spatial or temporal frequency of stimulus modulation.
We consider a sensory system equipped with many measuring devices, each able to estimate both stimulus location and content from.
It is sometimes assumed that sensory systems know : a case we review in the next section. But in general we do not know
; we only know (or guess) some of its properties, such as its mean value and variance. In particular, let
be the (marginal) means of on dimensions and . Sensory systems can optimize their performance with this minimal knowledge, as follows.
To reduce the chances of making gross errors, we use the following strategy. We find the condition of minimal uncertainty against the profile of maximal uncertainty, i.e., using a minimax approach [von Neumannvon Neumann1928, Luce RaiffaLuce Raiffa1957]. We do so in two steps. First we find such and for which measurement uncertainty is maximal. Then we find the condition at which the function of maximal uncertainty has the smallest value: the minimax point.
We evaluate maximal uncertainty using the well-established definition of entropy [ShannonShannon1948]:
Recall that Shannon’s entropy is sub-additive:
Therefore, we can say that the uncertainty of measurement cannot exceed
Eq. S3 is the “envelope” of maximal measurement uncertainty: a “worst-case” estimate.
By the Boltzmann theorem on maximum-entropy probability distributions[Cover ThomasCover Thomas2006], the maximal entropy of probability densities with fixed means and variances is attained when the functions are Gaussian. Then, maximal entropy is a sum of their variances [Cover ThomasCover Thomas2006]. We obtain
are the standard deviations. And the maximal entropy is simply:
That is, when variances are unknown, maximal uncertainty of measurement is a sum of variances of measurement components.
This is the method used by GepshteinTyukinKubovy2007,GepshteinTykinAlbright2010 in derivations of joint uncertainty and composite uncertainty functions.111For simplicity, GepshteinTykinAlbright2010 use intervals of measurement, rather than interval variances, as estimates of component uncertainties. The authors then found the optimal conditions by looking for minimal values of the uncertainty functions.
1.2 1.2. Top-down approach
Now we assume the system enjoys some knowledge of stimulation, so we can use likelihood as a measure of uncertainty. Suppose we want to derive a combined estimate from two estimates and of some parameter of stimulation. We assume that likelihood functions , , and are continuous, differentiable, and known. Let us first assume that likelihoods are separable:
Then, the most likely value of is
We can use the logarithmic transformation because it is a strictly monotone continuous function on , and hence it does not change maxima of continuous functions.
It is commonly assumed that and are Gaussian functions, or that they are well approximated by Gaussian functions. For example, YuilleBuelthoff1996 assumed that cubic and higher-order terms of the Taylor expansion of
can be neglected, which is equivalent to the assumption of Gaussianity. (We return to this assumption, and also the assumption of separability in a moment.) Then
The latter expression is maximized when its first derivative over is zero. Hence
which is the familiar weighted-average rule of cue combination [CochranCochran1937, Maloney LandyMaloney Landy1989, Clark YuilleClark Yuille1990, Landy, Maloney, Johnsten YoungLandy 1995, Yuille BülthoffYuille Bülthoff1996]. In general, when the number of measurements is greater than two, the combination rule of Eq. S6 becomes
where are such that individual likelihood functions attain their maxima at .
Why is the assumption common that likelihood functions have the simple form of Eq. S5, i.e., are separable and Gaussian? An answer follows from the argument we presented in the previous section. Suppose that one seeks to estimate the likelihood function when its shape is unknown. We saw in the previous section that the least certain estimate is the likelihood function for which the entropy is maximal. Hence, by sub-additivity of entropy (Eq. S2), the least certain estimate of is
as in Eq. S5. Moreover, if the mean values and variances of and are fixed, then the likelihood functions must be Gaussian, by the same argument. Indeed, separable Gaussian likelihood functions are the least certain estimates.
2 2. Resource allocation
In GepshteinTykinAlbright2010 we asked how sensory system ought to allocate their resources in face of uncertainties inherent in measurement and stimulation. We approached this problem in two steps. First, we combined all uncertainties in uncertainty functions: comprehensive descriptions of how quality of measurement varied across conditions of measurement. Second, we proposed how limited resources are to be allocated given the uncertainty functions. Here we illustrate the second step in more detail, using the approach of constrained optimization.
A key requirement of allocation is to optimize reliability (reduce uncertainty) of measurement by many sensors. Satisfying this requirement alone makes the system place all sensors where conditions of measurement are least uncertain, leaving the system unprepared for sensing the stimuli that are useful but whose uncertainty is high. To prevent such gaps of allocation, we propose that minimal requirements should be twofold:
A. Reliability: Prefer low uncertainty.
B. Comprehensiveness: Measure all useful stimuli.
We formalize these requirements as follows. Let:
be the size of measuring device (“receptive field”),
be the uncertainty function associated with measuring devices of different size, and
be the amount of resources allocated across (Eq. the number of cells with receptive fields of size ).
Encouraging reliability. By requirement A, the system is penalized for allocating resources where uncertainty is high. This is achieved, for example, when the cost for placing resources at is
where is a positive constant. The higher the uncertainty at , or the larger the amount of resources allocated to , the higher the cost. Hence the total cost of allocation is:
Functional is minimal when all the detectors are allocated to (i.e., have the size of) at the lowest value of .
Encouraging comprehensiveness. By requirement B, the system is penalized for failing to measure particular stimuli. This is achieved, for example, when the allocation cost is
where is a positive constant. The total penalty of this type is:
Functional is large (infinite) when all resources are allocated to a small vicinity (one point). is small when are large for all .
Prescription of allocation. The total penalty of requirements A and B is
Using standard tools of calculus of variations [<]e.g., ¿Elsgolc1961 we find such function that minimizes . In particular, we consider a variation of with respect to changes of :
Because at optimal the value of is zero for all , we deduce that conditions of optimality are:
In other words
This is the prescription of optimal allocation.
Amount of resources. If the total amount or resources in the system is known and is :
then we may modify coefficients and in Eq. S10, to make Eq. S10 consistent with Eq. S13. Or, we may use the method of Lagrange multipliers, looking for conditions where variation of the following functional vanishes:
The latter constraint is used to find in Eq. S15. In either case, the shape of the optimal allocation function is determined by , such that allocation function is maximal where is minimal. The formulation in Eq. S14 has an advantage. It allows one to derive optimal prescriptions under changes in the amount of resources allocated to the task, such as in selective attention.
Generalizations. In a multidimensional case, when represents several variables (e.g., spatial and temporal extents of receptive fields, and ), and is a function of many variables, the prescription is
The previously derived prescription holds: allocate maximal amount of resources to conditions of minimal uncertainty.
- [Clark YuilleClark Yuille1990] Clark, J. J. Yuille, A. L. 1990. Data fusion for sensory information processing systems. Norwell, MA, USA: Kluwer Academic Publishers.
- [CochranCochran1937] Cochran, W. G. 1937. Problems arising in the analysis of a series of similar experiments. Journal of the Royal Statistical Society (Supplement), 4, 102–118.
- [Cover ThomasCover Thomas2006] Cover, T. M. Thomas, J. A. 2006. Elements of information theory. New York: John Wiley.
- [ElsgolcElsgolc2007] Elsgolc, L. D. 2007. Calculus of variations. Dover Publications. (Original work published in 1961.)
- [Gepshtein, Tyukin AlbrightGepshtein 2010] Gepshtein, S., Tyukin, I. Albright, T. 2010. The uncertainty principle of measurement in vision. (Manuscript in preparation.)
- [Gepshtein, Tyukin KubovyGepshtein 2007] Gepshtein, S., Tyukin, I. Kubovy, M. 2007. The economics of motion perception and invariants of visual sensitivity. Journal of Vision, 7(8), 1–18. (doi: 10.1167/7.8.8)
- [Landy, Maloney, Johnsten YoungLandy 1995] Landy, M., Maloney, L., Johnsten, E. Young, M. 1995. Measurement and modeling of depth cue combinations: in defense of weak fusion. Vision Research, 35, 389–412.
- [Luce RaiffaLuce Raiffa1957] Luce, R. D. Raiffa, H. 1957. Games and decisions. New York: John Wiley.
- [Maloney LandyMaloney Landy1989] Maloney, L. T. Landy, M. S. 1989. Statistical framework for robust fusion of depth information. In W. A. Pearlman (), Proc. spie vol. 1199, p. 1154-1163, visual communications and image processing iv, william a. pearlman; ed. ( 1154-1163).
- [ShannonShannon1948] Shannon, C. E. 1948. A mathematical theory of communication. Bell System Technical Journal, 27, 379–423, 623–656.
- [TaubTaub1963] Taub, A. H. (). 1963. John von Neumann: Collected works. Volume VI: Theory of games, astrophysics, hydrodynamics and meteorology. New York, NY, USA: Pergamon Press.
- [von Neumannvon Neumann1928] von Neumann, J. 1928. Zur Theorie der Gesellschaftsspiele. [On the theory of games of strategy]. Mathematische Annalen, 100, 295–320. (English translation in Taub1963.)
Yuille, A. L. Bülthoff, H. H.
Bayesian decision theory and psychophysics.
In D. C. Knill W. Richards (),
Perception as Bayesian inference (123–161). Cambridge, UK: Cambridge University Press.