# A note on averaging prediction accuracy, Green's functions and other kernels

We present the mathematical context of the predictive accuracy index and then introduce the definition of integral average transform. We establish the relation of our definition with two variables kernels K( y, x). As an example of an application we show that integrating against the fundamental solution of the Laplace operator, that is, solving the Poisson equation, can be re-interpreted as an integral of averages of the forcing term over balls. As a result, we obtained a novel integral representation of the solution of the Poisson equation. Our motivation comes from the need for a better mathematical understanding of the prediction accuracy index. This index is used to identify hot spots in predictive security and other applications.

• 1 publication
• 1 publication
• 2 publications
08/27/2019

### Separability of the kernel function in an integral formulation for anisotropic radiative transfer equation

We study in this work an integral formulation for the radiative transfer...
07/05/2018

### A solution to a linear integral equation with an application to statistics of infinitely divisible moving averages

For given measurable functions g,h:R^d →R and a (weighted) L^2-function ...
09/09/2019

### On the solutions of linear Volterra equations of the second kind with sum kernels

We consider a linear Volterra integral equation of the second kind with ...
08/07/2019

### An explicit numerical algorithm to the solution of Volterra integral equation of the second kind

This paper considers a numeric algorithm to solve the equation y(t)...
01/10/2022

### A Coq Formalization of the Bochner integral

The Bochner integral is a generalization of the Lebesgue integral, for f...
01/02/2014

### Solving Poisson Equation by Genetic Algorithms

This paper deals with a method for solving Poisson Equation (PE) based o...

## 1 Introduction

We start by analyzing a procedure that is used in predictive security applications to identify crime hot spots, [2, 8, 4, 9]. This procedure is known as the Prediction Accuracy Index (PAI) and it is used to compare among possible hot spots regions. In this context, a higher PAI is preferred to a lower one. In particular, in some applications it is used (along side other measures such as distances and divergences [9, 4]

) as an indicator of similarity between two densities of random variables (a density from a prediction and the other from an observation). Roughly speaking, the PAI of a region is defined as the ratio between the hit rate and the volume proportion (see

[2, 8, 4] and the next section). When the PAI is used as a similarity measure between two densities, say and , this indicator is computed as follows

1. Identify several regions of high probability according to

.

2. Evaluate the PAI on each one of the regions identified before. Here the idea is that higher values of PAI will indicate certain similarity between and .

In case several regions of high-probability are selected in 1. and 2. above, one can add to this procedure the following:

1. Compute the average among the PAI indicators obtained above. See Section 2.

Our preliminary analysis readily shows that PAI measures (even average ones) shall not be used as a similarity measure between two densities and it have to be used cautiously to identify hot spots. See also [8, 4, 9] for some other issues related to the PAI indicator. We believe this is an important take home message in this note. See Remark 1.

Anyhow, the procedure described above (1., 2. and 3.) led us to the definition of the integral average transform. See Section 3 and Remark 2. The integral average transform is defined for a given function , assumed regular enough just to fix ideas. The transform is the result of integrating plain unweighted averages of over a family of neighborhoods of a point , say that is indexed by the real parameter . Note that we are talking about plain averages and not weighted averages such as . See (12) for a precise definition. We show that this procedure is equivalent to computing integrals against a two variables kernel with a possible singularity at , that is, computing,

 ∫f(y)K(y,x)dy.

We also show that the integral above can be interpreted as an integral average transform, that is, an integral of regular averages of over a parametrized family of regions. As an example of application we show that integrating against the fundamental solution of the Laplace equation, that is, solving the Poisson equation, can be re-interpreted as an integral of averages of the forcing terms over balls. See Section 3 and Remark 5.

The rest of the paper is organized as follows. In Section 2 we present the mathematical context related to the PAI including some modification know as the penalized PAI. We introduce the average PAI that motivates our main definition. In Section 3 we present the definition of integral average transform and, as an application of this novel interpretation of kernels, we show how to write the fundamental solution of the Laplace operator as an integral average transform. In Section 4 we present some final comments.

## 2 Averaging prediction accuracy

In this section we present the motivation that lead us to study integrals of average of function on parametrized regions.

Let us consider the real random variables and . Denote by and

be probability density functions associated with

and , respectively. Given a measurable set we write for the Lebesgue measure of . Denote by the set with probability density level higher (greater) than , that is,

 [ψ>r]={z∈Rn:ψ(z)>r}=ψ−1((r,+∞)). (1)

Let us define . In some practical applications a threshold value is determined and the region is refereed to as hot spot area (and in case of being the union of disjoint small regions these regions are know as hot spots), see [2] and references therein. In fact, note that if is close to the set will contain the most probable occurrences of .

It is common in applications to use the hot spots of the random variable to predict the hot spots of another a random variable . A common way to measure the performance of such predictions is by using the so-called Prediction Accuracy Index (PAI) defined next (see [2]). In some cases, when several possible predictions of are available, it is used the PAI measure to decide which model to use to report hots spots predictions of the target . See [2, 8].

In order to fix ideas and simplify the presentation we assume that and are continuous functions but we can as well consider discretized version up to some given spatial resolution as it is most common in applications.

### 2.1 PAI of a subregion

Assume that we want to analyze a measurable study region with positive volume measure. Region corresponds to the study region where the random variable occur. The region is a possible hot spot region for in . Define the hit rate by ([2])

 Hϕ(B)=∫Bϕ∫Aϕ.

If define the average of on by

 \fintBϕ=1|B|∫Bϕ.

Note that in our notation for we do not make explicit reference to the region since we assume is fixed. A commonly used indicator ([2]) of the quality of using to approximate the hot spots of in is given by the ratio of the hit rate with the percentage of volume , that is,

 PAIϕ(B)=Hϕ(B)|B||A|=|A|∫Aϕ∫Bϕ|B|=\fintBϕ\fintAϕ.

Note that the can also be written as the ratio between the averages of on and which gives another interpretation of the that readily reveals some possible issues with this indicator such as the preferences for small subregions with high-value of .

In order to better understand this indicator we present some simple examples. Note that if we get . If for instance we select with volume percentage

 |B||A|=0.5

and hit rate

 H(B)=0.5

we still have since we can predict half of occurrences of in half of the study area. Assume now that we have

 |B||A|=0.2 and H(B)=0.8.

We then have since we can predict 80% of occurrences of in 20% of the study area.

In practical applications we are interested in finding regions with a high value of PAI for a given density . We make the following observations:

• Since we always have we see that

 PAIϕ(B)=\fintBϕ\fintAϕ≤R\fintAϕ.
• Let . If is such that then and all its possitive measure subsets maximizes the PAI indicator. In fact, for all such that we have

 PAIϕ(˜B)=R\fintAϕ.

We then see that, in practical applications the PAI indicator will favor small area regions around the maximum values of the density . Here small will depend on the spatial resolution at which subregions are computed. This might not be convenient in some applications as pointed out in [8].

• If defined above is such that then, given a subregion with and , there always exists with and such that

 PAIϕ(˜B)

Therefore we see that searching for Borel subregions with as high PAI as possible is not a well posed problem under general considerations. Other possible optimization problems may be needed for practical applications. See [8, 4, 9] for more details.

Due to the fact that the PAI indicator prefers small area regions some modifications have been introduced. Among them, in [8] was introduced a penalized PAI where the area ratio is penalized, that is,

Here there was introduced the penalization . The exponent may depend on , e.g., (and in this case, for small hit rate, the indicator is penalized multiplicatively by the volume proportion of the region while for large hit rate close to 1 we do not have that penalization, [8]). Many other penalization alternatives can also be consider at the light of practical applications. For instance, a penalization of the form

 λ(B)=|B||∂B|

where is the surface volume of with regular enough. See (17) below.

One additional observation is the following. One can think to compute an average of several PAI values over subregions. Let

 BN⊆BN−1⊆⋯⊆B1

and consider

 avP=1NN∑i=1PPAIϕ(Bi).

Define the piecewise constant layered function

 K(x)=1Nk∑i=1λ(Bi)1|Bi|   % for x∈Bk∖Bk+1,

, where . We have that

 avP=|A|∫Aϕ  ∫Rnϕ(x)K(x)dx. (2)

Note that the function depends on the sets and the weight but we not make this dependence explicit in our notation. We conclude that the average value (of PAI indicators) over some regions corresponds to the inner product between and a piecewise constant function that weights the regions by the inverse their areas penalized by . The layered function can be taught as an approximation of a kernel that has singularities in the region where it takes maximum value.

### 2.2 PAI of a random variable

Recall that is the density of the random variable that we want to use to predict the hot spots of the random variable . See [2]. Assume that is continuous and that . For any define by

 r(s)=inf{r≥0:∫[ψ>r]∩Aψ=(1−s)∫Aψ} (3)

and the subregion by

 Bs=Bs(ψ)=[ψ>r(s)]∩A. (4)

Define the prediction accuracy index at level as the PPAI of the subregion .

 p(s)=p(s;ψ,ϕ)=PPAIϕ(Bs)=|A|∫Aϕλ(Bs)∫Bsϕ|Bs|. (5)

As before, the level prediction accuracy index is computed by dividing the hit rate by the volume percentage using the region and multiplying by a penalization.

From the comments on the previous subsection we have for the case (no penalization),

• If is such that then .

• If defined above is such that and there exists such that with and , then for small enough we have and

 p(s−ϵ;ϕ,ϕ)
• If (with finite volume) we have .

As mentioned before, in order to have an overall quantity, the performance may be measured by an average of the prediction accuracy index at different levels. More precisely, chose an integer and define

 PN(ψ,ϕ)=1NN∑i=1p(iN;ψ,ϕ). (6)

Note that, under appropriate assumptions we shall have a limiting value when . We define

 P(ψ,ϕ)=limN→∞PN(ψ,ϕ)=∫10p(s;ψ,ϕ)ds. (7)

### 2.3 Average PAI and kernels

 P(ψ,ϕ) = 1\fintAϕ∫10λ(Bs)|Bs|∫Bsϕ(y)dyds = 1\fintAϕ∫10∫Rnϕ(y)λ(Bs)|Bs|1Bs(y)dyds = 1\fintAϕ∫Rnϕ(y)(∫10λ(Bs)|Bs|1Bs(y)ds)dy=1\fintAϕ∫RnϕKψdy.

Here we have introduced the “layered” function

 Kψ(y)=∫10λ(Bs)|Bs|1Bs(y)ds.

In case is continuous then and have the same level sets. Indeed, if we put defined such that where is defined in (3) we have that

 1Bs(y)={1,s≤t(y)0,s>t(y).

Then,

 Kψ(y)=∫t(y)0λ([ψ>r(s)])|[ψ>r(s)]|ds=∫t(y)0λ(Bs)|Bs|ds.

Summarizing,

 P(ψ,ϕ)=1\fintAϕ∫Rnϕ(y)Kψ(y)dy. (8)

The is a positive function with the same level curves as with maximum value (and possible singularities) at . Appearance of singularities depends on the integrability of the function

 s↦λ(Bs)|Bs|=λ([ψ>r(s)])|[ψ>r(s)]| in [0,1].

Therefore, the value will give higher weight to the regions containing . To further clarify our point we present the next example.

###### Example 1

Consider , and defined by

 ψ(x)=⎧⎪⎨⎪⎩0,x<−1,12p(1−|x|)p−1−1≤x<1,0,1≤x.

We have and and . Additionally if we take we have and it holds that the mass of the region is given by

 ∫[ψ>r]ψ=2∫1−(2rp)1p−10ψ=1−(2rp)pp−1.

Therefore, given , we can find such that to obtain,

 r=r(s)=12psp−1p.

Then, the measure of the region is the lengh of the interval that is,

 |Bs|=|[ψ≥r(s)]|=2⎛⎝1−(2rp)1p−1⎞⎠=2(1−(sp−1p)1p−1)=2(1−s1p).

Given , we can find by

 r(t(y))=12ptp−1p=12p(1−x)p−1

which gives

 t=(1−x)p.

Thus we obtain for the case and for

 Kψ(y) = ∫t(y)01|Bs|ds (9) = ∫(1−y)p012(1−s1p)ds (10) = 12p(−log(y)+p−1∑ℓ=1(p−1ℓ)(−1)ℓℓxℓ). (11)

See Figure 1 for an illustration.

From (8) we have , that is, an inner product with the function in (11). Note that puts a very high weight around the value and therefore ignoring other possible regions with hot spots. For instance if then the possible hot spots of at and wont be detected but the value of will be very high (as far as we select high enough).

###### Remark 1

We then conclude that not the PAI at a particular level, not the average of several PAIs at different levels, are good measures or indicators in order to select one among different possible predictions of the random variable , in particular, hots spots reported after selecting among several models using PAI as a main indicator are not adequate. See [8, 2].

In order to make better hot spots predictions in practice it is recommended, before computing hots spots using PAI as indicator, that the selected model shall be decided using other more suitable measures. See for instance [1, 6].

## 3 The integral average transform

In the previous section we presented an application where integral of averages of a function on subregions are computed. We summarize this procedure as follows: consider a function and family of regions neighboring , say such that: for all and . Define the following integral average transform of by

 u(x)=∫Rλ(s,x)|Bs,x|∫Bs,xf(y)dy=∫Rλ(s,x)\fintBs,xf(y)dy. (12)

That is, a weighed integral of plain average values of over the regions , . Here is a non-negative weight function.

This procedure is equivalent to integrating against a possible singular kernel. We formalize this interpretation in the next statement.

###### Theorem 3.1

Define

 K(y,x)=∫Rλ(s,x)|Bs,x|1Bs,x(y)ds

and consider defined in (12). Then, .

As in the previous subsections we can choose for instance, subsets related to levels curves of another functions , say,

 Bs,x=ψ−1x((−∞,s)). (13)
###### Remark 2

In Section 2 the average PPAI, , defined in (7), was an integral average transform of where the family of sets were given by levels sets of . In particular, in Section 2 is continuous and has a maximum value , then corresponds to the integral average transform of evaluated at . The definition of subregions in (4) are related to superlevel sets of while for this section, and the rest of the paper, we use as sublevel sets; See (13). In the case of super level set the possible singularity of the associated kernel will be related to the minimum value of that, in order to fix ideas we are assuming to be a singleton. Recall that we are assuming that for all and .

Let us now consider a non-negative kernel given by a with a possible singularity at and smooth for . We can then take, for instance, for some . Then

 Bs,x=[1s

Note that corresponds to the level curve of . In this case if we define

 λ(s,x)=1q s1q+1|Bs,x|

and therefore we have

 ∫Rλ(s,x)|Bs,x|1Bs,x(y)ds=∫∞1K(x,y)qλ(s,x)|Bs,x|ds=∫∞1K(x,y)q1q s1q+1ds=K(x,y).

We conclude that there is several ways to re-interpret an inner product against a kernel as a integral average transform.

Due to the importance of kernels and its ubiquity in pure and applied mathematics it is always useful to have several interpretations and equivalent formulations for the results that are written as integration against kernels. In the next section we present some well know examples.

### 3.1 Fundamental solution of Laplace equation as an integral average transform

As a particular example let us consider the application of singular kernels related to the Poisson problem ([7]). In order to fix ideas we do not considered the more general setting regarding minimal regularity of functions involved. Instead we assume that all functions are sufficiently regular in order to show the usage of the integral average transform (12).

Let . Here we need to find a function such that

 −Δu=f in Rn. (15)

Consider In this case

 Bs,x=Bs(x)={z∈Rn:||z−x||

Denote the volume of the unit ball in and define the weight

 λ(s,x)=⎧⎪⎨⎪⎩|Bs(x)||∂Bs(x)|=ωnsnnωnsn−1=sn,0≤s,n≥2,0,otherwise. (17)

Then, in order to compute the corresponding kernel note that, for and ,

 KR(y,x): = ∫R01|∂Bs(x)|1Bs,x(y)ds = ∫R||x−y||1nωnsn−1ds = ⎧⎪⎨⎪⎩12πlog(R||x−y||),n=2,1n(2−n)ωn(R2−n−||x−y||2−n),n≥3. (19) = Cn(R)+G(y,x) (20)

where the constant is given by

 Cn(R)=⎧⎨⎩12πlog(R),n=2,1n(2−n)ωnR2−nn≥3,

and is the fundamental solution of the Laplace equation. See [7, 5]. By taking we see that, for ,

 K(y,x)=∫∞01|∂Bs(x)|1Bs,x(y)ds=G(x,y).

We have the following result.

###### Theorem 3.2

Assume that and that is regular enough and with compact support. Considered the integral average transform defined in (12) with family of sets in (16) and weight in (17), that is,

 u(x)=∫∞0sn\fint||x−y||

where

 \fint||x−y||

is the average of over the ball . Then

 −Δu=f.

Proof. Using (20) we have, for , the truncated integral,

 uR(x) = ∫R0sn\fint||x−y||

By taking large enough, we see that

 uR(x)=Cn(R)∫Rnf(y)dy+∫Rnf(y)G(x,y)dy.

The results follows from the fact that is the fundamental solution of the Laplace equation and the other term vanishes when for and .

###### Remark 3

Consider now the case , , and . For all large enough we have

 uR(z)=Cn(R)∫Rnf(y)dy+∫Rnf(y)G(z,y)dy.

for all . It is enough to take where contains the support of . We then see that .

We have also the following result concerning a generalization of the mean value property. See [3].

###### Theorem 3.3

Assume that and are sufficiently smooth and satisfy

 −Δu=f in BR(x0).

Then the following mean value property holds,

 u(x0)=\fint∂BR(x0)u(y)dy+∫R0sn\fint||x0−y||

Proof. From [3] we have

 u(x0)=\fint∂BR(x0)u(y)dy+∫BR(x0)f(y)GBR(x0,y)dyds

where is the Green function on the ball given by

 GBR(x0,y)=⎧⎪⎨⎪⎩12πlog(R||x0−y||),n=2,1n(2−n)ωn(R2−n−||x0−y||2−n),n≥3.

From (19) and the Fubini’s theorem it follows

 ∫BR(x0)f(y)GBR(x0,y)dyds = ∫BR(x0)f(y)KR(x0,y)dyds (25) = ∫R0sn\fint||x0−y||

This finished the proof.

We now turn to show how to rewrite the solution of the Poisson problem on the positive semi-space as an integral average transform. Here we need to find a function such that

 −Δu=f in Rn+={z∈Rn:zn>0} (26)

with on

###### Theorem 3.4

Assume that , is regular enough and with compact support contained in and considered the integral average transform defined by

 u(x)=∫∞0sn\fintBs(x)f(y)1Bs(x)∖Bs(x−2xnen)dyds. (27)

Then

 −Δu=f in Rn+

and on

Proof. We have (see [5, 7])

 u(x) = ∫Rnf(y)(G(x,y)−G(x−2xnen,y)dy) (28) = ∫∞0sn(\fintBs(x)f(y)dy−\fintBs(x−2xnen)f(y)dy)ds (29)

Here we have used (21). The result follows from recalling that the support of is contained in and by noting that

 \fintBs(x)f(y)dy−\fintBs(x−2xnen)f(y)dy=\fintBs(x)f(y)1Bs(x)∖Bs(x−2xnen)dy.

###### Remark 4

For we have, for large enough

 G(x,y)−G(x−2xnen,y)dy = KR(x,y)−KR(x−2xnen,y)

and therefore

 ∫Rnf(y)(G(x,y)−G(x−2xnen,y)dy)= (30) ∫Rnf(y)(KR(x,y)−KR(x−2xnen,y))dy (31)

We conclude the same result as in (27) by taking .

###### Remark 5

In the previous results we write the solution of the Poisson equation in the half space as an integral of averages of the forcing term over balls centered at . In the case the ball , the function is cut to . See Figure 2.

Finally we mention that formula (21) give us additional interpretation of the solution of the Poisson equations in domain with boundaries. Instead of subtracting two solution as in formula (28), we could extend the forcing term in such a way that averages centered at the boundary vanish. For instance, let us consider the domain and . In order to use (21) to obtain the solution of the Poisson equation in we could extend to the whole , say such that and with for all . We then have the following results as an alternative the Theorem 3.4.

###### Theorem 3.5

Assume that , is regular enough and with compact support contained in and considered

 Ef(y)=⎧⎪⎨⎪⎩f(y),y∈Rn+,0,y∈∂Rn+,−f(y−2ynen),elsewhere,

and defined by

 u(x)=∫∞0sn\fintBs(x)Ef(y)dyds. (33)

Then

 −Δu=f in Rn+

and on

Proof. Observe that formula (33) coincides with (27).

Similar argument can be applied to other domains with simple boundaries such as strips and cubes.

In this short note we introduced the integral average transform. Our motivation comes from the need of a better mathematical understanding of some practical measures such as the prediction accuracy index that is popular in problems related to predictive security. In this paper we have explained the mathematical and practical context of this application in order to motivate our main definition. The integral average transform is defined for a given function and it is the result of integrating plain averages of over an family of sets (containing a point ) indexed by the integration argument. See (12) for a precise definition. We show that this procedure is equivalent to computing integrals against a two variable kernel with a possible singularity at . We also show that any kernel integral of the form can also be interpreted as an integral average transform. Given the ubiquity of kernels in the solution of many problems, we believe that this novel interpretation may be worth of further investigation. For instance, other kernels can be considered such as Poisson’s kernels. We can also associate some dynamics to the parameter .

## References

• [1] S. Cha (2007) Comprehensive survey on distance/similarity measures between probability density functions. City 1 (2), pp. 1. Cited by: Remark 1.
• [2] S. Chainey, L. Tompson, and Uhlig (2008) The utility of hotspot mapping for predicting spatial patterns of crime. Security Journal 21 (1), pp. 4–28. External Links: Cited by: §1, §2.1, §2.2, §2, §2, Remark 1.
• [3] J. M. DeLaurentis and L. A. Romero (1990) A monte carlo method for poisson’s equation. Journal of Computational Physics 90 (1), pp. 123–140. Cited by: §3.1, §3.1.
• [4] G. Drawve (2016) A metric comparison of predictive hot spot techniques and rtm. Justice Quarterly 33 (3), pp. 369–397. Cited by: §1, §1, §2.1.
• [5] L. C. Evans (1998) Graduate studies in mathematics 19 (4), pp. 7. Cited by: §3.1, §3.1.
• [6] A. L. Gibbs and F. E. Su (2002) On choosing and bounding probability metrics. International statistical review 70 (3), pp. 419–435. Cited by: Remark 1.
• [7] D. Gilbarg and N. S. Trudinger (2015) Elliptic partial differential equations of second order. Vol. 224, springer. Cited by: §3.1, §3.1, §3.1.
• [8] C. Joshi, S. Curtis-Ham, C. D’Ath, and D. Searle (2021) Considerations for developing predictive spatial models of crime and new methods for measuring their accuracy. ISPRS International Journal of Geo-Information 10 (9), pp. 597. Cited by: §1, §1, 2nd item, §2.1, §2.1, §2, Remark 1.
• [9] G. Mohler, M. Porter, J. Carter, and G. LaFree (2020) Learning to rank spatio-temporal event hotspots. Crime Science 9 (1), pp. 1–12. Cited by: §1, §1, §2.1.