We present a multiple criteria model for solving an inverse problem for diffusion models described in terms of Partial Differential Equations (PDEs). The theory of PDEs is crucial for modelling several problems in different disciplines, such as Business, Economics, Engineering, Finance and so on. For instance it can be used to model the process of innovation and spread of ideas, the evolution of population dynamics, the dynamics of fluids, option pricing, and many other.
A PDE can be analyzed from both a direct and inverse approach: the direct problem is the analysis of the properties of existence, uniqueness, and stability of the solution. This is also refereed to the notion of well-posedness in the sense of Hadamard .
The inverse problem, instead, aims to identify causes from effects. In practice, this may be done by using observed data to estimate parameters in the functional form of a model. Usually an inverse problem is ill-posed because some the properties related to existence, uniqueness, and stability fail to hold. When this happens, it is crucial to identify a suitable numerical scheme that ensures the convergence to the solution.
The literature is rich in papers studying ad hoc methods to address ill-posed inverse problems by minimizing a suitable approximation error along with utilizing some regularization techniques [6, 15, 19, 20]. The Collage-based approach instead, that has been utilized in this paper, relies on an extension of the Collage Theorem , a consequence of Banach’s fixed point theorem that has shown its importance to solve inverse problems for fixed point equations. The Collage Theorem is also the basis of the collage-based compression algorithm in fractal imaging [1, 12]
. It has also been extended to inverse problems for ordinary differential equations and their application to different fields in and for partial differential equations over solid and perforated domains in [2, 13].
The results presented in this paper are a further contribution to this stream of research. We extend the Collage-based algorithm for steady-state equations by searching for an approximation that not only minimizes the collage error but also maximizes the entropy and minimize the sparsity. In this extended formulation, the parameter estimation minimization problem can be understood as a multiple criteria problem, with three different and conﬂicting criteria: The generalized collage error, the entropy associated with the unknown parameters, and the sparsity of the set of unknown parameters. We implement a scalarization technique to reduce the multiple criteria program to a single criterion one, by combining all objective functions with different trade-off weights. Numerical examples confirm that the collage method produces good, but sub-optimal, results. A relatively low-weighted entropy term allows for better approximations while the sparsity term decreases the complexity of the solution in terms of the number of elements in the basis.
The paper is organized as follows: Section 2 recalls some basic definitions in multiple criteria optimization. Section 3 recalls the main ideas of the Generalized Collage Theorem and how it can be used to solve inverse problems for steady-state equations. Section 4 presents the notion of entropy and how this formulation can be adapted to this particular context. Section 5 introduces the notion of sparsity and its importance to determine the complexity of the approximation. Section 6 formulates the multiple criteria model, which Section 7 illustrates some numerical computations. Section 8 presents an application to an inverse problem in population dynamics and Section 9, as usual, concludes.
2 Basics on Multiple Criteria Optimization
This section focuses on recalling some basic facts in Multiple Criteria Optimization (MCO). In an abstract setting, a finite-dimensional MCO problem (see Sawaragi et al., 1985) can be stated as follows:
where is a Banach space and
is a vector-valued functional, andis ordered by the Pareto cone . A point is said to be Pareto optimal or efficient if is one of the maximal elements of the set of achievable values . Thus a point is Pareto optimal if it is feasible and, for any possible , implies . In a more synthetic way, a point is said to be Pareto optimal if .
Among the different techniques to reduce an MOP problem to a single criterion model there is, for sure, the scalarization technique. Using a scalarization technique, a multiple objective model can be reduced to a single criterion problem by summing up all criteria with different weights. The weights in front of each criterion express the relative importance of that criterion for the Decision Maker. By using this approach, More precisely, by scalarization an MOP model boils down to:
where is a vector taking values in the interior of , namely . The equivalence between the scalarized problem and the original MOP problem is complete if the are linear and, by varying , it is possible to obtain different Pareto optimal points. In the other cases linear scalarization provides only partial results. Other scalarization methods can be found in the literature and one which is worth to be mentioned is the Chebyshev scalarization model that can also be used for non-convex problems. Scalarization can also be applied to problems in which the ordering cone is different than the Pareto one. In this case, one has to rely on the elements of the dual cone to scalarize the multicriteria problem.
2.2 -constraint method
The second model that is proposed to solve the vector-valued problem is the -constraint method. In this methodology one of the objective functions is optimised using the others as constraints, then they are added to the constraint part of the model. The method is an hybrid methodology, in fact for the , least acceptable levels, have to be set while the remaining objective function is optimised. Then the decision maker plays a relevant role in this model, choosing the objective function to be optimised and the least acceptable levels for the objective functions added as constraints. Therefore, the original vector-valued problem can be now written as:
This method has the advantage of being theorically able to identify Pareto optimal points also of non-convex problems. However, it also has two potential drawbacks: The identified optimal point is only granted to be weakly Pareto optimal, and the problem might become unfeasible due to the additional constraints.
2.3 Goal Programming
Another method that can be used to solve vector-valued problems that is worth to be mentioned is the Goal Programming (or GP approach). Goal Programming was first introduced by Charnes, Cooper, and Ferguson (1955) and Charnes and Cooper (1961). The innovative idea behind this model is the determination of the aspiration levels of an objective function. This model does not try to find an optimal solution but an acceptable one, it tries to achieve the goals set by the decision maker rather than maximising or minimising the objective functions. Given a set of ideal goals , with , chosen by the decision maker, it is possible to re-write the problem in a GP form:
where , are the positive and negative deviations (slack variables),respectively, and , are the corresponding weights.
3 Inverse Problems for Steady-State Equations using the Generalized Collage Theorem
The purpose of this paper is to provide an extended multiple criteria algorithm for the estimation of unknown parameters in steady-state equation by combining the collage distance, the entropy, and the sparsity of the set of estimated coefficients. The next Section 8 will illustrate an application of this approach within the context of population dynamics. Here we recall some basic facts about the collage-based approach. Before formulating the inverse problem for a generic family of problems in variational forms, let us have a quick look how a classical steady-state equation can be reformulated in such a form. Consider the steady-state equation: Find such that
It is well-known that if we take the above model, we multiply both sides by a test function - the space of all continuous and differentiable functions with compact support in - and integrate over , the model can be written as
If we define by the vector of all unknown coefficients in and , the model boils down to the more compact form:
where is a family of bilinear forms defined as
and is a family of linear operators defined as
The steady-state equation can be reformulated as a variational problem for which existence and uniqueness is guaranteed by the classical Lax-Milgram theorem and over the space of all integrable functions with integrable weak derivative on (see  for more details in this).
More in general, if is an Hilbert space, consider the following variational equation: Find such that
for any , where and are families of linear and bilinear maps, respectively, both defined on an Hilbert space for any . Let denote the inner product in , and , for all . The existence and uniqueness of solutions to this kind of equation are provided by the classical Lax-Milgram representation theorem (see ). The following theorem presents how to determine the solution to the inverse problem for the above variational problem. Following our earlier studies of inverse problems using fixed points of contraction mappings, we shall refer to it as a “generalized collage method.”
(Generalized Collage Theorem)  Let be a compact subset of , and suppose that, for any , be a family of bilinear forms and be a family of linear forms. Furthermore, suppose that:
There exists a constant such that for any , for all ,
There exists a constant such that for any , for all .
Then by the Lax-Milgram theorem, then for any there exists a unique vector such that
for all . Then, for any ,
In order to ensure that the approximation is close to a target element , we can, by the Generalized Collage Theorem, try to make the term as close to zero as possible. The appearance of the factor complicates the procedure. However, if then the inverse problem can be reduced to the minimization of the function on the space , that is,
In the following section we use the abbreviation CD, to denote the function .
4 The Notion of Entropy
The concept of entropy, as it is now used in information theory, was developed by C.E. Shannon 
. Over the years it has been used in different areas and applications in various scientific disciplines. In his article, Shannon introduces the concept of information of a discrete random variable with no memory as a functional that quantifies the uncertainty of a random variable. The concept of entropy describes the level of information associated with an event. More precisely, the definition of Shannon’s entropy[18, 5] satisfies the following properties:
The measure is continuous and by changing the value of one of the probabilities by a very small amount should only produce a small change of the entropy;
If all the outcomes are equally likely, then entropy should be maximal.
If a certain outcome is a certainty, then the entropy should be zero.
The amount of entropy should be the same independently of how the process is regarded as being divided into parts.
According to these desiderata, Shannon defines the entropy in terms of a discrete random variable , with possible outcomes as:
For our purposes, this definition needs to be adapted to deal with a set of parameters, that can take both positive and negative values. For a set of parameters the notion of entropy is:
where . In the sequel, rather than maximizing the entropy term - that represents the total amount of information associated with that particular combination of parameters’ values - we will consider the minimization of its opposite, also known as neg-entropy. This criterion will be included in the multiple criteria model illustrated in the following Section 6.
5 The Notion of Sparsity
In literature the notion of sparsity has been widely used to reduce the complexity of a model by taking into in consideration only those parameters whose values have major impact on the solution. In other words, by adding this term we wish to determine solutions that are “simple”, or more precisely sparse. We say that a real vector in is sparse, when most of the entries of vanish. We also say that a vector is -sparse if it has at most nonzero entries. This is equivalent to say that the -pseudonorm, or counting ‘norm’, defined as
is at most . The -pseudonorm is a strict sparsity measure, and most optimization problems based on it are combinatorial in nature, and hence in general NP-hard. To overcome these difficulties, it is common to replace the function with relaxed variants or smooth approximations that measure and induce sparsity. One possible variant is to use the norm instead, which is a convex surrogate for the , defined as
It is also the best surrogate in the sense that the ball is the smallest convex body containing all -sparse objects of the form (see ). Another possibility is to replace the pseudonorm with some approximation, as for instance
for a given . It is worth noticing that is a or function (continuous with Lipschitz gradient).
6 The Model
We now propose a new collage-based approach for solving inverse problems based on Multiple Criteria Optimization which combines together the Collage Distance, the Entropy, and the Sparsity. Then we consider the following criteria to be maximized/minimized simultaneously:
is the Collage Distance, to be minimized over . This criterion describes the accuracy of the approximation;
is the Entropy, to be maximized over . This criterion models the amount of information carried by the parameters’ model;
is the Sparsity, to me minimized over . This criterion instead describes the complexity of the solution in terms of number of elements in the basis to be utilized to approximate the target.
It is worth noticing that these three criteria are, in general, conflicting. It is clear that by reducing the sparsity criterion , this will negatively affect the as less elements in the basis are available to construct the solution. To observe that the Entropy and the Sparsity criteria are also conflicting, let us take a simple example where is a random variable with only two possible outcomes and with probabilities and , respectively. It is clear that if increases, and then decreases, gets more and more likely to happen. This would produce a decrement in while the sparsity of the vector would increase (see also  for a nice discussion on the importance of the concepts of entropy and sparsity). By introduction the neg-entropy , the multiple criteria model can be formulated as a minimization program as follows:
This multiple criteria problem can be transformed into a single criterion model by using one the approaches presented above. In particular, one can construct the following single-criterion models:
Model 1: We scalarize the model by introducing three different positive weights, namely , , . The scalarized model boils down to:
The next section shows how the method works for different combinations of the weights.
Model 2: We move two out of three criteria into the constraints. Then Model 2 reads as:
Model 3: In the GP formulation, let us , , be the goals of , , respectively. Then Model 3 reads as:
7 A Computational Study
To show the implementation of the algorithm, consider the following steady-state equation:
with true , , and . The following tables show the result of our parameter estimation technique. Here we implement Model I presented in the previous section, Models II and III can be implemented similarly. Let us recall that is the coefficient of generalized collage distance, is the coefficient of entropy criterion, is the coefficient of the sparsity criterion. In the following tables, we denote by the value of the minimal general collage distance, by the value of the minimal entropy, and by the value of minimal sparsity. is the distance between the true and the recovered . We work with overlapping ”hat” bases, the first with interior elements, the second with 23 interior elements, so every other element in the finer basis has the same peak point as an element from the coarser basis. We include the ”half hats” at each end, as well, since we recover in these bases and is nonzero at the endpoints. So, in total there are elements. In particular, the following Table 4 considers the effects of all three criteria simultaneously. This table shows how the three criteria interact differently when the three weights vary.
Table 1: CD versus ENT
Table 2: CD vs sparsity
Table 3: CD versus SP. Add another set of even finer basis elements, also sharing peak points with the other bases, for a total of 87 elements.
Table 4: How the three criteria interact
8 An Application to Population Dynamics
Population dynamics is an extremely important field and it is the basis of many economic models. Recently a lot of attention has been devoted to a new stream of research, called Economic Geography, that aims at understanding the evolution of population over space and time and the effects of migration flows on economies. A spatial population model can be formulated as follows: Given a compact interval , consider the following differential model with Dirichlet boundary conditions on :
where is the population level at time in location , is the initial distribution of population, and is an exogenous flow of population. For simplicity, we also suppose that is equal to a constant value at and and over time. The steady-state level of population is the unique solution to the model
Let us consider the following numerical example where , , , , and . In this case the true solution is .
The following tables present the results for inverse problem instead. The first table shows the results with no noise added, interior data points, 11+23=34 interior basis functions. The second table, instead, reports the result of the algorithm with relative noise added to data.
The analysis of inverse problems for dynamical systems driven by differential equation is a crucial area in applied science. In fact, in practical applications, it is relevant to be able to estimate the unknown parameters of a given equation starting from samples of the solution collected by experiments or observations. In this paper we have extended a previous algorithm, based on the so-called ”Collage Distance”, to estimate the unknown parameters of steady-state equations. This extended version also includes the notion of entropy and the notion of sparsity and it is modelled as a multicriteria model. These three criteria are conflicting by nature, as a increment in precision usually implies an increment in sparsity. We have solved the model using a scalarization technique with different weights and conducted several numerical experiments to show how the method works practically.
-  Barnsley M, Fractals Everywhere, Academic Press, New York, 1989.
-  M.I. Berenguer, H. Kunze, D. La Torre, M. Ruiz Galán, Galerkin method for constrained variational equations and a collage-based approach to related inverse problems, J. Comput. Appl. Math. 292 (2016), 67–75.
-  E. J. Candès (2014), Mathematics of sparsity (and a few other things), Proceedings of the International Congress of Mathematicians, Seoul, South Korea, 2014.
-  L.C. Evans (2010), Partial Differential Equations, Graduate Studies in Mathematics, American Mathematical Society.
-  F. Flores Camachoa, N. Ulloa Lugob, H. Covarrubias Martıneza (2015), The concept of entropy, from its origins to teachers, Revista Mexicana de Fısica E 61, 69–80.
-  A. Kirsch, An introduction to the mathematical theory of inverse problems, Springer, 2011.
-  Kunze H and Vrscay E R, Solving inverse problems for ordinary differential equations using the Picard contraction mapping, Inverse Problems 15 (1999) 745–770.
-  Kunze H and Gomes S, Solving An Inverse Problem for Urison-type Integral Equations Using Banach’s Fixed Point Theorem, Inverse Problems 19 (2003) 411–418.
-  Kunze H, Hicken J and Vrscay E R, Inverse Problems for ODEs Using Contraction Maps: Suboptimality of the “Collage Method”, Inverse Problems 20 (2004) 977–991.
-  Kunze H, La Torre D and Vrscay E R, A generalized collage method based upon the Lax–Milgram functional for solving boundary value inverse problems, Nonlinear Anal. 71 12 (2009) e1337–e1343.
-  Kunze H, La Torre D, Vrscay E R, Solving inverse problems for DEs using the collage theorem and entropy maximization, Applied Mathematics Letters, 25 (2012), 2306-2311.
-  Kunze H, La Torre D, Mendivil F and Vrscay E R, Fractal-based methods in analysis, Springer, 2012.
-  Kunze H and La Torre D, Collage-type approach to inverse problems for elliptic PDEs on perforated domains, Electronic Journal of Differential Equations, 48, 2015.
-  J. Hadamard, Lectures on the Cauchy problem in linear partial differential equations, Yale University Press, 1923.
-  F.D. Moura Neto, A.J. da Silva Neto, An Introduction to Inverse Problems with Applications, Springer, New York, 2013.
-  G. Pastor, I. Mora-Jimenez, R. Jantti, and A.J. Caamano (2013), Mathematics of Sparsity and Entropy: Axioms, Core Functions and Sparse Recovery, Proceedings of the Tenth International Symposium in Wireless Communication Systems (ISWCS 2013).
-  Sawaragi, Y., Nakayama, H., Tanino, T. (1985). Theory of multiobjective optimization (Academic Press, Inc.)
-  C.E. Shannon (1948), A Mathematical Theory of Communication, Bell System Technical Journal, 27 (3), 379–423.
-  A.N. Tychonoff, N.Y. Arsenin, Solution of Ill-posed Problems, Washington: Winston & Sons, 1977.
-  C.R. Vogel, Computational Methods for Inverse Problems, SIAM, New York, 2002.