Many problems in the natural sciences involve a process of finding an explanatory model that best fits a set of field data. Such problems are called inverse problems[2, 20]. Examples of inverse problems are varied and include, tomography, oceanographic sensing, gravitational and seismic sensing and magnetotellurics. Inverse problems are often under-determined that is there are many models that explain the field data. Because of this multiplicity of explanations there is a role for population-based stochastic search in solving inverse problems. Through the provision of a population, within in each run and by producing a different population between runs, these search algorithms provide a a variety of well-fitting solutions. Field experts can then query these solutions to look for distributions of features.
Unfortunately, inversion using stochastic search is computationally expensive and can even be infeasible when models are described by many parameters. This is typically the case in magnetotellurics where 3D models defining the resistivity of the Earth’s subsurface can consist of thousands of parameters. In earlier work  we described a methodology, called blob-modelling which reduced the number of parameters needed to describe magnetotelluric models. This parameter reduction made stochastic search practical for 3D magnetotelluric inversion. Later we performed a case study which showed that, while the blob-modelling method was reasonably effective in a variety of settings it was sensitive to the starting configurations for the search process. We also found later that this initial approach struggled to represent more detailed models.
This paper describes a substantially refined approach to inversion of problems involving 2D and 3D geometries. The technique involves a multi-stage search involving and initial greedy search phase, followed by interleaved stages of evolutionary search, model simplification (culling) and cellular division (blob-splitting). The cellular division phase is carried out by splitting ellipse functions that make the largest contribution to fitness. We also substantially improve the representation by adding a strength parameter to each ellipse function to smoothly express the degree of local dominance.
These changes have improved the outcomes and reliability of previous work. The cell-growth model in particular allows for detailed models to be evolved with more success than our previous all-at-once approach.
The contributions of this paper are as follows: We describe a novel approach inversion by combining evolutionary search with cellular division; We describe an improved way to express 2D and 3D models using ellipse functions with a strength parameter that denotes the degree to which an ellipse is in the foreground; We describe an improved initial search phase, based on the Hough transform but applicable to a large number of diffuse ellipse functions of different colours.
We apply our technique to an artificial benchmark problem of discovering unknown 2D pictures. We then extend the technique back to the original problem of 3D magnetotelluric inversion. We conduct experiments that demonstrate the advantages of adding blob-splitting to the search process. We show that our technique ports between the benchmark problems with very little modification which indicates promise in the technique generalising to other fields.
1.2 Paper Structure
The rest of this article is laid out as follows. In section 2 we put our work in context of other work in the problem domain. In section 3 we describe our technique as applied to both 2D and 3D problems in picture discovery and magnetotelluric inversion. In section 4 we describe the customisation of our framework to our two benchmark problem domains. In section 5 we present our results and, finally, section 6 concludes and canvases future work.
2 Related Work
There is a long history of the application of stochastic search to inverse problems. This history is reflected in stochastic approaches to inversion in magnetotellurics (MT)[10, 16, 6, 15, 18, 21, 12, 11, 8]. The early approaches included Monte-Carlo techniques[10, 16]6], simulated annealing and evolutionary methods[15, 18, 21, 12, 11].
Work using evolutionary methods is diverse, ranging from direct evolution of 2D models using specialised operators (Flores and Schultz
); to hybrid schemes combining genetic algorithms and local search (Tong et. al.); to using pareto-optimisation techniques to balance model fitness and model smoothness (Moorkamp, Jones, and Fishwick). Work in 3D inversion has been limited due to the large number of parameters involved, Liu and Li, et. al. inverted 3D models 3600 parameters using a real-valued GA with informed mutation operators and inequality constraints to prevent divergence. (Anonymised..)[8, 9] used reduced-parameter representations and model priming to invert 3D models. This latter work forms the foundations for the work described here. Some work has also been done on using reduced-parameter representations for MT[10, 17] our work differs in the number of separate elements and the lack of assumptions about the prior knowledge of the model.
In relation to picture discovery, because our benchmark was deliberately set up to take no advantage of information beyond a scalar error value, there are few direct analogues to this benchmark in the literature on signal processing. However there are works that share features. The scanning step in the priming stage is similar to some implementations of the Hough transform. Our work is also related to work in circle detection and image segmentation[1, 4]. More recently, recently Cuevas et. al. used a competitive evolution process, based on animal behaviour to detect ellipses in an image.
Our work differs in that it expresses a whole picture in terms of ellipses rather than detecting specific elements within pictures. In addition our work, doesn’t exploit priors, which makes it much slower for applications where detailed prior information is available but much more applicable to situations where prior knowledge is limited.
3 Methodology and Background
The aim of our method is to perform 2D and 3D spatial inversion. The search process cyclically refines guesses of the model in response to error values. An inverse problem can be formulated as: where is the data signal (responses) collected and is the model being sought. The function is the forward function which maps a model to its data signal. Given and it is possible to derive
through a process of estimation and refinement as outlined in Algorithm1.
The input to the process is a field data vector. The output is the best models produced by the process. The process first initialises . The process then iterates through successive steps of improvement and testing. The improve function perturbs the models through either random or informed search heuristics and can vary during the search process. The function takes the models and uses it to produce simulated responses that can then be compared to field data. This forward-modelling step is key to the inversion process and is different for each problem domain. The loop terminates when the minimum error errs falls below a threshold, at which point the best models are returned.
In this work our two benchmarks have a different forward-modelling function. For the picture discovery (PD) problem our forward model samples our 2D model representation into a 100-by-100 grayscale picture. For magnetotellurics (MT) our forward model samples our 3D model representation to a hexahedral mesh. This mesh is then given to an MT forward modelling function written by Siripuvaraporn which generates a simulated field response for comparison to the real field response.
We describe our model representation, evaluative function, and our search process next. In our discussion we describe the process for the PD problem and highlight any modifications required for the MT problem. We also highlight the differences in methodology from our previous published work.
Our representation needs to be compact enough to make stochastic search feasible but expressive enough to admit realistic models. We represent our model as a background colour: combined with a set of overlapping diffuse ellipse functions .
The background value is new to this work and allows the models background colour to be determined as part of the search process111We found that including this term helped search performance in our early experiments..
The ellipse functions (blobs) are parameterised by central intensity, strength, diffuseness, position, size, and rotation. These parameters are described in Table 1. All parameters are normalised to the range and represent values pertinent to their purpose. Thus a value of for will place the centre of the blob on the left side of the image/model; a value of for will rotate the blob by 222In theory, because ellipses are symmetrical, we could limit rotations to without loss of generality but this can make search difficult for values just beyond that limit.; and a value of for will give the blob an x-radius of half the size of the model.
|rotation about z axis|
For the 3D ellipses used for the MT problem we add parameters: for z-location and size and for rotation about the x and y axes of the ellipse.
The strength parameter is new to this work and defines the extent to which the blob appears in the foreground or background.
The value at a location in a picture or model element is determined by both the local intensities the blob functions that overlap at that location and the relative strengths of the blobs at that location.
The local intensity of a blob function at location is given by:
The parameters and are the transformed values of the parameters after taking account of translation induced by the and rotation induced by . The numerator is the peak intensity of . The parameter determines how sharply this intensity trails off. An value close to one produces a very sharp edge while a value close to zero produces a very diffuse ellipse. Generalisation to three dimensions entails adding a parameter to the above equation.
At each point of the model the local intensities of each blob are combined to produce a value for that location. In previous work we combined intensities using a generalised mean but this method tended to smear features of moderate intensity. In this work we use the strength parameter determine the extent to which blob is locally dominant. Local dominance can be viewed as the extent to which the blob is in the foreground.
To create a combined model value we first assign the background intensity to a dummy blob intensity and set (background strength is zero). Then for the remaining blobs we normalise the local intensity with respect to the maximum intensity of that blob to give . We also normalise strengths with respect to the maximum strength blob. To help make foreground and background blobs more distinct we raise this normalised strength to the sixth power. We denote these adjusted strengths: . From these values for and the combined local value is produced from equation 1
where bg is defined: and represents the extent to which the model background influences this location. It’s dual, fg is defined: denotes the extent to which the foreground influences this location. The denominator wi is defined: is a weighted intensity for the whole model. is used preserve the influence of the background term when the overall model is low intensity at the given location. Equation 1 works to allow relatively high strength blobs to dominate the value at each location. Conversely if location is overlapped only by low strength blobs then the background term will dominate. It should be noted that extending Equation 1 in the third dimension trivially involves the addition of a parameter. The effect of the strength parameter blob influences is illustrated in Fig 1.
Finally, it should be noted that the value of is in the range this value has to be normalised to each application as described in section 4.
3.2 Evaluative Function
Our framework evaluates each individual by measuring the difference between the response of the individual and the input data for the inversion . For an individual evaluation this difference is represented by a single scalar error value. This is the only value used to guide the search process. In this work we do not attempt to add a regularlisation term to our error value. This is primarily because the use of blobs keeps models relatively simple. The details of the extraction of the response value for our picture-discovery and magnetotellurics benchmarks are described in section 4.
3.3 Search Process
The search process is defined by the algorithm shown in Algorithm 2.
The first step in the algorithm is prime. This greedy function takes a blank model and adds and adjusts blobs until no further improvements can be made. Each blob is added by scanning a diffuse light blob and then a diffuse dark blob across the model until the position returning the highest fitness for each is found. The position, shape, orientation and intensity is then refined by half-interval search until no further improvements can be made. The priming function then chooses the refined dark or light blob according to whichever one contributes most to model fitness and discards the other. The background is then adjusted to improve fitness. This process of adding blobs is then until no further improvement in fitness is possible.
It should be noted that this initial, scanning, stage of blob addition is similar to the process for the Hough transform. However, priming differs in that the search for the non-positional parameters is greedy and non-exhaustive. The priming process also differs from our earlier work  which started with a user-defined number of blobs in set locations before proceeding straight to half interval search. In this previous work it was challenging to profitably place more than 16 blobs where as the current priming process, depending on target model complexity, can routinely place more than 30 blobs.
After priming, Algorithm 2 invokes the Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES) to optimise the parameters of the best model from the priming stage. After CMA-ES the best value from the population is extracted and a loop is entered. The first operation in the loop is culling. This greedy operation test the contribution of each blob and if its contribution is negative, it is deleted. The next operation is splitting which divides a user-defined number (usually 3-5) of the most significant blobs into two. The division takes place along the longest axis of the original blob and the resulting blobs are 2/3 the radius of the original blob and shifted along the long axis so they just touch. The effect of splitting on a picture discovery benchmark image is shown in figure 2.
The aim of splitting is two-fold, first it allows significant elements to become further refined, second it disrupts the model allowing a new phase of exploration. After splitting the model is further evolved and culled in later iterations until a set limit on the number of splits is reached.
This concludes the description of the search algorithm. Next we describe the setup for the benchmarks for our experiments.
4 Benchmark Setup
We use two types of benchmarks in our experiments: picture-discovery (PD) and magnetotellurics (MT). For picture discovery we use the target images shown in Fig 3(a). For PD our evaluative function uses the function in Equation 1 to sample its input genome at each position of a 100x100 grayscale image. These samples are then scaled to discrete integers in the range 0-255 to create pixel values in a grayscale picture. The error is calculated by subtracting the target picture from the candidate picutre pixel-wise and averaging the absolute values of the result.
For the MT benchmarks we aim to model the resistivity of a 3D segment of planetary crust by finding a model to match the response of ground-based field stations to naturally occuring eletromagnetic radiation. To do this we sample the genome into each element of a 3D hexahedral mesh. The samples are mapped into resistivity by the following equation.
which allows the model to express relatively shallow, conductive earth with reasonable fidelity. To evaluate the error for a model the sampled model is run through a forward modelling function using Siripuvaraporn’s wsinv3dmt to produce an artificial response that can then be subtracted from the field response to obtain an RMS error value.
(a). This model is one of a series of models developed to test MT inversion techniques. It is a reasonably challenging target in that it has highly contrasting bodies near the surface and alternating conductive and resistive layers. We use a relatively low resolution model of 13x14x10 cells to keep the evaluation time for each model to less than 2 seconds. To create artificial field data from the model we use wsinv3dmt to run a forward model to convert the model parameters into impedance tensors for five frequencies (2Hz to 0.01Hz) measured by a set of simulated field stations positioned over the model.
Our second MT benchmark is real field data from a site near Paralana in South Australia that shows promise as an enhanced geothermal power source. The response data was collected from 54 stations and processed tensors in six frequencies ranging from 50Hz to 0.01Hz. For the inversion we again, use a low-resolution model (13x13x10 cells).
We conducted three experiments (one picture discovery and two MT). Our first experiment was applied to the PD benchmark, with splitting (PDSplit) and without splitting (PDNoSplit). This experiment was used to verify that the framework could be used to model complex environments and that splitting is potentially profitable.
Our second experiment was applied the discovery the COMMEMI 3d2 MT benchmark. In this experiment we compare the results of multiple runs using splitting (MTSplit), no splitting (MTNoSplit) and a single run the standard (deterministic) wsinv3dmt gradient search program (MTWsinv).
Our third experiment (ParaSplit) completes several runs of the configuration used in MTSplit on the Paralana data set. We compare the runs to each other and to the the outcome of wsinv3dmt (ParaWsinv) applied to the data set.
All experiments were conducted on a 48processor Intel i7 platform with 64GB of memory. In all cases we configured CMA-ES with , and . We describe each experiment in turn.
5.1 Picture Discovery
In this experiment we ran a splitting (PDSplit) and non-splitting (PDNoSplit) version of our search algorithm once on each of the PD benchmarks pictured in Figure 3(a). PDSplit ran with five splits of up to five blobs each interspersed with six runs of CMA-ES with a maximum of 10000 iterations (producing a maximum of 3000000 evaluations in the CMA-ES stage). The PDNoSplit algorithm ran with the same priming as PDSplit and one stage of CMA-ES with a maximum of 60000 iterations.
The results show that PDSplit (part (d)) performs visibly better on all three benchmarks than PDNoSplit (part (c)). The error values for PDSplit are also smaller by 26%,17% and 12% respectively on the the three benchmarks. However, PDNoSplit terminated after only 6000 of its 60000 iteration due to lack of progress. This means that PDSplit could be benefiting from both the additional model parameters produced by the splitting and the additional evaluations enabled restarting CMA-ES after splitting. As a result PDSplit was able to run between 6000 and 9000 iterations in each of its CMA-ES phases before termination for flat-fitness. In total each run for the PDSplit benchmark took approximately 3 days with 98% of the processing time devoted to CMA-ES.
Finally, it should be noted that a 2d version of our previous framework (implemented early in the refinement process) was not able to make significant progress in the priming stage of the benchmarks.
5.2 Magnetotellurics: COMMEMI 3d2
For the MT experiments we used the same model setup and number of evaluations as for our previous proof-of-concept work to allow for a comparison with the fittest values produced for that work. The target model is the COMMEMI 3d2 model pictured in Fig 4(a).
We ran two trials of 20 and 17 runs respectively333Three of the runs in the second experiment were aborted due to system limits.. In the first trial (MTSplit) we primed the model to 4 blobs and then, ran CMA-ES four times with 1000 iterations, interspersed with splitting and culling to bring the model to 13 blobs. In the second trial we primed to 13 blobs and ran CMA-ES once with 4000 iterations - thus equalising the number of blobs and the number of iterations. For reference also ran a deterministic gradient-search inversion using wsinv3dmt for 50 iterations. In all experiments the starting background was set to which is the background resistivity at the surface of the target model.
The results of the experiment are summarised in Table 2.
MTSplit performed significantly better than MTNoSplit with on the Wilcoxon sum rank test. Note again that wsinv3dmt, from a given starting model, is deterministic, so all runs yield the same results. The best value for wsinv3dmt of was achieved in the 31st iteration.
Fig 4 shows cross-sections for the target model (part (a)) and the best models produced by MTNoSplit, MTSplit and wsinv3dmt respectively.
As can be seen, MTSplit appears to produce the best approximation to the target model, especially at the surface where signals are most distinct. The conductive layer at depth is thinner than in the target but, given the relative opacity of conductive layers in MT these models will appear nearly identical to the surface sensors444This is, in part, an artifact of MT inversion being underdetermined - MT is analogous to tomography carried out from one side of the target object.. It should be noted that the performance of wsinv3dmt, being a gradient search method, will depend on starting model as well as model resolution so further experiments are needed for a fully valid comparison with the other methods. It should also be noted that wsinv3dmt is much faster than the stochastic methods, taking 2 hours of runtime in contrast to 4 days for MTNoSplit and MTSplit. As an additional note both MTSplit and MTNoSplit outperformed the runs described in our earlier work which produced an RMS of 1.28.
In order to contrast the evolutionary processes followed by MTNoSplit and MTSplit we plot the log of error values against evaluations in Fig 5.
The figure shows the best MTSplit run exhibits improved performance through faster evolution of simpler models at the start (4 blobs versus 13 blobs) and through the restarting of evolution after the splits which are evident as peaks in the error value. MTNoSplit also takes slightly longer due to a longer priming stage (approximately 19000 evaluations vs 8000 evaluations for MTNoSplit).
5.3 Magnetotellurics: Paralana
As with the COMMEMI 3d2 model, we used the same model parameters and data as for our previous work. In this experiment we conducted 5 trial runs with splitting (ParaSplit) and one reference run of 50 iterations with wsinv3dmt (ParaWsinv). Each run of ParaSplit primed to 4 blobs and interspersed 5 runs of CMA-ES (1000 iterations) with 4 rounds of splitting and culling with each round adding 4 blobs555Culling removed approximately two blobs in each run. The runs for ParaSplit each took 5 days to complete. The RMS’s of the ParaSplit runs ranged between 2.440 and 2.556. Four out of five of these runs are slightly better than the run of 2.51 from our previous experiments. The ParaWsinv run performed better than ParaSplit with a minimum RMS of 2.18 in its third iteration. Figure 6 shows the models produced by the best three runs of ParaSplit and ParaWsinv. respectively.
All four models in the figure exhibit some common structure with a relatively conductive surface layer underpinned by a thick region of less conductive material with indication of more conductivity at depth. However, the three ParaSplit models666The remaining two ParaSplit models are very similar in structure to the three models displayed. have much more extensive regions moderate resistivity compared to ParaWsinv which has a smaller region of much higher resistivity () close to known structures of hot-dry rocks (marked with a black arrow in the image). ParaWsinv also has a strongly defined area of higher conductivity at depth (marked with a red arrow). This corresponds with a posited heat source arising from radioactive decay in host rocks. One possible explanation for these differences is that the experimental setup for ParaSplit limited resitivty to a maximum of which prevents its models expressing the higher levels of resistivity shown in part (d). The larger areas of moderate resistivity in parts (a) through to (c) may be compensating for an inability to express higher resistivity but confirmation of this will require further experimentation.
6 Conclusions and Future Work
In this paper we described a new methodology for refining spatial models during inversion. We have demonstrated that this methodology can enhance search and enable the building of of more detailed 2D models in picture-discovery benchmarks. We applied our methodology to a popular 3D magnetotelluric benchmark and demonstrated significant advantages in search compared to non-splitting search and to our earlier published results. We also showed that our methodology can be applied to real 3D models - but perhaps requires some caution in setting of search space bounds for resistivity.
This work can be extended in several ways. First, for problems - such as picture discovery - where significant prior knowledge is available - exploitation of that knowledge can be built into the priming stage of that search. Second the technique should be applied to higher resolution models to verify that its performance is preserved as sampling intervals are decreased. Finally, given the relatively small number of parameters in our models it may be possible to automatically learn a surrogate magnetotelluric forward modelling function - at least for very simple models. Such a function could potentially speed up the intial stages of inversion by two or three orders of magnitude.
Thank you to Stefan Thiel and Jared Peacock for their support and advice in the development of the refinements described in this paper. Thank you to Petratherm Pty Ltd for access to the Paralana modelling data.
-  V. Ayala-Ramirez, C. H. Garcia-Capulin, A. Perez-Garcia, and R. E. Sanchez-Yanez. Circle detection on images using genetic algorithms. Pattern Recognition Letters, 27(6):652–657, 2006.
-  D. Colton, H. Engl, A. K. Louis, J. McLaughlin, and W. Rundell. Surveys on solution methods for inverse problems. Springer Science & Business Media, 2012.
-  E. Cuevas, M. González, D. Zaldívar, and M. Pérez-Cisneros. Multi-ellipses detection on images inspired by collective animal behavior. Neural Computing and Applications, 24(5):1019–1033, 2014.
-  E. Cuevas, H. Sossa, et al. A comparison of nature inspired algorithms for multi-threshold image segmentation. Expert Systems with Applications, 40(4):1213–1219, 2013.
-  S. E. Dosso and D. W. Oldenburg. Magnetotelluric appraisal using simulated annealing. Geophysical Journal International, 106(2):379–385, 1991.
-  H. Grandis, M. Menvielle, and M. Roussignol. Thin-sheet electromagnetic inversion modeling using monte carlo markov chain (mcmc) algorithm. Earth Planets Space, 54:511–521, 2002.
-  N. Hansen and P. K. P. E. Ch. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (cma-es). Evolutionary Computation, 11:1–18, 2003.
-  john doe. deidentified publication as per conference requirements. noid, 2100.
-  john doe. deidentified publication as per conference requirements. noid, 2100.
-  A. G. Jones and R. Hutton. A multi-station magnetotelluric study in southern scotland. monte-carlo inversion of the data and its geophysical and tectonic implications. Geophysical Journal of the Royal Astronomical Society, 56(2):351–368, 1979.
-  B. Liu, S. Li, L. Nie, J. Wang, Q. Zhang, et al. 3d resistivity inversion using an improved genetic algorithm based on control method of mutation direction. Journal of Applied Geophysics, 87:1–8, 2012.
-  M. Moorkamp, A. G. Jones, and S. Fishwick. Joint inversion of receiver functions, surface wave dispersion, and magnetotelluric data. J. Geophys. Res., 115, 2010.
-  P. Mukhopadhyay and B. B. Chaudhuri. A survey of hough transform. Pattern Recognition, 48(3):993–1010, 2015.
-  J. R. Peacock, S. Thiel, G. S. Heinson, and P. Reid. Time-lapse magnetotelluric monitoring of an enhanced geothermal system. Geophysics, 78(3):B121–B130, 2013.
-  M. A. Pérez-Flores and A. Schultz. Application of 2-D inversion with genetic algorithms to magnetotelluric data from geothermal areas. Earth, Planets, and Space, 54:607–616, May 2002.
-  M. Sambridge and K. Mosegaard. Monte carlo methods in geophysical inverse problems. Rev. Geophys., 40(3), 2002.
-  P.-A. Schnegg. A computing method for 3D magnetotelluric modelling directed by polynomials. Earth, Planets, and Space, 51:1005–1012, Oct. 1999.
-  C. Schwarzbach, R.-U. Börner, and K. Spitzer. Two-dimensional inversion of direct current resistivity data using a parallel, multi-objective genetic algorithm. Geophysical Journal International, 162:685–695, Sept. 2005.
-  W. Siripunvaraporn. Three-Dimensional magnetotelluric inversion: An introductory guide for developers and users. Surveys in Geophysics, pages 1–23, May 2011.
-  A. Tarantola. Inverse problem theory and methods for model parameter estimation. siam, 2005.
-  T. Xiao-zhong, L. Jian-xin, S. Ya, L. Wen-tai, and X. Ling-hua. A hybrid optimization method based on genetic algorithm for magnetotelluric inverse problem. In Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on, pages 1 –4, dec. 2009.
-  M. Zhdanov, I. Varentsov, J. Weaver, N. Golubev, and V. Krylov. Methods for modelling electromagnetic fields results from commemi the international project on the comparison of modelling methods for electromagnetic induction. Journal of Applied Geophysics, 37(34):133 – 271, 1997.