The usage of iterative solvers to solve numerical problems, such as inverse elasticity, is very common in literature (Delingette2004). Most of them are the result of combining various methods and techniques that in the end provide a solution (Delingette2004). The application of these solvers in real problems showed their usefulness and they became regarded as reliable methods to solve said problems (Delingette2004; tromeur2006; Palomar2008; Rajagopal2008; mohamed2018).
Despite their reliability, these methods are considered costly in terms of time consumption (Azar2002; Delingette2004; tromeur2006; Lopes2018). The time consumption is directly related to the number of iterations required to find a solution to the problem (Delingette2004; tromeur2006), the number of iterations being highly dependent on the initial estimation (Delingette2004; marques2016; Lopes2018). With that in mind, several approaches were made to develop methods capable of providing these solvers with better approximations as initial estimations which could accelerate the convergence and therefore reduce the time spent with the solver (tromeur2006; barabasz2014; simoncic2015; marques2016; Lopes2018).
These types of problems can be seen as regression problems, and on that account, machine learning methods such as MNNs, started to be an alternative and viable approach in several fields from computer science to biology (Hinton2006; LeCun2015; DePristo2016; Goodfellow2016; Zhou2018). In fact, there is already available literature where they are used to help solve real life problems involving iterative methods (Litjens2017; Sun2017; Martinez2017; Hamidinekoo2018; rezaee2019).
The main issue with MNNs and other machine learning methods is that they are not 100% reliable and can occasionally produce outputs outside the required safety margin. If we consider problems where the health of a patient is involved, than, the usage of these methods becomes very limited, since using inaccurate data can potentially lead to health complications.
Lets consider as a case study, the iterative solver proposed by Lopes2018 that is used to estimate the biomechanical parameters of the breast. The method receives a set of breast measurements and then starts an iterative process from an initial estimation of the breast parameters that concludes when the measurements produced by the estimated parameters match (to a certain degree) the input breast measurements (Lopes2018). Despite the work proposing ways to accelerate the convergence, the method can, for some cases, need hours to provide an accurate solution (Lopes2018).
We propose a new hybrid method to solve estimation/regression problems, that combines the capacity of obtaining nearly instantaneous results by a machine learning method (MNN) and the robustness and accuracy of iterative solvers. This method uses data from the known problem to train a neural network which will then be used to provide a solution to the problem. Then, the iterative solver will validate the provide solution. If the solution does not meet the accuracy requirements, than the iterative solver refines the aforementioned solution.
Note that, the intent behind this hybrid method is not to use the MNN as a rough initial estimator as what happens in other studies (Martinez2017; rezaee2019), but to use the MNN as a replacement for the slow iterative solvers understanding at the same time its limitations.
We use as case study the iterative solver mentioned above (Lopes2018) and will conduct several tests. The results obtained show that this method achieved the goals of becoming faster (MNN working of times) and robust (the iterative solver was able to refine the breast parameters when the MNN solution was not accurate enough).
Subsection 2.1 (in 2) details the iterative method (case study) used to demonstrate the validity of the hybrid approach. Section 3 presents and evaluates the neural network approach comparing it with an iterative method (subsection 2.2) and the hybrid approach is described in subsection 3.2. Finally, in 4 we discuss the general results and present some conclusions and possible avenues for future work.
2.1 Case Study: Breast Model
We provide a brief description of the iterative method used to demonstrate the advantages of the hybrid approach. The method presented in Lopes2018 was selected because it is an analytical method and a large set of examples can be generated both for training, and testing the neural network.
The breast is a complex structure constituted of a mass of glandular tissue encased in fat that accounts for its characteristic round shape, being connected to the skin through a series of ligaments. These tissues possess different bio-mechanical properties depending of the patient (age, fat layer length, skin elasticity among others), and as such behave differently to external perturbations, namely the gravity. Deformation evaluation of the breast over these external actions is achieved by considering a simplified stress-free geometrical domain of the breast equipped with a Neo-hookean mechanical model (Palomar2008) where the breast inner tissues and the skin are discretised (as in Lopes2017; Lopes2018).
The breast’s shape and size in a stress-free domain consists in a spherical cap where the plane section is attached to the torso. This geometrical structure is defined by two parameters: the radius and the truncated length of the cap . The breast visco-elastic properties are defined by a set of four mechanical parameters: and for the glandular and fat tissues, and and for the breast skin. The geometrical parameters are denoted by while the mechanical parameters are denoted by and they can be used to generate a digital breast (figure 1).
The six parameters that characterise the breast shape, size, and the mechanical characteristics have to be determined for each patient. In Lopes2018, they use 15 measurements to estimate these six parameters. The aforementioned measurements consist on the breast’s volume, skin surface area, breast height, frontal and back depth for the patient in three different positions depicted in Figure 2 (see Lopes2018 for the details).
We aim at finding the set of parameters from a given set of measurements carried out on a specific patient. There are several methods proposed in literature that provide the parameters’ identification operator (Azar2002; Palomar2008; Cardoso2010; Lopes2018). We choose the method presented in Lopes2018 due to the ease of reproducing its values (does not require medical imaging data and it allows to simulate several different breast configurations).
The results obtained with this method showed a good accuracy in terms of the estimation of the parameters. The main issue is the large amount of time (in some cases it can take hours Lopes2018) necessary to estimate the breast parameters.
2.2 MNN Estimator
We present a MNN model and detail the training procedure we carry out to emulate the inverse problem solver, i.e. to provide an approximation of as a function of measurements . Similarly to Lopes2018, the main goal of the breast parameter estimation method is the determination of a set of parameters that produces a relevant digital model of the real breast, i.e. a model which accurately reproduces the real breast mechanical behaviour as well as its aesthetics. Such model is then used to support a surgeon decision making regarding surgical procedures and, hence, reduces the risk of errors.
Although accurate, inverse solver methods are too time consuming to be of practical use. Despite several code optimisations, the running times reported in Lopes2018 can go up to several hours of computation whereas MNNs can provide an answer in real-time.
To evaluate the accuracy of the proposed approach we will compare our results with the values presented in Lopes2018 using three different meshes: coarse, medium and thin. The visual difference between these mesh resolutions is depicted in figure 3.
To train, validate and test the MNNs, datasets were created using the numerical breast model proposed in Lopes2017 carrying out numerical simulations with a wide range of variation of coefficients . We elaborate a dataset for each mesh (, and ) with . Each dataset is split into a test set with 10000 cases (), a validation set with 4000 samples (), the remaining 36000 cases being the training set ().
2.2.1 Datasets generation
To generate the datasets, random samples are drawn from a Gaussian distribution of the parameters centred on a set of values suggested by surgeons as representing an average looking breast, , , , and . Then a dataset is created as follows:
valid vectors, such that each component is chosen randomly with the Normal law . Set , , . A vector is considered invalid when some of the parameters, or a combination of them, do not represent plausible breast models Lopes2017;
Compute the MNN inputs corresponding to the MNN outputs , using the method detailed in Lopes2018 and create the elements .
When the dataset is fully generated, its elements are normalised as follows:
where and stand for the mean values over the whole dataset.
2.2.2 MNN Configuration
A MNN configuration relates to its architecture, the activation functions and on the optimiser, among other parameters. We experimented several configurations for the MNN and found that the best results were obtained with a MNN constituted of 12 hidden layers with 128 nodes per layer, using the exponential linear unit activation function for each layer, and the RMSProp optimiser. The input layer has then fifteen inputs for the measurements to provide an output layer of six nodes for the geometric and mechanical parameters.
). The values per mesh reported in this section are an average of 5 runs. Each run is considered complete after it ran 3000 epochs and the best epoch regarding the validation set is selected. It is important to mention that the training and validation datasets are a result of splitting of a dataset of 40000 cases at each run where(36000 cases) goes for training and the remaining (4000 cases) goes for validation.
We draw comparisons between the original Inverse Method (IM) and the Neural Network approach. In table 1, we report on the MNN performance for both the training and validation sets, together with the results in Lopes2018 for the IM approach (relative error with respect to the exact value given in %).
The training results show that the MNNs are capable of associating the set of given measurements to the breast model parameters with different degrees of success depending on the parameters and on the mesh size. Comparing the errors per parameter of the MNN and the IM, we observe that the MNN approach obtains a significantly better approximation for every parameter. Such an improvement comes from the averaging character of the training that reduces the random component of the error contained in the data. In particular, the MNN provides the best approximations for the geometrical parameters ( and ) with only and error respectively with a coarse mesh and and error respectively with a thin mesh. Regarding the mechanical parameters (, and ), the MNN provides the best approximations with the medium mesh size.
A closer analyse of the results for each parameter shows that the geometrical parameters and as well as the mechanical parameters of the breast bulk tissue and are well-estimated with very small errors (values around or lower). The mechanical parameters of the skin present larger errors with approximately error for and approximately error for . The same difficulties in estimating were observed in Lopes2018. In fact, they pointed out that, even at the numerical level, i.e., using the iterative inverse solver, it is difficult to obtain a good approximation value (relative error larger than for ) due to a very sensitivity of this parameter to the measure errors.
Taking into consideration the values obtained with the IM, it could be expected that the thinner the mesh, the better the approximation obtained with the MNN. However, Lopes2018 stated that the relation between the parameters and the measurements is nonlinear. This means that any error estimating a certain parameter will affect the other estimated parameters disproportionately (Lopes2018). Therefore, using very coarse or very thin meshes can compromise the overall estimation of the breast parameters, and a possible optimal mesh size can be found where it balances the geometrical aspects of the breast and the detail for the mechanical parameters (Lopes2018).
The validation values of table 1 are similar to the training values.They present values of average error per parameter of for a coarse mesh, for a medium mesh and for a thin mesh. These results are very good when compared with the ones obtained with the IM whose average error per parameter was for a coarse mesh, for a medium mesh and for a thin mesh.
3.1 Multilayer Neural Networks
3.1.1 MNN Performance
We assess the efficiency of the training MNN model using the test datasets with 10000 cases for each mesh resolution. The results for the MNN model and the IM method are presented in table 2 showing the relative errors as well as the time spent by each method. The values for the IM are the same as the ones presented in table 1 in section 2.2.3.
Test errors are very similar to the validation ones with average error per parameter around for a coarse mesh, for a medium mesh and for a thin mesh. These results show that the MNN method is capable of estimating the breast model parameters with a very good accuracy in comparison with the error of the IM. For all mesh resolutions the MNN method is clearly more efficient since it provides more accurate approximations in a very short time. The error is reduced by and the computational time reduction, reported in the last column, is very significant.
Clearly, such results highlight the interest to substitute the inverse problem solvers, based on the resolution of the mechanical problem, with a Neural Network at the production level. However, the MNN model estimation can sometimes predict a set of parameters that can notably differ from the exact parameters by a large margin (greater than 30% error per parameter for instance). On the other hand, the iterative method always provides an accurate estimate within a guaranteed error maximum level (Lopes2018) since we really solve the physical problem. Therefore the values that are obtained from the MNN model need to be used with caution. In other words, a validation and correction procedure has to be established to guarantee the validity of the parameters and turn the method more robust.
3.1.2 Methods robustness
The performance results of the MNN model presented in section 3.1.1 show a good accuracy with relative errors. However, these tests were done with the MNN models trained with the exact values of the measurements. In a real scenario the surgeon might not obtain the exact measurements of the breast, i.e. but an approximation with some inaccuracy and so, it is important to test the robustness of the model, i.e. the capacity to handle the uncertainties by limiting the error propagation to the outputs. From surgeons experience, it is assumed a potential error up to for each measurement. With that in mind, we introduced a variation on each measurement of the test datasets (, where ) up to .
The sensitivity analysis performed on the estimation method in Lopes2018 showed that even if the difference on some measurements is small (few percents), it may have a great impact in several situations by strongly magnifying the error on the parameters. Therefore, it is relevant to evaluate how the MNN models deal with potential errors in the measurements.
For that purpose, one introduces errors on the measurements and evaluate the effect of that error in the estimation of the parameters. To this end, we create a dataset of perturbed measures using a uniform law that produces a maximal error of 10%. Then we compute the associated output to evaluate the error with respect to the exact value. Hence the relative error of the outputs is evaluated and compared with the relative errors of the inputs. The assessment of the measurements error propagation to the parameters by the MNN is presented in table 3, together with the error propagation for the IM previously reported in Lopes2018.
The difference between the estimated parameters after perturbation of the inputs and the exact values increased as expected due to the errors in the measurements. We report a substantial degradation from the original errors , and (table 2) to , and for the coarse, medium and thin mesh sizes respectively.
Magnification of the errors is essentially determined by the Jacobian matrix of the functional evaluated with the exact measure value (i.e. without perturbation) and a first-order of errors approximation are given with
According to Lopes2018, the Jacobian coefficient are uniformly bounded and the error propagation is almost contained leading to reliable values as a first estimator of the breast model parameters. Since the deviations observed with the MNN are very similar to the ones obtained with the IM, we conclude that the MNN is robust.
3.1.3 Measurements deviation
The question arises about the one-to-one correspondence between the measurement and the parameters, namely if the two operations (with the IM or MNN) and (with the direct solver) are inverse from one to another at the numerical level. To this end, as in Lopes2018, we assess the error between a given reference measurements with the measurements obtained with the estimated parameters after computing with one of the two method. We evaluate the relative error using
where represents the relative weight of each measurement (the scale of some measurements such as the volume is very different compared to measurements such as the skin surface area) Lopes2018. In Lopes2018, they considered that a set of parameters is accurately evaluated when the value returned by the cost function in equation 1 is lower than a certain threshold in order to ensure very small errors.
We first start with the IM and evaluate the difference between and considering the three mesh sizes. We report in table 4 the results obtained in Lopes2018 for all the measurements , .
We observe an average total error per parameter lower than and we see that, with the exception of , and , every parameter has an error lower than , which indicate an excellent one-to-one correspondence. The error obtained with the other mentioned measurements was also considered small according to Lopes2018 because their values ranged within the few millimetres. Note that the average error per parameter is slightly better for thinner meshes ( of error for a coarse mesh and just and error for the medium and thin mesh respectively).
We performed the same tests using the MNN as a predictor for the parameters reported the errors in table 5.
Results for the MNN (table 5) show smaller errors comparing to the IM (table 4). The average total error per parameter is smaller than which is an less amount of error compared to the results obtained with the IM (table 4). The error values follow the same pattern presented in tables 1 and 2 with the best results been obtained with the medium mesh. Just like with the IM, apart from , and which show higher values of error (, and respectively), the error obtained with the other measurements is inferior to .
Concluding, we can state that, when compared to the IM approach, the MNN is capable of achieving a set of parameters which produce a lower error in the retrieved measurements and well-preserved the one-to-one correspondence.
3.2 Hybrid Method
Most of the parameters provided by the MNN are very good approximations and can be directly used for the breast simulation to determine the shape and behaviour of the breast. However, the MNN model estimation may predict a set of parameters that are not correct, i.e. the error in the measurements given by equation 1 was above the accuracy threshold () which, in practical terms, presents too many errors to produce a relevant breast simulation. This suggests that an assessment of the parameter validity provided by the MNN is critical and correction would be necessary in some very few situations.
Considering the IM workflow, it uses a pre-determined set of parameters as the initial guess and then proceeds to refine them until the measurements generated from those parameters match the input measurements (generally given by the surgeon - Lopes2018). So, as literature shows (Delingette2004; barabasz2014), a very good initial guess has a great advantage since the iterative procedure is reduced to very few stages. Despite being reliable, the iterative method can take several minutes (sometimes hours) to estimate the breast parameters. In a medical context, poor estimations can lead to health problems but long running time for parameter estimations are not practical and cannot be used during a consultation.
So, to achieve our goal of reliability in an acceptable time frame we propose an hybrid method that takes advantage of the two methods: on the one hand, the quick response of the MNN providing fast and in most cases a good solution for the parameters while, on the other hand, the IM robustness to validate and refine the MNN solution guarantying the overall solution accuracy. The hybrid method starts by running the MNN to obtain an initial estimation. Then it uses the iterative method as a validator (i.e. the cost functional of the inverse problem) to check the error. If the error is above a threshold then the IM uses the initial approximation provided by the MNN as a starting point and iterates until it gets the error below the threshold (see Figure 4).
If this error is lower to the previously mentioned threshold (), then according to Lopes2018 the estimated parameters are suitable to be used to produce the digital representation of the breast.
We want to evaluate the performance of this hybrid method on two metrics:
Fast and accurate solutions - percentage of cases where the MNN obtained an accurate solution by itself;
Validation & refining performance - reliability of the validation as well as the average number of iterations required by the IM to refine non-accurate MNN estimations.
The first metric will reveal at what point can a machine learning method like the MNN replace iterative solvers. The second metric is testing the capacity of the MNN to provide good approximations even when it is not accurate, i.e., we are evaluating the premise behind the observations made in Lopes2018, which states that closer initial guesses leads to lesser iterations by the iterative method to obtain an accurate set of parameters.
In summary, we want to determine if under these circumstances, the hybrid method can significantly outperform the IM. In table 6, we present an error analysis estimated by the MNN model for each mesh size with a total number of cases per mesh.
The values presented in table 6 show, for each mesh size, the 4 different types of estimations performed by the MNN trained models: Accurate; Close (difference of parameters up to 10%); Medium (difference of parameters up to 20%) and Far (difference of parameters above 20%). The MNN models are accurate approximately of the times (, and with the MNN coarse, medium and thin models respectively). This means that in of the cases, the hybrid model will be able to estimate the breast parameters almost instantaneously.
From the total of 30000 cases (10000 cases per mesh size) only 333 cases were not accurate: 230 ( of the cases) were close to the reference values; 63 were medium ( of the cases) and 40 cases were poor (far) estimations ( of the cases). Note that these cases do not have necessarily extreme measurement values, spreading roughly evenly across the tested parameter spectrum. For these cases we show that the initial guess deriving from the trained MNN, reduces significantly the time spent to obtain an accurate estimation of the parameters. Instead of an average of 3.27, 11.6 and 26.45 minutes for the coarse, medium and thin meshes respectively, the hybrid method results are always significantly below these averages even when the MNN estimation was poor (1.89, 6.68 and 14.36 minutes for the coarse, medium and thin meshes respectively). The hybrid model improve the robustness of the IM method while providing results in a significantly smaller time frame.
Traditional methods for computing the solution of inverse problems (e.g. estimation of parameters) usually involve iterative solvers and techniques. Such popular techniques require significant computational resources and time response that ranges between a few minutes to several hours, depending on the required accuracy. Machine learning based approaches offer an alternative way to efficiently assess the same problems by providing a solution in real time. However they are not reliable and that limits its usage, mainly when dealing with real problems (e.g. medical) where reliability is crucial.
A new hybrid method was proposed to combine the strengths of the traditional methods (iterative methods) and the new machine learning methods (multilayer neural networks). We used as case study an existent iterative method to estimate the breast parameters of a patient from a set of given measurements. This new method uses data from the problem to create and train a MNN that provides an estimation of the breast parameters. Then, the method uses an IM to validate the estimated parameters and in the case the solution is not considered accurate, then the IM refines those parameters until the desired accuracy is achieved.
The overall results show that the hybrid method proposed achieved its goals: a fast and reliable method. The MNN was capable of providing accurate solution for of cases. On the remaining , this method was capable of refining the MNN solution, and when comparing its performance against a stand alone iterative method, we observed that it reduced very significantly the number of iterations required to converge to an accurate solution.
This new approach of solving inverse problems show the potential of combining MNN together with other numerical methods, as the iterative inverse problem solver, to create more efficient as well as more accurate estimators/predictors for critical scenarios where a non accurate solution is not tolerated. The tests performed with this new model show great potential of its usage in real situations.
A very large thanks to Dra. Augusta Cardoso for her contribution to this paper.
This work has been supported by Portuguese Foundation for Science and Technology (FCT) within the Project Scope: UID/CEC/00319/2019 as well as in the framework of the Strategic Funding UID/FIS/04650/2019.