1 Introduction
In 1943 Tikhonov invented the regularization method for illposed problems [9]. He introduced the fundamental concept. This concept consists of the following three conditions which should be in place when solving the illposed problem.
1. One should a priori assume that there exists an ideal exact solution of the problem for an ideal noiseless data .
2. The correctness set should be chosen a priori, meaning that some a priori bounds imposed on the solution of the problem should be imposed.
3. To construct a stable numerical method for the problem.
For the first time, the question about global uniqueness theorems was addressed positively and for a broad class of Coefficient Inverse Problems with single measurement data in the works of A.L. Bukhgeim and M.V. Klibanov in 1981. These results were published in their paper [2]. After widely applied to many physical examples. The first complete proofs were published in two separate papers [3] and [6]
. This technique is now called the “BukhgeimKlibanov method.” This method is the only one enabling for proofs of global uniqueness results for multidimensional Coefficient Inverse Problems with single measurement data. The BukhgeimKlibanov method is based on the idea of applications of the so called Carleman estimates to proofs of uniqueness results for Coefficient Inverse Problems. The main interest in applications in, for example, the hyperbolic case, is when one of initial conditions is identically zero and another one is either the
function or that the wave field is initialized by the plane wave. The uniqueness question in the latter case remains a longstanding and wellknown unsolved problem.2 The mathematical model
Find an approximate solution of the BlackScholes equation
(1) 
subject to boundary conditions
(2) 
and the initial condition
(3) 
is the stock price, is time, is the volatility of the stock option
is the price of the stock option.
We predict option price from “today” to “tomorrow” and “the day after tomorrow”. 255 trading days annually. “One day” “Today” “Tomorrow” “The day after tomorrow” interval:
To solve the problem, we minimize the functional ,
where is the regularization parameter and The function It follows from (2) and (3) that
(4) 
(5) 
Uniqueness and existence of the minimizer follow from the Riesz theorem.
Convergence of minimizers to the exact solution when the level of error in the boundary and initial data tends to zero, was proven by Klibanov (2015) [7] using a Carleman estimate.
3 Tshape finite difference scheme
3.1 Discretization
The domain of the PDE is given by . The domain can be discretizied by tuples : [10] , , for . The actual value of can be determined depending on the performance of the computer and the required accuracy; must be true otherwise it is impossible to solve, and we noticed that the results converges well (relative difference between different ’s ) when is greater than 20.
Let denote the option price corresponding to a stock price at time : [10].
For discretizing the partial derivatives, we considered and separately. We used the backward difference scheme for the first derivative, and the standard central difference scheme [4] for the second derivative, i.e.
(6) 
(7) 
Then the partial differential equation can be approximated by:
(8) 
Consider the equation above at a single point . It involves four adjacent points, including itself: . The four points form a letter “T” shape in the Cartesian coordinates:
Therefore we called it Tshape finite difference scheme. Denote and ; as a result, the Tikhonovlike functional [7] introduced above can also be approximated by:
(9) 
3.2 Serialization
In order to make the minimization of
“code friendly”, we need to find a better way to represent the functional instead of a big summation. Notice that the Tshape finite difference scheme, when regarding the four points as a vector
, is a linear map from to , i.e., . Also, we can consider the sum of squares in the Tikhonov functional as the 2norm of a vector: . The facts above imply that we can find a matrix that represents the Tshape differential operator; and a corresponding function that is “nearly equivalent” to , besides that the domain is different (but feasibly interchangable). This would lead us to the idea of serialization of :(10) 
By doing this, we could form a vector that helps convert the Tikhonov functional to a much simpler form:
(11)  
(12)  
(13) 
Where is the vector of values from function at positions corresponding to . And the next step is to find the matrix . Thanks to the enlightenment from [10], we found that can be constructed by Toeplitzlike matrices, diagonal matrices, and the Kronecker product.
Let
and
And the the matrix can be constructed by
with being a diagonal matrix with elements corresponding to the factor .
4 Preprocessing boundary and initial conditions
The boundary and initial conditions needs to be ruled out from the equation , since these variables are prescribed and cannot be considered as free variables in the system of equations.
4.1 Why we must remove the boundaries
The necessity of preprocessing the boundary and initial conditions is originated from the nature of differential equations. Consider the original PDE
Suppose satisfies the equation above, then we can easily construct as another answer to the PDE, where is an arbitrary constant. This phenomenon reflects the truth that the boundary and initial conditions are extremely important to ensure the uniqueness of the final solution.
4.2 Why Tikhonovlike regularization is not enough
We have the Tikhonovlike functional
to prevent from deviating too much from . Although satisfies the boundary and initial conditions when , it does not stick the solution to boundary/initial conditions since the regularization parameter is adjustable in running cases. The parameter is finetuned during the beta finding session to make sure that we remove the noise from input data as much as possible; then, if the we find eventually is enoughly small, the effectiveness of regularization for constraining the boundary and initial conditions is negligible.
4.3 Assigning values to free variables in system
Let’s take a small real matrix as a quick example. Consider the system :
with the following requirements:

;

is the free variable;

.
Then and are linearly dependent on , and there exists a unique solution once is given.
Now we define , and solve the system.
Firstly, we do finitely many steps of row operations to eliminate any nonzero elements in the first row ; then we have the new system :
In this case must equal to , otherwise there will be no solution to the system.
Secondly, we construct an auxillary vector ( stands for boundary, as the same notation in the PDE case) to eliminate from the system. We substract from the RHS to yield the new system :
Then, the solution to the new system exists only if . This would help us to rewrite to the system since they will always multiply by in the matrix:
Finally, shrink the matrix to get our reducedorder system :
Suppose the solution to the reduced matrix is , then the final answer for the original system, given that , is .
4.4 Removing boundaries from
The finite elements approximation of the Tikhonovlike functional is given by
(14) 
from above; where . As we can observe from the previous chapter, the rows in matrix corresponding to the boundaries are all zeros. This means that the matrix as a nullity of at least (the number of points on boundaries). By using the same method mentioned above, we first construct the boundary vector , containing given values on the boundaries and zero otherwise; and then subtract the matrix product from the equation to get
(15) 
Then we could delete the allzero rows and columns corresponding to the boundaries to reduce the order of matrix . Notice that since the always satisfies the boundary values given; so we can do the same reduction to vector . After the deletion, becomes matrix instead of , and . We finally produce the reduced minimization problem without changing the solution:
(16) 
This method ensures that the minimizer follows strictly with on the boundaries in order to produce a creditable solution, and slightly reduces the require performance of computer (although the runtime complexity is not reduced).
5 Minimized using conjugate gradient method
Our ultimate goal is to minimize the function
In this chapter, Python is used to implement all data structures needed and to minimize the given numerical problem.
5.1 Normalizing the matrix before conjugate gradient method
We noticed through practical using of the code, that the matrix would typically have huge float numbers. It is occurred mainly because the differece between the ask and bid price of the underlying stock is often (or even less in more accurate stock markets). This would result in elements values more than in the matrix , where continuous multiplications ( times) of these elements will definitely cause the floating number to overflow (In Python 3, the max precision for float is about ). Therefore, we normalized each row of matrix , in order to prevent this case from happening.
For each row in the system , we do
norm_i = L2norm(r_i) r_i = r_i / norm_i b_i = b_i / norm_i
This will eliminate the possibility of overflowing while maintaining the correctness of the solution.
5.2 Minimized using Python SciPy module
In order to minimize , conjugate gradient method (CG) is used due to the resemblance of the system of equation to the symmetric and positivedefinite matrix required by the direct CG: . However, we chose to use the iterative method of CG since the lost function given by direct CG does not match our given Tikhonovlike functional, which is defined as .
The SciPy module in Python offers the functionality of minimizing a function using CG. Let j_beta be a lambda function in Python which takes in a vector of dimension and returns a floating number, and u be a mutable array of length , then calling
scipy.minimize(j_beta, u, method=‘CG’)
would return a OptimizeResult containing the minimizer, iterated time, and result status (whether CG succeeds). Extracting the mutated u and reshape it to the matrix of dimension would give us the desired output representing the meshgrid of domain . The results obtained using this method will be shown in the next chapter.
6 Results
Tables 1 and 2 display our old results of 368 options we tested in 2015 [7].
Table 1. Profits and losses by three different methods
Method  Number of tested options  Total profit/loss 

BlackScholes  
Last price extrapolation  
Ask price extrapolation 
Table 2. Percentages of options with profits/losses
Method  profit  loss  zero 

BlackScholes  72.83%  16.85%  10.32% 
Last price extrapolation  39.40%  56.80%  3.80% 
Ask price extrapolation  10.86%  88.34%  0.8% 
Figure 2(a) displays the histogram of profits/loses with the trading strategy devised by Professor Klibanov. Figure 2(b) is a zoomed part of figure 2(a). The profit was and our loss was The total profit was (the difference between them).
a)  b) 
Figure 3 reflects our new improved results. From the data offered by Bloomberg Terminal, we obtained 50,446 “data blocks”, which consists of option and stock price data for three consecutive trading days (thus Thursday and Friday connected with Monday are considered consecutive). Then we produced the same number of minimizers regarding to the prediction of each of these data blocks. let REAL denote the real EOD option price on tomorrow and REAL the day after tomorrow, meanwhile EST being the estimated option price for tomorrow and EST for the day after tomorrow, we define the absolute error:
In particular, Figure 3(a) displays the histogram of absolute errors of these 50k estimates compared to real data from Bloomberg terminal. Figure 3(b) represents a zoomed part of figure 3(a).
a)  b) 
The same histograms with the best resolution are shown below.
From the histogram above, we calculated the median of errors is .
7 Application of machine learning to improve profitability of options trading strategy
Method: We have improved these results using machine learning to classify the minimizer’s set to filter for the trading strategy inclusion. We built 13 element input vector consisting of minimizers (for ), stock ask and bid price (for ), option ask and bid price and volatility (for )
Supervised Machine learning has been applied to the binary classification neural network for the logistic probability loss function with regularization:
(17) 
Where are weights which are optimized by minimizing the loss function using the method of gradient descent. is a parameter of regularization. is our normalized 13  dimensional vectors. is output of the neural network. is the number of vectors in the training set. is our labels (the ground truth). The labels are set to 1 for profitable trades and 0 otherwise. All vectors and labels are split into three parts: training, verification/validation and test sets. The training set is used for weight learning. Verification/validation set is used for tuning of the neural network hyperparameters. Test set is for generating the outcomes of trading strategy. We compared the profitability of the trading strategy based on the original minimizer’s set with the profitability of the classified set.
7.1 Results:
We created a heuristic model for the neural network of three hidden layers of dimension, of 50, 25, 14 dimensions respectively. We also found that learning rate
and iterate time is a proper parameter settings for this neural network.Figure 4 shows the learning curve of the neural network, using the MSELoss function with mean reduction as the evaluation loss function.
We used kcross validation to ensure the consistency of the neural network, with
in the graph above. The dots in the graph are the mean values of the loss at each epoch (1 epoch
20 iterations), while the vertical bars represents the standard deviation among the different training datasets. From the graph, we can conclude that the loss converges eventually, and thus machine learning technique is a feasible way to reduce noise and errors produced by pure mathematical solutions.
References
 [1] https://bloomberg.com.. Cited by: An Evaluation of novel method of IllPosed Problem for the BlackScholes Equation solution.
 [2] (1981) Uniqueness in the large of a class of multidimensional inverse problems. (Russian). Soviet Math. Doklady (17), pp. 244–247. Cited by: §1.
 [3] (1981) Carleman estimates for Volterra operators and uniqueness of inverse problems, in NonClassical Problems of Mathematical Physics. (Russian). Computing Center of the Siberian Branch of USSR Academy of Science. Cited by: §1.
 [4] (2004) Finite Difference Schemes and Partial Differential Equation.. SIAM. Cited by: §3.1.
 [5] https://github.com/kss39/IllPosedBlackScholesEquation.. Cited by: An Evaluation of novel method of IllPosed Problem for the BlackScholes Equation solution.
 [6] (1981) Uniqueness of solutions in the ‘large’ of some multidimensional inverse problems, in NonClassical Problems of Mathematical Physics. (Russian). Computing Center of the Siberian Branch of USSR Academy of Science, pp. 101–114. Cited by: §1.
 [7] (2015) An illposed problem for the BlackScholes equation for a profitable forecast of prices of stock options on real market data.. Inverse Problems 32 (1). Cited by: §2, §3.1, §6, An Evaluation of novel method of IllPosed Problem for the BlackScholes Equation solution.
 [8] https://money.cnn.com/data/markets/russell.. Cited by: An Evaluation of novel method of IllPosed Problem for the BlackScholes Equation solution.
 [9] (1943) On the stability of inverse problems. (Russian). Doklady of the USSR Academy of Science (39), pp. 195–198. Cited by: §1.
 [10] (2015) Inverse Problems in BlackScholes: Option Price Forecasting with Locally Reconstructed Volatility.. Master’s Thesis, SimTech 6. Cited by: §3.1, §3.1, §3.2.
Comments
There are no comments yet.